Define nested dataclass hierarchies as clean, readable schemas — no boilerplate, no dependencies.
Python's @dataclass requires you to define each level of a nested hierarchy separately, then wire them together manually. Each inner class name must appear three times: once in the class definition, once as the field type hint, and once in field(default_factory=...):
from dataclasses import dataclass, asdict, field
@dataclass
class NestedParent:
@dataclass
class Child:
@dataclass
class GrandChild:
grandchild_str: str = "grandchild1"
grandchild_num: int = 1
grandchild: GrandChild = field(default_factory=GrandChild)
child_str: str = "child"
child: Child = field(default_factory=Child)
parent_str: str = "parent"And even after all that boilerplate, the asdict round-trip is broken:
NestedParent(**asdict(NestedParent())) == NestedParent() # FalseThis is because asdict serialises nested instances to plain dicts, but @dataclass does not coerce them back on construction — unlike flat dataclasses where this works naturally.
@deep_dataclass lets you express the same hierarchy as a natural nested schema. The decorator infers the class name, type hint, and default_factory from the nested block — no repetition:
from deep_dataclasses import deep_dataclass
@deep_dataclass(autosnake=True)
class DeepParent:
class Child:
class Grandchild:
grandchild_str: str = "grandchild1"
grandchild_num: int = 1
child_str: str = "child"
parent_str: str = "parent"
print(DeepParent().child.grandchild)
# Grandchild(grandchild_str='grandchild1', grandchild_num=1)The autosnake=True option converts PascalCase inner class names to snake_case field names. Without it the field name matches the class name exactly.
@deep_dataclass produces standard dataclass instances — all stdlib tools work as expected, and the asdict round-trip is fixed:
d1 = NestedParent() # vanilla dataclass hierarchy
d2 = DeepParent() # deep_dataclass equivalent
asdict(d1) == asdict(d2) # True — identical structure
DeepParent(**asdict(d2)) == d2 # True — coercion works
NestedParent(**asdict(d1)) == d1 # False — vanilla @dataclass doesn't coerce nested dictsValidation and Config Loading: A poor mans pydantic
to_json_schema exports any @deep_dataclass schema for use with third-party validators. Because @deep_dataclass coerces nested dicts at construction time, the validate-then-construct pattern works at all depths:
from deep_dataclasses import to_json_schema
import jsonschema, json
raw = json.loads('{"child": {"grandchild": {"grandchild_num": 2}}}')
jsonschema.validate(raw, to_json_schema(DeepParent)) # validate first
cfg = DeepParent(**raw) # then construct — fully typed
assert isinstance(cfg.child, DeepParent.child) # TrueValidation catches type violations at any nesting depth:
data = asdict(DeepParent())
data['child']['child_str'] = 3 # inject a type error
jsonschema.validate(data, to_json_schema(DeepParent)) # raises ValidationErrorFailed validating 'type' in schema['properties']['child']['properties']['child_str']:
{'type': 'string', 'default': 'child'}
On instance['child']['child_str']:
3
@deep_dataclass works with the full range of typing annotations. The @auxiliary decorator marks an inner class as a type-only helper — it won't become a standalone field, but can be referenced in Union[...], List[...], Optional[...], etc.
from dataclasses import field, asdict
from typing import Literal, List, Union
from deep_dataclasses import deep_dataclass, auxiliary, to_json_schema
import jsonschema
@deep_dataclass
class Config:
@auxiliary
class TrainMode:
lr: float = 0.001
pseudo_batch_size: int = 32
@auxiliary
class TestMode:
metric: str = "accuracy"
folds: int = 5
mode: Union[TrainMode, TestMode] = field(default_factory=TrainMode)
device: Literal["cpu", "cuda"] = "cpu"
images: List[str] = field(default_factory=list)When constructing from a dict, @deep_dataclass selects the Union variant whose field names best cover the keys supplied — an exact match is always preferred over a partial one:
cfg_train = Config(mode={"lr": 0.05}) # 'lr' is a TrainMode field
cfg_test = Config(mode={"metric": "f1"}) # 'metric' is a TestMode field
assert isinstance(cfg_train.mode, Config.TrainMode)
assert isinstance(cfg_test.mode, Config.TestMode)
assert cfg_train.mode.pseudo_batch_size == 32 # unspecified fields get defaultsSchema validation enforces Literal values, Union structure, and List element types:
jsonschema.validate(asdict(Config()), to_json_schema(Config)) # passes
bad = asdict(Config())
bad["device"] = "tpu" # not in Literal["cpu", "cuda"]
jsonschema.validate(bad, to_json_schema(Config)) # raises ValidationErrorpip install deep-dataclassesdeep_dataclasses has zero mandatory dependencies — it uses only re, dataclasses, and typing from the standard library. jsonschema (used in the examples above) is an optional third-party package needed only if you want schema validation.
| Feature | @dataclass |
@deep_dataclass |
|---|---|---|
| Nested hierarchy | Manual, verbose | Inline, readable |
field(default_factory=...) |
Required per field | Automatic |
| Nested dict → instance coercion | ❌ | ✅ (recursive, all depths) |
Union variant selection from dict |
❌ | ✅ (best-match by field coverage) |
asdict() / == / __repr__ |
✅ | ✅ |
frozen, slots, etc. |
✅ | ✅ (tested) |
| Type validation | ❌ | Exports to jsonschema |
| Mandatory dependencies | stdlib | stdlib only |
Early release. Core functionality is complete with 100% test coverage. API may evolve — feedback welcome on discuss.python.org.
Issues and PRs welcome. See the issue tracker for known TODOs.