deep_dataclasses

Define nested dataclass hierarchies as clean, readable schemas — no boilerplate, no dependencies.

The Problem

Python's @dataclass requires you to define each level of a nested hierarchy separately, then wire them together manually. Each inner class name must appear three times: once in the class definition, once as the field type hint, and once in field(default_factory=...):

from dataclasses import dataclass, asdict, field

@dataclass
class NestedParent:
    @dataclass
    class Child:
        @dataclass
        class GrandChild:
            grandchild_str: str = "grandchild1"
            grandchild_num: int = 1

        grandchild: GrandChild = field(default_factory=GrandChild)
        child_str: str = "child"

    child: Child = field(default_factory=Child)
    parent_str: str = "parent"

And even after all that boilerplate, the asdict round-trip is broken:

NestedParent(**asdict(NestedParent())) == NestedParent()  # False

This is because asdict serialises nested instances to plain dicts, but @dataclass does not coerce them back on construction — unlike flat dataclasses where this works naturally.

The Solution

@deep_dataclass lets you express the same hierarchy as a natural nested schema. The decorator infers the class name, type hint, and default_factory from the nested block — no repetition:

from deep_dataclasses import deep_dataclass

@deep_dataclass(autosnake=True)
class DeepParent:
    class Child:
        class Grandchild:
            grandchild_str: str = "grandchild1"
            grandchild_num: int = 1
        child_str: str = "child"
    parent_str: str = "parent"

print(DeepParent().child.grandchild)
# Grandchild(grandchild_str='grandchild1', grandchild_num=1)

The autosnake=True option converts PascalCase inner class names to snake_case field names. Without it the field name matches the class name exactly.

Fully Compatible with `dataclasses`

@deep_dataclass produces standard dataclass instances — all stdlib tools work as expected, and the asdict round-trip is fixed:

d1 = NestedParent()  # vanilla dataclass hierarchy
d2 = DeepParent()    # deep_dataclass equivalent

asdict(d1) == asdict(d2)        # True — identical structure
DeepParent(**asdict(d2)) == d2  # True — coercion works
NestedParent(**asdict(d1)) == d1  # False — vanilla @dataclass doesn't coerce nested dicts

Validation and Config Loading: A poor mans pydantic

to_json_schema exports any @deep_dataclass schema for use with third-party validators. Because @deep_dataclass coerces nested dicts at construction time, the validate-then-construct pattern works at all depths:

from deep_dataclasses import to_json_schema
import jsonschema, json

raw = json.loads('{"child": {"grandchild": {"grandchild_num": 2}}}')
jsonschema.validate(raw, to_json_schema(DeepParent))  # validate first
cfg = DeepParent(**raw)                               # then construct — fully typed
assert isinstance(cfg.child, DeepParent.child)        # True

Validation catches type violations at any nesting depth:

data = asdict(DeepParent())
data['child']['child_str'] = 3                         # inject a type error
jsonschema.validate(data, to_json_schema(DeepParent))  # raises ValidationError

Failed validating 'type' in schema['properties']['child']['properties']['child_str']:
    {'type': 'string', 'default': 'child'}

On instance['child']['child_str']:
    3

Data Modelling with Type Hints

@deep_dataclass works with the full range of typing annotations. The @auxiliary decorator marks an inner class as a type-only helper — it won't become a standalone field, but can be referenced in Union[...], List[...], Optional[...], etc.

from dataclasses import field, asdict
from typing import Literal, List, Union
from deep_dataclasses import deep_dataclass, auxiliary, to_json_schema
import jsonschema

@deep_dataclass
class Config:
    @auxiliary
    class TrainMode:
        lr: float = 0.001
        pseudo_batch_size: int = 32
    @auxiliary
    class TestMode:
        metric: str = "accuracy"
        folds: int = 5
    mode: Union[TrainMode, TestMode] = field(default_factory=TrainMode)
    device: Literal["cpu", "cuda"] = "cpu"
    images: List[str] = field(default_factory=list)

When constructing from a dict, @deep_dataclass selects the Union variant whose field names best cover the keys supplied — an exact match is always preferred over a partial one:

cfg_train = Config(mode={"lr": 0.05})           # 'lr' is a TrainMode field
cfg_test  = Config(mode={"metric": "f1"})        # 'metric' is a TestMode field

assert isinstance(cfg_train.mode, Config.TrainMode)
assert isinstance(cfg_test.mode,  Config.TestMode)
assert cfg_train.mode.pseudo_batch_size == 32    # unspecified fields get defaults

Schema validation enforces Literal values, Union structure, and List element types:

jsonschema.validate(asdict(Config()), to_json_schema(Config))  # passes

bad = asdict(Config())
bad["device"] = "tpu"                            # not in Literal["cpu", "cuda"]
jsonschema.validate(bad, to_json_schema(Config)) # raises ValidationError

Installation

pip install deep-dataclasses

deep_dataclasses has zero mandatory dependencies — it uses only re, dataclasses, and typing from the standard library. jsonschema (used in the examples above) is an optional third-party package needed only if you want schema validation.

Comparison

Feature	`@dataclass`	`@deep_dataclass`
Nested hierarchy	Manual, verbose	Inline, readable
`field(default_factory=...)`	Required per field	Automatic
Nested dict → instance coercion	❌	✅ (recursive, all depths)
`Union` variant selection from dict	❌	✅ (best-match by field coverage)
`asdict()` / `==` / `__repr__`	✅	✅
`frozen`, `slots`, etc.	✅	✅ (tested)
Type validation	❌	Exports to jsonschema
Mandatory dependencies	stdlib	stdlib only

Status

Early release. Core functionality is complete with 100% test coverage. API may evolve — feedback welcome on discuss.python.org.

Contributing

Issues and PRs welcome. See the issue tracker for known TODOs.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
doc		doc
src/deep_dataclasses		src/deep_dataclasses
test		test
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

deep_dataclasses

The Problem

The Solution

Fully Compatible with `dataclasses`

Validation and Config Loading: A poor mans pydantic

Data Modelling with Type Hints

Installation

Comparison

Status

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

deep_dataclasses

The Problem

The Solution

Fully Compatible with dataclasses

Validation and Config Loading: A poor mans pydantic

Data Modelling with Type Hints

Installation

Comparison

Status

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Fully Compatible with `dataclasses`

Packages