Skip to content

feat(dataclass): add asdict, astuple, and match_args support#556

Merged
junrushao merged 2 commits intoapache:mainfrom
junrushao:junrushao/2026-04-17/dataclass-astuple-asdict
Apr 18, 2026
Merged

feat(dataclass): add asdict, astuple, and match_args support#556
junrushao merged 2 commits intoapache:mainfrom
junrushao:junrushao/2026-04-17/dataclass-astuple-asdict

Conversation

@junrushao
Copy link
Copy Markdown
Member

Summary

Adds three stdlib-parity features to tvm_ffi.dataclasses:

  1. asdict(obj, *, dict_factory=dict) — recursively converts a @py_class / @c_class instance to a plain Python dict. FFI containers (Array, List) recurse into list; (Map, Dict) recurse into dict, yielding JSON-ready output.
  2. astuple(obj, *, tuple_factory=tuple) — the tuple analogue of asdict, with the same recursion rules.
  3. match_args: bool = True parameter on @py_class and @c_class — sets cls.__match_args__ to the tuple of positional __init__ field names (init=True and not kw_only), enabling Python 3.10+ match statements. Skipped when the class body already defines __match_args__.

Semantics follow CPython's dataclasses module: asdict/astuple raise TypeError for types and non-dataclass values; kw-only fields (via field(kw_only=True), decorator-level kw_only=True, or the KW_ONLY sentinel) are excluded from __match_args__.

Design notes

  • _is_ffi_dataclass_instance filters FFI container instances (Array, List, Map, Dict) from FFI dataclass instances — both share __tvm_ffi_type_info__, so the container isinstance-check runs first during recursion.
  • _set_match_args walks the TypeInfo.parent_type_info chain in parent-first order, matching the order of the auto-generated __init__ signature.
  • Recursion uses an _ATOMIC_TYPES frozenset (mirroring stdlib dataclasses._ATOMIC_TYPES) for the fast path on immutable leaves.

Test plan

  • uv run pytest tests/python/ — 2184 passed, 38 skipped, 3 xfailed
  • pre-commit run --files <touched files> — all hooks pass (ruff, ty, format)
  • New coverage in tests/python/test_dataclass_common.py:
    • TestAsdict (15 tests): basic, nested, inheritance, FFI Array/List/Map/Dict recursion, dict_factory, result independence, error paths
    • TestAstuple (10 tests): basic, nested, recursion, tuple_factory, error paths
    • TestMatchArgs (11 tests): defaults, init=False, kw_only (field / decorator / KW_ONLY sentinel), inheritance order, match_args=False opt-out, user-defined __match_args__ override, c_class basic and inheritance

Architecture:
- common.py gains asdict/astuple plus _asdict_inner/_astuple_inner
  recursion driven by an _ATOMIC_TYPES frozenset mirroring stdlib
  dataclasses. FFI sequence containers (Array, List) recurse to Python
  list; FFI mapping containers (Map, Dict) recurse to Python dict so
  the result is plain Python data (JSON-ready). Non-dataclass leaves
  fall back to copy.deepcopy.
- _dunder.py gains _set_match_args(cls, type_info), a helper that
  walks the TypeInfo parent chain parent-first and assembles the
  positional __init__ field names (init=True and not kw_only), then
  sets cls.__match_args__ if the class does not already define one.
  _install_dataclass_dunders accepts a new match_args parameter and
  invokes the helper alongside the other dunders.

Public Interfaces:
- tvm_ffi.dataclasses now exports asdict and astuple.
- @py_class and @c_class accept match_args: bool = True, mirroring
  stdlib dataclasses. Default preserves prior behavior only when the
  class did not override __match_args__; newly decorated classes now
  gain __match_args__ automatically.

UI/UX: none.

Behavioral Changes:
- Python 3.10+ match statements against @py_class / @c_class
  instances now bind positional captures by field order, matching
  stdlib dataclass expectations.
- asdict is a TypeError when passed a type or a non-FFI-dataclass
  value, mirroring stdlib.
- _is_ffi_dataclass_instance distinguishes FFI dataclass instances
  from FFI container instances (Array/List/Map/Dict), so asdict and
  astuple do not mistake containers for dataclasses.

Docs:
- API docs for the new decorator parameter and the two helpers live
  in the inline docstrings; no Sphinx RST updates required for this
  patch since public names are discovered via __all__.

Tests:
- Executed: uv run pytest tests/python/
- Result: 2184 passed, 38 skipped, 3 xfailed.
- Added TestAsdict (15), TestAstuple (10), TestMatchArgs (11) in
  tests/python/test_dataclass_common.py covering inheritance,
  init=False, kw_only (field, decorator, KW_ONLY sentinel),
  match_args=False opt-out, user-defined __match_args__ override,
  FFI Array/List/Map/Dict recursion, dict_factory/tuple_factory, and
  error paths.

Untested Edge Cases:
- match statement semantics on Python 3.8/3.9 are not exercised
  because match syntax is 3.10+; the __match_args__ tuple is still
  populated on older interpreters and is tested there.
- Deeply nested cyclic FFI graphs are not exercised by asdict /
  astuple; recursion relies on the caller's graph being acyclic
  (same assumption as stdlib dataclasses.asdict).
@junrushao junrushao force-pushed the junrushao/2026-04-17/dataclass-astuple-asdict branch from 0dad751 to 29479ed Compare April 17, 2026 23:54
@junrushao junrushao mentioned this pull request Apr 17, 2026
16 tasks
@junrushao junrushao linked an issue Apr 17, 2026 that may be closed by this pull request
16 tasks
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces asdict and astuple utility functions for FFI dataclasses, mirroring the standard library's dataclasses functionality. It also adds support for __match_args__ in both @c_class and @py_class decorators to enable structural pattern matching. The review feedback correctly identifies a bug in the recursive handling of collections.defaultdict within both asdict and astuple, where the code incorrectly checks the class instead of the instance for the default_factory attribute.

Comment thread python/tvm_ffi/dataclasses/common.py
Comment thread python/tvm_ffi/dataclasses/common.py
Adds direct tests for the `_asdict_inner` and `_astuple_inner`
defaultdict branches, locking in the stdlib-parity contract.

Tests/Evidence:
- The class-level check `hasattr(obj_type, 'default_factory')`
  mirrors CPython `dataclasses._asdict_inner` (lib/dataclasses.py).
  `collections.defaultdict` exposes `default_factory` as a
  class-level member_descriptor, so the check returns True.
- Direct-branch coverage is necessary because storing a
  `defaultdict` in an FFI `Any` field converts it to an FFI `Map`
  on readback — the public `asdict`/`astuple` path never reaches
  this branch for FFI-backed graphs.

Refs: review thread on PR apache#556 (gemini-code-assist).
@junrushao junrushao merged commit 9c4f598 into apache:main Apr 18, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

How far are we from @dataclass

2 participants