fix: use msgspec.UNSET for optional typedef fields to omit them from serialization#870
fix: use msgspec.UNSET for optional typedef fields to omit them from serialization#870Aryamanz29 merged 5 commits intomainfrom
Conversation
…serialization Switches optional fields in TypeDef and subclasses from None defaults to msgspec.UNSET with omit_defaults=True, ensuring absent fields are not serialized as null. Adds serde contract tests to verify round-trip behaviour. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR updates the v9 typedef msgspec models to use msgspec.UNSET + omit_defaults=True so that absent optional fields are omitted from JSON serialization (instead of being emitted as null), and adds a dedicated serde “contract” test suite to prevent regressions.
Changes:
- Switch optional
TypeDef/AttributeDef(and related typedef structs) defaults fromNone/ empty collections tomsgspec.UNSET, and enableomit_defaults=Trueto keep omitted fields out of serialized payloads. - Introduce a strongly-typed
RelationshipEndDefstruct forRelationshipDef.end_def1/end_def2rather than decoding them as plain dicts. - Add extensive serde contract tests for decode/encode round-tripping and UNSET/null distinction.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
pyatlan_v9/model/typedef.py |
Moves typedef structs to UNSET defaults with omit_defaults=True, and adds typed RelationshipEndDef to avoid dict-based decoding. |
tests_v9/unit/test_v9_typedef_serde_contract.py |
Adds guardrail tests enforcing sparse round-trip behavior and typed decoding for typedef payloads. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
- Rename test_string_boolean_is_locked_accepted → test_string_boolean_is_optional_accepted and update docstring to match the isOptional field actually being tested - Rename test_string_encoded_integer_cardinality_accepted → test_string_boolean_is_indexable_accepted and update docstring to reflect isIndexable: 'false' string coercion - Fix propagateTagPropagation key assertion to use correct camelCase propagateTags Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Change TypeDef.category to Union[AtlanTypeCategory, UnsetType] = UNSET so the base class no longer makes it required (allows absent from JSON) - Remove hardcoded category defaults from all subclasses (EnumDef, StructDef, AtlanTagDef, EntityDef, RelationshipDef, CustomMetadataDef) to prevent omit_defaults=True from dropping the field when its value equals the default - Add __post_init__ to each subclass to unconditionally set the correct AtlanTypeCategory, matching the pattern used by generated asset classes - Update test assertions from `is None` to `is msgspec.UNSET` for absent fields Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| """Internal use only.""" | ||
|
|
||
| def __post_init__(self) -> None: | ||
| self.category = AtlanTypeCategory.ENUM |
There was a problem hiding this comment.
__post_init__ injects category into sparse round-trip output
Medium Severity
Every TypeDef subclass (EnumDef, StructDef, AtlanTagDef, EntityDef, RelationshipDef, CustomMetadataDef) unconditionally sets self.category in __post_init__. Since __post_init__ also runs after deserialization, category is always set to a non-UNSET value and always appears in re-encoded output — even when absent from the source JSON. This contradicts the PR's invariant that sparse payloads are not expanded on re-encode. The old approach of using class-level defaults (e.g., category: AtlanTypeCategory = AtlanTypeCategory.ENUM) respected the omit_defaults contract without injecting extra fields.
Additional Locations (2)
There was a problem hiding this comment.
Frankly this should be invariant on a re-serialization (enforces correct category), so I think better to take this approach than risk we don't serialize the category at all because we have omit_defaults=True and the default is preset (hard-coded) to the category. (As this would also mean that when the category actually IS set, it's not serialized — which is a bigger incorrectness overall.)
There was a problem hiding this comment.
The unconditional __post_init__ injection of category is intentional by design. category is required by Atlas in write payloads but may be absent in read responses. Injecting the structurally-correct value when absent is the only way to satisfy both 'always present in writes' and 'no clobbering of explicit values' — since the injected value is always the one that belongs to the subclass type (an EnumDef can only ever be ENUM), there is no risk of clobbering. The minor round-trip change (adding category when absent in source) is an accepted trade-off per the design goals for this PR.
…_post_init__ injection Both fields had concrete defaults (False / "v2") that caused omit_defaults=True to silently drop them when the server sent those exact values, breaking round-trip fidelity. Changed both to msgspec.UNSET and added a conditional __post_init__ on AttributeDef.Options that injects the defaults only when absent, so explicit values from Atlas responses are never clobbered. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| if self.custom_metadata_version is msgspec.UNSET: | ||
| self.custom_metadata_version = "v2" | ||
| if self.is_rich_text is msgspec.UNSET: | ||
| self.is_rich_text = False |
There was a problem hiding this comment.
Options.__post_init__ injects stale defaults breaking sparse round-trip
Medium Severity
The __post_init__ in AttributeDef.Options sets custom_metadata_version = "v2" and is_rich_text = False whenever they're UNSET. Because the field defaults are now msgspec.UNSET (not "v2" / False), these injected values no longer match their defaults, so omit_defaults=True always serializes them. Previously these values matched their hard-coded defaults and were always omitted from output. Now every Options object emits customMetadataVersion and isRichText in serialized JSON, even when absent from the source payload — violating the PR's stated invariant about sparse round-trip fidelity and adding extra fields to Atlas API write payloads.


Summary
TypeDefand all subclasses fromNonedefaults tomsgspec.UNSETwithomit_defaults=Truenulltests_v9/unit/test_v9_typedef_serde_contract.pywith serde contract tests to verify round-trip behaviourTest plan
pytest tests_v9/unit/test_v9_typedef_serde_contract.pyto verify serde contract tests passnullvalues for unset optional fields🤖 Generated with Claude Code
Note
Medium Risk
Changes default values and serialization for typedef models, which can affect Atlas API request/response payload shapes and downstream tooling expectations. Added contract tests reduce risk but any consumer relying on
nullor empty-list emission may see behavior changes.Overview
Updates
pyatlan_v9/model/typedef.pyto treat most optional typedef fields as absent-by-default usingmsgspec.UNSETwithomit_defaults=True, ensuring sparse Atlas typedef payloads round-trip without injectingnullfields or empty lists.Adds a strongly-typed
RelationshipEndDefforRelationshipDef.end_def1/end_def2, setscategoryvia__post_init__on typedef subclasses, and adjustsAttributeDef.Optionsto avoid injecting stale defaults unless explicitly set.Updates unit tests to assert
UNSETsemantics and introducestests_v9/unit/test_v9_typedef_serde_contract.pyas a guardrail suite for decode/encode invariants critical to typedef write compatibility.Written by Cursor Bugbot for commit b701d1e. This will update automatically on new commits. Configure here.