BLDX-434 | Migrate to msgspec.Struct models#811
Conversation
|
@greptile review |
|
Too many files changed for review. ( |
|
@greptile re-review |
|
@claude /review |
|
@claude /review |
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
|
@claude /review |
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
Code ReviewThis PR migrates the pyatlan SDK's internal model layer from Pydantic BaseModel to Confidence Score: 2/5
Important Files Changed
Change FlowsequenceDiagram
participant User as User Code
participant Client as AtlanClient (v9)
participant SubClient as Sub-client (e.g. AssetClient)
participant Validate as validate_arguments
participant Bridge as _is_model_instance
participant Msgspec as msgspec.Struct model
participant HTTP as httpx + PyatlanTransport
participant Legacy as Legacy Pydantic layer
User->>Client: AtlanClient(base_url, api_key)
Client->>Client: __post_init__ (env fallbacks, session init)
User->>SubClient: client.assets.get_by_guid(guid)
SubClient->>Validate: @validate_arguments checks type
Validate->>Bridge: _is_model_instance(value, expected_type)
Bridge-->>Validate: match by MRO name
SubClient->>HTTP: _call_api(endpoint, request_obj)
HTTP->>Legacy: deserialize response (Pydantic)
Legacy-->>SubClient: legacy Asset object
SubClient->>Msgspec: (v9 path) msgspec.convert(raw, TargetStruct)
Msgspec-->>User: typed v9 model instance
Findings
|
d6a8ffc to
ec57960
Compare
- Remove dead code: admin/, checkpoint.py, exceptions.py, py.typed - Delete old single-file client.py (replaced by client/ package later) - Rename models/ → model/assets/ for consistency with legacy pyatlan/model/assets/ - Move infrastructure files (conversion_utils.py, serde.py, transform.py) to model/ - Add model/__init__.py for package-level re-exports - Update all import paths (pyatlan_v9.models → pyatlan_v9.model.assets) - Update all model test imports to match new paths
Migrate all legacy pyatlan model files (AtlanObject/Pydantic BaseModel) to pyatlan_v9 msgspec.Struct equivalents: Infrastructure: - core.py: AtlanObject base, AtlanTag, AtlanField helpers - structs.py: SourceTagAttachment, BadgeCondition, etc. - translators.py/retranslators.py: Tag name translation pipeline Models (28 new files): - search.py: DSL, IndexSearchRequest, Query types - typedef.py: EnumDef, StructDef, AtlanTagDef, CustomMetadataDef, etc. - lineage.py: LineageListRequest, FluentLineage, LineageResponse, etc. - audit.py, search_log.py: AuditSearchRequest, SearchLogRequest - response.py: AssetMutationResponse, AssetResponse - group.py, user.py, role.py: GroupRequest, AtlanUser, AtlanRole - credential.py, oauth_client.py, sso.py, api_tokens.py - events.py, keycloak_events.py: AtlanEvent, KeycloakEvent - query.py, task.py, workflow.py, suggestions.py - aggregation.py, atlan_image.py, contract.py, custom_metadata.py - data_mesh.py, dq_rule_conditions.py, file.py, internal.py, lineage_ref.py Assets: - purpose.py: Purpose model with tag translation support - snowflake_dynamic_table.py: SnowflakeDynamicTable model All models use msgspec conventions: kw_only=True, UNSET/UnsetType, rename='camel' where needed, and proper serialization methods.
…, add S3Object.create_with_prefix alias - Add _normalize_camel_key to handle API keys with uppercase abbreviations (e.g., apiPathRawURI→apiPathRawUri, dataProductAssetsDSL→dataProductAssetsDsl) - Fix announcement_type UNSET assertions in 5 asset test files (data_studio, gcs, preset, superset, api) for partial update responses - Add S3Object.create_with_prefix as alias for creator_with_prefix Made-with: Cursor
…ests Change `is None` to `not` for immediate_downstream and immediate_upstream checks to handle msgspec UNSET values in v9 lineage test assertions. Made-with: Cursor
…UNSET assertions, Anaplan creator params - Add explicit msgspec.field(name=...) mappings for QuickSight model fields (quick_sight_folder_type, quick_sight_dataset_import_mode, quick_sight_dataset_field_type) - Fix retranslators.py NoneType iteration when classification_names is None - Update S3 and test_client assertions to handle UNSET vs None for optional fields - Normalize Anaplan creator connection_qualified_name parameter signatures - Fix glossary _assert_relationship for v9 flat model structure Made-with: Cursor
…ration tests Systematically convert `assert field is None` to `assert not field` for msgspec model attributes that return UNSET instead of None when absent. Changes across 21 test files: - test_client.py: certificate_status, anchor, classification_names, views - test_sql_assets.py: mutated_entities.CREATE, trim_to_required checks - test_index_search.py: aggregations, nested_results, timestamp coercion - custom_metadata_test.py: RACI, IPR, DQ empty attribute checks - glossary_test.py: mutated_entities checks - document_db_asset_test.py: trim_to_required checks - s3_asset_test.py: aws_arn, s3_object_key absent fields - suggestions_test.py: owner_groups, meanings, description - test_workflow_client.py: credential extras/level/metadata - All aio/ async counterparts with same patterns - Convert integer timestamps to datetime for Pydantic validation Made-with: Cursor
…on, glossary UDR - Change certificate_status/message defaults from None to UNSET in Asset so setting them to None is properly serialized (not omitted by omit_defaults) - Simplify PopularityInsights.record_last_timestamp type to Union[int, None] to fix msgspec type union restriction while keeping __post_init__ conversion - Construct PopularityInsights via kwargs (not post-init mutation) so datetime→epoch conversion runs during construction - Add msgspec.convert in test_sql_assets verify_popularity to handle dict responses from API for source_read_recent_user_record_list - Fix glossary _assert_relationship to check relationship_attributes dict Made-with: Cursor
…, Entity UNSET defaults - Procedure.updater(): make definition optional (default '') so updater(qualified_name, name) works - Entity: set classification_names, meanings, labels, pending_tasks defaults to UNSET so workflow package serialization omits them (fixes test_packages AssertionError on connection payload) - Audit search: audit_search_paging.json entity audit keys to camelCase (entityQualifiedName, typeName, entityId) for EntityAudit decode; assert helpers use first['entityId'] etc. Made-with: Cursor
tests/unit and tests/unit/aio assert on audit_search_paging.json which was updated to camelCase (entityQualifiedName, typeName, entityId); update _assert_audit_search_results to read first['entityId'] etc. Made-with: Cursor
…validate_arguments Delete the custom validate_arguments wrapper and _is_model_instance utility. All legacy client, model, and UI code now uses pydantic.v1's validate_arguments directly. Also removes the msgspec cross-type encoder registration from model/core.py and _is_msgspec_cross_type_match from utils.py, and updates the batch __track method to support v9 models via getattr type_name check. Made-with: Cursor
…essages Pydantic v1's validate_arguments produces different error messages than the removed custom validator (e.g. "value is not a valid dict" instead of "instance of X expected"). Update all assertion strings and change exact equality checks to substring checks where pydantic appends type info. Made-with: Cursor
Adds validate_arguments decorators to async SSO client, fixes SSO model fields, improves asset/batch client v9 compatibility, and various model enhancements (badge, entity, persona, schema, etc.). Includes v9 validate module and pkg utilities. Made-with: Cursor
Fix SSO test missing group_map_name param, task client assertion, batch test mock patches. Add v9-specific validation constants with custom validator error messages (separate from pydantic messages in legacy). Made-with: Cursor
Remove legacy client usage from v9 integration tests, fix SSO test assertions, update async conftest and utilities for v9 client. Made-with: Cursor
…l qa-checks (ruff format, ruff check, mypy) Made-with: Cursor
Legacy tests under tests/ are kept in sync with main branch. V9-specific tests live under tests_v9/. Made-with: Cursor
…scriptors and overlay system - Regenerate 557 asset files from Pkl type definitions with field descriptors (ClassVar placeholders + deferred KeywordField/BooleanField/NumericField init) - Add 82 overlay files (_overlays/) with custom methods (creator, updater, policies) - Add /generate-v9-models Claude Code skill for model regeneration from models repo - Add pyatlan_v9 model generation section to README - Post-sync patches: set[str] fields in asset.py, relationship_attributes in related_entity.py, Process.Attributes backward-compat alias Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Regenerate all asset models using updated Pkl generator: - set[str] fields now generated natively (no post-sync sed patch) - validate/minimize/relate methods removed (dead code) - relationship_attributes field in RelatedEntity base class - Move overlay files to SDK repo (_overlays/ directory) - Fix overlay imports: ai_model.py, collection.py, connection.py - Add ruff exclusion for _overlays/ and F821 per-file-ignores - Update generate-v9-models skill: sdkOnly mode, temp staging, ruff auto-fix+format step, removed asset.py sed patch - Update process_test.py to use Process.generate_qualified_name Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Updated 60 existing asset files with new fields/changes - Added 5 new Snowflake Semantic files (dimension, fact, logical_table, metric, view) - Ruff-cleaned 339 unused import errors Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…notations All struct field annotations now use Union[X, None, UnsetType] instead of X | None | UnsetType, and List/Dict/Set from typing instead of builtin subscripting. Verified passing 6376 tests on both Python 3.9 and 3.11. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Aryamanz29
left a comment
There was a problem hiding this comment.
LGTM - lets goooooooooo 🚀
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
✨ Description
https://linear.app/atlan-epd/issue/BLDX-434/plan-productionization-of-msgspecstruct-models
🧩 Type of change
Select all that apply:
📋 Checklist