expose additional fields in PlanDagSpec#2231
Conversation
sqlmesh/schedulers/airflow/common.py
Outdated
| end_bounded: bool | ||
| ensure_finalized_snapshots: bool | ||
| directly_modified_snapshots: t.List[SnapshotId] | ||
| indirectly_modified_snapshots: t.Dict[SnapshotId, t.Set[SnapshotId]] |
There was a problem hiding this comment.
This key is likely not going to work with pydantic. AFAIK, Pydantic doesn't support (de)serialization of complex types as dict keys / set items.
sqlmesh/schedulers/airflow/client.py
Outdated
| models_to_backfill: t.Optional[t.Set[str]] = None, | ||
| end_bounded: bool = False, | ||
| ensure_finalized_snapshots: bool = False, | ||
| directly_modified_snapshots: t.List[SnapshotId] = [], |
There was a problem hiding this comment.
We should never use [] / {} as default method arguments in Python. See, for example, this https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments
There was a problem hiding this comment.
oh nice I didn't know about this, thanks
sqlmesh/schedulers/airflow/client.py
Outdated
| end_bounded: bool = False, | ||
| ensure_finalized_snapshots: bool = False, | ||
| directly_modified_snapshots: t.Optional[t.List[SnapshotId]] = None, | ||
| indirectly_modified_snapshots: t.Optional[t.List[SnapshotId]] = None, |
There was a problem hiding this comment.
Will we ever care about which snapshot caused the indirectly modified one? We can consider making it:
t.Dict[str, t.List[SnapshotId]]
and use the snapshot name as a key.
There was a problem hiding this comment.
for the specific thing im trying to do currently it isn't needed but i think this information would be nice to have in general so its probably worth including as a dict key
There was a problem hiding this comment.
Yeah, which snapshot caused the indirect mod is good info. In an ideal world, Airflow would provide a Plan object with a full ContextDiff.
this is meant to help
EnterpriseSnapshotDagGeneratorget some additional plan-related information for observer