Skip to content

expose additional fields in PlanDagSpec#2231

Merged
andrew-sha merged 4 commits intomainfrom
plan_dag_spec_new_snapshot_fields
Mar 12, 2024
Merged

expose additional fields in PlanDagSpec#2231
andrew-sha merged 4 commits intomainfrom
plan_dag_spec_new_snapshot_fields

Conversation

@andrew-sha
Copy link
Copy Markdown
Contributor

this is meant to help EnterpriseSnapshotDagGenerator get some additional plan-related information for observer

@andrew-sha andrew-sha requested a review from izeigerman March 6, 2024 23:26
end_bounded: bool
ensure_finalized_snapshots: bool
directly_modified_snapshots: t.List[SnapshotId]
indirectly_modified_snapshots: t.Dict[SnapshotId, t.Set[SnapshotId]]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This key is likely not going to work with pydantic. AFAIK, Pydantic doesn't support (de)serialization of complex types as dict keys / set items.

models_to_backfill: t.Optional[t.Set[str]] = None,
end_bounded: bool = False,
ensure_finalized_snapshots: bool = False,
directly_modified_snapshots: t.List[SnapshotId] = [],
Copy link
Copy Markdown
Collaborator

@izeigerman izeigerman Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should never use [] / {} as default method arguments in Python. See, for example, this https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh nice I didn't know about this, thanks

end_bounded: bool = False,
ensure_finalized_snapshots: bool = False,
directly_modified_snapshots: t.Optional[t.List[SnapshotId]] = None,
indirectly_modified_snapshots: t.Optional[t.List[SnapshotId]] = None,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we ever care about which snapshot caused the indirectly modified one? We can consider making it:

t.Dict[str, t.List[SnapshotId]]

and use the snapshot name as a key.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC: @crericha

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the specific thing im trying to do currently it isn't needed but i think this information would be nice to have in general so its probably worth including as a dict key

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, which snapshot caused the indirect mod is good info. In an ideal world, Airflow would provide a Plan object with a full ContextDiff.

@andrew-sha andrew-sha requested a review from izeigerman March 7, 2024 22:22
@andrew-sha andrew-sha merged commit 02ab507 into main Mar 12, 2024
@andrew-sha andrew-sha deleted the plan_dag_spec_new_snapshot_fields branch March 12, 2024 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants