-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DS][24/n] Remove all AssetSubset references from the context and result objects #21612
[DS][24/n] Remove all AssetSubset references from the context and result objects #21612
Conversation
132764e
to
5744f34
Compare
b9355f4
to
6baf92a
Compare
5744f34
to
8bc6892
Compare
6baf92a
to
e5e6ed8
Compare
def compute_union(self, other: "AssetSlice") -> "AssetSlice": | ||
return _slice_from_subset( | ||
self._asset_graph_view, self._compatible_subset | other.convert_to_valid_asset_subset() | ||
) | ||
|
||
def compute_intersection(self, other: "AssetSlice") -> "AssetSlice": | ||
return _slice_from_subset( | ||
self._asset_graph_view, self._compatible_subset & other.convert_to_valid_asset_subset() | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm i don't think we need to use the compute
language here right? This is guaranteed to be fairly fast I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at the moment there's one case where things get a bit rough, which is if you have an in-memory time-window subset which only contains the time windows, not the partition keys themselves.
in order to compute the intersection, the current logic has to compute the partition keys for those time windows.
we could theoretically make this more efficient if you have time window to time window intersections (just intersect the time windows themselves), but I think there will always be the case where you have one subset that's encoded in terms of partition keys and another subset encoded with time windows, and it's non-negligible to convert one to the other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah and then there's the especially-nasty case of multi-partitions definitions, which would require even more cleverness to make efficient in all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Very happy we are doing compute then instead of an operator that might just silently do that!
@@ -433,7 +435,8 @@ def evaluate_for_asset(self, context: "SchedulingContext") -> "SchedulingResult" | |||
ignore_subset=context.legacy_context.materialized_requested_or_discarded_since_previous_tick_subset, | |||
) | |||
) | |||
return SchedulingResult.create(context, true_subset, subsets_with_metadata) | |||
true_slice = context.asset_graph_view.get_asset_slice_from_subset(true_subset) | |||
return SchedulingResult.create(context, true_slice, subsets_with_metadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gentle reminder that i do want to revist the true_slice
verbiage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noted !
8bc6892
to
7b1cf10
Compare
e5e6ed8
to
41e4314
Compare
7b1cf10
to
45c84a3
Compare
41e4314
to
a03b80f
Compare
Merge activity
|
45c84a3
to
9fffe56
Compare
a03b80f
to
2fb533a
Compare
…ult objects (#21612) ## Summary & Motivation As title -- for the new codepaths, helps avoid a lot of convert_to_valid_asset_subset() business. I didn't bother updating the AutoMaterializeRule side of the world in a nicer way, as this code will eventually be removed anyway. Had to add some new methods to replace the & / | / - operators, this time properly labeled with "compute_" ## How I Tested These Changes
# construct is used here for performance | ||
return SchedulingContext.construct( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
X
…ult objects (dagster-io#21612) ## Summary & Motivation As title -- for the new codepaths, helps avoid a lot of convert_to_valid_asset_subset() business. I didn't bother updating the AutoMaterializeRule side of the world in a nicer way, as this code will eventually be removed anyway. Had to add some new methods to replace the & / | / - operators, this time properly labeled with "compute_" ## How I Tested These Changes
Summary & Motivation
As title -- for the new codepaths, helps avoid a lot of convert_to_valid_asset_subset() business.
I didn't bother updating the AutoMaterializeRule side of the world in a nicer way, as this code will eventually be removed anyway.
Had to add some new methods to replace the & / | / - operators, this time properly labeled with "compute_"
How I Tested These Changes