Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DS][27/n] Make candidate_slice and true_slice Optional #21648

Merged

Conversation

OwenKephart
Copy link
Contributor

Summary & Motivation

This helps avoid a class of errors using the type system. We have these concepts of "AssetSubsets" vs. "ValidAssetSubsets", where valid asset subsets are known to have been created with the asset's current-tick PartitionsDefinition, and AssetSubsets have gone through a serdes layer and therefore may have a differen partitions definition than they did at serialization time.

This means that care needs to be taken when using a bare AssetSubset.

Previously, we had these fancy operators which would treat AssetSubsets which had invalid partitions_definitions as the empty subset, but that is not actually always accurate. i.e. you may have been keeping track of all the materialized partitions of an asset using a subset, then you update the partitions definition. The correct thing to do here is to recompute the materialized partitions from scratch, NOT assume that this susbet is equivalent to the empty subset.

By explicitly creating a None value for candidate_slice / true_slice in these cases, we force authors of conditions to handle these cases through the type system. This already uncovered a potential bug in the new UpdatedSinceCron condition, in which it would return an empty set in the case that a partition was updated since the latest cron tick, but then the partitions definition was updated.

How I Tested These Changes

Copy link
Contributor Author

OwenKephart commented May 3, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @OwenKephart and the rest of your teammates on Graphite Graphite

@OwenKephart OwenKephart changed the title [DS][29/n] Make candidate_slice and true_slice Optional [DS][28/n] Make candidate_slice and true_slice Optional May 6, 2024
@OwenKephart OwenKephart requested a review from schrockn May 6, 2024 21:31
@OwenKephart OwenKephart force-pushed the 05-02-Update_values_within_InLatestTimeWindow branch from a226325 to 29f9484 Compare May 6, 2024 23:44
@OwenKephart OwenKephart force-pushed the 05-03-Make_candidate_slice_and_true_slice_Optional branch from e138ccc to f8b4f5d Compare May 6, 2024 23:45
@OwenKephart OwenKephart force-pushed the 05-03-Make_candidate_slice_and_true_slice_Optional branch from f8b4f5d to 7c8a70c Compare May 7, 2024 21:33
@OwenKephart OwenKephart changed the title [DS][28/n] Make candidate_slice and true_slice Optional [DS][27/n] Make candidate_slice and true_slice Optional May 8, 2024
@OwenKephart OwenKephart force-pushed the 05-02-Update_values_within_InLatestTimeWindow branch from 29f9484 to 0534e7c Compare May 8, 2024 23:00
@OwenKephart OwenKephart force-pushed the 05-03-Make_candidate_slice_and_true_slice_Optional branch from 7c8a70c to a4f2744 Compare May 8, 2024 23:00
@OwenKephart OwenKephart force-pushed the 05-02-Update_values_within_InLatestTimeWindow branch from 0534e7c to f34a068 Compare May 10, 2024 22:44
@OwenKephart OwenKephart force-pushed the 05-03-Make_candidate_slice_and_true_slice_Optional branch 2 times, most recently from ba19f8d to 1bd133f Compare May 13, 2024 17:29
@OwenKephart OwenKephart force-pushed the 05-02-Update_values_within_InLatestTimeWindow branch from f34a068 to f9c81a0 Compare May 14, 2024 13:47
@OwenKephart OwenKephart force-pushed the 05-03-Make_candidate_slice_and_true_slice_Optional branch from 1bd133f to 56c11e4 Compare May 14, 2024 13:47
Copy link
Contributor Author

OwenKephart commented May 14, 2024

Merge activity

  • May 14, 10:38 AM EDT: @OwenKephart started a stack merge that includes this pull request via Graphite.
  • May 14, 10:43 AM EDT: Graphite rebased this pull request as part of a merge.
  • May 14, 10:44 AM EDT: @OwenKephart merged this pull request with Graphite.

@OwenKephart OwenKephart force-pushed the 05-02-Update_values_within_InLatestTimeWindow branch from f9c81a0 to e678b9c Compare May 14, 2024 14:39
Base automatically changed from 05-02-Update_values_within_InLatestTimeWindow to master May 14, 2024 14:43
@OwenKephart OwenKephart force-pushed the 05-03-Make_candidate_slice_and_true_slice_Optional branch from 56c11e4 to 10b0a4b Compare May 14, 2024 14:43
@OwenKephart OwenKephart merged commit c209151 into master May 14, 2024
1 check was pending
@OwenKephart OwenKephart deleted the 05-03-Make_candidate_slice_and_true_slice_Optional branch May 14, 2024 14:44
danielgafni pushed a commit to danielgafni/dagster that referenced this pull request Jun 18, 2024
…1648)

## Summary & Motivation

This helps avoid a class of errors using the type system. We have these concepts of "AssetSubsets" vs. "ValidAssetSubsets", where valid asset subsets are known to have been created with the asset's current-tick PartitionsDefinition, and AssetSubsets have gone through a serdes layer and therefore may have a differen partitions definition than they did at serialization time.

This means that care needs to be taken when using a bare AssetSubset.

Previously, we had these fancy operators which would treat AssetSubsets which had invalid partitions_definitions as the empty subset, but that is not actually always accurate. i.e. you may have been keeping track of all the materialized partitions of an asset using a subset, then you update the partitions definition. The correct thing to do here is to recompute the materialized partitions from scratch, NOT assume that this susbet is equivalent to the empty subset.

By explicitly creating a None value for candidate_slice / true_slice in these cases, we force authors of conditions to handle these cases through the type system. This already uncovered a potential bug in the new UpdatedSinceCron condition, in which it would return an empty set in the case that a partition was updated since the latest cron tick, but then the partitions definition was updated.

## How I Tested These Changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants