Skip to content

Feat!: Differentiate between deployable and non-deployable snapshots during plan evaluation#1610

Merged
izeigerman merged 13 commits intomainfrom
feat-introduce-deployability
Oct 30, 2023
Merged

Feat!: Differentiate between deployable and non-deployable snapshots during plan evaluation#1610
izeigerman merged 13 commits intomainfrom
feat-introduce-deployability

Conversation

@izeigerman
Copy link
Contributor

@izeigerman izeigerman commented Oct 25, 2023

Previously we used the is_dev flag everywhere to determine the following:

  1. Whether to use dev intervals or regular intervals when computing missing intervals.
  2. Whether to reference a prod table or a dev / temp table when evaluating a snapshot (physical table mapping).
  3. Whether to write into a prod table or a dev / temp table.
  4. Order of the backfill during plan evalution.

Using is_dev had the following shortcomings:

  1. We had to repeat the same logic in many places. Eg. something if is_dev and snapshot.is_paused and (snapshot.is_forward_only or snapshot.is_non_breaking)... had to be repeated all over the code base.
  2. The decision about whether to use a prod table or a dev / temp table could only be derived from the current state of an individual snapshot. There was no way to impact the decision based on the current state of the DAG as a whole. We found use-cases in which the state of parent snapshots may impact this decision.

In essence, during evaluation, each snapshot can be in 1 of 3 states:

  1. Deployable: this means that the output that snapshot produces can be safely reused when promoted to prod.
  2. Non-deployable: this means that the output can't be safely reused in prod and should be discarded.
  3. Representative: this means that the snapshot might not be deployable due to its nature (eg. FORWARD_ONLY or INDIRECT_NON_BREAKING), but since it's currently promoted in prod, we should refer to its prod table when computing intervals or determining the physical table mapping. Therefore, all deployable snapshots are also representative, but the opposite is not always true.

To the best of my knowledge, these three states encompass all the use cases we're dealing with here.

To identify which state each snapshot is in, I created a new entity called DeployabilityIndex. It is built once at the beginning of plan evaluation and then used unchanged throughout the process. It's constructed using the entire DAG which means that if a parent is non-deployable this property also propagates to its children.

Thus, DeployabilityIndex is now a single source of truth for whether or not a snapshot produces deployable results, while the clients no longer need to worry about the current snapshot state (eg. whether it's paused or not) or which target environment it is.

Additionally, DeployabilityIndex presents an opportunity to inform a user at plan creation time about whether or not what they're about to build is going to be deployable. This, however, is out of scope of this PR.

@izeigerman izeigerman force-pushed the feat-introduce-deployability branch from 632404f to 5420b20 Compare October 25, 2023 21:19
@izeigerman izeigerman force-pushed the feat-introduce-deployability branch from 5420b20 to cfc0c71 Compare October 26, 2023 00:21
@izeigerman izeigerman force-pushed the feat-introduce-deployability branch from d6ee840 to 12a86d0 Compare October 27, 2023 00:05
@izeigerman izeigerman force-pushed the feat-introduce-deployability branch from 12a86d0 to f80419f Compare October 27, 2023 18:42
@izeigerman izeigerman force-pushed the feat-introduce-deployability branch from 1b3e7e5 to 72fbb9b Compare October 27, 2023 21:50
@izeigerman izeigerman merged commit 6267f24 into main Oct 30, 2023
@izeigerman izeigerman deleted the feat-introduce-deployability branch October 30, 2023 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants