Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DV2 destinations: Build DestinationState / Migration framework #35303

Merged
merged 2 commits into from
Mar 1, 2024

Conversation

edgao
Copy link
Contributor

@edgao edgao commented Feb 14, 2024

Add a concept of DestinationState, which is tracked in a table in the destination warehouse. Use this concept to implement a generic Migration interface. This involves several changes:

  • First, move the DestinationInitialState / DestinationInitialStateImpl to a kotlin data class. This is mostly a noop (originally I had wanted to use the copy method, but ended up not needing it)
  • Define a MinimumDestinationState interface, to enforce that all destination states MUST track a soft-reset parameter.
  • DestinationInitialState now takes a type parameter DestinationState extends MinimumDestinationState. This is propagated across all our classes. It also has a field destinationState: DestinationState.
  • DestinationHandler has a new method commitDestinationStates, and gatherInitialState is now expected to populate the destinationState field
    • See JdbcDestinationHandler for sample implementations.
  • Add a Migration interface, with two methods. One operates solely on the DestinationState object; the other should query the DB to confirm the migration is necessary and then execute the migration.
  • DefaultTyperDeduper + NoopTyperDeduperWithMigrations now accept a List<Migration>. These are executed within prepareSchemasAndRawTables. Also, we call commitDestinationStates from both prepareSchemasAndRawTables and prepareFinalTables.
    • The logic for triggering a soft reset has also been updated, to check the DestinationState / migration result.
    • TyperDeduperUtil is also updated to handle these migrations correctly. The old migrators have been moved into executeWeirdMigrations.
  • There are new test cases in DefaultTyperDeduperTest to verify new behavior.

See also #35308 for example implementation of a state+migration. (still WIP)

Copy link

vercel bot commented Feb 14, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Mar 1, 2024 6:55pm

@edgao edgao force-pushed the edgao/destination_state_table branch from 67f0f17 to f9156fd Compare February 16, 2024 19:05
@edgao edgao changed the base branch from master to gireesh/02-15-cdk/td-init-state February 16, 2024 19:06
@edgao edgao force-pushed the edgao/destination_state_table branch 3 times, most recently from 37ff176 to 34a4fb3 Compare February 20, 2024 18:29
@edgao edgao changed the base branch from gireesh/02-15-cdk/td-init-state to edgao/typer_deduper_interface_cleanup February 20, 2024 18:43
@edgao edgao force-pushed the edgao/destination_state_table branch from 34a4fb3 to fa52885 Compare February 20, 2024 22:23
@@ -94,12 +149,15 @@ void emptyDestination() throws Exception {
initialStates.forEach(initialState -> when(initialState.isFinalTablePresent()).thenReturn(false));

typerDeduper.prepareSchemasAndRawTables();
verify(destinationHandler).execute(separately("CREATE SCHEMA overwrite_ns", "CREATE SCHEMA append_ns", "CREATE SCHEMA dedup_ns"));
verify(destinationHandler).execute(separately("CREATE SCHEMA airbyte_internal", "CREATE SCHEMA overwrite_ns", "CREATE SCHEMA append_ns", "CREATE SCHEMA dedup_ns"));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously the tests didn't call create schema airbyte_internal because the StreamIds had null rawNamespace. We're now setting those to airbyte_internal, so these tests need to be updated.

@edgao edgao force-pushed the edgao/destination_state_table branch 2 times, most recently from be41e16 to b579660 Compare February 21, 2024 20:35
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch from e282bf3 to 71904f4 Compare February 21, 2024 20:35
@edgao edgao force-pushed the edgao/destination_state_table branch from eae10b0 to 3c448f8 Compare February 21, 2024 21:32
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch from 71904f4 to 829841d Compare February 21, 2024 21:32
@edgao edgao force-pushed the edgao/destination_state_table branch 2 times, most recently from 9e3d10f to 2790a2d Compare February 21, 2024 21:52
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch 2 times, most recently from dcb79ff to ff34c69 Compare February 21, 2024 21:58
@edgao edgao force-pushed the edgao/destination_state_table branch 2 times, most recently from 50500a8 to 1f86d02 Compare February 21, 2024 22:13
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch from ff34c69 to d2c07e5 Compare February 21, 2024 22:13
@edgao edgao force-pushed the edgao/destination_state_table branch 2 times, most recently from c0221ad to ca6ab2a Compare February 21, 2024 22:46
@edgao edgao force-pushed the edgao/destination_state_table branch 4 times, most recently from b274ff7 to 75cc1f1 Compare February 23, 2024 19:12
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch from 90b10f3 to d6199c6 Compare February 23, 2024 21:12
@edgao edgao force-pushed the edgao/destination_state_table branch from 75cc1f1 to 73f7fe4 Compare February 23, 2024 21:12
@edgao edgao changed the title state stuff DV2 destinations: Build DestinationState / Migration framework Feb 23, 2024
@gisripa gisripa force-pushed the edgao/typer_deduper_interface_cleanup branch 2 times, most recently from f0a9727 to c7e2466 Compare February 27, 2024 22:54
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch from c7e2466 to d6199c6 Compare February 27, 2024 22:57
Base automatically changed from edgao/typer_deduper_interface_cleanup to master February 28, 2024 22:41
@gisripa gisripa force-pushed the edgao/destination_state_table branch 5 times, most recently from 5022d37 to d8ce2d2 Compare February 29, 2024 19:35
@gisripa gisripa force-pushed the edgao/destination_state_table branch from d8ce2d2 to d158042 Compare March 1, 2024 18:35
Signed-off-by: Gireesh Sreepathi <gisripa@gmail.com>
@gisripa
Copy link
Contributor

gisripa commented Mar 1, 2024

/publish-java-cdk

🕑 https://github.com/airbytehq/airbyte/actions/runs/8115901215
✅ Successfully published Java CDK version=0.23.10!

@gisripa gisripa merged commit 4efc065 into master Mar 1, 2024
26 checks passed
@gisripa gisripa deleted the edgao/destination_state_table branch March 1, 2024 19:04
xiaohansong pushed a commit that referenced this pull request Mar 7, 2024
Signed-off-by: Gireesh Sreepathi <gisripa@gmail.com>
Co-authored-by: Gireesh Sreepathi <gisripa@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants