Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DV2 TyperDeduper: Extract migrations to separate method #35376

Merged
merged 5 commits into from
Feb 28, 2024

Conversation

edgao
Copy link
Contributor

@edgao edgao commented Feb 16, 2024

Based on top of #35342

Coincidentally, closes https://github.com/airbytehq/airbyte-internal-issues/issues/2469. After this PR, we'll trigger soft resets after raw table creation.

Extract some common code from the DefaultTyperDeduper and NoopTyperDeduperWithMigrations classes. This is just a refactor to simplify some future work.

It also fixes some tests in DefaultTyperDeduperTest. Previously some tests weren't actually doing anything, because they called clearInvocations() followed immediately by verifyNoMoreInteractions(), which is effectively just assertTrue(true).

The CDK build.gradle changes are just copied out of #34637. Once that's merged, there should be no diff with master.

Copy link

vercel bot commented Feb 16, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Feb 28, 2024 7:13pm

@octavia-squidington-iii octavia-squidington-iii added the CDK Connector Development Kit label Feb 16, 2024
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch 2 times, most recently from 796b41f to 98802c6 Compare February 16, 2024 23:44
@edgao edgao changed the base branch from master to gireesh/02-15-cdk/td-init-state February 16, 2024 23:44
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch from e596cd7 to d0fbcd8 Compare February 17, 2024 01:21
@gisripa gisripa force-pushed the gireesh/02-15-cdk/td-init-state branch from fb3ce67 to 52ef29f Compare February 21, 2024 02:59
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch from e282bf3 to 71904f4 Compare February 21, 2024 20:35
@edgao edgao changed the title Extract migrations to separate method DV2 TyperDeduper: Extract migrations to separate method Feb 21, 2024
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch 2 times, most recently from 829841d to dcb79ff Compare February 21, 2024 21:52
if (overwriteStreamsWithTmpTable != null) {
throw new IllegalStateException("Tables were already prepared.");
}
overwriteStreamsWithTmpTable = ConcurrentHashMap.newKeySet();
LOGGER.info("Preparing tables");

// This is intentionally not done in parallel to avoid rate limits in some destinations.
prepareSchemas(parsedCatalog);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this call now happens in prepareSchemasAndRawTables

// This is intentionally not done in parallel to avoid rate limits in some destinations.
prepareSchemas(parsedCatalog);

// TODO: Either the migrations run the soft reset and create v2 tables or the actual prepare tables.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic now happens in TyperDeduperUtil.executeRawTableMigrations

import java.util.concurrent.ExecutorService


class TyperDeduperUtil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class is partly taken from @jbfbell 's https://github.com/airbytehq/airbyte/pull/34637/files#diff-c9e7bf1074e965f63b135422dd2a7ac6ba196046b226a0ffe3a855a4eb82ed3b

it'll have some merge conflicts b/c I reordered the arguments in prepareAllSchemas but 🤷 just wanted to be consistent across the methods, and I generally prefer infra-ish params (executorservice, destinationhandler, etc.) to come before functional-ish params (parsedcatalog, streamconfig, etc.)

@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch from dcb79ff to ff34c69 Compare February 21, 2024 21:58
// with current state of raw tables & final tables. This is done first before gather initial state
// to avoid recreating
// final tables later again.
val runMigrationsResult =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as noted above - this code originally lived in DefaultTyperDeduper, I just extracted it to a separate class.

@@ -161,12 +165,14 @@ void existingEmptyTableMatchingSchema() throws Exception {
initialStates.forEach(initialState -> {
when(initialState.isFinalTablePresent()).thenReturn(true);
when(initialState.isFinalTableEmpty()).thenReturn(true);
when(initialState.isSchemaMismatch()).thenReturn(true);
when(initialState.isSchemaMismatch()).thenReturn(false);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test previously didn't actually assert anything, and was secretly broken. This is the correct version.

(specifically: if there are no final tables, isSchemaMismatch is supposed to return false)

@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch from ff34c69 to d2c07e5 Compare February 21, 2024 22:13
@edgao edgao marked this pull request as ready for review February 21, 2024 22:38
@edgao edgao requested a review from a team as a February 21, 2024 22:38
@gisripa gisripa force-pushed the gireesh/02-15-cdk/td-init-state branch from d84938e to 71d4578 Compare February 21, 2024 23:45
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch from d2c07e5 to d307cb7 Compare February 22, 2024 00:36
@gisripa gisripa force-pushed the gireesh/02-15-cdk/td-init-state branch 2 times, most recently from 2aab345 to 16843df Compare February 22, 2024 22:07
Base automatically changed from gireesh/02-15-cdk/td-init-state to master February 22, 2024 23:00
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch 2 times, most recently from 06ac33d to 90b10f3 Compare February 22, 2024 23:45
@gisripa gisripa force-pushed the edgao/typer_deduper_interface_cleanup branch from d6199c6 to f0a9727 Compare February 27, 2024 22:53
@gisripa gisripa changed the title DV2 TyperDeduper: Extract migrations to separate method wild idea Feb 27, 2024
@gisripa gisripa force-pushed the edgao/typer_deduper_interface_cleanup branch from f0a9727 to c7e2466 Compare February 27, 2024 22:54
@edgao edgao force-pushed the edgao/typer_deduper_interface_cleanup branch from c7e2466 to d6199c6 Compare February 27, 2024 22:57
@edgao edgao changed the title wild idea DV2 TyperDeduper: Extract migrations to separate method wild idea Feb 27, 2024
@gisripa gisripa changed the title DV2 TyperDeduper: Extract migrations to separate method wild idea DV2 TyperDeduper: Extract migrations to separate method Feb 28, 2024
@gisripa
Copy link
Contributor

gisripa commented Feb 28, 2024

/publish-java-cdk

🕑 https://github.com/airbytehq/airbyte/actions/runs/8086198897
✅ Successfully published Java CDK version=0.23.7!

@gisripa gisripa merged commit f0f7a98 into master Feb 28, 2024
26 checks passed
@gisripa gisripa deleted the edgao/typer_deduper_interface_cleanup branch February 28, 2024 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants