Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert destination-bigquery to kotlin CDK #36899

Conversation

stephane-airbyte
Copy link
Contributor

No description provided.

Copy link

vercel bot commented Apr 8, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 25, 2024 9:37pm

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @stephane-airbyte and the rest of your teammates on Graphite Graphite

@octavia-squidington-iii octavia-squidington-iii added area/connectors Connector related issues CDK Connector Development Kit connectors/destination/bigquery labels Apr 8, 2024
@stephane-airbyte stephane-airbyte marked this pull request as ready for review April 8, 2024 19:14
@stephane-airbyte stephane-airbyte requested review from a team as code owners April 8, 2024 19:14
@@ -48,7 +48,7 @@ public void flush(final StreamDescriptor decs, final Stream<PartialAirbyteMessag

stream.forEach(record -> {
try {
writer.accept(record.getSerialized(), record.getRecord().getEmittedAt());
writer.accept(record.getSerialized(), Jsons.serialize(record.getRecord().getMeta()), record.getRecord().getEmittedAt());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gisripa I think this is what I need to do, but I'd like you to confirm that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks correct to me. One thing to verify is if BigQuery is ok adding extra stuff in the CSV file, we had to pass a flag in snowflake to ignore this meta information until we add the migration code to populate meta in raw table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that my addition of V2_WITH_META means that this value will be ignored when writing the CSV

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah looks like it. seems ok, when we introduce meta in BQ this will work.

@stephane-airbyte stephane-airbyte force-pushed the stephane/04-08-convert_destination-bigquery_to_kotlin_cdk branch from 1467569 to a02d678 Compare April 8, 2024 21:52
@stephane-airbyte stephane-airbyte requested a review from a team as a code owner April 8, 2024 21:52
@stephane-airbyte stephane-airbyte force-pushed the stephane/04-08-convert_destination-bigquery_to_kotlin_cdk branch 2 times, most recently from 4fea7a9 to c2074de Compare April 8, 2024 22:49
@stephane-airbyte stephane-airbyte force-pushed the stephane/04-08-convert_destination-bigquery_to_kotlin_cdk branch 2 times, most recently from 0d28feb to 3598443 Compare April 9, 2024 18:20
Copy link
Contributor

@edgao edgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what's up with most of the test failures (... they're claiming that we failed to write the data correctly, which sounds sketchy)

but at least this one I think is just an unrelated breaking change in the cdk - lmk if you want a hand with it, but you should be able to mostly copy these files from e.g. redshift's test resources

BigQuerySqlGeneratorIntegrationTest > testV1V2migration() FAILED
    org.gradle.internal.exceptions.DefaultMultiCauseException: Multiple Failures (2 failures)
    	java.lang.IllegalArgumentException: resource sqlgenerator/alltypes_v1v2_expectedrecords_raw.jsonl not found.
    	java.lang.IllegalArgumentException: resource sqlgenerator/alltypes_v1v2_expectedrecords_final.jsonl not found.

*
* The @JvmSuppressWildcards is here so that the 2nd parameter of accept stays a java
* Map<StreamDescriptor, StreamSyncSummary>
* ```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this formatting looks off? (also TIL JvmSuppressWildcards)

@gisripa
Copy link
Contributor

gisripa commented Apr 9, 2024

they're claiming that we failed to write the data correctly

The failures seems pretty much what we encountered in Snowflake where adding additional data in CSV (airbyte_meta added in base Writer class) which doesn't have a mapped column in table seems to be failing to insert any of the data. So we are silently failing like snowflake when csv rows doesn't match columns size ?

@gisripa
Copy link
Contributor

gisripa commented Apr 9, 2024

https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-csv
--ignore_unknown_values I guess is the temporary fix until we introduce meta ?

@stephane-airbyte stephane-airbyte force-pushed the stephane/04-08-convert_destination-bigquery_to_kotlin_cdk branch 6 times, most recently from 9a617ca to 0b9fa0f Compare April 10, 2024 16:16
@stephane-airbyte stephane-airbyte force-pushed the stephane/04-08-convert_destination-bigquery_to_kotlin_cdk branch 3 times, most recently from b456f57 to 41db851 Compare April 25, 2024 17:51
Copy link
Contributor Author

stephane-airbyte commented Apr 25, 2024

/publish-java-cdk

🕑 https://github.com/airbytehq/airbyte/actions/runs/8837157547
❌ Publish Java CDK version=0.30.11 failed!

@stephane-airbyte stephane-airbyte force-pushed the stephane/04-08-convert_destination-bigquery_to_kotlin_cdk branch 3 times, most recently from 2584ef3 to 1a27dfe Compare April 25, 2024 18:37
@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Apr 25, 2024
defaultNamespace = defaultNamespace,
flushFailure = FlushFailure(),
workerPool = workerPool,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I aggressively removed the overloaded constructors hoping the 2 usages were already in Kotlin with defaults injected. Here it is yet another Java usage 🤦‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I noticed your removal 😠 😏

Copy link
Contributor

@edgao edgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one nit, otherwise lgtm

Copy link
Contributor

@gisripa gisripa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.
nit: any chance we can avoid that overloaded constructor in AsyncStreamConsumer, even if it means copy-pasting the required ones from StagingConsumerFactory into BQ land ?

Copy link
Contributor Author

stephane-airbyte commented Apr 25, 2024

/publish-java-cdk

🕑 https://github.com/airbytehq/airbyte/actions/runs/8838668760
✅ Successfully published Java CDK version=0.30.11!

@stephane-airbyte stephane-airbyte force-pushed the stephane/04-08-convert_destination-bigquery_to_kotlin_cdk branch from 226ee35 to d60c192 Compare April 25, 2024 20:07
Copy link
Contributor Author

stephane-airbyte commented Apr 25, 2024

/publish-java-cdk force=true

🕑 https://github.com/airbytehq/airbyte/actions/runs/8838745192
✅ Successfully published Java CDK version=0.30.11!

@stephane-airbyte stephane-airbyte force-pushed the stephane/04-08-convert_destination-bigquery_to_kotlin_cdk branch from b5ffe32 to 7a68dfd Compare April 25, 2024 21:33
@stephane-airbyte stephane-airbyte merged commit c4ad3d9 into master Apr 25, 2024
34 checks passed
@stephane-airbyte stephane-airbyte deleted the stephane/04-08-convert_destination-bigquery_to_kotlin_cdk branch April 25, 2024 21:46
Copy link
Contributor Author

Merge activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation CDK Connector Development Kit connectors/destination/bigquery
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants