Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DV2: Only run T+D if we have zero records or the previous sync left behind un-T+D-ed records #33232

Merged
merged 24 commits into from
Dec 13, 2023

Conversation

edgao
Copy link
Contributor

@edgao edgao commented Dec 7, 2023

Seeking early review: lmk if this interface makes sense. Bigquery+snowflake both implemented.

Non-draft to trigger CI.

Rough description:

  • Add a recordCounts map to the OnCloseFunction interfaces, which tracks how many records were processed during this sync per stream.
  • AsyncStreamConsumer - track this map
    • The Executors.newFixedThreadPool(5) thing is copied from this FlushWorkers constructor, to avoid duplicate constructor code
  • BufferedStreamConsumer - don't bother fixing this class, we want to kill it anyway
  • Pass recordCounts into the appropriate TyperDeduper method
  • Modify the DestinationHandler interface to find more detailed information at the start of a sync (we were already computing this)

failing ci is unrelated, investigation ongoing

BigQueryStandardInsertsTypingDedupingTest > testRawTableJsonToStringMigration() FAILED
    io.airbyte.workers.exception.TestHarnessException: Could not find image: airbyte/destination-bigquery:1.9.0

@edgao edgao requested a review from a team as a code owner December 7, 2023 20:15
Copy link

vercel bot commented Dec 7, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 13, 2023 11:15pm

@octavia-squidington-iii octavia-squidington-iii added area/connectors Connector related issues CDK Connector Development Kit labels Dec 7, 2023
Copy link
Contributor

github-actions bot commented Dec 7, 2023

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan.
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • You've updated the connector's metadata.yaml file any other relevant changes, including a breakingChanges entry for major version bumps. See metadata.yaml docs
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • Migration guide updated in docs/integrations/<source or destination>/<name>-migrations.md with an entry for the new version, if the version is a breaking change. See migration guide example
  • If set, you've ensured the icon is present in the platform-internal repo. (Docs)

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

# Conflicts:
#	airbyte-integrations/connectors/destination-snowflake/src/main/java/io/airbyte/integrations/destination/snowflake/SnowflakeInternalStagingDestination.java
#	airbyte-integrations/connectors/destination-snowflake/src/main/java/io/airbyte/integrations/destination/snowflake/SnowflakeSqlOperations.java
@edgao
Copy link
Contributor Author

edgao commented Dec 13, 2023

can't repro the test failure locally 🤔 so, uh, idk

Copy link
Contributor

@gisripa gisripa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm pending logistics

@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Dec 13, 2023
@edgao
Copy link
Contributor Author

edgao commented Dec 13, 2023

agh I need to publish the cdk here >.> will literally never remember to do this

@edgao
Copy link
Contributor Author

edgao commented Dec 13, 2023

... which also means I can't merge this until #33369 is merged 😢

@gisripa
Copy link
Contributor

gisripa commented Dec 13, 2023

#33369 is merged

# Conflicts:
#	airbyte-cdk/java/airbyte-cdk/core/src/main/resources/version.properties
@edgao
Copy link
Contributor Author

edgao commented Dec 13, 2023

/publish-java-cdk

🕑 https://github.com/airbytehq/airbyte/actions/runs/7202070056
✅ Successfully published Java CDK version=0.7.4!

@edgao edgao enabled auto-merge (squash) December 13, 2023 23:13
@edgao edgao merged commit d27ea33 into master Dec 13, 2023
25 of 26 checks passed
@edgao edgao deleted the edgao/td_if_records branch December 13, 2023 23:47
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
…ehind un-T+D-ed records (airbytehq#33232)

Co-authored-by: edgao <edgao@users.noreply.github.com>
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
…ehind un-T+D-ed records (airbytehq#33232)

Co-authored-by: edgao <edgao@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation CDK Connector Development Kit checklist-action-run connectors/destination/bigquery connectors/destination/snowflake
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants