Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add checkpointing to JDBC Source databases #10909

Closed
alafanechere opened this issue Mar 7, 2022 · 6 comments · Fixed by #16238, #16256, #16258, #16259 or #16261
Closed

Add checkpointing to JDBC Source databases #10909

alafanechere opened this issue Mar 7, 2022 · 6 comments · Fixed by #16238, #16256, #16258, #16259 or #16261

Comments

@alafanechere
Copy link
Contributor

alafanechere commented Mar 7, 2022

Summary

Created from Slack: https://airbytehq.slack.com/archives/C01MFR03D5W/p1640005429033600?thread_ts=1640005429.033600&cid=C01MFR03D5W

when starting full sync on a huge table i can see that there is a long query in the DB(something like this):
SELECT <all columns> FROM <table>

But the table is 4TB, this way it will take days to finish the first query (i will probably fail before).. any way to configure the first sync to be divided into smaller chanks?

Connector publication

Connector Old New PR
source-postgres 0.4.40 0.4.41 #14903
source-postgres-strict-encrypt ^ ^ ^
source-mssql 0.4.16 0.4.17 #16261
source-mssql-strict-encrypt ^ ^ ^
source-mysql 0.6.7 0.6.8 #16259
source-mysql-strict-encrypt ^ ^ ^
source-oracle 0.3.20 0.3.21 #16238
source-oracle-strict-encrypt ^ ^ ^
source-redshift 0.3.13 0.3.14 #16258
source-snowflake 0.1.19 0.1.20 ^
source-clickhouse 0.1.12 0.1.13 #16256
source-clickhouse-strict-encrypt 0.1.9 ^ ^
source-cockroachdb 0.1.16 0.1.17 ^
source-cockroachdb-strict-encrypt ^ ^ ^
source-db2 0.1.14 0.1.15 ^
source-db2-strict-encrypt ^ ^ ^
source-tidb 0.2.0 0.2.1 ^

^ means it is the same as above.

@alafanechere alafanechere changed the title Explain how reads on RDBMS are done Explain how reads on RDBMS work Mar 7, 2022
@leoalmeidasant
Copy link

Hi @alafanechere, do you find any solution to this issue? I've been passing through this same problem for about three weeks and I don't know how to solve.

@grishick grishick changed the title Explain how reads on RDBMS work Add checkpointing to JDBC databases Jun 30, 2022
@grishick
Copy link
Contributor

The issue will be addressed when a source and a destination in a given connection are making periodic checkpoints.

@grishick grishick added team/databases and removed area/documentation Improvements or additions to documentation team/documentation labels Jun 30, 2022
@grishick grishick changed the title Add checkpointing to JDBC databases Add checkpointing to JDBC Source databases Aug 5, 2022
@grishick
Copy link
Contributor

grishick commented Aug 5, 2022

@tuliren looks like you have already done this for Postgres and now it should be possible to do this for MySQL and other JDBC database sources. If the implementation will require source-specific changes, then please file separate issues for each source.

@tuliren tuliren self-assigned this Aug 24, 2022
@grishick
Copy link
Contributor

Work left:

@evantahler
Copy link
Contributor

Is this issue a duplicate of #14770? if so, feel free to close one of them!

@tuliren
Copy link
Contributor

tuliren commented Sep 1, 2022

@evantahler, no, they are not duplications. #14770 is for destination. This one is for source.

@tuliren tuliren linked a pull request Sep 2, 2022 that will close this issue
@tuliren tuliren reopened this Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment