Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 JDBC source: adjust fetch size based on max memory and max row size #13435

Merged
merged 16 commits into from
Jun 7, 2022

Conversation

tuliren
Copy link
Contributor

@tuliren tuliren commented Jun 2, 2022

What

How

  • Allocate 50% of max heap as JDBC buffer.
  • Calculate fetch size based on the max row size to be safer.

🚨 User Impact 🚨

  • So far, this update is only published for MS SQL.
  • This PR should reduce the possibility of OOME.
  • The fetch size will continue to go down in one sync. This could theoretically slow down Postgres sync.
    • This is mitigated by a total buffer size that can grow with the max heap.
    • The ultimate change we can do to mitigate the impact is to checkpoint the states more frequently, and replace a big and long sync with shorter syncs.

@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation labels Jun 2, 2022
@tuliren tuliren requested a review from edgao June 2, 2022 17:15
@tuliren tuliren temporarily deployed to more-secrets June 7, 2022 05:24 Inactive
@tuliren
Copy link
Contributor Author

tuliren commented Jun 7, 2022

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/2452338970
❌ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/2452338970
🐛 https://gradle.com/s/sbljyjk46lx5g

Build Failed

Test summary info:

Could not find result summary

@tuliren
Copy link
Contributor Author

tuliren commented Jun 7, 2022

/test connector=connectors/source-mysql

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/2452339435
❌ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/2452339435
🐛 https://gradle.com/s/seg54tg2jmade

Build Failed

Test summary info:

Could not find result summary

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/2452339435
❌ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/2452339435
🐛 https://gradle.com/s/35heq4p4do6qw

Build Failed

Test summary info:

Could not find result summary

Oh, MySQL source CDC integration test has been broken for a while.

Screen Shot 2022-06-07 at 00 18 55

@tuliren
Copy link
Contributor Author

tuliren commented Jun 7, 2022

/test connector=connectors/source-mssql

🕑 connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/2452339717
✅ connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/2452339717
No Python unittests run

Build Passed

Test summary info:

All Passed

@tuliren tuliren temporarily deployed to more-secrets June 7, 2022 05:30 Inactive
@tuliren
Copy link
Contributor Author

tuliren commented Jun 7, 2022

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/2452948350
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/2452948350
No Python unittests run

Build Passed

Test summary info:

All Passed

@tuliren tuliren temporarily deployed to more-secrets June 7, 2022 07:45 Inactive
@tuliren tuliren temporarily deployed to more-secrets June 7, 2022 09:43 Inactive
@tuliren
Copy link
Contributor Author

tuliren commented Jun 7, 2022

/publish connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/2456712527
❌ Failed to publish connectors/source-postgres
❌ Couldn't auto-bump version for connectors/source-postgres

@tuliren
Copy link
Contributor Author

tuliren commented Jun 7, 2022

/publish connector=connectors/source-mssql

🕑 connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/2456713626
🚀 Successfully published connectors/source-mssql
🚀 Auto-bumped version for connectors/source-mssql
✅ connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/2456713626

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets June 7, 2022 19:50 Inactive
@tuliren
Copy link
Contributor Author

tuliren commented Jun 7, 2022

/publish connector=connectors/source-mssql-strict-encrypt

🕑 connectors/source-mssql-strict-encrypt https://github.com/airbytehq/airbyte/actions/runs/2457458497
🚀 Successfully published connectors/source-mssql-strict-encrypt
❌ Couldn't auto-bump version for connectors/source-mssql-strict-encrypt

@tuliren tuliren temporarily deployed to more-secrets June 7, 2022 21:13 Inactive
@tuliren tuliren changed the title 🎉 Postgres source: adjust fetch size based on max memory and max row size 🎉 JDBC source: adjust fetch size based on max memory and max row size Jun 7, 2022
@tuliren tuliren merged commit 545a7a3 into master Jun 7, 2022
@tuliren tuliren deleted the liren/jdbc-fetch-size-v2 branch June 7, 2022 21:29
@tuliren tuliren temporarily deployed to more-secrets June 7, 2022 21:30 Inactive
@tuliren
Copy link
Contributor Author

tuliren commented Jun 7, 2022

Postgres will be published in a follow-up PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants