Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Source-MsSQL] Set default cursor for cdc mode #29493

Merged
merged 7 commits into from Aug 23, 2023

Conversation

nguyenaiden
Copy link
Contributor

@nguyenaiden nguyenaiden commented Aug 16, 2023

What

To prepare for Destination v2, Cdc-enabled DB sources need to have a pre-defined cursor for normalization to operate properly.

How

image
During the discover phase, source_defined_cursor is set to true, and a new composite cursor column called _ab_cdc_cursor is selected as the predefined cursor. This column is constructed from:

  1. Converting the emittedAt timestamp , generated at the read step, to epoch seconds.
  2. Convert that to nanoseconds.
  3. Initialize a threadsafe recordCounter = 1
  4. Add the recordCounter to the converted timestamp
  5. Increment the counter

The numeric cursor value will be computed for every record and allows for fast comparison in tie-breaking scenarios for records with the same Primary Key.

Updated Streamstate
{
  "cdc": false,
  "streams": [
    {
      "stream_name": "users",
      "cursor_field": [
        "_ab_cdc_cursor"
      ],
      "stream_namespace": "dbo"
    }
  ],

Recommended reading order

  1. MsSQLSource.java
  2. MsSqlCdcConnectorMetadataInjector.java
  3. All the test files

🚨 User Impact 🚨

New CDC syncs that have refreshed their source schema will see this cursor field be chosen.
This update requires that users do a full reset OR drop their SCD tables to ensure that syncs continue operating normally
Customers who reset their data will continue to see old behavior until their source goes through the discover() phase again.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 16, 2023

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan.
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • You've updated the connector's metadata.yaml file any other relevant changes, including a breakingChanges entry for major version bumps. See metadata.yaml docs
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • Migration guide updated in docs/integrations/<source or destination>/<name>-migrations.md with an entry for the new version, if the version is a breaking change. See migration guide example
  • If set, you've ensured the icon is present in the platform-internal repo. (Docs)

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

@nguyenaiden nguyenaiden marked this pull request as ready for review August 16, 2023 18:18
@nguyenaiden nguyenaiden requested a review from a team as a code owner August 16, 2023 18:18
@octavia-squidington-iii
Copy link
Collaborator

source-mssql test report (commit c6f9dee920) - ❌

⏲️ Total pipeline duration: 20mn03s

Step Result
Validate airbyte-integrations/connectors/source-mssql/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build source-mssql docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:source-mssql:integrationTest
Acceptance tests

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-mssql test

@octavia-squidington-iii
Copy link
Collaborator

source-mssql test report (commit c6f9dee920) - ❌

⏲️ Total pipeline duration: 18mn48s

Step Result
Validate airbyte-integrations/connectors/source-mssql/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build source-mssql docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:source-mssql:integrationTest
Acceptance tests

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-mssql test

@octavia-squidington-iii
Copy link
Collaborator

source-mssql test report (commit b056c83f60) - ❌

⏲️ Total pipeline duration: 20mn24s

Step Result
Validate airbyte-integrations/connectors/source-mssql/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build source-mssql docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:source-mssql:integrationTest
Acceptance tests

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-mssql test

@nguyenaiden
Copy link
Contributor Author

nguyenaiden commented Aug 16, 2023

/legacy-test connector=connectors/source-mssql

🕑 connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/5883654001
❌ connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/5883654001
🐛 https://gradle.com/s/3abp55vbaktby

Build Failed

Test summary info:

Could not find result summary

@nguyenaiden nguyenaiden changed the title [Source-MsSQL] Set default cursor for cdc mode [DRAFT DO NOT MERGE] [Source-MsSQL] Set default cursor for cdc mode Aug 16, 2023
@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Aug 16, 2023
@octavia-squidington-iii
Copy link
Collaborator

source-mssql test report (commit 67e3ec3cc1) - ❌

⏲️ Total pipeline duration: 20mn51s

Step Result
Validate airbyte-integrations/connectors/source-mssql/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build source-mssql docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:source-mssql:integrationTest
Acceptance tests

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-mssql test

@octavia-squidington-iii
Copy link
Collaborator

source-mssql test report (commit bc733328b1) - ❌

⏲️ Total pipeline duration: 20mn27s

Step Result
Validate airbyte-integrations/connectors/source-mssql/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build source-mssql docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:source-mssql:integrationTest
Acceptance tests

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-mssql test

Copy link
Contributor

@prateekmukhedkar prateekmukhedkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need to do the changes for source-mssql-strict-encrypt also.

@@ -24,5 +24,5 @@ ENV APPLICATION source-mssql

COPY --from=build /airbyte /airbyte

LABEL io.airbyte.version=1.1.1
LABEL io.airbyte.version=2.0.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you also want to do this for mssql-strict-encrypt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Will add that

@@ -453,10 +456,12 @@ public List<AutoCloseableIterator<AirbyteMessage>> getIncrementalIterators(
MssqlCdcTargetPosition.getTargetPosition(database, sourceConfig.get(JdbcUtils.DATABASE_KEY).asText()), true, firstRecordWaitTime,
OptionalInt.empty());

final MssqlCdcConnectorMetadataInjector mssqlCdcConnectorMetadataInjector = MssqlCdcConnectorMetadataInjector.getInstance(emittedAt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One quick thought I had was :
if this is now stateful, do we want to declare this in the constructor to prevent other parts of the code from declaring multiple versions of the CdcConnectorMetadataInjector?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akashkulk I actually already turned it into a singleton. Refer to MssqlCdcConnectorMetadataInjector.java

Copy link
Contributor

@akashkulk akashkulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to unblock. Some quick nits

Copy link
Contributor

@akashkulk akashkulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually probably want to look into test failures before merging

@nguyenaiden
Copy link
Contributor Author

/legacy-test connector=connectors/source-mssql

@github-actions
Copy link
Contributor

source-mssql test report (commit 06d2655046) - ❌

⏲️ Total pipeline duration: 38mn20s

Step Result
Java Connector Unit Tests
Build connector tar
Build source-mssql docker image for platform linux/x86_64
Java Connector Integration Tests
Acceptance tests
Validate airbyte-integrations/connectors/source-mssql/metadata.yaml
Connector version semver check
Connector version increment check
QA checks

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-mssql test

@github-actions
Copy link
Contributor

source-mssql test report (commit f964644809) - ❌

⏲️ Total pipeline duration: 18mn26s

Step Result
Java Connector Unit Tests
Build connector tar
Build source-mssql docker image for platform linux/x86_64
Java Connector Integration Tests
Acceptance tests
Validate airbyte-integrations/connectors/source-mssql/metadata.yaml
Connector version semver check
Connector version increment check
QA checks

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-mssql test

@nguyenaiden
Copy link
Contributor Author

@nguyenaiden
Copy link
Contributor Author

/approve-and-merge reason="legacy tests + all QA checks passed"

@octavia-approvington
Copy link
Contributor

After a careful ML study,
we think this looks okay.
imagine code being okay

@octavia-approvington octavia-approvington merged commit f68a828 into master Aug 23, 2023
19 checks passed
@octavia-approvington octavia-approvington deleted the sql-server-cdc-cursor branch August 23, 2023 19:01
@github-actions
Copy link
Contributor

source-mssql-strict-encrypt test report (commit c23dcb3513) - ❌

⏲️ Total pipeline duration: 316mn20s

Step Result
Java Connector Unit Tests
Build connector tar
Build source-mssql-strict-encrypt docker image for platform linux/x86_64
Java Connector Integration Tests
Acceptance tests
Validate airbyte-integrations/connectors/source-mssql-strict-encrypt/metadata.yaml
Connector version semver check
Connector version increment check
QA checks

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-mssql-strict-encrypt test

@github-actions
Copy link
Contributor

source-mssql test report (commit c23dcb3513) - ❌

⏲️ Total pipeline duration: 35mn06s

Step Result
Java Connector Unit Tests
Build connector tar
Build source-mssql docker image for platform linux/x86_64
Java Connector Integration Tests
Acceptance tests
Validate airbyte-integrations/connectors/source-mssql/metadata.yaml
Connector version semver check
Connector version increment check
QA checks

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-mssql test

@github-actions
Copy link
Contributor

source-mssql test report (commit 46a8fd444e) - ❌

⏲️ Total pipeline duration: 06mn30s

Step Result
Java Connector Unit Tests
Build connector tar
Build source-mssql docker image for platform linux/x86_64
Java Connector Integration Tests
Acceptance tests
Validate airbyte-integrations/connectors/source-mssql/metadata.yaml
Connector version semver check
Connector version increment check
QA checks

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-mssql test

@github-actions
Copy link
Contributor

source-mssql-strict-encrypt test report (commit 46a8fd444e) - ❌

⏲️ Total pipeline duration: 04mn00s

Step Result
Java Connector Unit Tests
Build connector tar
Build source-mssql-strict-encrypt docker image for platform linux/x86_64
Java Connector Integration Tests
Acceptance tests
Validate airbyte-integrations/connectors/source-mssql-strict-encrypt/metadata.yaml
Connector version semver check
Connector version increment check
QA checks

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-mssql-strict-encrypt test

harrytou pushed a commit to KYVENetwork/airbyte that referenced this pull request Sep 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants