Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Source-postgres]: Add logging in case of multiple records with same LSN #35842

Merged
merged 13 commits into from
Mar 7, 2024

Conversation

akashkulk
Copy link
Contributor

No description provided.

Copy link

vercel bot commented Mar 6, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Mar 7, 2024 1:01am

@octavia-squidington-iii octavia-squidington-iii added area/connectors Connector related issues CDK Connector Development Kit connectors/source/postgres labels Mar 6, 2024
@@ -57,6 +57,7 @@ private ContainerModifier(String methodName) {

}

@SuppressWarnings("deprecation")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure, but this underlying library seems to have a deprecation that is failing the gradle check, so it i's necessary

@akashkulk akashkulk marked this pull request as ready for review March 6, 2024 20:35
@akashkulk akashkulk requested a review from a team as a code owner March 6, 2024 20:35
@akashkulk
Copy link
Contributor Author

akashkulk commented Mar 7, 2024

/publish-java-cdk

🕑 https://github.com/airbytehq/airbyte/actions/runs/8181051859
✅ Successfully published Java CDK version=0.23.16!

@akashkulk akashkulk enabled auto-merge (squash) March 7, 2024 01:08
@akashkulk akashkulk merged commit 4fcff41 into master Mar 7, 2024
28 checks passed
@akashkulk akashkulk deleted the akash/psg-multiple-lsn-logging branch March 7, 2024 01:09
if (targetPosition.isEventAheadOffset(checkpointOffsetToSend, event)) {
sendCheckpointMessage = true;
} else {
LOGGER.info("Encountered {} records with the same event offset", recordsLastSync);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log message is causing our airbyte logs to significantly grow in size as we commonly see this message >1 million times in each sync.

Is it possible to selectively disable this log or would we have to change our log level? We don't believe multiple records with the same offset is an issue for us because most of our database operations are batch inserts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillKoehrsen are you able to submit a PR which limits how often this is logged? Maybe only once per offset or stop after 100 logs?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't have the experience with the codebase to feel comfortable opening a PR. We were able to get around the issue in the short term by downgrading our postgres connector to 3.3.12.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, we'll track this bug via #36903

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillKoehrsen we have removed this log line in the new version of source-postgres

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! Really appreciate it.

testExecutionConcurrency=-1

JunitMethodExecutionTimeout=5 m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akashkulk what's this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR, CDK 0.23.15 #35827 default timeout was reduced from 5m to 1m by default. It wasn't really validated against any other certified connectors other than source-mssql for which it seems like this fix exclusively applied.

This timeout just adds back that functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation CDK Connector Development Kit connectors/source/postgres
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants