-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CDCSDK] CDC nemesis case fails with org.apache.kafka.connect.errors.ConnectException: Unable to obtain valid replication slot. #21780
Labels
Comments
shamanthchandra-yb
added
priority/high
High Priority
area/cdcsdk
CDC SDK
status/awaiting-triage
Issue awaiting triage
labels
Apr 2, 2024
dr0pdb
added a commit
that referenced
this issue
Apr 11, 2024
Summary: This revision adds more logs in various phases for ease of debugging of stress runs for CDC. Most of them are VLOGs. The only INFO log is the Walsender startup log which is a one-time log and shouldn't have any performance implications. Jira: DB-10655 Test Plan: Jenkins: test regex: .*ReplicationSlot.* Ran tests and checked log manually. Reviewers: asrinivasan Reviewed By: asrinivasan Subscribers: ybase, ycdcxcluster, yql, bogdan Differential Revision: https://phorge.dev.yugabyte.com/D33989
dr0pdb
added a commit
that referenced
this issue
Apr 23, 2024
…er fixes Summary: ##### Backport Description Clean merges. No conflicts. ##### Original Description Original commits: 6bd88e6 / D33989 868d626 / D34162 ab43084 / D34320 ###### [#21780] YSQL: Introduce more debug logs for better debuggability This revision adds more logs in various phases for ease of debugging of stress runs for CDC. Most of them are VLOGs. The only INFO log is the Walsender startup log which is a one-time log and shouldn't have any performance implications. Jira: DB-10655 ###### [#21519] YSQL: Skip RollbackToSubTransaction RPC to local tserver proxy if not using a distributed transaction Before this revision, every RollbackToSubTransaction operation in PG would lead to a corresponding RPC call to the local tserver. The local tserver used to return early in case there was no distributed transaction. This revision adds the logic in the PG layer (pg_session) to skip sending the RPC if the transaction is read-only or a fast-path transaction i.e., has NON_TRANSACTIONAL isolation level. Note that we were already doing that for transaction commit/aborts but weren't skipping the RPC for rollback of sub-transaction. This change was proposed as part of the implementation of the PG compatible logical replication support. While streaming the changes to the Walsender, it starts and aborts transactions for every transaction that gets streamed. This is required for reading PG catalog tables. As a result, we were seeing a lot of unnecessary RPC calls to the local tserver. Jira: DB-10402 ###### [#21652] YSQL: Add more debug logs in the ListReplicationSlots function for debugging This revision introduces more VLOG statements in the ListReplicationSlots function in pg_client_service. This will aid us in debugging the issues observed while reading the CDC state table from the tablet server. Jira: DB-10546 Test Plan: Jenkins: test regex: .*ReplicationSlot.* Reviewers: asrinivasan Reviewed By: asrinivasan Subscribers: bogdan, yql, ycdcxcluster, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34410
ZhenYongFan
pushed a commit
to ZhenYongFan/yugabyte-db
that referenced
this issue
Jun 15, 2024
…L: Backport misc walsender fixes Summary: ##### Backport Description Clean merges. No conflicts. ##### Original Description Original commits: 6bd88e6 / D33989 868d626 / D34162 ab43084 / D34320 ###### [yugabyte#21780] YSQL: Introduce more debug logs for better debuggability This revision adds more logs in various phases for ease of debugging of stress runs for CDC. Most of them are VLOGs. The only INFO log is the Walsender startup log which is a one-time log and shouldn't have any performance implications. Jira: DB-10655 ###### [yugabyte#21519] YSQL: Skip RollbackToSubTransaction RPC to local tserver proxy if not using a distributed transaction Before this revision, every RollbackToSubTransaction operation in PG would lead to a corresponding RPC call to the local tserver. The local tserver used to return early in case there was no distributed transaction. This revision adds the logic in the PG layer (pg_session) to skip sending the RPC if the transaction is read-only or a fast-path transaction i.e., has NON_TRANSACTIONAL isolation level. Note that we were already doing that for transaction commit/aborts but weren't skipping the RPC for rollback of sub-transaction. This change was proposed as part of the implementation of the PG compatible logical replication support. While streaming the changes to the Walsender, it starts and aborts transactions for every transaction that gets streamed. This is required for reading PG catalog tables. As a result, we were seeing a lot of unnecessary RPC calls to the local tserver. Jira: DB-10402 ###### [yugabyte#21652] YSQL: Add more debug logs in the ListReplicationSlots function for debugging This revision introduces more VLOG statements in the ListReplicationSlots function in pg_client_service. This will aid us in debugging the issues observed while reading the CDC state table from the tablet server. Jira: DB-10546 Test Plan: Jenkins: test regex: .*ReplicationSlot.* Reviewers: asrinivasan Reviewed By: asrinivasan Subscribers: bogdan, yql, ycdcxcluster, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34410
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Jira Link: DB-10655
Description
Please find stress link in JIRA description.
There was nemesis happened, and we observed below error during that time:
However, after that, we saw sequence of logs stating:
and finally failing with
Source connector version
fourpointfour/ybdb-debezium:0.2
Connector configuration
YugabyteDB version
2.23.0.0-b86
Issue Type
kind/bug
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: