-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DocDB] Fatal: Couldn't find connection for any index to Connection (0x000017e13de943d8) #21738
Closed
1 task done
Labels
2024.1 Backport Required
2024.1_blocker
area/docdb
YugabyteDB core features
kind/bug
This issue is a bug
priority/medium
Medium priority issue
qa_stress
Bugs identified via Stress automation
QA
QA filed bugs
Comments
shishir2001-yb
added
area/docdb
YugabyteDB core features
QA
QA filed bugs
status/awaiting-triage
Issue awaiting triage
qa_stress
Bugs identified via Stress automation
labels
Mar 29, 2024
yugabyte-ci
added
kind/bug
This issue is a bug
priority/medium
Medium priority issue
labels
Mar 29, 2024
Probably related to #20661 as indicated by Shishir on slack offline. |
spolitov
added a commit
that referenced
this issue
Apr 6, 2024
Summary: It could happen that connection gets 2 failures simultaneously, especially when underlying tcp stream gets broken. There is a flag avoid calling DestroyConnection multiple times - `queued_destroy_connection_`. But in case of rpc heartbeat timeout, connection is destroyed but flag is not set. As result we could get into situation when Reactor::DestroyConnection is called twice for the same connection. But in case of client connection the first call will remove connection from `client_conns_`. And second call to DestroyConnection would not be able to find connection in this map. Sanity check will fail and process will die because of check failure. Fixed rpc heartbeat timeout handling to check and set `queued_destroy_connection_`. Also added logic to avoid removing connection from `client_conns_` if connection was already destroyed. Jira: DB-10612 Test Plan: Jenkins Reviewers: bogdan, mbautin Reviewed By: bogdan Subscribers: ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D33685
Closed
1 task
The fix needs to be backported to 2024.1 |
spolitov
added a commit
that referenced
this issue
Apr 20, 2024
…ction Summary: It could happen that connection gets 2 failures simultaneously, especially when underlying tcp stream gets broken. There is a flag avoid calling DestroyConnection multiple times - `queued_destroy_connection_`. But in case of rpc heartbeat timeout, connection is destroyed but flag is not set. As result we could get into situation when Reactor::DestroyConnection is called twice for the same connection. But in case of client connection the first call will remove connection from `client_conns_`. And second call to DestroyConnection would not be able to find connection in this map. Sanity check will fail and process will die because of check failure. Fixed rpc heartbeat timeout handling to check and set `queued_destroy_connection_`. Also added logic to avoid removing connection from `client_conns_` if connection was already destroyed. Jira: DB-10612 Original commit: 136b6e4 / D33685 Test Plan: Jenkins Reviewers: bogdan, mbautin, rthallam Reviewed By: rthallam Subscribers: ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34294
ZhenYongFan
pushed a commit
to ZhenYongFan/yugabyte-db
that referenced
this issue
Jun 15, 2024
…on Connection Summary: It could happen that connection gets 2 failures simultaneously, especially when underlying tcp stream gets broken. There is a flag avoid calling DestroyConnection multiple times - `queued_destroy_connection_`. But in case of rpc heartbeat timeout, connection is destroyed but flag is not set. As result we could get into situation when Reactor::DestroyConnection is called twice for the same connection. But in case of client connection the first call will remove connection from `client_conns_`. And second call to DestroyConnection would not be able to find connection in this map. Sanity check will fail and process will die because of check failure. Fixed rpc heartbeat timeout handling to check and set `queued_destroy_connection_`. Also added logic to avoid removing connection from `client_conns_` if connection was already destroyed. Jira: DB-10612 Original commit: 136b6e4 / D33685 Test Plan: Jenkins Reviewers: bogdan, mbautin, rthallam Reviewed By: rthallam Subscribers: ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34294
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
2024.1 Backport Required
2024.1_blocker
area/docdb
YugabyteDB core features
kind/bug
This issue is a bug
priority/medium
Medium priority issue
qa_stress
Bugs identified via Stress automation
QA
QA filed bugs
Jira Link: DB-10612
Description
Tried on version 2.23.00-b65
Logs: https://drive.google.com/file/d/1mgQ4cbpVThLNgCoS4HObGKN3uwmGO_hp/view?usp=sharing
Encountered the following Fatal while running cross DB DDLs test with PITR and Backup/Restore.
Test details:
Observed a coredump as well
G-flags:
Issue Type
kind/bug
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: