-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid Keeper crash on shutdown (fix test_keeper_snapshot_on_exit
)
#44908
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antonio2368 thank you for the fix!
@@ -523,7 +523,7 @@ void KeeperStateMachine::processReadRequest(const KeeperStorage::RequestForSessi | |||
true /*check_acl*/, | |||
true /*is_local*/); | |||
for (const auto & response : responses) | |||
if (!responses_queue.push(response)) | |||
if (!responses_queue.push(response) && !responses_queue.isFinished()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is better to move this check out from loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question: is it possible that the queue is finished between the loop iterations? If so it's better to add:
if (responses_queue.isFinished())
break;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To address this and the comment above, the biggest problem is the NuRaft.
If I throw an exception from one of its threads, std::abort
happens (that's what caused the problem), that lib doesn't handle exceptions nicely as CH. Assume all methods here are called by NuRaft and not CH.
The queue can be finished in between the loop iterations but it can also be finished between the checks:
if finished -> finished is false
break;
if (!push(...)) -> finished is true, i.e we have same problem
I can add the check for break to not do unnecessary iterations as @novikd suggested, but because I would still need to keep the check after an unsuccessful push I wouldn't change much code-wise.
So I would like to merge this to fix CI ASAP and then polish it.
@@ -248,7 +248,7 @@ nuraft::ptr<nuraft::buffer> KeeperStateMachine::commit(const uint64_t log_idx, n | |||
session_id = storage->getSessionID(session_id_request.session_timeout_ms); | |||
LOG_DEBUG(log, "Session ID response {} with timeout {}", session_id, session_id_request.session_timeout_ms); | |||
response->session_id = session_id; | |||
if (!responses_queue.push(response_for_session)) | |||
if (!responses_queue.push(response_for_session) && !responses_queue.isFinished()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is better to move this check to the beginning of the function?
Also what will be if exception will be thrown if queue is finished? Or at least some log message maybe?
@antonio2368 what do you think?
@@ -523,7 +523,7 @@ void KeeperStateMachine::processReadRequest(const KeeperStorage::RequestForSessi | |||
true /*check_acl*/, | |||
true /*is_local*/); | |||
for (const auto & response : responses) | |||
if (!responses_queue.push(response)) | |||
if (!responses_queue.push(response) && !responses_queue.isFinished()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question: is it possible that the queue is finished between the loop iterations? If so it's better to add:
if (responses_queue.isFinished())
break;
Changelog category (leave one):
Fix the issue found by
test_keeper_snapshot_on_exit
after #44881 was merged.