-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent manual flush hanging in read-only mode #4615
Prevent manual flush hanging in read-only mode #4615
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
db/db_impl_compaction_flush.cc
Outdated
@@ -1543,6 +1543,12 @@ Status DBImpl::WaitUntilFlushWouldNotStallWrites(ColumnFamilyData* cfd, | |||
WriteStallCondition write_stall_condition = WriteStallCondition::kNormal; | |||
do { | |||
if (write_stall_condition != WriteStallCondition::kNormal) { | |||
// If the DB is in read-only mode and not coming back, no point waiting. | |||
if (error_handler_.IsBGWorkStopped() && | |||
!error_handler_.IsRecoveryInProgress()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be enough to check for IsBGWorkStopped(). There is no guarantee that a recovery, even if in progress, would finish in a timely manner. For example, if recovery is being attempted due to a NoSpace() error, it could go on forever if not enough space is freed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, that should be consistent with how we stop user writes vs return errors to writers.
@ajkr has updated the pull request. Re-import the pull request |
@ajkr has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. LGTM
The logic to wait for stall conditions to clear before beginning a manual flush didn't take into account whether the DB was in read-only mode. In read-only mode the stall conditions would never clear since no background work is happening, so the wait would be never-ending. It's probably better to return an error to the user.
Test Plan: