-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[commitlog] Add setter functions for the onFlush callback #4045
Merged
marcushill
merged 8 commits into
m3db:master
from
marcushill:fix-commitlog-writewait-bug
Jan 12, 2022
Merged
[commitlog] Add setter functions for the onFlush callback #4045
marcushill
merged 8 commits into
m3db:master
from
marcushill:fix-commitlog-writewait-bug
Jan 12, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit adds a setOnFlush method to both the commitLogWriter and chunkWriter intefaces. This function is necessary to get around a subtle bug in the WriteWait commitlog strategy causing pendingFlushFns functions (and therefore commitLog.Write returning) to lag. The issue is that after the first call to RotateLogs, when onFlush is called, it ends up reading from the wrong list of pendingFlushFns. This is hard to detect as this occurs after the data has already been flushed to the correct file on disk and only affects reporting completion. It is not caught by any tests because the next call to RotateLogs (or Close) will resolve the problem. Additionally, we mostly test the default WriteBehind strategy rather than the WriteWait strategy.
marcushill
force-pushed
the
fix-commitlog-writewait-bug
branch
from
January 6, 2022 15:29
2b124bd
to
a2a4945
Compare
arnikola
approved these changes
Jan 6, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but may want a second opinion, not too familiar with this path here
robskillington
approved these changes
Jan 12, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
This commit adds a
setOnFlush
method to both thecommitLogWriter
andchunkWriter
intefaces.This function is necessary to get around a subtle bug in the
WriteWait
commitlog strategycausing
pendingFlushFns
calls (and thereforecommitLog.Write
returning) to lag.The issue is that after the first call to
RotateLogs
, whenonFlush
is called, it ends up readingfrom the wrong list of
pendingFlushFns
. This is hard to detect as this occurs after the data hasalready been flushed to the correct file on disk and only affects reporting completion.
It is not caught by any tests because the next call to
RotateLogs
(orClose
) will resolve the problem.Additionally, we mostly test the default
WriteBehind
strategy rather than theWriteWait
strategy.Special notes for your reviewer:
I was unable to come up with a satisfactory test for this issue since it resolves itself. We discovered this because we have a scenario where we depend on the implied guarantee that in the
WriteWait
strategy, all flushes beforeRotateLogs
will eventually be flushed and return either success or failure afterRotateLogs
and we found that we were blocked waiting forWrite
to return.Does this PR introduce a user-facing and/or backwards incompatible change?:
Does this PR require updating code package or user-facing documentation?: