Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[commitlog] Add setter functions for the onFlush callback #4045

Merged
merged 8 commits into from
Jan 12, 2022

Conversation

marcushill
Copy link
Collaborator

What this PR does / why we need it:

This commit adds a setOnFlush method to both the commitLogWriter and chunkWriter intefaces.

This function is necessary to get around a subtle bug in the WriteWait commitlog strategy
causing pendingFlushFns calls (and therefore commitLog.Write returning) to lag.

The issue is that after the first call to RotateLogs, when onFlush is called, it ends up reading
from the wrong list of pendingFlushFns. This is hard to detect as this occurs after the data has
already been flushed to the correct file on disk and only affects reporting completion.

It is not caught by any tests because the next call to RotateLogs (or Close) will resolve the problem.
Additionally, we mostly test the default WriteBehind strategy rather than the WriteWait strategy.

Special notes for your reviewer:
I was unable to come up with a satisfactory test for this issue since it resolves itself. We discovered this because we have a scenario where we depend on the implied guarantee that in the WriteWait strategy, all flushes before RotateLogs will eventually be flushed and return either success or failure after RotateLogs and we found that we were blocked waiting for Write to return.

Does this PR introduce a user-facing and/or backwards incompatible change?:


Does this PR require updating code package or user-facing documentation?:


This commit adds a setOnFlush method to both the commitLogWriter and chunkWriter intefaces.

This function is necessary to get around a subtle bug in the WriteWait commitlog strategy
causing pendingFlushFns functions (and therefore commitLog.Write returning) to lag.

The issue is that after the first call to RotateLogs, when onFlush is called, it ends up reading
from the wrong list of pendingFlushFns. This is hard to detect as this occurs after the data has
already been flushed to the correct file on disk and only affects reporting completion.

It is not caught by any tests because the next call to RotateLogs (or Close) will resolve the problem.
Additionally, we mostly test the default WriteBehind strategy rather than the WriteWait strategy.
@marcushill marcushill marked this pull request as ready for review January 6, 2022 15:46
Copy link
Collaborator

@arnikola arnikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but may want a second opinion, not too familiar with this path here

Copy link
Collaborator

@robskillington robskillington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@marcushill marcushill enabled auto-merge (squash) January 12, 2022 14:01
@marcushill marcushill merged commit 424b0ed into m3db:master Jan 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants