-
-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport bug fixes for a v1.1.1 release #2160
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
8269268
to
0b6e5ce
Compare
In large-scale deployments we observed the disk monitor erase request to return a timeout error, cancelling the ongoing request. This is a two-fold bug fix: First, we must not use a timeout for process-internal actor communication, especially not for such complicated nested loops that when partially executed and then never resumed leave the actor in an undefined state. Second, we must treat erase requests with a high priority because they should never be upheld by queued requests on the read or write path.
0b6e5ce
to
7b40ec8
Compare
554fc01
to
d7d84c6
Compare
d7d84c6
to
05b6c6e
Compare
Co-authored-by: Benno Evers <benno.evers@tenzir.com>
16b66c0
to
22c3101
Compare
lava
approved these changes
Mar 24, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good; there are not unit tests but we verified manually that they work and improve disk monitor behavior on the testbed.
ad33b0d
to
5bf8f33
Compare
7e28a00
to
a9a1120
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In large-scale deployments we observed the disk monitor erase request to return a timeout error, cancelling the ongoing request. This is a two-fold bug fix: First, we must not use a timeout for process-internal actor communication, especially not for such complicated nested loops that when partially executed and then never resumed leave the actor in an undefined state. Second, we must treat erase requests with a high priority because they should never be upheld by queued requests on the read or write path.
If this does not fix the bug, then a release with these changes will at the very least help us track down the actual source of the issue because it's no longer being shadowed from the request timeout error.
📝 Checklist
🎯 Review Instructions
Run on our testbed.