-
Notifications
You must be signed in to change notification settings - Fork 876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(rdb_load): EnsureRead(min) requesting more bytes than min #2604
Conversation
70379f6
to
37c535f
Compare
37c535f
to
e2e97fe
Compare
@dranikpg I was wrong, there is no other bug other than the one I fix here (the flake I was looking at was my bug once I was trying to reproduce this one). I rerun the test 800 times without any failures. We should be good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻
src/server/rdb_load.cc
Outdated
} | ||
|
||
error_code RdbLoaderBase::EnsureReadInternal(size_t min_sz) { | ||
DCHECK_LT(mem_buf_->InputLen(), min_sz); | ||
DCHECK_LT(mem_buf_->InputLen(), min_sz + mem_buf_->InputLen()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is awalys true for unsigned integers 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just that the meaning of the variable changed now, it's not min_sz (as min required size), but min_to_read (as to minimally should be read), so the fix could have also been applied here
…flydb#2604) * fixes EnsureRead(min) to request exactly min bytes
This is the bug we saw in a few regression tests (redis replication, rotating master).
The issue is that
EnsureRead(num)
did not include the bytes already read in theInputBuffer
and in some rare cases ended up blocking forever. An example of that is the last 8 empty bytes after aSendFullSyncCut()
. Replica node would receiveRDB_OPCODE_FULLSYNC_END
and theInputBuffer
would already contain n out of 8 empty bytes. If let's say n=4,EnsureRead(8)
would request 8 bytes when in fact we needed 4 more leading to a deadlock (because full sync would not proceed).