Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(rdb_load): EnsureRead(min) requesting more bytes than min #2604

Merged
merged 4 commits into from
Feb 19, 2024

Conversation

kostasrim
Copy link
Contributor

@kostasrim kostasrim commented Feb 16, 2024

This is the bug we saw in a few regression tests (redis replication, rotating master).

The issue is that EnsureRead(num) did not include the bytes already read in the InputBuffer and in some rare cases ended up blocking forever. An example of that is the last 8 empty bytes after a SendFullSyncCut(). Replica node would receive RDB_OPCODE_FULLSYNC_END and the InputBuffer would already contain n out of 8 empty bytes. If let's say n=4, EnsureRead(8) would request 8 bytes when in fact we needed 4 more leading to a deadlock (because full sync would not proceed).

@kostasrim kostasrim self-assigned this Feb 16, 2024
@kostasrim kostasrim marked this pull request as ready for review February 18, 2024 14:57
@kostasrim
Copy link
Contributor Author

@dranikpg I was wrong, there is no other bug other than the one I fix here (the flake I was looking at was my bug once I was trying to reproduce this one).

I rerun the test 800 times without any failures. We should be good.

dranikpg
dranikpg previously approved these changes Feb 19, 2024
Copy link
Contributor

@dranikpg dranikpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

}

error_code RdbLoaderBase::EnsureReadInternal(size_t min_sz) {
DCHECK_LT(mem_buf_->InputLen(), min_sz);
DCHECK_LT(mem_buf_->InputLen(), min_sz + mem_buf_->InputLen());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is awalys true for unsigned integers 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just that the meaning of the variable changed now, it's not min_sz (as min required size), but min_to_read (as to minimally should be read), so the fix could have also been applied here

@kostasrim kostasrim merged commit 58dda3b into main Feb 19, 2024
10 checks passed
@kostasrim kostasrim deleted the fix_ensure_read branch February 19, 2024 12:41
lsvmello pushed a commit to lsvmello/dragonfly that referenced this pull request Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants