Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stable sync replication dcheck fails over long latency connections #3112

Closed
romange opened this issue Jun 1, 2024 · 3 comments
Closed

stable sync replication dcheck fails over long latency connections #3112

romange opened this issue Jun 1, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@romange
Copy link
Collaborator

romange commented Jun 1, 2024

Describe the bug
With unreliable connections, replica crashes on tx_executor.cc:116] Check failed: tx_data.lsn == *lsn_

To Reproduce

  1. Run the server on the cloud, make its port 6379 accessible to the local machine, run there DF
  2. Run DF on your local machine, configure it to be a replica of cloud DF.
  3. Run memtier on the cloud machine (DF master) with enough write traffic: memtier_benchmark -c 10 --threads 4 --ratio 1:0 --key-maximum 100000000 -d 100 -n 30000 --hide-histogram --distinct-client-seed
  4. At some point, the master DF (on the cloud machine) will print: dflycmd.cc:273] Partial sync requested from stale LSN...
  5. The replica will crash around this time.
@romange romange added the bug Something isn't working label Jun 1, 2024
@adiholden adiholden assigned BorysTheDev and unassigned adiholden Jul 3, 2024
@adiholden
Copy link
Collaborator

I think that #3686 fixes this but we need to verify.
The journal lsn when finalizing the full sync was sent before calling journal callback unregister which can block and this might cause sending wrong journal offset to replica when full sync finishes as more commands are writing to journal until the unregister returns

@romange
Copy link
Collaborator Author

romange commented Sep 11, 2024

I agree that someone needs to run this.

@BorysTheDev
Copy link
Contributor

I've run a couple of times and everything works well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants