Fix copy-paste bug in disk-based replica sync sending wrong AOF address#1633
Merged
vazois merged 1 commit intovazois/mmrt-devfrom Mar 16, 2026
Merged
Conversation
…k-based replica sync In TryReplicateDiskbasedSync, ExecuteClusterInitiateReplicaSync was sending beginAddress.Span for both the aofBeginAddress and aofTailAddress parameters. This was introduced in commit 6fb99e5 when converting from ToByteArray() to Span-based calls. The primary uses the replica's tail address to compute the AOF sync replay range. With both parameters being the begin address, the primary couldn't determine where the replica's AOF actually ended, causing the replica to never receive AOF records and remain stuck at offset 64 (kFirstValidAofAddress). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
ClusterSRNoCheckpointRestartSecondarytests fail because the replica remains stuck at AOF offset 64 (kFirstValidAofAddress) after restart, never syncing with the primary.Error:
[127.0.0.1:7000]: 1,140288 != [127.0.0.1:7001]: 1,64Root Cause
Copy-paste bug in
ReplicaDiskbasedSync.TryReplicateDiskbasedSync()introduced in commit6fb99e58when converting fromToByteArray()toSpan-based calls. TheaofTailAddressparameter was accidentally set tobeginAddress.Spaninstead oftailAddress.Span.The primary uses the replica's tail address in
ComputeAofSyncReplayAddress()to determine the AOF sync replay range. With both parameters being the begin address, the primary couldn't determine where the replica's AOF ended, so the replica never received AOF records after restart.Fix
One-line fix:
beginAddress.SpantotailAddress.Spanfor theaofTailAddressparameter ofExecuteClusterInitiateReplicaSync.Testing
All 4
ClusterReplicationTLS.ClusterSRNoCheckpointRestartSecondaryvariants now pass.