Skip to content

Set IsReplicating on APPENDLOG init handshake to prevent idle resync loop#1828

Merged
vazois merged 3 commits into
mainfrom
vazois/fix-reconnect
May 27, 2026
Merged

Set IsReplicating on APPENDLOG init handshake to prevent idle resync loop#1828
vazois merged 3 commits into
mainfrom
vazois/fix-reconnect

Conversation

@nattress
Copy link
Copy Markdown
Member

After a TCP disruption on an idle cluster, the replica's EnsureReplication
would trigger repeated resyncs because IsReplicating was only set when a
data APPENDLOG arrived, not on the init handshake (-1,-1,-1). On an idle
stream no data APPENDLOG follows, so the flag stayed false and
EnsureReplication fired every ClusterReplicationReestablishmentTimeout.

Move IsReplicating = true to the top of NetworkClusterAppendLog so it is
set on every APPENDLOG message including the init handshake. The flag is
cleared implicitly when the session is disposed on connection drop.

Copilot AI review requested due to automatic review settings May 26, 2026 23:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adjusts replication-session semantics so IsReplicating is set for the replication stream starting with the APPENDLOG init handshake, preventing unnecessary resyncs when the stream is idle.

Changes:

  • Updated IClusterSession.IsReplicating XML doc to reflect “active replication stream” semantics (incl. handshake).
  • Moved APPENDLOG “initialization message” handling into the replica-path and sets IsReplicating = true before the init handshake returns OK.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
libs/server/Cluster/IClusterSession.cs Clarifies IsReplicating contract to include init handshake as part of active replication stream.
libs/cluster/Session/RespClusterReplicationCommands.cs Marks sessions as replicating earlier and handles init handshake in the replica branch to avoid spurious resyncs.

Comment thread libs/cluster/Session/RespClusterReplicationCommands.cs
Comment thread libs/cluster/Session/RespClusterReplicationCommands.cs
vazois and others added 3 commits May 27, 2026 10:17
…loop

After a TCP disruption on an idle cluster, the replica's EnsureReplication
would trigger repeated resyncs because IsReplicating was only set when a
data APPENDLOG arrived, not on the init handshake (-1,-1,-1). On an idle
stream no data APPENDLOG follows, so the flag stayed false and
EnsureReplication fired every ClusterReplicationReestablishmentTimeout.

Move IsReplicating = true to the top of NetworkClusterAppendLog so it is
set on every APPENDLOG message including the init handshake. The flag is
cleared implicitly when the session is disposed on connection drop.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Validate that the node is a replica of the expected primary before
initializing the ReplicaReplayDriver or setting IsReplicating. There is
no point in performing initialization work if the role/primary checks
would throw immediately after.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@vazois vazois force-pushed the vazois/fix-reconnect branch from 1cb8c92 to 2cb87ea Compare May 27, 2026 17:19
@vazois vazois merged commit 180f4e6 into main May 27, 2026
268 of 269 checks passed
@vazois vazois deleted the vazois/fix-reconnect branch May 27, 2026 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants