Reject snapshot-less join against primary holding a more recent snapshot by Copilot · Pull Request #7844 · microsoft/CCF

Copilot · 2026-04-28T10:58:43Z

The primary should express an opinion about what snapshot is acceptable. Currently this opinion is "at least my startup_seqno".
To fix #7835 we should have a two step system.
On the first try the primary should require the joiner to have a snapshot which is at its latest snapshot number.
On subsequent tries, that primary should fall back to the safety constraint of greater than startup_seqno.

Copilot generated description:

A joiner with startup_seqno == 0 was accepted by an original (also startup_seqno == 0) primary even when that primary held committed snapshots, causing the joiner to replay the entire ledger.

Changes

Join request: new optional retry_count field on JoinNetworkNodeToNode::In, incremented by the joiner each time a StartupSeqnoIsOld response triggers a snapshot fetch + retry.
Primary-side check (node_frontend.h accept handler):
- retry_count == 0: require startup_seqno >= max(this_startup_seqno, latest_on_disk_snapshot_seqno). The max keeps the existing behaviour when this node holds no snapshot.
- retry_count > 0: fall back to the prior startup_seqno >= this_startup_seqno check, so the joiner doesn't chase a snapshot committed mid-fetch.
Latest snapshot source: read from disk via NodeConfigurationSubsystem + snapshots::find_latest_committed_snapshot_in_directory (i.e. what the joiner could actually fetch from this node), not from the SNAPSHOT_STATUS KV table.
Joiner (NodeState): tracks join_retry_count, bumped where the existing StartupSeqnoIsOld + fetch_recent_snapshot path is taken, and sent on the next request.
Test: run_join_no_snapshot_against_original_primary in tests/reconfiguration.py starts a single-node network, generates a committed snapshot, then asserts:
- join with from_snapshot=False, fetch_recent_snapshot=False is rejected with StartupSeqnoIsOld (previously silently replayed the ledger);
- join with from_snapshot=False, fetch_recent_snapshot=True succeeds and the joiner reports a non-zero startup_seqno.
CHANGELOG: new [Unreleased] / Fixed entry.

Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/d3383e6c-3063-4f30-ba6d-e324856cc41f Co-authored-by: cjen1-msft <190360281+cjen1-msft@users.noreply.github.com>

Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/ee92835a-b9f3-4451-aa71-bfc04a2d774e Co-authored-by: cjen1-msft <190360281+cjen1-msft@users.noreply.github.com>

Copilot

Pull request overview

This PR hardens the node join protocol to prevent snapshot-less joiners from being accepted by an “original” primary that has already committed snapshots, avoiding full-ledger replay by requiring a sufficiently recent startup snapshot on the initial join attempt.

Changes:

Add retry_count to the node-to-node join request and track it on the joiner to distinguish first-attempt joins from retries after snapshot fetch.
Update primary-side join acceptance logic to compare the joiner’s startup_seqno against the latest committed snapshot on disk (first attempt only), falling back to the previous check on retries.
Add an E2E regression test covering rejection of snapshot-less joins and successful join after fetching a snapshot; add a changelog entry.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/node/rpc/node_call_types.h`	Adds optional `retry_count` to join request type.
`src/node/rpc/serialization.h`	Adds `retry_count` to JSON serialization for join request.
`src/node/node_state.h`	Tracks and sends `join_retry_count` when retrying join after snapshot fetch.
`src/node/rpc/node_frontend.h`	Implements primary-side min acceptable seqno check using latest on-disk committed snapshot on first attempt.
`tests/reconfiguration.py`	Adds regression test for joining without a snapshot against an original primary holding committed snapshots.
`CHANGELOG.md`	Adds `[Unreleased]` fixed entry describing the behavior change.

Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/77ef7836-6f94-4a76-a8fd-ec303efa9014 Co-authored-by: eddyashton <6000239+eddyashton@users.noreply.github.com>

Copilot · 2026-04-28T12:29:49Z

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

https://api.github.com/graphql
- Triggering command: /usr/bin/gh gh pr view --json number (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/0152a93d-bea4-4acf-8376-7be729e36da8 Co-authored-by: cjen1-msft <190360281+cjen1-msft@users.noreply.github.com>

…ror messages Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/1dbff290-d0e6-4564-a52f-6633edf7df3e Co-authored-by: eddyashton <6000239+eddyashton@users.noreply.github.com>

Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/a18950f8-94db-4a15-b9a5-cae39445e222 Co-authored-by: cjen1-msft <190360281+cjen1-msft@users.noreply.github.com>

achamayou · 2026-04-28T16:58:35Z

+              ccf::errors::StartupSeqnoIsOld,
+              fmt::format(
+                "Node requested to join from seqno {} which is older than "
+                "this node's preferred recent snapshot seqno {} (the latest "


We should avoid apostrophes/single quotes in error messages and keep them compact:

"Joiner startup snapshot sequence number ({}) is lower than minimum acceptable value ({})", in.startup_seqno.value(), preferred_seqno

achamayou · 2026-04-28T16:59:01Z

              "Node requested to join from seqno {} which is older than this "
-              "node startup seqno {}. A snapshot at least as recent as {} must "
-              "be used instead.",
+              "node's required minimum snapshot seqno {} (this node's "


Same as above.

Initial plan

2d9df82

Copilot AI assigned Copilot and cjen1-msft Apr 28, 2026

Copilot started work on behalf of cjen1-msft April 28, 2026 10:58 View session

Copilot AI linked an issue Apr 28, 2026 that may be closed by this pull request

Join of a snapshot-less node to an original node replays the ledger #7835

Open

Implement primary-side join check against latest on-disk snapshot

043687d

Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/d3383e6c-3063-4f30-ba6d-e324856cc41f Co-authored-by: cjen1-msft <190360281+cjen1-msft@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Implement behavior for joining nodes to original ledger~~ Reject snapshot-less join against primary holding a more recent snapshot Apr 28, 2026

Copilot finished work on behalf of cjen1-msft April 28, 2026 11:22

Copilot AI requested a review from cjen1-msft April 28, 2026 11:22

cjen1-msft reviewed Apr 28, 2026

View reviewed changes

Comment thread src/node/rpc/node_call_types.h

cjen1-msft reviewed Apr 28, 2026

View reviewed changes

Comment thread src/node/rpc/node_frontend.h

Copilot started work on behalf of cjen1-msft April 28, 2026 12:10 View session

Shorten retry_count doc comment per review

ef73def

Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/ee92835a-b9f3-4451-aa71-bfc04a2d774e Co-authored-by: cjen1-msft <190360281+cjen1-msft@users.noreply.github.com>

Copilot finished work on behalf of cjen1-msft April 28, 2026 12:14

Copilot AI requested a review from cjen1-msft April 28, 2026 12:14

cjen1-msft marked this pull request as ready for review April 28, 2026 12:21

cjen1-msft requested a review from a team as a code owner April 28, 2026 12:21

Copilot AI review requested due to automatic review settings April 28, 2026 12:21

Copilot started reviewing on behalf of cjen1-msft April 28, 2026 12:22 View session

eddyashton mentioned this pull request Apr 28, 2026

Eliminate get_committed_snapshots from join path. #7832

Open

Copilot AI reviewed Apr 28, 2026

View reviewed changes

Comment thread CHANGELOG.md Outdated

Comment thread src/node/rpc/node_frontend.h Outdated

eddyashton reviewed Apr 28, 2026

View reviewed changes

Comment thread CHANGELOG.md Outdated

Copilot started work on behalf of eddyashton April 28, 2026 12:27 View session

CHANGELOG: rename Unreleased to 7.0.3, bump pyproject and add PR ref

8e9c582

Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/77ef7836-6f94-4a76-a8fd-ec303efa9014 Co-authored-by: eddyashton <6000239+eddyashton@users.noreply.github.com>

Copilot AI requested a review from eddyashton April 28, 2026 12:29

Copilot finished work on behalf of eddyashton April 28, 2026 12:29

Copilot started work on behalf of cjen1-msft April 28, 2026 12:30 View session

CHANGELOG: shorten 7.0.3 entry

47de9f8

Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/0152a93d-bea4-4acf-8376-7be729e36da8 Co-authored-by: cjen1-msft <190360281+cjen1-msft@users.noreply.github.com>

Copilot finished work on behalf of cjen1-msft April 28, 2026 12:31

eddyashton reviewed Apr 28, 2026

View reviewed changes

Comment thread src/node/rpc/node_frontend.h Outdated

Copilot started work on behalf of eddyashton April 28, 2026 12:32 View session

node_frontend: distinguish preferred vs required snapshot seqno in er…

730bbb5

…ror messages Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/1dbff290-d0e6-4564-a52f-6633edf7df3e Co-authored-by: eddyashton <6000239+eddyashton@users.noreply.github.com>

Copilot finished work on behalf of eddyashton April 28, 2026 12:35

Copilot AI requested a review from eddyashton April 28, 2026 12:35

eddyashton reviewed Apr 28, 2026

View reviewed changes

Comment thread src/node/rpc/node_frontend.h Outdated

Copilot started work on behalf of cjen1-msft April 28, 2026 13:49 View session

node_frontend: treat missing retry_count as legacy joiner (value_or(1))

f3d26b8

Agent-Logs-Url: https://github.com/microsoft/CCF/sessions/a18950f8-94db-4a15-b9a5-cae39445e222 Co-authored-by: cjen1-msft <190360281+cjen1-msft@users.noreply.github.com>

Copilot finished work on behalf of cjen1-msft April 28, 2026 13:52

achamayou reviewed Apr 28, 2026

View reviewed changes

achamayou approved these changes Apr 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reject snapshot-less join against primary holding a more recent snapshot#7844

Reject snapshot-less join against primary holding a more recent snapshot#7844
Copilot wants to merge 7 commits intomainfrom
copilot/implement-snapshot-joining-behaviour

Copilot AI commented Apr 28, 2026 •

edited by cjen1-msft

Loading

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Apr 28, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Uh oh!

Uh oh!

achamayou Apr 28, 2026

Uh oh!

achamayou Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Copilot AI commented Apr 28, 2026 • edited by cjen1-msft Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Copilot generated description:

Changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Apr 28, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Uh oh!

Uh oh!

achamayou Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

achamayou Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Copilot AI commented Apr 28, 2026 •

edited by cjen1-msft

Loading