Skip to content

test: promote replicated simulation scenarios#65

Merged
skel84 merged 4 commits intomainfrom
issue-54-promote-replicated-tests
Mar 13, 2026
Merged

test: promote replicated simulation scenarios#65
skel84 merged 4 commits intomainfrom
issue-54-promote-replicated-tests

Conversation

@skel84
Copy link
Owner

@skel84 skel84 commented Mar 13, 2026

Summary

  • promote the missing replicated partition and primary-crash scenario families into deterministic harness tests
  • add minimal retry-aware harness support for ambiguous client outcomes across failover
  • align replicated testing and status docs with the new executable coverage

Validation

  • cargo test -p allocdb-node replicated_simulation -- --nocapture
  • ./scripts/preflight.sh

Closes #54

Add deterministic partition and primary-crash regression coverage to the real three-replica harness, plus the minimal retry-aware harness support needed to replay ambiguous client outcomes after failover.

Closes #54
@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d4dfe091-4977-4a60-acd9-cf42ca51c692

📥 Commits

Reviewing files that changed from the base of the PR and between d8348da and bbc3d22.

📒 Files selected for processing (4)
  • crates/allocdb-node/src/replicated_simulation.rs
  • crates/allocdb-node/src/replicated_simulation_tests.rs
  • docs/status.md
  • docs/testing.md
📜 Recent review details
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Write extensive tests for every meaningful behavior change. Favor invariant tests, negative-path tests, recovery tests, and regression tests over shallow happy-path coverage.
Add extensive logging where it materially improves debuggability or operational clarity. Use the right log level: error for invariant breaks, corruption, and failed operations that require intervention; warn for degraded but expected conditions such as overload, lag, or rejected requests; info for meaningful lifecycle and state-transition events; debug for detailed execution traces useful in development; trace only for very high-volume diagnostic detail.
Logging must be structured and purposeful. Do not add noisy logs that obscure signal or hide bugs.

Files:

  • crates/allocdb-node/src/replicated_simulation_tests.rs
  • crates/allocdb-node/src/replicated_simulation.rs
**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

Keep documentation up to date with the code and design. If a change affects behavior, invariants, failure modes, operational semantics, testing strategy, or implementation sequencing, update the relevant docs in the same task or PR.

Files:

  • docs/testing.md
  • docs/status.md
docs/status.md

📄 CodeRabbit inference engine (AGENTS.md)

Keep docs/status.md current as the single-file progress snapshot for the repository. Update it whenever milestone state, implementation coverage, or the recommended next step materially changes.

Files:

  • docs/status.md
🧠 Learnings (2)
📚 Learning: 2026-03-12T15:18:53.086Z
Learnt from: CR
Repo: skel84/allocdb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-12T15:18:53.086Z
Learning: Applies to **/*.rs : Write extensive tests for every meaningful behavior change. Favor invariant tests, negative-path tests, recovery tests, and regression tests over shallow happy-path coverage.

Applied to files:

  • crates/allocdb-node/src/replicated_simulation_tests.rs
📚 Learning: 2026-03-12T15:18:53.086Z
Learnt from: CR
Repo: skel84/allocdb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-12T15:18:53.086Z
Learning: Applies to docs/status.md : Keep [`docs/status.md`](./docs/status.md) current as the single-file progress snapshot for the repository. Update it whenever milestone state, implementation coverage, or the recommended next step materially changes.

Applied to files:

  • docs/status.md
🧬 Code graph analysis (2)
crates/allocdb-node/src/replicated_simulation_tests.rs (4)
crates/allocdb-node/src/replica.rs (1)
  • engine (719-721)
crates/allocdb-node/src/simulation.rs (1)
  • engine (341-343)
crates/allocdb-node/src/replicated_simulation.rs (3)
  • replica (535-540)
  • published_result (682-684)
  • configured_primary (578-580)
crates/allocdb-core/src/state_machine.rs (1)
  • deadline_slot (225-230)
crates/allocdb-node/src/replicated_simulation.rs (2)
crates/allocdb-node/src/replica.rs (1)
  • engine (719-721)
crates/allocdb-node/src/simulation.rs (1)
  • engine (341-343)
🔇 Additional comments (9)
docs/testing.md (1)

261-263: LGTM!

The documentation updates accurately reflect the new executable test coverage. The scenario descriptions in lines 261-263 now correctly emphasize retry recovery and commit propagation semantics, and the expanded coverage list in lines 324-341 aligns with the newly promoted test cases.

Also applies to: 324-341

crates/allocdb-node/src/replicated_simulation_tests.rs (4)

745-794: LGTM!

This test correctly validates that retry-cache hits do not overwrite canonical published results. The assertions on lines 758, 767, 773, 786, and 793 comprehensively verify:

  1. Original commit produces a non-retry-cache result
  2. Retry returns a from_retry_cache result while preserving the original
  3. Conflicting payload returns OperationConflict without corrupting the canonical entry

948-1010: LGTM!

This regression test directly validates the fix for the primary-only prepared promotion bug raised in past reviews. The test correctly:

  1. Creates a prepare that only exists on the primary (lines 961-984 verify backups have no prepared entry)
  2. Forces view change and verifies the entry is NOT promoted (lines 988-996)
  3. Confirms retry creates a fresh Prepared entry in the new view (lines 1002-1009)

1012-1071: LGTM!

This test validates the positive case for prepared-suffix recovery during view change. It correctly demonstrates that when a backup (replica 3) holds a prepared entry that proves majority append, the new primary (replica 2) can reconstruct the committed prefix even though it never directly received the prepare.


1073-1297: LGTM!

The partition and primary-crash scenario tests comprehensively cover the failure modes documented in docs/testing.md:

  • minority_partition (lines 1073-1108): Validates quorum preservation and stale backup catch-up
  • split_cluster (lines 1110-1164): Validates fail-closed behavior and later rejoin convergence
  • crash_before_quorum (lines 1166-1203): Validates pre-quorum retry creates fresh prepare
  • crash_after_quorum (lines 1205-1249): Validates retry resolves from reconstructed commit cache
  • crash_after_reply (lines 1251-1297): Validates read and retry preservation on new primary

The tests correctly use ReplicatedClientRequestOutcome to distinguish between Prepared and Published outcomes and verify the from_retry_cache flag appropriately.

docs/status.md (1)

50-57: LGTM!

The status updates accurately reflect the PR's contributions:

  • Lines 50-57: New promoted scenarios (partition, crash, retry) are documented
  • Lines 172-173: Retry-aware client submit helper matches the new client_submit_or_retry API
  • Lines 185-189: Regression coverage list now includes all newly promoted test families

The validation commands reorganization (lines 191-195) improves clarity by grouping related tests.

Also applies to: 172-173, 185-189

crates/allocdb-node/src/replicated_simulation.rs (3)

60-64: LGTM!

The ReplicatedClientRequestOutcome enum and client_submit_or_retry method correctly implement retry-aware submission. Crucially, when a retry cache hit occurs (line 674-676), the method returns the result directly without writing to published_results, preserving the canonical per-LSN result. This addresses the previously raised concern about overwriting canonical published results.

Also applies to: 659-680


1207-1244: LGTM!

The lookup_retry_result helper correctly implements the retry cache lookup:

  • Returns None when the operation is not found (line 1227), allowing a fresh prepare
  • Uses fingerprint comparison to detect conflicts and return OperationConflict (lines 1230-1238)
  • Properly sets from_retry_cache: true to distinguish from canonical commits

1315-1337: LGTM!

The fix for the primary-only prepared promotion bug is correctly implemented. The view_change_target_commit_lsn method now:

  1. Takes old_primary as a parameter (line 1318)
  2. Excludes the old primary's prepared suffix from the target commit calculation (lines 1329-1331)
  3. Only considers backup-held prepared suffixes as proof of majority append

The inline comment (lines 1326-1328) clearly documents the rationale. This addresses the critical issue raised in past reviews about incorrectly promoting uncommitted entries that only existed on the primary.


Summary by CodeRabbit

  • New Features

    • Retry cache mechanism for client request recovery after primary failures
    • Improved prepared suffix handling during failover and view changes
    • Multi-process local cluster runner with durable workspace persistence across restarts
  • Tests

    • Comprehensive test coverage for primary crash scenarios, partition healing, and recovery workflows
    • New validation for retry cache behavior and commit reconstruction
  • Documentation

    • Updated status documentation reflecting deterministic partition handling and durability guarantees
    • Enhanced testing scenarios covering failover, minority partitions, and rejoin strategies

Walkthrough

Adds a retry-aware client submit flow to the replicated simulation, exposes ReplicaNode::highest_prepared_lsn, updates view-change commit selection to consider prepared suffixes, and adds many tests and docs covering partition, crash, and rejoin scenarios.

Changes

Cohort / File(s) Summary
Replica Prepared LSN Tracking
crates/allocdb-node/src/replica.rs
Adds pub fn highest_prepared_lsn(&self) -> Option<Lsn> to return the highest LSN among prepared entries.
Replicated Simulation Core
crates/allocdb-node/src/replicated_simulation.rs
Adds ReplicatedClientRequestOutcome enum, client_submit_or_retry() method, and lookup_retry_result() helper; integrates retry-cache lookup into client submit flow; updates view_change_target_commit_lsn to consider highest_prepared_lsn; expands imports for request decoding and result handling.
Replicated Simulation Tests
crates/allocdb-node/src/replicated_simulation_tests.rs
Adds extensive tests validating retry-cache semantics, prepared-suffix recovery across view changes, partition/crash/rejoin scenarios, and exposes ReplicatedClientRequestOutcome for assertions.
Documentation
docs/status.md, docs/testing.md
Updates status and testing docs to describe retry-aware submit behavior, primary-crash/rejoin/partition scenarios, and expanded test coverage and scenarios.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Simulation as Replicated Simulation
    participant RetryCache as Retry Lookup
    participant Engine as Primary Engine State

    Client->>Simulation: client_submit_or_retry(primary, slot, payload)
    activate Simulation

    Simulation->>RetryCache: lookup_retry_result(primary, slot, payload)
    activate RetryCache

    RetryCache->>Engine: fetch active primary engine state
    Engine-->>RetryCache: engine state / prepared & published records

    alt Retry Found (Fingerprint match -> published)
        RetryCache-->>Simulation: SubmissionResult (published)
        Simulation-->>Client: Published(SubmissionResult)
    else No usable retry result
        RetryCache-->>Simulation: None
        deactivate RetryCache
        Simulation->>Simulation: client_submit() -> produce Prepared entry
        Simulation-->>Client: Prepared(ReplicaPreparedEntry)
    end

    deactivate Simulation
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'test: promote replicated simulation scenarios' accurately summarizes the main change: promoting replicated simulation scenarios into executable tests.
Description check ✅ Passed The description covers the main changes (promoting scenarios, adding retry support, aligning docs), includes validation steps, and references the closed issue.
Linked Issues check ✅ Passed All requirements from issue #54 are met: partition/primary-crash/rejoin scenarios are promoted into executable tests [#54], retry-aware support is added, docs are aligned, and transcripts remain replayable.
Out of Scope Changes check ✅ Passed All changes are directly related to issue #54: new test scenarios, retry cache implementation for ambiguous outcomes, documentation updates, and a new method for accessing prepared LSN state.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch issue-54-promote-replicated-tests
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/allocdb-node/src/replicated_simulation_tests.rs`:
- Around line 1029-1045: The test currently observes the outcome via
published_result(entry.lsn) which is set by handle_prepare_ack() as soon as
quorum append completes, so it does not model a distinct “reply not yet
published” boundary; either change the test to explicitly model pre-reply by
removing/avoiding the published_result check and instead assert quorum-append
via a lower-level indicator (e.g., check a quorum-appended helper or persisted
log state) before crashing, or explicitly publish the reply before the crash if
you intend the “reply exposed” scenario (call the harness method that publishes
replies, e.g., harness.publish_reply(entry.lsn) or equivalent), and update the
test name/docs to match the chosen behavior; locate changes around
client_submit, deliver_protocol_message, published_result(entry.lsn), and
handle_prepare_ack() to implement this fix.

In `@crates/allocdb-node/src/replicated_simulation.rs`:
- Around line 1325-1329: The patch missed a regression test where a newly
elected primary must fetch a prepared suffix from another voter: add a test in
replicated_simulation.rs that simulates a 3-replica view where only replica 3
holds the prepared entry and replica 2 wins the new view, forcing the path that
calls highest_prepared_lsn() and then find_prepared_entry() to reconstruct the
missing entries; exercise the failover/recovery flow, assert that replica 2
successfully copies the prepared suffix (verify commit LSNs and that the
prepared entry appears on replica 2), and ensure the test covers the
negative/recovery case so this regression cannot recur.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8665918b-a8fd-42d1-80f6-de232bca5797

📥 Commits

Reviewing files that changed from the base of the PR and between bac464b and b97b5d4.

📒 Files selected for processing (5)
  • crates/allocdb-node/src/replica.rs
  • crates/allocdb-node/src/replicated_simulation.rs
  • crates/allocdb-node/src/replicated_simulation_tests.rs
  • docs/status.md
  • docs/testing.md
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Write extensive tests for every meaningful behavior change. Favor invariant tests, negative-path tests, recovery tests, and regression tests over shallow happy-path coverage.
Add extensive logging where it materially improves debuggability or operational clarity. Use the right log level: error for invariant breaks, corruption, and failed operations that require intervention; warn for degraded but expected conditions such as overload, lag, or rejected requests; info for meaningful lifecycle and state-transition events; debug for detailed execution traces useful in development; trace only for very high-volume diagnostic detail.
Logging must be structured and purposeful. Do not add noisy logs that obscure signal or hide bugs.

Files:

  • crates/allocdb-node/src/replica.rs
  • crates/allocdb-node/src/replicated_simulation_tests.rs
  • crates/allocdb-node/src/replicated_simulation.rs
**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

Keep documentation up to date with the code and design. If a change affects behavior, invariants, failure modes, operational semantics, testing strategy, or implementation sequencing, update the relevant docs in the same task or PR.

Files:

  • docs/status.md
  • docs/testing.md
docs/status.md

📄 CodeRabbit inference engine (AGENTS.md)

Keep docs/status.md current as the single-file progress snapshot for the repository. Update it whenever milestone state, implementation coverage, or the recommended next step materially changes.

Files:

  • docs/status.md
🧠 Learnings (2)
📚 Learning: 2026-03-12T15:18:53.086Z
Learnt from: CR
Repo: skel84/allocdb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-12T15:18:53.086Z
Learning: Applies to **/*.rs : Write extensive tests for every meaningful behavior change. Favor invariant tests, negative-path tests, recovery tests, and regression tests over shallow happy-path coverage.

Applied to files:

  • crates/allocdb-node/src/replicated_simulation_tests.rs
📚 Learning: 2026-03-12T15:18:53.086Z
Learnt from: CR
Repo: skel84/allocdb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-12T15:18:53.086Z
Learning: Applies to docs/status.md : Keep [`docs/status.md`](./docs/status.md) current as the single-file progress snapshot for the repository. Update it whenever milestone state, implementation coverage, or the recommended next step materially changes.

Applied to files:

  • docs/status.md
🧬 Code graph analysis (2)
crates/allocdb-node/src/replicated_simulation_tests.rs (1)
crates/allocdb-node/src/replicated_simulation.rs (2)
  • replica (535-540)
  • configured_primary (578-580)
crates/allocdb-node/src/replicated_simulation.rs (2)
crates/allocdb-node/src/replica.rs (5)
  • engine (719-721)
  • new (210-220)
  • new (452-456)
  • new (539-543)
  • new (1376-1378)
crates/allocdb-node/src/simulation.rs (4)
  • engine (341-343)
  • new (86-88)
  • new (114-119)
  • new (166-185)

@skel84
Copy link
Owner Author

skel84 commented Mar 13, 2026

Addressed CodeRabbit feedback in commit d8348da.

Changes:

  • kept retry-cache hits from overwriting canonical published results
  • renamed the majority-appended crash regression to match the boundary the harness actually models and asserted the committed prefix via replica state
  • added regressions for prepared-suffix recovery from another voter during takeover and for retry-cache hits preserving canonical published results
  • updated replicated testing/status docs to match the executable coverage

Validation rerun:

  • cargo test -p allocdb-node replicated_simulation -- --nocapture
  • ./scripts/preflight.sh

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
docs/testing.md (1)

261-263: ⚠️ Potential issue | 🟡 Minor

Don't describe an unmodeled pre-reply crash window.

The harness still publishes the client outcome as soon as quorum append completes in handle_prepare_ack(), so this scenario only models interrupted commit propagation. Please remove the “reply delivery” wording here to keep the docs aligned with the executable behavior.

As per coding guidelines, **/*.md: Keep documentation up to date with the code and design. If a change affects behavior, invariants, failure modes, operational semantics, testing strategy, or implementation sequencing, update the relevant docs in the same task or PR.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/testing.md` around lines 261 - 263, Update the test description to
remove the unmodeled "reply delivery" wording so it matches the harness
behavior: change the second bullet to state that the primary is crashed after
quorum append (and after reply) modeling only interrupted commit propagation,
and explicitly reference the existing behavior in handle_prepare_ack() which
publishes client outcome as soon as quorum append completes; remove any claim
that reply delivery can be interrupted because that pre-reply crash window is
not modeled.
🧹 Nitpick comments (1)
crates/allocdb-node/src/replicated_simulation.rs (1)

1206-1243: Log retry-cache hits and misses in this lookup path.

This is the new ambiguity-resolution surface, but hit/miss/conflict outcomes are silent today. A structured debug! here with primary, operation_id, request_slot, applied_lsn, and result_code would make failover regressions much easier to triage.

As per coding guidelines, **/*.rs: Add extensive logging where it materially improves debuggability or operational clarity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/allocdb-node/src/replicated_simulation.rs` around lines 1206 - 1243,
The lookup_retry_result path lacks observability for retry-cache behavior; add a
structured debug log in lookup_retry_result (after decoding request and after
retrieving the record) that logs primary, request.operation_id, request_slot,
and whether the record was found (hit/miss); when found include applied_lsn,
record.result_code and whether the outcome was a conflict (compare
record.command_fingerprint vs request.command.fingerprint()), and indicate
from_retry_cache=true in the log so operators can trace retry-cache hits/misses
and conflict outcomes for SubmissionResult.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/allocdb-node/src/replicated_simulation.rs`:
- Around line 1323-1327: The code is incorrectly using
node.highest_prepared_lsn() from every voter when computing target_commit_lsn,
which allows a prepare that exists only on the currently-configured primary to
be promoted into committed history; change the logic that sets target_commit_lsn
(the two max(...) lines) to ignore the still-configured primary (i.e., skip the
node whose metadata().is_configured_primary or matching view/primary identifier)
when calling highest_prepared_lsn(), and ensure
find_prepared_entry()/complete_view_change() logic only considers prepared LSNs
from non-primary voters; add a regression test that simulates a "primary-only
prepared entry" (force complete_view_change() while primary is still in config)
and asserts that such an LSN is not copied/committed on the new primary.

---

Duplicate comments:
In `@docs/testing.md`:
- Around line 261-263: Update the test description to remove the unmodeled
"reply delivery" wording so it matches the harness behavior: change the second
bullet to state that the primary is crashed after quorum append (and after
reply) modeling only interrupted commit propagation, and explicitly reference
the existing behavior in handle_prepare_ack() which publishes client outcome as
soon as quorum append completes; remove any claim that reply delivery can be
interrupted because that pre-reply crash window is not modeled.

---

Nitpick comments:
In `@crates/allocdb-node/src/replicated_simulation.rs`:
- Around line 1206-1243: The lookup_retry_result path lacks observability for
retry-cache behavior; add a structured debug log in lookup_retry_result (after
decoding request and after retrieving the record) that logs primary,
request.operation_id, request_slot, and whether the record was found (hit/miss);
when found include applied_lsn, record.result_code and whether the outcome was a
conflict (compare record.command_fingerprint vs request.command.fingerprint()),
and indicate from_retry_cache=true in the log so operators can trace retry-cache
hits/misses and conflict outcomes for SubmissionResult.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c6085148-432e-4d28-aace-28b83c28e380

📥 Commits

Reviewing files that changed from the base of the PR and between b97b5d4 and d8348da.

📒 Files selected for processing (4)
  • crates/allocdb-node/src/replicated_simulation.rs
  • crates/allocdb-node/src/replicated_simulation_tests.rs
  • docs/status.md
  • docs/testing.md
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Write extensive tests for every meaningful behavior change. Favor invariant tests, negative-path tests, recovery tests, and regression tests over shallow happy-path coverage.
Add extensive logging where it materially improves debuggability or operational clarity. Use the right log level: error for invariant breaks, corruption, and failed operations that require intervention; warn for degraded but expected conditions such as overload, lag, or rejected requests; info for meaningful lifecycle and state-transition events; debug for detailed execution traces useful in development; trace only for very high-volume diagnostic detail.
Logging must be structured and purposeful. Do not add noisy logs that obscure signal or hide bugs.

Files:

  • crates/allocdb-node/src/replicated_simulation_tests.rs
  • crates/allocdb-node/src/replicated_simulation.rs
**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

Keep documentation up to date with the code and design. If a change affects behavior, invariants, failure modes, operational semantics, testing strategy, or implementation sequencing, update the relevant docs in the same task or PR.

Files:

  • docs/status.md
  • docs/testing.md
docs/status.md

📄 CodeRabbit inference engine (AGENTS.md)

Keep docs/status.md current as the single-file progress snapshot for the repository. Update it whenever milestone state, implementation coverage, or the recommended next step materially changes.

Files:

  • docs/status.md
🧠 Learnings (2)
📚 Learning: 2026-03-12T15:18:53.086Z
Learnt from: CR
Repo: skel84/allocdb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-12T15:18:53.086Z
Learning: Applies to **/*.rs : Write extensive tests for every meaningful behavior change. Favor invariant tests, negative-path tests, recovery tests, and regression tests over shallow happy-path coverage.

Applied to files:

  • crates/allocdb-node/src/replicated_simulation_tests.rs
  • docs/status.md
📚 Learning: 2026-03-12T15:18:53.086Z
Learnt from: CR
Repo: skel84/allocdb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-12T15:18:53.086Z
Learning: Applies to docs/status.md : Keep [`docs/status.md`](./docs/status.md) current as the single-file progress snapshot for the repository. Update it whenever milestone state, implementation coverage, or the recommended next step materially changes.

Applied to files:

  • docs/status.md
🧬 Code graph analysis (1)
crates/allocdb-node/src/replicated_simulation.rs (2)
crates/allocdb-node/src/simulation.rs (1)
  • engine (341-343)
crates/allocdb-node/src/replica.rs (1)
  • engine (719-721)

@skel84
Copy link
Owner Author

skel84 commented Mar 13, 2026

Addressed the latest CodeRabbit finding in bbc3d22.

Changes:

  • exclude the previous primary when view-change recovery derives a commit target from prepared suffixes
  • add a regression that forces a view change while only the old primary holds the prepared entry, and assert the new primary does not promote it

Reran:

  • cargo test -p allocdb-node replicated_simulation -- --nocapture
  • ./scripts/preflight.sh

@skel84 skel84 merged commit 481b868 into main Mar 13, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

M7-T06 Promote replicated simulation scenarios into executable tests

1 participant