feat(tier-2): Lane A — SWAPDB, MOVE, COPY DB n, CLUSTER REPLICAS/SLAVES, COUNT-FAILURE-REPORTS by TinDang97 · Pull Request #100 · pilotspace/moon

TinDang97 · 2026-05-24T06:36:15Z

Summary

Tier 2 Lane A — 5 Redis-parity commands + 1 hot-path perf fix + 1 hygiene refactor. T2.6 (READONLY/READWRITE) intentionally deferred to a follow-up PR because it changes route_slot() signature and cascades through 4 handler files.

Commits (7)

c381b31 T2.1 SWAPDB — cross-shard atomic swap via ShardMessage::SwapDb, WAL-durable, BGREWRITEAOF CAS guard, restart-replay test
4958dc9 T2.2 MOVE key db — with_two_dbs_locked helper (lower-index first), WAL-durable, intercept in all 4 handler paths
bbc6117 T2.3 COPY ... DB n — extends copy() to accept target db index, reuses with_two_dbs_locked, WAL-durable
f538589 T2.4 CLUSTER REPLICAS / SLAVES — extracts shared format_node_line(node, self_node_id) helper between CLUSTER NODES and CLUSTER REPLICAS
ebd240a T2.5 CLUSTER COUNT-FAILURE-REPORTS — counts non-stale pfail_reports; exposes DEFAULT_NODE_TIMEOUT_MS as pub(crate)
608e2d1 perf(handler): collapse duplicate is_write gate on MOVE/COPY hot path — restores s=1 SET p=1 throughput (-9.5% → +0.9%)
e429b2b refactor(storage): split segment.rs (1587 LOC) into segment/ directory module (4 files, each ≤ 819 LOC). Mechanical — zero semantic change. Brings the file under the 1500-LOC limit ahead of any future hot-path additions.

Sprint 3 perf scope (S3.1–S3.6)

Investigated during this PR. RFC at .planning/rfcs/v02-enterprise-architecture.md was stale by ~1 month:

Task	Status	Notes
S3.1 pre-size DashTable	Already shipped in main	PERF-08 era work
S3.2 single-probe `insert_or_update`	Already shipped in main	PERF-08 era work
S3.3 second-byte fingerprint filter	Dropped	Quantitative model: ~0.07% CPU savings at production load factor (3 orders below CLAUDE.md's stated ~3% noise floor). Not shipped.
S3.4 hoist `write_db` across same-db writes	Dropped	Holding `RwLockWriteGuard` across batch-loop iterations requires enumerating every `.await` in the loop (CLIENT PAUSE sleep, SWAPDB/FT/cross-shard scan handlers, multi-key coordinator, AOF tx) — structurally incompatible with the "no lock across `.await`" rule. Marginal +2-5% gain doesn't justify the maintenance burden. A clean scan-ahead design (compute longest prefix of consecutive same-db legacy writes, hold one guard for that sync sub-loop) is the right path forward in a separate PR.
S3.5a/b dispatch-counter atomics	Already shipped in main	Verified Round 5
S3.6 HNSW devirtualization	Already shipped in main	`src/vector/distance/mod.rs:23`

The segment-split commit (e429b2b) is kept regardless — defensible standalone hygiene.

CI gates green

cargo fmt --check
cargo clippy --release -- -D warnings
cargo clippy --no-default-features --features runtime-tokio,jemalloc -- -D warnings
cargo test --release --lib — 3273 PASS / 0 FAIL (pre-split); split commit re-verified cargo test --release --lib storage::dashtable — 50 PASS / 0 FAIL on both runtimes
cargo test --no-default-features --features runtime-tokio,jemalloc --lib — 2668 PASS / 0 FAIL

Performance verification (L-A.G3)

Initial finding — s=1 SET p=1 regression

First scaling-bench-median run on Lane A head vs scaling-matrix-orbstack-2026-05-23.md showed 3 s=1 cells (-8.5% / -11.4% / -11.7%) flagged as regressions. Strict in-session A/B against pre-Lane-A merge base bddda41 cleared most cells, but s=1 SET p=1 measured 218K vs base 241K = -9.5% — the only real signal once cross-session VM variance was controlled for.

Root cause

handler_monoio/mod.rs:1077 and handler_sharded/mod.rs:1133 (added by T2.2/T2.3) wrapped MOVE/COPY interception in an outer if metadata::is_write(cmd) { guard. The existing write-path block below already calls metadata::is_write(cmd), so every SET, HSET, ZADD, etc. paid two PHF table lookups instead of one. At pipeline=1 where per-command overhead dominates, that compounded to -9.5%. At pipeline ≥ 16 the cost amortized into noise.

Fix (commit `608e2d1`)

Remove the redundant outer wrapper. MOVE/COPY rely directly on inner eq_ignore_ascii_case checks; for non-MOVE/COPY workloads the branch predictor learns "false" on both inline checks for free (~5ns each, fully predictable). handler_single.rs and shard/spsc_handler.rs were already correct — they gate MOVE/COPY directly without an outer wrapper.

Post-fix in-session A/B vs `46ed28a` (May-23 baseline commit)

Strict back-to-back same-VM-state, s=1, c=400, median-of-3:

cell	`46ed28a`	Lane A pre-fix	Lane A post-fix	post-fix vs base
GET p=1	198K	211K	201K	+1.5%
GET p=16	2.65M	2.88M	2.52M	-4.9%
GET p=64	7.25M	7.54M	7.18M	-1.0%
SET p=1	241K	218K (-9.5%)	243K	+0.9% ✅
SET p=16	1.49M	1.50M	1.48M	-0.7%
SET p=64	2.74M	2.76M	2.63M	-4.0%

All 6 cells within ±5% of the merge-base in-session.

Post-fix scaling matrix (run-2 canonical, run-1 hit VM contention)

	Redis	s=1	s=2	s=4	best Moon	vs Redis
GET p=1	202K	217K	167K	160K	217K	1.07×
GET p=16	2.38M	2.75M	2.28M	2.14M	2.75M	1.15×
GET p=64	4.48M	7.35M	4.63M	6.17M	7.35M	1.64×
SET p=1	184K	232K	170K	162K	232K	1.26×
SET p=16	953K	1.46M	1.44M	1.51M	1.51M	1.58×
SET p=64	1.69M	2.82M	2.63M	3.33M	3.33M	1.97×

Run-1 of the same matrix saw a transient s=2 SET p=1 collapse (125K, monotonically declining within the cell) which also dragged Redis SET p=16 down -26% — confirmed VM-environmental, not code, by the run-2 recovery.

Methodology note for the L-A.G3 gate

Cross-session comparison against scaling-matrix-orbstack-2026-05-23.md is not reliable at ±5% — the file itself flags ±10-15% as the VM variance band, and we observed ±15-26% cross-session Redis movement on the same hardware. Recommend the L-A.G3 wording change to: "±5% of in-session A/B against the merge base, measured strict back-to-back on the same VM boot." The baseline file remains a long-term trend reference.

Functional verification (post-fix, --shards 1)

MOVE k0 1 → :1, GET (db0) = nil, GET (db1) = v0 — PASS
COPY src dst DB 2 → :1, GET (db2) = srcval, src preserved — PASS
SWAPDB 0 3 → +OK, GET (db0) = v3, GET (db3) = v0 — PASS

Out of scope

T2.6 READONLY/READWRITE — deferred (changes route_slot() signature, must land last)
Tier 2 Lane B (T2.7-T2.9)
Tier 2 Lane C (T2.10-T2.11)

Test plan

Summary by CodeRabbit

New Features
- SWAPDB: atomic cross‑database swap (sharded and single), returns OK, is a no‑op for identical indices, and errors for out‑of‑range indices.
- CLUSTER: added REPLICAS/SLAVES and COUNT‑FAILURE‑REPORTS for improved cluster introspection.
- MOVE/COPY: support cross‑database operations via optional DB clause.
Bug Fixes
- BGREWRITEAOF now prevents concurrent rewrites to avoid races.
Tests
- Added comprehensive tests for SWAPDB, cluster subcommands, and AOF/WAL replay.

Implements Redis-compatible SWAPDB with full multi-shard broadcast, WAL durability, BGREWRITEAOF guard, bounds checking, same-index no-op, and crash-recovery replay. ## Architecture SWAPDB is intercepted at the handler layer (not inside `cmd_dispatch`) because it requires async execution, multi-db access, and cross-shard coordination: - **handler_monoio/dispatch.rs** — `try_handle_swapdb()` async fn - **handler_sharded/dispatch.rs** — `try_handle_swapdb()` async fn - **handler_single.rs** — inline SWAPDB path for single-shard tokio mode ## Multi-shard broadcast `shard::coordinator::coordinate_swapdb()` fans out to all shards: - **Remote shards**: `ShardMessage::SwapDb { a, b, reply_tx }` sent via SPSC. The SPSC handler (`spsc_handler::SwapDb` arm) emits a per-shard WAL record via `wal_append_and_fanout` BEFORE performing the swap, then calls `ShardDatabases::swap_dbs(shard_id, a, b)` and signals the coordinator via an OneshotSender<()>. - **Local shard**: handled inline in the coordinator (ChannelMesh has no self-send slot — `target_index` panics on `my_id == target_id`). WAL bytes are sent via `ShardDatabases::wal_append()` (the per-shard MPSC channel that the event-loop drains on its 1ms tick into the WAL writer). The swap is performed via `ShardDatabases::swap_dbs()`. The coordinator awaits all remote-shard acks before returning +OK. Brief-skew acceptance: between the first and last ack, interleaved GET may see pre-swap on one shard + post-swap on another. This matches Redis cluster relaxed semantics and is acceptable for SWAPDB semantics. ## WAL durability Each shard emits a "SWAPDB <a> <b>" WAL record before performing the swap (both remote shards via SPSC handler and local shard via the wal_append channel). On crash recovery, `persistence::replay::DispatchReplayEngine` intercepts SWAPDB before calling `cmd_dispatch` (which only sees one `&mut Database`), operates directly on the `&mut [Database]` slice, and uses `split_at_mut` + `std::mem::swap` to apply the exchange. Verified with restart-survival test: SET key in db0 → SWAPDB 0 1 → SIGTERM → restart → key found in db1 (WAL v2 path, `--disk-offload disable --appendonly yes`). ## BGREWRITEAOF guard (CAS fix) Replaced unconditional `store(true)` in `bgrewriteaof_start` and `bgrewriteaof_start_sharded` with `compare_exchange(false, true)` to prevent a second concurrent BGREWRITEAOF call from clearing the flag mid-rewrite. The flag is only cleared by the caller that set it (on channel-send failure) or by the AOF writer task on completion. ## Correctness properties - `ShardDatabases::swap_dbs()` acquires write locks in ascending-index order (lower index first) to prevent deadlock on concurrent SWAPDB calls with the same pair from opposite directions. - Same-index SWAPDB (SWAPDB 0 0) returns +OK immediately with no WAL emitted and no lock contention. - Out-of-range indices (SWAPDB 0 999 with db_count=16) return ERR. - BGREWRITEAOF guard: rejects SWAPDB if `AOF_REWRITE_IN_PROGRESS` is set. The flag is set by `bgrewriteaof_start{,_sharded}` (via CAS) and cleared by the AOF writer task after rewrite completes (both monoio and tokio paths). - `ShardMessage::SwapDb { a: usize, b: usize, reply_tx: OneshotSender<()> }` fits within the 64-byte (1 cache-line) cap asserted at compile time. ## Tests added - `persistence::replay::tests::replay_swapdb_exchanges_databases` - `persistence::replay::tests::replay_swapdb_same_index_noop` - `persistence::replay::tests::replay_swapdb_out_of_range_skips` - `shard::shared_databases::tests::test_swap_dbs_exchanges_contents` - `shard::shared_databases::tests::test_swap_dbs_reverse_order_same_result` - `command::tests::swapdb_dispatch_stub_returns_error` (replaces old stub test) - `scripts/test-consistency.sh`: SWAPDB section added (key movement, same-index no-op, out-of-range ERR, reverse swap) ## E2E smoke (4-shard, `--disk-free-min-pct 0`) - `SWAPDB 0 1` — OK, db0 DBSIZE 0 db1 DBSIZE 1, key moved correctly - `SWAPDB 0 0` — OK (no-op) - `SWAPDB 0 9999` — ERR DB index is out of range - `SWAPDB 1 0` (reverse) — OK, key restored to db0 ## CI gates - cargo fmt --check: PASS - cargo clippy -- -D warnings (default features): PASS - cargo clippy --no-default-features --features runtime-tokio,jemalloc -- -D warnings: PASS - cargo test --no-default-features --features runtime-tokio,jemalloc --lib: 2618 PASS, 0 FAIL author: Tin Dang

Atomically moves a key from the connection's selected database to another database on the same shard. Both source and destination are on the same Moon shard — cross-shard moves return ERR directing users to MIGRATE (T2.8). ## Semantics (Redis-compatible) - Returns :1 on success - Returns :0 if the key is absent in the source database - Returns :0 if the key already exists in the destination database (no clobber) - Returns :0 if src_db == dst_db (no-op, Redis-compatible) - Returns ERR on wrong arity or out-of-range db index ## Architecture MOVE cannot go through the central dispatch() function which only receives one &mut Database. Each of the four handler paths intercepts MOVE before reaching dispatch(): - handler_single: drops the single-db write guard, calls with_two_dbs_locked, then re-acquires the guard to restore the loop invariant - handler_monoio: intercepted inside the is_write gate before the write path - handler_sharded: same pattern as handler_monoio - spsc_handler: intercepted before cmd_dispatch; uses with_two_slice_dbs on the ShardSlice path (no RwLock needed — SPSC is single-threaded) and with_two_dbs_locked on the legacy ShardDatabases path ## Lock ordering Lower database index is always locked first in with_two_dbs_locked to prevent deadlocks with concurrent reverse MOVE operations from other connections. The ShardSlice path uses split_at_mut (no locking) so deadlock is impossible. ## WAL WAL is appended (wal_append_and_fanout / aof_tx) only on the :1 (success) path, matching Redis AOF semantics. ## New module: src/command/keyspace/ Shared helpers used by both MOVE (T2.2) and future COPY DB n (T2.3): - move_core() — pure data-plane MOVE logic - parse_move_args() — validates MOVE key db argument structure - with_two_dbs_locked() — deadlock-safe RwLock acquisition (lower index first) - with_two_slice_dbs() — split_at_mut borrow for ShardSlice path ## Tests (13 unit tests in move_cmd.rs) - move_core: success, missing key, collision → key restored in src - parse_move_args: ok, wrong arity, negative db, out-of-range, non-numeric - with_two_dbs_locked: lower-first and higher-first orderings - with_two_slice_dbs: lower-src and higher-src orderings Smoke-tested against a live server (--appendonly no): MOVE mykey 1 → 1 (key moved) MOVE nonexistent 1 → 0 MOVE colkey 1 → 0 (collision preserved both sides) MOVE samekey 0 → 0 (same-db no-op) MOVE (bad arity) → ERR MOVE db2key 2 → 1 (move to db2 works) author: Tin Dang

Extends COPY to support cross-database copy via the `DB n` clause: COPY src dst DB n [REPLACE] The `DB n` form copies a key from the connection's selected database to database n on the same shard. Without `DB n` the command falls through to the existing same-db path in key_extra::copy (no behaviour change for the common case). ## Semantics (Redis-compatible) - Returns :1 on success (key copied to destination database) - Returns :0 if source key does not exist - Returns :0 on collision in destination without REPLACE - Returns :1 with REPLACE on collision (overwrites destination) - `DB n == current_db` falls through to key_extra::copy (same-db path) - No DB clause falls through to key_extra::copy (same-db path) - Returns ERR on invalid DB index or syntax errors ## Architecture COPY with DB n cannot go through dispatch() which only receives one &mut Database. The same handler-level interception strategy as MOVE is used: - handler_single: COPY DB n block after the MOVE block; same guard drop/re-acquire pattern; db.as_slice() dereferences SharedDatabases - handler_monoio: COPY block inside the is_write gate, before normal write path - handler_sharded: COPY block inside the is_write gate - spsc_handler: COPY block after MOVE block; same slice/locked dual-path parse_copy_db_args() returns: None — no DB clause or same-db (caller falls through to dispatch) Some(Ok(args)) — cross-db: intercept and execute copy_core Some(Err(f)) — invalid DB index or syntax: return error frame immediately ## key_extra.rs — same-db DB clause fallthrough When parse_copy_db_args returns None (same-db case), dispatch() reaches key_extra::copy(). The DB token and its index argument are consumed there so existing same-db COPY DB 0 (from current db 0) works correctly. ## New items in src/command/keyspace/move_cmd.rs - copy_core() — pure data-plane COPY logic (no locking, no WAL) - CopyDbArgs struct — parsed cross-db COPY arguments - parse_copy_db_args() — tokenises COPY args, identifies cross-db case ## Tests (9 new unit tests in move_cmd.rs) - copy_core: success, missing src, collision without replace, collision with replace - parse_copy_db_args: no DB clause (→ None), same db (→ None), cross db, with REPLACE, invalid DB index Smoke-tested against a live server (--appendonly no): COPY src dst DB 1 → 1 (key copied, src preserved) COPY src dst DB 1 (collision) → 0 (dst untouched) COPY src dst DB 1 REPLACE → 1 (dst overwritten) COPY origkey newkey DB 2 → 1 (different src/dst names across dbs) COPY nosuchkey dst DB 1 → 0 (missing src) COPY cpyfallthrough cpy2 → 1 (no DB clause, same-db fallthrough) COPY samedbkey samedbdst DB 0 → 1 (same-db DB clause, fallthrough) author: Tin Dang

Add CLUSTER REPLICAS <node-id> and its deprecated alias CLUSTER SLAVES. Wire protocol: - Returns Frame::Array of Frame::BulkString entries (one per replica). - Each entry is a CLUSTER NODES-format line with no trailing newline. - Empty array when the master has no replicas. - ERR Unknown node <id> when the requested node-id is not in cluster state. Implementation: - Extract format_node_line(node, self_node_id) helper (no trailing \n) so that CLUSTER NODES and CLUSTER REPLICAS share one formatter. - Refactor handle_cluster_nodes to use format_node_line + push('\n'). - handle_cluster_replicas filters nodes.values() by NodeFlags::Replica { master_id } == target_id. - Both REPLICAS and SLAVES dispatch to handle_cluster_replicas in the CLUSTER subcommand match arm — single code path, zero duplication. Tests added (5): - cluster_replicas_returns_empty_for_master_with_no_replicas - cluster_replicas_lists_replicas (≥9 fields, no trailing \n, both ids present) - cluster_replicas_rejects_unknown_node_id - cluster_slaves_is_alias_for_replicas (dispatch via handle_cluster_command) - cluster_replicas_includes_myself_marker_when_self_is_replica author: Tin Dang

Add CLUSTER COUNT-FAILURE-REPORTS <node-id>. Wire protocol: - Returns Frame::Integer(count) — the number of active PFAIL reports for the given node. - Returns :0 for an unknown node-id (matches real Redis behaviour). - A report is active when its age is strictly less than 2 * node_timeout_ms. Stale reports (age >= 2 * node_timeout_ms) are excluded from the count. Staleness window: - Uses DEFAULT_NODE_TIMEOUT_MS (30_000 ms) from cluster::failover, now exported as pub(crate) so both try_mark_fail_with_consensus and this handler apply the same 2× cutoff — no divergence between the two paths that read pfail_reports. - Follow-up: centralize node_timeout into ClusterState so it is configurable at runtime (current --cluster-node-timeout is 15_000 ms in Config but not yet threaded to this handler; hardcoded 30_000 used here matches the existing failover.rs constant). Implementation: - handle_cluster_count_failure_reports reads ClusterState under a shared read lock; no writes, no WAL emission. - Dispatch wired as b"COUNT-FAILURE-REPORTS" arm in handle_cluster_command. - failover.rs: DEFAULT_NODE_TIMEOUT_MS promoted from private const to pub(crate) const with doc comment. Tests added (4): - cluster_count_failure_reports_returns_zero_for_unknown_node - cluster_count_failure_reports_returns_zero_for_healthy_node - cluster_count_failure_reports_counts_active_reports - cluster_count_failure_reports_excludes_stale_reports (uses ts=0 for stale, ts=u64::MAX/2 for active — clock-skew-safe) author: Tin Dang

qodo-code-review · 2026-05-24T06:36:19Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

coderabbitai · 2026-05-24T06:36:30Z

Warning

Review limit reached

@TinDang97, we couldn't start this review because you've used your available PR reviews for now.

Your plan currently allows 1 review/hour. Refill in 26 minutes and 43 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a30ab4ca-af48-4e46-87fb-6e415c00f864

📥 Commits

Reviewing files that changed from the base of the PR and between e429b2b and 3267648.

📒 Files selected for processing (11)

CHANGELOG.md
src/command/keyspace/move_cmd.rs
src/server/conn/handler_monoio/dispatch.rs
src/server/conn/handler_monoio/mod.rs
src/server/conn/handler_sharded/dispatch.rs
src/server/conn/handler_sharded/mod.rs
src/server/conn/handler_single.rs
src/shard/coordinator.rs
src/shard/shared_databases.rs
src/shard/spsc_handler.rs
src/storage/dashtable/segment/insert.rs

📝 Walkthrough

Walkthrough

Adds MOVE/COPY core functions and parsing, two-db locking/slice helpers, handler-level SWAPDB with cross-shard coordination and WAL, AOF rewrite gating and lifecycle, replay-time SWAPDB handling, CLUSTER REPLICAS/COUNT-FAILURE-REPORTS handlers, a new DashTable Segment implementation, and supporting tests/scripts.

Changes

Cross-Database Commands (MOVE, COPY, SWAPDB)

Layer / File(s)	Summary
Data-plane core operations and argument parsing `src/command/keyspace/mod.rs`, `src/command/keyspace/move_cmd.rs`	`move_core` and `copy_core` implement atomic key transfer/copy semantics; `parse_move_args` and `parse_copy_db_args` parse/validate cross-db clauses; `with_two_dbs_locked` and `with_two_slice_dbs` provide deadlock-safe two-db access; unit tests added.
Command dispatch fallbacks & metadata `src/command/mod.rs`, `src/command/metadata.rs`, `src/command/key_extra.rs`	Exports `keyspace` module; updates direct-dispatch fallbacks: `MOVE`/`SWAPDB` now require handler-layer invocation; `COPY` DB token parsing now errors on missing index and falls through for same-db; tests updated.
Single-mode connection handler `src/server/conn/handler_single.rs`	`SWAPDB` intercept acquires two write locks in ascending order, swaps databases, emits WAL when enabled; write-path `MOVE`/`COPY` drop single-db guard, acquire two-db locks, perform op, reacquire guard, and append AOF on success.
Sharded monoio handler `src/server/conn/handler_monoio/dispatch.rs`, `src/server/conn/handler_monoio/mod.rs`	Async `SWAPDB` dispatcher validates args, blocks during BGREWRITEAOF, coordinates cross-shard swap; local `MOVE`/`COPY` fast-paths use shard-slice or ordered two-db locking and append AOF on non-error responses.
Sharded handler `src/server/conn/handler_sharded/dispatch.rs`, `src/server/conn/handler_sharded/mod.rs`	Async handler-layer `SWAPDB` intercept rejects MULTI/queued execution, validates indices, delegates to `coordinate_swapdb`; local `MOVE`/`COPY` handled early via `keyspace::move_cmd`.
Shard coordinator and message dispatch `src/shard/shared_databases.rs`, `src/shard/dispatch.rs`, `src/shard/coordinator.rs`, `src/shard/spsc_handler.rs`	`ShardDatabases::swap_dbs` atomically swaps two DBs under ordered locks; `ShardMessage::SwapDb` carries swap indices and oneshot reply; `coordinate_swapdb` broadcasts swap to shards, emits WAL before local swap, and awaits acknowledgements; SPSC handler appends WAL and swaps.
AOF concurrency control and replay `src/command/persistence.rs`, `src/persistence/aof.rs`, `src/persistence/replay.rs`	Introduces `AOF_REWRITE_IN_PROGRESS` atomic flag to gate BGREWRITEAOF via CAS with rollback on send failure; writer loops clear the flag after rewrite; `DispatchReplayEngine::replay_command` intercepts `SWAPDB` and swaps DBs via safe slice splitting for replay; tests added.
Consistency testing `scripts/test-consistency.sh`	Adds SWAPDB consistency checks: seed db0/db1, execute SWAPDB, verify key movement and absence, check same-index no-op and out-of-range error behavior, and swap back to restore state.

Cluster Commands Enhancement

Layer / File(s)	Summary
CLUSTER REPLICAS and COUNT-FAILURE-REPORTS `src/cluster/command.rs`, `src/cluster/failover.rs`	`handle_cluster_replicas` (alias `SLAVES`) filters replica nodes by master id and returns formatted node lines via `format_node_line`; `handle_cluster_count_failure_reports` counts recent pfail reports using a stale cutoff derived from `DEFAULT_NODE_TIMEOUT_MS` (now `pub(crate)`); tests added.

Storage: DashTable Segment

Layer / File(s)	Summary
Segment lookup and accessors `src/storage/dashtable/segment/find.rs`	Adds `find`, `get`, `get_mut`, `get_key_value`, and `is_in_non_home_group` with architecture-specific optimizations and unsafe slot access guards.
Segment insert and insert-or-update `src/storage/dashtable/segment/insert.rs`	Implements `insert`, `insert_or_update_at`, `find_free_slot_in_group`, `write_slot`, probe instrumentation, and group/fallback scanning logic.
Segment core, remove, split, and public API `src/storage/dashtable/segment/mod.rs`, `src/storage/dashtable/segment/ops.rs`	Defines `Segment` struct, control-byte layout, `h2`, `home_buckets`, `remove`, `split`, Drop safety, Send/Sync impls, and an extensive test suite validating correctness and split behavior.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Handler
  participant Coordinator
  participant LocalShard
  participant RemoteShard
  participant AOFWriter

  Client->>Handler: SWAPDB a b
  Handler->>Handler: parse indices, check MULTI / AOF_REWRITE_IN_PROGRESS
  Handler->>Coordinator: coordinate_swapdb(a,b)
  Coordinator->>LocalShard: emit WAL (SWAPDB a b)
  LocalShard->>LocalShard: swap_dbs(a,b)
  Coordinator->>RemoteShard: ShardMessage::SwapDb(a,b)
  RemoteShard->>RemoteShard: emit WAL, swap_dbs(a,b)
  RemoteShard->>Coordinator: oneshot ack
  Coordinator->>Handler: +OK
  Note right of AOFWriter: WAL persistence occurs asynchronously

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

pilotspace/moon#63: Related BGREWRITEAOF startup and AOF rewrite flow changes used by this PR’s AOF gating.
pilotspace/moon#99: Prior dispatch-layer changes around MOVE/SWAPDB fallbacks that this PR updates.
pilotspace/moon#17: Earlier multi-shard database infrastructure work related to ShardDatabases used by SWAPDB.

Suggested labels

enhancement

Suggested reviewers

pilotspacex-byte

Poem

A rabbit nibbles keys at dawn, 🐇
Two DBs swapped before the lawn,
WALs are written, locks align,
Shards all nod — the swap's divine,
Tests hop through, the meadow's calm.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title comprehensively and clearly summarizes all 5 main features (SWAPDB, MOVE, COPY DB n, CLUSTER REPLICAS/SLAVES, COUNT-FAILURE-REPORTS) plus the perf fix in a concise, scannable format that accurately represents the changeset.
Description check	✅ Passed	The description is highly detailed and well-structured, covering objectives, commits, performance analysis, verification, and test plans. However, it deviates from the template by omitting the standard checklist sections (cargo fmt, cargo clippy, cargo test, consistency tests).
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/tier-2-lane-a

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 18

🧹 Nitpick comments (1)

src/cluster/command.rs (1)

925-961: 💤 Low value

Test won't catch the duplicate master_id bug.

The test checks fields.len() >= 9 but doesn't validate field content. With the current bug in format_node_line, replica lines will have 10 fields (duplicate master_id), and this test will pass.

Consider adding a check after fixing the bug:

// For replica nodes, verify master_id appears exactly once in the correct position
if fields[3] == "slave" || fields[3].starts_with("myself,slave") {
    assert_eq!(fields[3], "slave", "flags should be 'slave' for non-self replicas");
    assert!(fields[4].len() == 40, "field 4 should be 40-char master_id");
}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cluster/command.rs` around lines 925 - 961, The test
cluster_replicas_lists_replicas is too loose and won't catch the duplicate
master_id bug from format_node_line; update the test (inside the loop that
parses fields) to assert replica-specific layout: when fields[3] indicates a
replica (either exactly "slave" or starts with "myself,slave" for the self
case), assert flags are "slave" for non-self replicas and verify the master_id
field is in the expected position and is exactly 40 chars long (i.e., check
fields[3] == "slave" for non-self replicas and fields[4].len() == 40), and also
ensure the master_id does not appear elsewhere in the line so duplicates would
fail; add these checks after the fields: Vec<&str> =
line.split_whitespace().collect() block to catch the duplicate-master_id bug
from format_node_line.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/test-consistency.sh`:
- Around line 548-553: The seed step uses the helper/command "both" but relies
on a separate SELECT that won’t persist across invocations, so the final "both
DEL swapkey" runs against DB0 instead of DB1; update the seed to target DB1
explicitly when deleting (i.e., remove the separate SELECT 1 and invoke DEL with
DB index specified for "both" so the DEL runs against DB1 for key "swapkey") and
ensure the commands around "both SET swapkey hello" and the subsequent delete
use explicit DB targeting.
- Around line 570-576: The test currently reads redis_oor but only checks
rust_oor for "ERR" so redis_oor is unused and parity isn't asserted; update the
conditional to verify both redis_oor and rust_oor contain "ERR" (e.g., use grep
-qi "ERR" on both $redis_oor and $rust_oor) and only increment PASS when they
both match, otherwise increment FAIL and echo a message showing both $redis_oor
and $rust_oor to highlight the discrepancy; keep the existing variables
(redis_oor, rust_oor, PASS, FAIL, SWAPDB) when implementing this change.

In `@src/cluster/command.rs`:
- Around line 102-118: The flags string in format_node_line incorrectly includes
the replica's master_id for NodeFlags::Replica, causing an extra token; change
the match arms that handle NodeFlags::Replica { master_id } so they produce
"myself,slave" and "slave" (no master_id appended) when building flags_str, and
leave the separate master-id field logic (the master_id_field variable/output)
intact so the master node id is emitted only in its dedicated column.

In `@src/command/key_extra.rs`:
- Around line 42-49: The code currently consumes the DB index token by doing i
+= 1 without validating its content; instead, validate the consumed token
(args[i] before or after increment as appropriate) is a non-negative integer
(e.g., parse bytes to a usize/u64 after converting from bytes) and return
Frame::Error(Bytes::from_static(b"ERR invalid DB index")) on parse failure.
Update the branch that comments about parse_copy_db_args/same-db path to
explicitly parse and validate the DB token (the args and i variables in this
block) before proceeding so malformed values like "DB abc" are rejected.

In `@src/command/keyspace/move_cmd.rs`:
- Around line 255-264: The code in with_two_dbs_locked currently takes two write
locks even when src_idx == dst_idx, causing a self-deadlock; add an explicit
guard at the start of with_two_dbs_locked that checks if src_idx == dst_idx and
either panic! or return an Err/early-return to enforce the contract. Locate the
with_two_dbs_locked function and insert the equality check before any locking on
dbs (use the same src_idx, dst_idx variables used in the diff), so the
subsequent code that acquires lo/hi locks and calls f(&mut lo, &mut hi) never
runs for equal indices.

In `@src/server/conn/handler_monoio/dispatch.rs`:
- Around line 1016-1034: The SWAPDB handler currently parses DB indices without
verifying argument count, allowing extra args and returning the wrong error for
too few args; before calling parse_db_index on cmd_args, check that
cmd_args.len() == 2 and if not push the standard arity error via
responses.push(Frame::Error(Bytes::from_static(b"ERR wrong number of arguments
for 'swapdb' command"))) and return true; then proceed to parse using the
existing parse_db_index closure and the existing variables (parse_db_index,
cmd_args, a, b, responses, Frame::Error) as-is.

In `@src/server/conn/handler_monoio/mod.rs`:
- Around line 789-792: SWAPDB is being handled before ACL checks, allowing
unauthorized access; move the SWAPDB intercept so it runs after the ACL gate
(i.e., after the call to dispatch::try_enforce_acl) or explicitly call
dispatch::try_enforce_acl before invoking dispatch::try_handle_swapdb in the
monoio path (the symbols to change are dispatch::try_handle_swapdb and
dispatch::try_enforce_acl in handler_monoio/mod.rs) so behavior matches the
sharded handler and unauthorized clients cannot reach SWAPDB.

In `@src/server/conn/handler_sharded/dispatch.rs`:
- Around line 455-472: The SWAPDB handler currently accepts extra args and
misreports missing args because it parses only the first two values; update the
dispatch logic to require exact arity by checking cmd_args.len() == 2 before
parsing. Specifically, add a guard that if cmd_args.len() != 2 pushes the same
Frame::Error(Bytes::from_static(b"ERR wrong number of arguments for 'swapdb'
command")) (or the existing error message you prefer) into responses and returns
true; then continue to use the parse_db_index closure to parse both entries
(parse_db_index, cmd_args.first(), cmd_args.get(1)) only after the arity check
so trailing args are rejected and missing args are reported correctly.

In `@src/server/conn/handler_sharded/mod.rs`:
- Around line 1131-1200: The MOVE/COPY handlers bypass normal write-path
undo/intent bookkeeping when a cross-store transaction is active; add an early
rejection that checks the active_cross_txn flag and returns an error before any
MOVE/COPY work or AOF append. Specifically, in the metadata::is_write branch
before handling cmd.eq_ignore_ascii_case(b"MOVE") and
cmd.eq_ignore_ascii_case(b"COPY"), detect the active cross-store txn (e.g.,
conn.active_cross_txn or ctx.active_cross_txn as available) and push a
Frame::Error like "ERR cross-store transaction active; MOVE/COPY not allowed" to
responses and continue, ensuring no MOVE/COPY logic or AofMessage::Append is
executed.

In `@src/server/conn/handler_single.rs`:
- Around line 630-633: The SWAPDB handler currently only reads the first two
arguments (using idx_a = cmd_args.first() and idx_b = cmd_args.get(1)) which
lets extra trailing args pass; update the handler to enforce exact arity before
parsing by checking that cmd_args.len() == 2 (or equivalent) and returning an
argument-count error if not, then proceed to call parse_idx on both entries
(idx_a/idx_b) and the existing match branch (Some(a), Some(b)); reference the
variables idx_a, idx_b and the parse_idx usage to locate where to add the length
check and early error return.
- Around line 656-677: The code drops the result of aof_tx.try_send when
enqueueing the SWAPDB WAL, which can cause the handler to acknowledge success
even if the WAL append failed; update the SWAPDB path (the block using aof_tx,
itoa::Buffer, Frame::Array, serialize_command and AofMessage::Append) to handle
the try_send Result: check for Err and on failure propagate an error (or return
a failed response) so the state mutation is not acknowledged, or retry/send
synchronously (await) until persisted; ensure the caller sees the failure and
does not reply OK when tx.try_send returns an error.

In `@src/shard/coordinator.rs`:
- Around line 1660-1728: The file exceeds the repository size cap and the new
coordination logic (coordinate_swapdb) should be moved into a dedicated
submodule; extract the SWAPDB coordination code (the function coordinate_swapdb
plus any small helpers it needs like the WAL emit/serialize callsite and imports
for ShardMessage, spsc_send, ShardDatabases, channel types) into a new submodule
(e.g., shard::coordinator::swapdb) and update the parent coordinator module to
re-export or call into swapdb::coordinate_swapdb. Ensure you move or import all
referenced symbols (spsc_send, ShardMessage, channel::oneshot/OneshotReceiver,
shard_databases.swap_dbs/wal_append, crate::persistence::aof::serialize_command,
Frame, etc.), adjust visibility (pub(crate) or pub) and mod declarations (mod
swapdb; pub use swapdb::coordinate_swapdb;) so existing callers compile, and run
cargo check to fix any missing use paths.
- Around line 1710-1713: coordinate_swapdb currently serializes the WAL frame
and calls shard_databases.wal_append(my_shard, serialized) but proceeds to
shard_databases.swap_dbs(my_shard, a, b) even if wal_append failed (it can
silently drop send failures); change the flow to check the result of wal_append
(or make wal_append return a Result/boolean) and only call
shard_databases.swap_dbs when WAL enqueue succeeds, returning or propagating an
error from coordinate_swapdb on failure; update uses of serialize_command,
wal_append, and swap_dbs to handle and propagate the WAL enqueue failure so
local swap never happens without durable WAL record.

In `@src/shard/shared_databases.rs`:
- Around line 797-806: The swap_dbs function currently only has a debug_assert
to catch a == b but will deadlock in release builds by taking the same write
lock twice; add an explicit no-op early return at the top of swap_dbs (before
computing lo/hi and before acquiring locks) that checks if a == b and returns
immediately, preserving the existing debug_assert and then proceeding to acquire
self.shards[shard_id][lo].write() and self.shards[shard_id][hi].write() only for
lo != hi.

In `@src/shard/spsc_handler.rs`:
- Around line 2296-2325: The SwapDb handler currently performs
shard_databases.swap_dbs(a,b) without coordinating with snapshots; change it to
block or snapshot-protect SWAPDB by acquiring the snapshot protection/guard
before serialising/applying the swap (or by checking SnapshotState and
waiting/returning an error if a SnapshotState is active). Concretely, in the
ShardMessage::SwapDb branch wrap the WAL emission + shard_databases.swap_dbs
call with the existing SnapshotState API (e.g. obtain a SnapshotGuard or call
snapshot_state.block_mutations()/wait_for_no_snapshots) so swaps cannot run
concurrently with an in-flight snapshot, and keep the existing ascending-index
write-lock ordering when holding that snapshot guard. Ensure you reference the
ShardMessage::SwapDb handler, shard_databases.swap_dbs, and the
SnapshotState/snapshot guard API when implementing the coordination.
- Around line 464-509: The MOVE handler returns early (Integer(0)) for dst_db ==
db_idx and bypasses snapshot copy-on-write; before deciding or executing the
move (both the dst_db == db_idx branch and the actual move branch that calls
ksmv::move_core), invoke the snapshot COW interceptor for the source key so
snapshots capture the pre-move state—i.e., after parsing Ok((key, dst_db)) call
the existing cow_intercept function (e.g. crate::shard::snapshot::cow_intercept
or the project’s equivalent) for (db_idx, &key) while still on the shard context
(the same place where you later call ksmv::with_two_slice_dbs or
ksmv::with_two_dbs_locked) so the subsequent Integer(0) return or
ksmv::move_core cannot delete the pre-snapshot data.
- Around line 464-549: The MOVE and COPY intercept paths call
move_core/copy_core without refreshing the source and destination DBs, so
expired keys may be treated incorrectly; update the closures passed to
ksmv::with_two_slice_dbs and ksmv::with_two_dbs_locked in both the MOVE (where
move_core is invoked) and COPY (where copy_core is invoked) branches to call
refresh_now_from_cache(cached_clock) on both src and dst before calling
ksmv::move_core or ksmv::copy_core (i.e., inside each closure, first run
src.refresh_now_from_cache(cached_clock);
dst.refresh_now_from_cache(cached_clock); then invoke the core function).

---

Nitpick comments:
In `@src/cluster/command.rs`:
- Around line 925-961: The test cluster_replicas_lists_replicas is too loose and
won't catch the duplicate master_id bug from format_node_line; update the test
(inside the loop that parses fields) to assert replica-specific layout: when
fields[3] indicates a replica (either exactly "slave" or starts with
"myself,slave" for the self case), assert flags are "slave" for non-self
replicas and verify the master_id field is in the expected position and is
exactly 40 chars long (i.e., check fields[3] == "slave" for non-self replicas
and fields[4].len() == 40), and also ensure the master_id does not appear
elsewhere in the line so duplicates would fail; add these checks after the
fields: Vec<&str> = line.split_whitespace().collect() block to catch the
duplicate-master_id bug from format_node_line.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7468036a-b29c-4c8d-85aa-c4dfe0a9944f

📥 Commits

Reviewing files that changed from the base of the PR and between bddda41 and ebd240a.

📒 Files selected for processing (20)

scripts/test-consistency.sh
src/cluster/command.rs
src/cluster/failover.rs
src/command/key_extra.rs
src/command/keyspace/mod.rs
src/command/keyspace/move_cmd.rs
src/command/metadata.rs
src/command/mod.rs
src/command/persistence.rs
src/persistence/aof.rs
src/persistence/replay.rs
src/server/conn/handler_monoio/dispatch.rs
src/server/conn/handler_monoio/mod.rs
src/server/conn/handler_sharded/dispatch.rs
src/server/conn/handler_sharded/mod.rs
src/server/conn/handler_single.rs
src/shard/coordinator.rs
src/shard/dispatch.rs
src/shard/shared_databases.rs
src/shard/spsc_handler.rs

💤 Files with no reviewable changes (1)

src/command/metadata.rs

coderabbitai · 2026-05-24T06:45:14Z

+# Seed: db0 has swapkey=hello, db1 is empty
+both SELECT 0
+both SET swapkey hello
+both SELECT 1
+both DEL swapkey
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use explicit DB targeting when seeding SWAPDB state

both SELECT ... does not persist DB selection across later redis-cli invocations. On Line 552, DEL swapkey runs against default DB 0, which can invalidate the intended seed and make the swap check non-representative.

Suggested fix

-# Seed: db0 has swapkey=hello, db1 is empty -both SELECT 0 -both SET swapkey hello -both SELECT 1 -both DEL swapkey +# Seed: db0 has swapkey=hello, db1 is empty +redis-cli -p "$PORT_REDIS" -n 0 SET swapkey hello >/dev/null 2>&1 || true +redis-cli -p "$PORT_RUST" -n 0 SET swapkey hello >/dev/null 2>&1 || true +redis-cli -p "$PORT_REDIS" -n 1 DEL swapkey >/dev/null 2>&1 || true +redis-cli -p "$PORT_RUST" -n 1 DEL swapkey >/dev/null 2>&1 || true

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/test-consistency.sh` around lines 548 - 553, The seed step uses the helper/command "both" but relies on a separate SELECT that won’t persist across invocations, so the final "both DEL swapkey" runs against DB0 instead of DB1; update the seed to target DB1 explicitly when deleting (i.e., remove the separate SELECT 1 and invoke DEL with DB index specified for "both" so the DEL runs against DB1 for key "swapkey") and ensure the commands around "both SET swapkey hello" and the subsequent delete use explicit DB targeting.

coderabbitai · 2026-05-24T06:45:15Z

+redis_oor=$(redis-cli -p "$PORT_REDIS" SWAPDB 0 9999 2>&1) || true
+rust_oor=$(redis-cli -p "$PORT_RUST" SWAPDB 0 9999 2>&1) || true
+if echo "$rust_oor" | grep -qi "ERR"; then
+    PASS=$((PASS + 1))
+else
+    FAIL=$((FAIL + 1)); echo "  FAIL: SWAPDB out-of-range should return ERR, got: $rust_oor"
+fi


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert out-of-range parity against Redis (and use redis_oor)

Line 570 captures redis_oor, but the check only validates moon output contains ERR. This misses direct parity verification and leaves redis_oor unused.

Suggested fix

redis_oor=$(redis-cli -p "$PORT_REDIS" SWAPDB 0 9999 2>&1) || true rust_oor=$(redis-cli -p "$PORT_RUST" SWAPDB 0 9999 2>&1) || true -if echo "$rust_oor" | grep -qi "ERR"; then - PASS=$((PASS + 1)) -else - FAIL=$((FAIL + 1)); echo " FAIL: SWAPDB out-of-range should return ERR, got: $rust_oor" -fi +assert_eq "SWAPDB out-of-range parity" "$redis_oor" "$rust_oor"

🧰 Tools

🪛 Shellcheck (0.11.0)

[warning] 570-570: redis_oor appears unused. Verify use (or export if used externally).

(SC2034)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/test-consistency.sh` around lines 570 - 576, The test currently reads redis_oor but only checks rust_oor for "ERR" so redis_oor is unused and parity isn't asserted; update the conditional to verify both redis_oor and rust_oor contain "ERR" (e.g., use grep -qi "ERR" on both $redis_oor and $rust_oor) and only increment PASS when they both match, otherwise increment FAIL and echo a message showing both $redis_oor and $rust_oor to highlight the discrepancy; keep the existing variables (redis_oor, rust_oor, PASS, FAIL, SWAPDB) when implementing this change.

coderabbitai · 2026-05-24T06:45:15Z

+fn format_node_line(node: &ClusterNode, self_node_id: &str) -> String {
+    let flags_str = if node.node_id == self_node_id {
+        // self: prepend "myself,"
+        match &node.flags {
+            NodeFlags::Master => "myself,master".to_string(),
+            NodeFlags::Replica { master_id } => format!("myself,slave {}", master_id),
+            NodeFlags::Pfail => "myself,pfail".to_string(),
+            NodeFlags::Fail => "myself,fail".to_string(),
+        }
+    } else {
+        match &node.flags {
+            NodeFlags::Master => "master".to_string(),
+            NodeFlags::Replica { master_id } => format!("slave {}", master_id),
+            NodeFlags::Pfail => "pfail".to_string(),
+            NodeFlags::Fail => "fail".to_string(),
+        }
+    };


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Redis CLUSTER NODES output format for replica nodes

💡 Result:

In Redis Cluster, redis-cli CLUSTER NODES output is line-based; each line corresponds to one node entry in the cluster state as seen by the contacted node. Replica nodes are represented as one of the entries whose flags include “slave” (older name) or “replica”, and whose “master” field contains the node ID of the primary it replicates (or “-” if the primary isn’t known). The line also includes fields such as “config-epoch”, status/link info, and the list of hash slots (replicas typically don’t list slots). [1] Replica node-specific fields/markers you’ll see in each CLUSTER NODES line: 1) Flags token: “slave” / “replica” (indicates the node is a replica). [1] 2) Master node ID field: if the node is a replica and the primary is known, the “master” field is the primary’s node ID; otherwise it is “-”. [1] 3) “config-epoch” field: for a replica, this is the configuration epoch of the current primary (the “real” meaning is tied to the primary; replicas broadcast the primary’s config epochs via heartbeats). [1] Example shape (tokens/fields), taken from the documented explanation of how CLUSTER NODES lines are structured: - Node ID - ip:port@cluster-bus-port - flags (includes “slave”/“replica” for replicas) - master (primary node ID, or “-” if unknown) - … other timing/config/link fields - … hash slots list (typically only present for primaries). [1] So, to identify replica nodes in CLUSTER NODES output: look for lines whose flags include “slave”/“replica”, then confirm the replicated primary via the “master” field in that same line. [1]

Citations:

1: https://redis.io/docs/latest/commands/cluster-nodes/

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show the helper implementation around the referenced line range nl -ba src/cluster/command.rs | sed -n '80,180p'

Repository: pilotspace/moon

Length of output: 102

🏁 Script executed:

#!/bin/bash set -euo pipefail awk '{printf "%d\t%s\n", NR, $0}' src/cluster/command.rs | sed -n '80,180p'

Repository: pilotspace/moon

Length of output: 4208

Fix CLUSTER NODES replica formatting: don’t embed master-id in the flags field

format_node_line builds flags_str for NodeFlags::Replica { master_id } as "myself,slave {master_id}" / "slave {master_id}" (lines 103-118) and then also outputs master_id_field right after (lines 120-123, 132-145). Because flags_str contains a space, the line splits into an extra token, duplicating/misalaligning the master-id field. Redis expects the flags field to be a single token (slave / myself,slave) with the replicated primary’s node id in the following master-id field.

🔧 Proposed fix

let flags_str = if node.node_id == self_node_id { // self: prepend "myself," match &node.flags { NodeFlags::Master => "myself,master".to_string(), - NodeFlags::Replica { master_id } => format!("myself,slave {}", master_id), + NodeFlags::Replica { master_id: _ } => "myself,slave".to_string(), NodeFlags::Pfail => "myself,pfail".to_string(), NodeFlags::Fail => "myself,fail".to_string(), } } else { match &node.flags { NodeFlags::Master => "master".to_string(), - NodeFlags::Replica { master_id } => format!("slave {}", master_id), + NodeFlags::Replica { master_id: _ } => "slave".to_string(), NodeFlags::Pfail => "pfail".to_string(), NodeFlags::Fail => "fail".to_string(), } };

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cluster/command.rs` around lines 102 - 118, The flags string in format_node_line incorrectly includes the replica's master_id for NodeFlags::Replica, causing an extra token; change the match arms that handle NodeFlags::Replica { master_id } so they produce "myself,slave" and "slave" (no master_id appended) when building flags_str, and leave the separate master-id field logic (the master_id_field variable/output) intact so the master node id is emitted only in its dedicated column.

coderabbitai · 2026-05-24T06:45:15Z

+            // Cross-db COPY is intercepted by handler-level code before dispatch().
+            // When this branch is reached, the DB index has already been validated
+            // as the same database (parse_copy_db_args returned None → same-db path).
+            // Consume the DB index argument and continue as a same-db copy.
+            i += 1; // skip the db-index token
+            if i >= args.len() {
+                return Frame::Error(Bytes::from_static(b"ERR syntax error"));
+            }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate the consumed DB index token instead of accepting arbitrary bytes.

This branch only checks presence, so malformed values (e.g., DB abc) are silently accepted if interception is bypassed. Validate that the skipped token is a non-negative integer before continuing.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/command/key_extra.rs` around lines 42 - 49, The code currently consumes the DB index token by doing i += 1 without validating its content; instead, validate the consumed token (args[i] before or after increment as appropriate) is a non-negative integer (e.g., parse bytes to a usize/u64 after converting from bytes) and return Frame::Error(Bytes::from_static(b"ERR invalid DB index")) on parse failure. Update the branch that comments about parse_copy_db_args/same-db path to explicitly parse and validate the DB token (the args and i variables in this block) before proceeding so malformed values like "DB abc" are rejected.

coderabbitai · 2026-05-24T06:45:15Z

+                if cmd.eq_ignore_ascii_case(b"MOVE") {
+                    use crate::command::keyspace::move_cmd as ksmv;
+                    let response = match ksmv::parse_move_args(args, db_count) {
+                        Err(e) => e,
+                        Ok((_key, dst_db)) if dst_db == db_idx => {
+                            crate::protocol::Frame::Integer(0)
+                        }
+                        Ok((key, dst_db)) => {
+                            // SPSC runs single-threaded per shard; no concurrent MOVE can
+                            // deadlock. slice path uses split_at_mut (no locking needed).
+                            if crate::shard::slice::is_initialized() {
+                                crate::shard::slice::with_shard(|s| {
+                                    ksmv::with_two_slice_dbs(
+                                        &mut s.databases,
+                                        db_idx,
+                                        dst_db,
+                                        |src, dst| ksmv::move_core(src, dst, &key),
+                                    )
+                                })
+                            } else {
+                                // Lock ordering (lower index first) prevents deadlock with
+                                // handler_monoio/sharded connections on the same shard.
+                                ksmv::with_two_dbs_locked(
+                                    &shard_databases.all_shard_dbs()[shard_id],
+                                    db_idx,
+                                    dst_db,
+                                    |src, dst| ksmv::move_core(src, dst, &key),
+                                )
+                            }
+                        }
+                    };
+                    if matches!(response, crate::protocol::Frame::Integer(1)) {
+                        let serialized = aof::serialize_command(&command);
+                        wal_append_and_fanout(
+                            &serialized,
+                            wal_writer,
+                            wal_v3_writer,
+                            repl_backlog,
+                            replica_txs,
+                            repl_state,
+                            shard_id,
+                        );
+                    }
+                    let _ = reply_tx.send(response);
+                    return;
+                }


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Preserve snapshot COW for MOVE.

This early return bypasses the usual cow_intercept on the source key. If a snapshot is active and the source segment has not been serialized yet, MOVE can delete the key before the snapshot captures its pre-command state.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/shard/spsc_handler.rs` around lines 464 - 509, The MOVE handler returns early (Integer(0)) for dst_db == db_idx and bypasses snapshot copy-on-write; before deciding or executing the move (both the dst_db == db_idx branch and the actual move branch that calls ksmv::move_core), invoke the snapshot COW interceptor for the source key so snapshots capture the pre-move state—i.e., after parsing Ok((key, dst_db)) call the existing cow_intercept function (e.g. crate::shard::snapshot::cow_intercept or the project’s equivalent) for (db_idx, &key) while still on the shard context (the same place where you later call ksmv::with_two_slice_dbs or ksmv::with_two_dbs_locked) so the subsequent Integer(0) return or ksmv::move_core cannot delete the pre-snapshot data.

coderabbitai · 2026-05-24T06:45:15Z

+        ShardMessage::SwapDb { a, b, reply_tx } => {
+            // WAL-before-swap: emit the SWAPDB record so that crash-recovery
+            // replay can re-apply the swap in the correct order.  The record
+            // is written even when wal_writer/wal_v3_writer are None (the
+            // fast-path in wal_append_and_fanout will skip it cheaply).
+            //
+            // Serialise "SWAPDB <a> <b>" without heap allocation on the number
+            // formatting (itoa writes into a stack buffer).
+            let mut a_buf = itoa::Buffer::new();
+            let mut b_buf = itoa::Buffer::new();
+            let a_str = a_buf.format(a);
+            let b_str = b_buf.format(b);
+            let wal_frame = crate::protocol::Frame::Array(crate::framevec![
+                crate::protocol::Frame::BulkString(bytes::Bytes::from_static(b"SWAPDB")),
+                crate::protocol::Frame::BulkString(bytes::Bytes::copy_from_slice(a_str.as_bytes())),
+                crate::protocol::Frame::BulkString(bytes::Bytes::copy_from_slice(b_str.as_bytes())),
+            ]);
+            let serialized = aof::serialize_command(&wal_frame);
+            wal_append_and_fanout(
+                &serialized,
+                wal_writer,
+                wal_v3_writer,
+                repl_backlog,
+                replica_txs,
+                repl_state,
+                shard_id,
+            );
+
+            // Perform the in-place swap under ascending-index write locks.
+            shard_databases.swap_dbs(shard_id, a, b);


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Block or snapshot-protect SWAPDB.

SWAPDB mutates entire databases but never goes through the normal write/COW path. Running this while SnapshotState is active can expose swapped contents to an in-flight snapshot with no way to reconstruct the pre-swap view.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/shard/spsc_handler.rs` around lines 2296 - 2325, The SwapDb handler currently performs shard_databases.swap_dbs(a,b) without coordinating with snapshots; change it to block or snapshot-protect SWAPDB by acquiring the snapshot protection/guard before serialising/applying the swap (or by checking SnapshotState and waiting/returning an error if a SnapshotState is active). Concretely, in the ShardMessage::SwapDb branch wrap the WAL emission + shard_databases.swap_dbs call with the existing SnapshotState API (e.g. obtain a SnapshotGuard or call snapshot_state.block_mutations()/wait_for_no_snapshots) so swaps cannot run concurrently with an in-flight snapshot, and keep the existing ascending-index write-lock ordering when holding that snapshot guard. Ensure you reference the ShardMessage::SwapDb handler, shard_databases.swap_dbs, and the SnapshotState/snapshot guard API when implementing the coordination.

Lane A's T2.2 (MOVE) and T2.3 (COPY ... DB n) wrapped their command-name checks in an outer `if metadata::is_write(cmd) { ... }` guard inside the per-frame dispatch loop of handler_monoio and handler_sharded. The existing write-path block below already calls `metadata::is_write(cmd)`, so every write command (SET, HSET, ZADD, etc.) was paying TWO PHF table lookups instead of one. In-session A/B on moon-dev (s=1, c=400, median-of-3) measured Lane A SET p=1 at 218K rps versus the pre-Lane-A baseline (46ed28a) at 241K rps — a -9.5% regression in the per-command-overhead-dominated cell. Other cells (GET p=*, SET p>=16) were unaffected because the extra PHF lookup amortizes away at pipeline depth >= 16. ## Fix Remove the redundant outer `if metadata::is_write(cmd) {` wrapper. MOVE and COPY are themselves write commands; the inner `cmd.eq_ignore_ascii_case(b"MOVE")` / `b"COPY"` checks already gate correctly. For non-MOVE/COPY workloads (SET, GET, HSET, etc.) the branch predictor learns "false" on both inline checks for free (~5ns each, fully predictable). ## Result (post-fix, in-session vs 46ed28a baseline) | cell | 46ed28a | Lane A pre-fix | Lane A post-fix | delta vs base | |-------------|---------|----------------|-----------------|---------------| | GET p=1 | 198K | 211K | 201K | +1.5% | | GET p=16 | 2.65M | 2.88M | 2.52M | -4.9% | | GET p=64 | 7.25M | 7.54M | 7.18M | -1.0% | | SET p=1 | 241K | 218K (-9.5%) | 243K | +0.9% | | SET p=16 | 1.49M | 1.50M | 1.48M | -0.7% | | SET p=64 | 2.74M | 2.76M | 2.63M | -4.0% | All cells now within +/-5% of 46ed28a in-session. ## Files - src/server/conn/handler_monoio/mod.rs: lines 1075-1173 (wrapper removed) - src/server/conn/handler_sharded/mod.rs: lines 1131-1200 (wrapper removed) ## Functional verification (post-fix, --shards 1) - MOVE k0 1 -> :1, GET 0 = (nil), GET (db1) = v0 PASS - COPY src dst DB 2 -> :1, GET (db2) = srcval, src preserved PASS - SWAPDB 0 3 -> +OK, GET (db0) = v3, GET (db3) = v0 PASS ## CI gates - cargo fmt --check: PASS - cargo clippy --release -- -D warnings: PASS - cargo test --release --lib: 3273 PASS / 0 FAIL - cargo test --no-default-features --features runtime-tokio,jemalloc --lib: 2668 PASS / 0 FAIL handler_single.rs and shard/spsc_handler.rs were already correct (they gate MOVE/COPY directly with eq_ignore_ascii_case, no outer is_write wrapper). author: Tin Dang

qodo-code-review · 2026-05-24T07:55:07Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/server/conn/handler_monoio/mod.rs (1)

1120-1174: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Run the normal eviction/OOM gate before cross-DB COPY.

COPY ... DB n can grow memory, but this fast-path returns before the write path's try_evict_if_needed / try_evict_if_needed_async_spill checks. Under maxmemory or spill-offload, it can duplicate a large value without the usual backpressure.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/server/conn/handler_monoio/mod.rs` around lines 1120 - 1174, The
fast-path for cross-DB COPY currently performs ksmv::copy_core without invoking
the usual eviction/OOM gate; before executing the cross-db branch (i.e., after
parse_copy_db_args returns Ok(ca) and before calling ksmv::with_two_slice_dbs /
ksmv::with_two_dbs_locked and ksmv::copy_core) call the same eviction checks
used by normal writes (try_evict_if_needed and/or
try_evict_if_needed_async_spill on ctx) with the expected size delta for the
copy so we apply backpressure or async spill before duplicating large values;
keep the existing AOF append (ctx.aof_tx / aof::serialize_command /
AofMessage::Append) behavior unchanged and only proceed to call ksmv::copy_core
if the eviction gate allows the write.

src/server/conn/handler_sharded/mod.rs (1)

1170-1202: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Preserve maxmemory/OOM enforcement for cross-DB COPY.

This intercept mutates the destination DB before the normal write path runs, so it skips try_evict_if_needed. A large COPY ... DB n can therefore bypass the shard's memory guard.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/server/conn/handler_sharded/mod.rs` around lines 1170 - 1202, The current
COPY DB branch mutates the destination DB via ksmv::copy_core (called through
ksmv::with_two_slice_dbs or ksmv::with_two_dbs_locked) before the shard-level
memory enforcement runs, letting large cross-DB COPY bypass try_evict_if_needed;
fix by invoking the shard eviction/OOM enforcement for the destination DB prior
to performing the copy (e.g., detect cross-DB copy from parse_copy_db_args, and
call the existing try_evict_if_needed mechanism on the destination DB via
ctx.shard_databases (or by obtaining the dst DB handle and running
try_evict_if_needed) before calling
with_two_slice_dbs/with_two_dbs_locked/copy_core), ensuring the call path
mirrors the normal write path’s eviction check so COPY cannot exceed maxmemory.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/server/conn/handler_monoio/mod.rs`:
- Around line 1110-1114: The current AOF append gate `if !matches!(response,
Frame::Error(_)) { ... ctx.aof_tx ... aof::serialize_command(&frame) ...
AofMessage::Append(...) }` wrongly persists no-op responses like
`Frame::Integer(0)` and also appends MOVE/COPY even when they returned 0; change
the condition so you skip appending when the response is `Frame::Integer(0)`
and, for MOVE/COPY commands, only send the serialized command to `ctx.aof_tx`
when the response equals `Frame::Integer(1)` (success); keep the existing
skip-on-`Frame::Error(_)` behavior and use the same checks in the parallel block
at the other location (around lines 1164-1167).

In `@src/server/conn/handler_sharded/mod.rs`:
- Around line 1161-1165: The current AOF append branch treats any non-Error
response as a write; change it to only append when the command actually
succeeded and the selected DB is :1: replace the condition around
aof_bytes/AofMessage::Append so it checks (1) the response is a success frame
(not Frame::Error and not a noop/null result — mirror the success-check logic
used in handler_single.rs) and (2) the connection context indicates DB 1 (use
the same ctx DB selector accessor used elsewhere in this module), and apply the
identical fix at the other occurrence (the block at 1193-1196); keep using
aof_bytes, ctx.aof_tx and AofMessage::Append for the append path.

---

Outside diff comments:
In `@src/server/conn/handler_monoio/mod.rs`:
- Around line 1120-1174: The fast-path for cross-DB COPY currently performs
ksmv::copy_core without invoking the usual eviction/OOM gate; before executing
the cross-db branch (i.e., after parse_copy_db_args returns Ok(ca) and before
calling ksmv::with_two_slice_dbs / ksmv::with_two_dbs_locked and
ksmv::copy_core) call the same eviction checks used by normal writes
(try_evict_if_needed and/or try_evict_if_needed_async_spill on ctx) with the
expected size delta for the copy so we apply backpressure or async spill before
duplicating large values; keep the existing AOF append (ctx.aof_tx /
aof::serialize_command / AofMessage::Append) behavior unchanged and only proceed
to call ksmv::copy_core if the eviction gate allows the write.

In `@src/server/conn/handler_sharded/mod.rs`:
- Around line 1170-1202: The current COPY DB branch mutates the destination DB
via ksmv::copy_core (called through ksmv::with_two_slice_dbs or
ksmv::with_two_dbs_locked) before the shard-level memory enforcement runs,
letting large cross-DB COPY bypass try_evict_if_needed; fix by invoking the
shard eviction/OOM enforcement for the destination DB prior to performing the
copy (e.g., detect cross-DB copy from parse_copy_db_args, and call the existing
try_evict_if_needed mechanism on the destination DB via ctx.shard_databases (or
by obtaining the dst DB handle and running try_evict_if_needed) before calling
with_two_slice_dbs/with_two_dbs_locked/copy_core), ensuring the call path
mirrors the normal write path’s eviction check so COPY cannot exceed maxmemory.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 920bf565-5bc3-4fdd-9eed-fdb19d4645a4

📥 Commits

Reviewing files that changed from the base of the PR and between ebd240a and 608e2d1.

📒 Files selected for processing (2)

src/server/conn/handler_monoio/mod.rs
src/server/conn/handler_sharded/mod.rs

The segment.rs file was 1587 LOC, 87 over the 1500-line per-file limit in CLAUDE.md. Splitting was needed before adding any new code to the segment hot path. This commit is a mechanical refactor with zero semantic change. ## Layout src/storage/dashtable/segment/ mod.rs (819 LOC) — types, constants, Segment struct, basic accessors (count/depth/is_full/has_non_home_keys/ ctrl_byte/iter_occupied/key_ref/value_ref/ value_mut), prefetch_ptr, Drop, Send/Sync, and ALL tests (tests exercise the cross-file public API so they live with the types) find.rs (291 LOC) — find / get / get_mut / find_slot_mut / get_key_value / is_in_non_home_group (test helper) insert.rs (367 LOC) — insert / insert_or_update_at + helpers (find_free_slot_in_group, write_slot, probe_count / bump_probe_count cfg variants) ops.rs (150 LOC) — remove / split / insert_during_split, plus the free function home_buckets Each file ≤ 367 LOC; mod.rs at 819 LOC includes ~500 LOC of tests. All files comfortably under the 1500 limit. ## Visibility adjustments A few previously-private items needed `pub(super)` for cross-file access: - Segment::ctrl, count, depth, has_non_home_keys, probe_count (test), keys, values - Segment::is_full_ctrl (used by Drop in mod, split in ops) - Segment::write_slot, find_free_slot_in_group (used by insert and ops) - prefetch_ptr (used by find and insert) `home_buckets` is re-exported from `mod.rs` via `pub use ops::home_buckets;` so the public API path (`segment::home_buckets`) is unchanged. ## Verification - cargo build --release: PASS - cargo test --release --lib storage::dashtable: 50 passed / 0 failed - cargo test --no-default-features --features runtime-tokio,jemalloc --lib storage::dashtable: 50 passed / 0 failed - cargo fmt --check: PASS - cargo clippy --release -- -D warnings: PASS - cargo clippy --no-default-features --features runtime-tokio,jemalloc -- -D warnings: PASS No semantic change; the split prepares for S3.3 (second-byte fingerprint filter) which adds ~180 LOC to find/insert paths. author: Tin Dang

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/storage/dashtable/segment/insert.rs`:
- Around line 350-365: The linear-scan fallback in the block that computes
free_slot (when first_free is None) can place the key into a non-home group but
does not set has_non_home_keys, causing future find() calls to skip the
fallback; modify that closure so when it finds and returns a slot whose group !=
home_group it sets has_non_home_keys = true before returning (use the same
home-group check logic as the other insert path), then continue to call make(),
self.write_slot(free_slot, h2, k, v) and return SegmentInsertOrUpdate::Inserted
{ slot: free_slot } as before; reference symbols: first_free, has_non_home_keys,
ctrl_byte, TOTAL_SLOTS, home_group, write_slot, make(),
SegmentInsertOrUpdate::Inserted.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 36363094-8c7a-4a80-8096-77794fb3696b

📥 Commits

Reviewing files that changed from the base of the PR and between 608e2d1 and e429b2b.

📒 Files selected for processing (5)

src/storage/dashtable/segment.rs
src/storage/dashtable/segment/find.rs
src/storage/dashtable/segment/insert.rs
src/storage/dashtable/segment/mod.rs
src/storage/dashtable/segment/ops.rs

💤 Files with no reviewable changes (1)

src/storage/dashtable/segment.rs

Round 1 of CodeRabbit's review surfaced 21 actionable comments across the five Lane A commits + the segment split. This commit addresses every issue that is a real bug or a quick durability/security win; structural items (coordinator.rs > 1500 LOC split, snapshot-COW interaction with MOVE/SWAPDB) are scoped out to a follow-up PR per the inline replies. ## Security - `handler_monoio/mod.rs`: SWAPDB intercept moved to run AFTER `try_enforce_acl`. Previously an unauthenticated client could SWAPDB via the monoio path even though handler_sharded already enforced the gate — closes the runtime-drift authorization bypass. ## Durability - `shared_databases.rs`: introduce `try_wal_append_required(...)` — strict variant of `wal_append` that returns `false` only when persistence is configured AND the channel rejected the enqueue. SWAPDB has no command-level rollback, so dropping the WAL silently would diverge cluster state from the on-disk log. - `coordinator.rs::coordinate_swapdb`: use the strict variant and return `ERR SWAPDB aborted: WAL enqueue failed` rather than performing a non-durable swap. - `handler_single.rs` SWAPDB: same — try_send first, only swap on success. ## Correctness / data integrity - `storage/dashtable/segment/insert.rs::insert_or_update_at`: set `has_non_home_keys = true` whenever the chosen free slot is in a non-home regular group (mirrors `insert()` lines 72-79). Without this, subsequent `find()` calls would skip the fallback scan (gated on the flag) and miss the just-inserted key. - `move_cmd.rs::with_two_dbs_locked`: hard `assert_ne!` on src == dst. parking_lot::RwLock is not reentrant; the second write() would self-deadlock in release builds. Callers (MOVE/COPY) short-circuit src == dst → :0 already, but defending here is cheap and surfaces caller misuse. - `shared_databases.rs::swap_dbs`: same-index short-circuit (no-op return) in release builds. The previous `debug_assert_ne!` got compiled out and a rogue SPSC SwapDb message with a == b would stall the shard event loop. ## TXN bookkeeping - handler_sharded + handler_monoio: `MOVE`, `COPY ... DB n`, and `SWAPDB` reject with `ERR_TXN_CROSS_SHARD` while `conn.in_cross_txn()` is true. These intercepts bypass the regular write-path undo/intents capture, so TXN.ABORT cannot reconstruct pre-command state. Same policy as cross-shard writes — rejected today, can be relaxed in a future PR once two-DB transaction bookkeeping lands. ## Protocol conformance - `handler_{single,sharded,monoio}` SWAPDB: exact arity check before parsing (`cmd_args.len() != 2 → ERR wrong number of arguments for 'swapdb' command`). Previously `SWAPDB 0 1 extra` would execute and `SWAPDB 0` would return the integer/range error. - handler_sharded + handler_monoio MOVE/COPY: AOF append gated on `Frame::Integer(1)` (success) rather than `!Error(_)`. The previous predicate logged `:0` no-ops, drifting from handler_single (which is success-only). ## Expiry semantics - `spsc_handler.rs` MOVE/COPY arms: call `refresh_now_from_cache` on both source and destination DBs before `move_core` / `copy_core`. The regular single-DB write path does this (line 583); skipping it for the two-DB intercepts let expired source keys appear movable and expired destination keys block REPLACE-less COPY. ## CI hygiene - `CHANGELOG.md`: PR #100 entry under "Tier 2 Lane A" plus "Fixed" / "Refactor" sections (unblocks the Lint CHANGELOG gate). ## Scoped out (separate PR — tracked in inline replies) - `coordinator.rs` is 1700+ LOC; the split is a heavy mechanical refactor that belongs in its own PR alongside the parallel handler split (both files exceed the 1500-LOC limit). - Snapshot-COW interaction for MOVE/SWAPDB (CodeRabbit comments on `spsc_handler.rs:509` / `:2325`): requires a design pass on snapshot reachability for two-DB commands, not a quick fix. - `CLUSTER REPLICAS` format pedantry: needs deeper Redis-format verification. - `test-consistency.sh` SWAPDB seeding: bash script polish, low value. - `key_extra.rs:49` DB index byte validation: belongs in the COPY arg parser rewrite, not here. ## Verification - `cargo build --release` PASS - `cargo build --no-default-features --features runtime-tokio,jemalloc` PASS - `cargo clippy --release -- -D warnings` PASS - `cargo clippy --no-default-features --features runtime-tokio,jemalloc -- -D warnings` PASS - `cargo fmt --check` PASS - `cargo test --release --lib` — 3256 PASS / 0 FAIL - `cargo test --no-default-features --features runtime-tokio,jemalloc --lib` — 2649 PASS / 0 FAIL author: Tin Dang

Second pass on the CodeRabbit review surfaced four findings that the previous fix wave did not address. This commit closes them all. ## Real bugs - `src/cluster/command.rs::format_node_line`: `NodeFlags::Replica { master_id }` was rendered as `"slave <master_id>"` inside the flags column AND the master-id was emitted again in its own dedicated column. The result was a malformed `CLUSTER NODES` / `CLUSTER REPLICAS` line where the master-id appeared twice and every column right of "flags" shifted by one — breaking whitespace-tokenized parsers (including redis-cli's own splitter). Flags is now `"slave"` / `"myself,slave"` only; master-id stays in its own column. - `src/command/key_extra.rs` `COPY ... DB <idx>`: the DB index token was consumed by `i += 1` without parsing. The handler-level intercept normally validates the DB index before reaching dispatch, but any path that bypassed the intercept (or future caller wiring) would silently accept garbage like `COPY src dst DB abc`. Now defensively `parse::<usize>()`s the token and returns the canonical `ERR value is not an integer or out of range` on malformed input. ## Test hardening - `src/cluster/command.rs` test `cluster_replicas_lists_replicas`: was loose (only checked >=9 fields, gathered ids). It would not have caught the duplicate-master-id bug above. Now pins the exact replica-line layout: 9 whitespace-separated columns, `fields[2] == "slave"` (not "slave <master_id>"), `fields[3]` is exactly the 40-char master id, and the master-id appears exactly once on the line. Any future regression that re-embeds master-id in the flags column will flip the column count from 9 → 10 and trip the assertion. ## Script hygiene - `scripts/test-consistency.sh` SWAPDB block: * Seed step previously used `both SELECT 0; both SET swapkey hello; both SELECT 1; both DEL swapkey`. `redis-cli SELECT` does NOT persist across separate invocations, so the trailing `DEL swapkey` ran against db0 and removed the key we had just seeded. Now uses explicit `redis-cli -n <db>` for each step. * Out-of-range parity check captured `$redis_oor` but only asserted `$rust_oor` contained `ERR`. A divergence (moon errors, Redis silently OKs) would have passed. Now requires both sides to return `ERR` and prints both messages on failure. ## Scoped out (intentionally not in this PR) The remaining CodeRabbit comments need design work, not patches: - `src/shard/coordinator.rs` is over the 1500-LOC cap. Splitting it out (e.g. `coordinator::swapdb`) is mechanical refactor work that belongs with the parallel `handler_{sharded,monoio}/mod.rs` split — they're all over the cap and should land together. - Snapshot-COW interaction with `SwapDb` / `MOVE` in `src/shard/spsc_handler.rs` needs a design pass on snapshot reachability for two-DB commands, not a quick guard. ## Verification - `cargo build --release` PASS - `cargo build --no-default-features --features runtime-tokio,jemalloc` PASS - `cargo clippy --release -- -D warnings` PASS - `cargo clippy --no-default-features --features runtime-tokio,jemalloc -- -D warnings` PASS - `cargo fmt --check` PASS - `cargo test --release --lib` — 3256 PASS / 0 FAIL - `cargo test --no-default-features --features runtime-tokio,jemalloc --lib` — 2649 PASS / 0 FAIL - Functional: `COPY src dst DB abc` returns ERR; `COPY src dst DB 99999` returns ERR (out-of-range intercept still fires); CLUSTER REPLICAS layout verified by the strengthened unit test. author: Tin Dang

…ti-shard) (#102) Three small helper scripts used during PR #100 validation. Captured here so future contributors can rerun the same exact methodology without recreating it from scratch. ## Scripts - `scripts/bench-s1-only.sh` — tight L-A.G3 verification: s=1 only, GET+SET, p=1/16/64, median-of-3. Used to isolate whether an s=1 regression is real or a bench-environment artifact. Moon-only (no Redis comparison). - `scripts/bench-pr100.sh` — Moon vs Redis side-by-side at s=1. GET+SET, p=1/16/64, c=400, median-of-3. Random keys via `-r 100000` so SET writes spread across the keyspace. ~15 min total runtime. - `scripts/bench-pr100-multishard.sh` — same matrix but cycles Moon at s=1/2/4 against a single-process Redis baseline. ~25-30 min. Useful for documenting the well-known multi-shard scaling profile on random keyspace (see `project_multi_shard_limitations.md`). All three scripts: - assume `redis-server` + `redis-benchmark` on PATH (works in the OrbStack `moon-dev` VM out of the box) - pass `--appendonly no --disk-offload disable` to Moon (fair-bench defaults per `feedback_bench_disk_offload_fairness.md`) - emit a markdown summary table + a TSV with raw `r1/r2/r3/median` values for archiving No source changes. Reference results from these scripts are inlined in the PR #100 description. author: Tin Dang

TinDang97 added 5 commits May 24, 2026 13:09

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

TinDang97 marked this pull request as draft May 24, 2026 07:03

TinDang97 marked this pull request as ready for review May 24, 2026 07:55

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

Comment thread src/server/conn/handler_monoio/mod.rs Outdated

Comment thread src/server/conn/handler_sharded/mod.rs Outdated

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

Comment thread src/storage/dashtable/segment/insert.rs

TinDang97 merged commit 22c4e40 into main May 24, 2026
5 of 8 checks passed

TinDang97 deleted the feat/tier-2-lane-a branch May 24, 2026 12:27

TinDang97 mentioned this pull request May 24, 2026

fix(tier-2): CodeRabbit follow-ups from PR #100 #101

Merged

4 tasks

TinDang97 mentioned this pull request May 24, 2026

chore(scripts): add PR #100 bench helpers #102

Merged

2 tasks

Conversation

TinDang97 commented May 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Commits (7)

Sprint 3 perf scope (S3.1–S3.6)

CI gates green

Performance verification (L-A.G3)

Initial finding — s=1 SET p=1 regression

Root cause

Fix (commit 608e2d1)

Post-fix in-session A/B vs 46ed28a (May-23 baseline commit)

Post-fix scaling matrix (run-2 canonical, run-1 hit VM contention)

Methodology note for the L-A.G3 gate

Functional verification (post-fix, --shards 1)

Out of scope

Test plan

Summary by CodeRabbit

Uh oh!

qodo-code-review Bot commented May 24, 2026

Qodo reviews are paused for this user.

Uh oh!

coderabbitai Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot commented May 24, 2026

Qodo reviews are paused for this user.

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

TinDang97 commented May 24, 2026 •

edited by coderabbitai Bot

Loading

Fix (commit `608e2d1`)

Post-fix in-session A/B vs `46ed28a` (May-23 baseline commit)

coderabbitai Bot commented May 24, 2026 •

edited

Loading