Skip to content

perf: PR #43 recovery — ACL caching, inline SET, NEON SIMD#71

Merged
pilotspacex-byte merged 10 commits into
mainfrom
perf/pr43-recovery
Apr 12, 2026
Merged

perf: PR #43 recovery — ACL caching, inline SET, NEON SIMD#71
pilotspacex-byte merged 10 commits into
mainfrom
perf/pr43-recovery

Conversation

@pilotspacex-byte
Copy link
Copy Markdown
Contributor

@pilotspacex-byte pilotspacex-byte commented Apr 11, 2026

Summary

Three independent optimisations targeting remaining hot spots from the PR #43 pipelined-SET regression profile (aarch64 OrbStack moon-dev, 1 shard):

  • Track C — ACL caching: Cache cached_acl_unrestricted: bool per-connection, re-resolved on AUTH/HELLO. Skips RwLock + HashMap SipHash probe on every command.
  • Track A — Inline SET dispatch: Extend try_inline_dispatch to handle plain *3 SET key value from raw RESP bytes, bypassing Frame construction. Handles maxmemory eviction + AOF (raw RESP, zero re-serialisation). Gated by can_inline_writes (requires unrestricted ACL + no MULTI + no tracking). Zero-copy key/value extraction via read_buf.split_to(consumed).freeze() — Arc refcount bump, no malloc.
  • Track B — NEON SIMD for DashTable: AArch64 NEON path for Group::match_h2 and match_empty_or_deleted. SSE2 path unchanged.

Measured impact — aarch64 Linux (OrbStack moon-dev, 1 shard, 50 clients)

End-to-end benchmark (requires PR #72 maxclients fix to produce valid numbers; measured with that fix applied locally to both binaries):

Workload main + fix PR #71 + fix Δ
SET p=16 (500k reqs × 3 runs) 2.343M rps avg 3.150M rps avg +34.4% avg (peak +48%)
SET p=1 (100k reqs × 3 runs) 247K rps avg 235K rps avg -4.9% avg (-2.5% median, within noise)
GET p=16 (500k reqs × 3 runs) 3.254M rps avg 3.385M rps avg +4.0% avg

Track B microbench (cargo bench --bench simd_probe, aarch64):

Operation NEON Scalar Speedup
match_h2 miss/hit ~886 ps ~1.23 ns 1.39×
match_empty_or_deleted 686 ps 5.27 ns 7.68×

Honest caveats

  1. SET p=1 small regression (~2.5% median). The inline-dispatch path adds a few branches (argc == b'3', can_inline_writes check) that cost more than they save at pipeline depth 1. Zero-copy fix reduced this from ~7% to ~2.5% — essentially noise given run variance (7-14%). At p=16 these overheads amortise over 16 commands and the inline SET savings dominate (+34% win).
  2. GET p=16 net +4% — the NEON SIMD probing helps even on reads, which was unexpected. Original projection was ±0%.
  3. Requires PR fix(metrics): connection counter double-decrement on monoio path #72 to merge first. The maxclients counter double-decrement makes redis-benchmark unusable on monoio — both baselines and this PR produce SET: rps=0.0 without the fix.

Files changed

File Change
src/acl/table.rs is_user_unrestricted() helper
src/server/conn/core.rs cached_acl_unrestricted + refresh_acl_cache()
src/server/conn/handler_{monoio,sharded,single}.rs ACL cache init/refresh, inline SET wiring
src/server/conn/blocking.rs Expand inline dispatch: GET + plain SET with zero-copy extraction
src/server/conn/tests.rs 3 new SET tests
src/storage/dashtable/simd.rs NEON SIMD for match_h2 + match_empty_or_deleted
benches/simd_probe.rs Criterion microbench (NEON vs scalar)

Unsafe audit

All 4 new unsafe blocks (NEON SIMD) are [SOUND] — 178/178 SAFETY comments pass scripts/audit-unsafe.sh. neon_bitmask_from_cmp is a safe fn with internal unsafe block (no raw pointer params). vld1q_u8 has no alignment requirement unlike SSE2.

Test plan

  • cargo fmt --check clean
  • cargo clippy -- -D warnings clean (both feature sets)
  • 11 inline dispatch tests pass (3 new: SET executes / options fallthrough / disabled fallthrough)
  • 9 SIMD tests pass on aarch64
  • 17 ACL table tests pass
  • Criterion microbench proves NEON win (cargo bench --bench simd_probe)
  • redis-benchmark SET -P 16 A/B on Linux aarch64 — +34% avg, peak +48%
  • Zero-copy fix validated — SET p=1 regression reduced from 7% to noise-level 2.5%
  • GET p=16 +4% (bonus)
  • scripts/audit-unsafe.sh 178/178 pass

Depends on

#72fix(metrics): connection counter double-decrement on monoio path. Without that fix, redis-benchmark cannot produce real throughput on aarch64 monoio. Merge #72 first, then validate this PR.

Summary by CodeRabbit

  • Performance

    • Introduced per-connection ACL caching to skip locks for unrestricted users
    • Extended inline command dispatch to handle raw SET operations with improved throughput
    • Added AArch64 NEON SIMD acceleration for DashTable matching operations
  • Tests

    • Expanded test coverage for inline dispatch and ACL caching behavior
    • Added SIMD benchmark suite for performance validation

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 11, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bba3dabe-03d6-472c-8b22-6d31efea8a6d

📥 Commits

Reviewing files that changed from the base of the PR and between 354c373 and 94216bf.

📒 Files selected for processing (13)
  • CHANGELOG.md
  • Cargo.toml
  • UNSAFE_POLICY.md
  • benches/simd_probe.rs
  • src/acl/table.rs
  • src/command/acl.rs
  • src/server/conn/blocking.rs
  • src/server/conn/core.rs
  • src/server/conn/handler_monoio.rs
  • src/server/conn/handler_sharded.rs
  • src/server/conn/handler_single.rs
  • src/server/conn/tests.rs
  • src/storage/dashtable/simd.rs

📝 Walkthrough

Walkthrough

Adds a monotonic AclTable version counter and per-connection ACL cache with a fast-path for unrestricted users; implements an inline raw-RESP SET key value write-path (eviction, shard-local write, AOF append) gated by that cache; adds AArch64/NEON SIMD for dash-table matching and supporting benchmarks/tests.

Changes

Cohort / File(s) Summary
ACL Table
src/acl/table.rs
Add version_handle(), version(), replace_with(...), and is_user_unrestricted(...); bump shared Arc<AtomicU64> on mutations; unit tests for version behavior.
ACL command
src/command/acl.rs
Use table.replace_with(new_table) in ACL LOAD to preserve existing AclTable version Arc.
Connection state & ACL cache
src/server/conn/core.rs
Add per-connection fields cached_acl_unrestricted, cached_acl_version, acl_version_handle; add refresh_acl_cache, acl_cache_fresh, acl_skip_allowed. Initialize handle in ConnectionState::new.
Connection handlers (cache integration & gating)
src/server/conn/handler_monoio.rs, src/server/conn/handler_sharded.rs, src/server/conn/handler_single.rs
Refresh per-connection ACL cache on connection init and after AUTH/HELLO; compute can_inline_writes = conn.acl_skip_allowed() gating; skip ctx.acl_table.read() + permission probe when cache is fresh and unrestricted.
Inline dispatch (GET + SET)
src/server/conn/blocking.rs, src/server/conn/tests.rs
Extend try_inline_dispatch/try_inline_dispatch_loop signatures with can_inline_writes: bool and runtime_config: &RwLock<RuntimeConfig>; add inline path for plain RESP *3 SET key value when gated: routing check, try_evict_if_needed, exclusive shard write, append raw consumed RESP to AOF, write +OK\r\n, consume input. Harden RESP parsing bounds. Update tests to cover inlined and fallthrough cases; add make_rt_config() helper.
SIMD / DashTable
src/storage/dashtable/simd.rs
Add AArch64/NEON implementations for Group::match_h2 and Group::match_empty_or_deleted, including neon_bitmask_from_cmp; adjust scalar fallback cfgs to exclude aarch64.
Bench & Cargo
benches/simd_probe.rs, Cargo.toml
Add Criterion micro-bench benches/simd_probe.rs comparing SIMD vs scalar; add [[bench]] name = "simd_probe" in Cargo.toml.
Unsafe policy & Changelog
UNSAFE_POLICY.md, CHANGELOG.md
Document approved AArch64 NEON unsafe pattern and add Unreleased notes describing ACL fast-path, inline SET fast-path, and NEON optimizations.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ConnHandler as ConnectionHandler
    participant ACL as AclTable
    participant Shard
    participant AOF
    Client->>ConnHandler: send RESP "*3" SET key value
    ConnHandler->>ACL: maybe refresh_acl_cache(&acl_table)
    alt conn.acl_skip_allowed() && can_inline_writes
        ConnHandler->>Shard: try_evict_if_needed(runtime_config)
        ConnHandler->>Shard: acquire shard-exclusive lock
        ConnHandler->>Shard: write string entry & update metadata
        ConnHandler->>AOF: append raw consumed RESP bytes
        ConnHandler-->>Client: "+OK\r\n" (consume input)
    else not inlined
        ConnHandler->>ACL: ctx.acl_table.read() + check_command_permission(...)
        ACL-->>ConnHandler: allow / deny
        ConnHandler->>Shard: normal (non-inlined) dispatch
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • pilotspacex-byte

Poem

🐰 I cached a flag while nibbling code,
I let SET sprint down a shorter road,
NEON twitched its ears at matching bits,
Locks held firm while AOF saved the hits,
Hop — unrestricted hops now go vroom!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the three main optimization tracks (ACL caching, inline SET, NEON SIMD) addressing PR #43 regression.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed PR description is comprehensive and complete, with all required sections filled out: detailed summary of three optimization tracks, measured performance impact with specific benchmarks, files changed, unsafe audit results, and test plan coverage.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/pr43-recovery

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Performance recovery: ACL caching, inline SET, NEON SIMD for aarch64

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Cache per-connection ACL unrestricted flag to skip RwLock + HashMap probe on every command (~2.3%
  CPU)
• Extend inline dispatch to handle plain SET (*3 only) with maxmemory eviction and AOF support (~8%
  CPU)
• Add AArch64 NEON SIMD path for DashTable probing, replacing 16-iteration scalar loop (~14% CPU)
• Expected combined impact: +10–15% SET throughput at p=16 on aarch64
Diagram
flowchart LR
  A["Connection Init"] -->|refresh_acl_cache| B["Cached ACL Flag"]
  B -->|cached_acl_unrestricted=true| C["Skip RwLock Probe"]
  C -->|fast path| D["Command Execution"]
  E["Raw RESP Bytes"] -->|try_inline_dispatch| F["GET or Plain SET"]
  F -->|GET| G["Read-only Path"]
  F -->|SET *3| H["Write Path"]
  H -->|eviction check| I["Store Value"]
  H -->|AOF append| J["Persistence"]
  K["DashTable Lookup"] -->|x86_64| L["SSE2 Path"]
  K -->|aarch64| M["NEON SIMD Path"]
  M -->|vceqq_u8| N["16-way Comparison"]
  N -->|bitmask extract| O["Fast Probe Result"]
Loading

Grey Divider

File Changes

1. src/acl/table.rs ✨ Enhancement +9/-0

Add ACL unrestricted user check helper

• Add is_user_unrestricted() helper method to check if a user has no ACL restrictions
• Used by connection handler to cache the unrestricted flag per-connection
• Avoids repeated RwLock + HashMap probe on every command for default/unrestricted users

src/acl/table.rs


2. src/server/conn/core.rs ✨ Enhancement +18/-0

Add per-connection ACL unrestricted cache

• Add cached_acl_unrestricted: bool field to ConnectionState struct
• Add refresh_acl_cache() method to resolve and cache the unrestricted flag from AclTable
• Initialize cache to false on connection creation
• Re-resolve cache on AUTH/HELLO commands

src/server/conn/core.rs


3. src/server/conn/handler_monoio.rs ✨ Enhancement +16/-2

Wire ACL cache and inline SET dispatch

• Call refresh_acl_cache() on connection initialization
• Call refresh_acl_cache() after successful AUTH and HELLO commands
• Add can_inline_writes flag gated by `cached_acl_unrestricted && !in_multi &&
 !tracking_state.enabled`
• Pass can_inline_writes and runtime_config to inline dispatch functions
• Wrap ACL permission check with fast-path guard: skip RwLock when cached_acl_unrestricted=true

src/server/conn/handler_monoio.rs


View more (5)
4. src/server/conn/handler_sharded.rs ✨ Enhancement +13/-3

Add ACL cache refresh and fast-path check

• Call refresh_acl_cache() on connection initialization
• Call refresh_acl_cache() after successful AUTH and HELLO commands
• Wrap ACL permission check with fast-path guard: skip RwLock when cached_acl_unrestricted=true

src/server/conn/handler_sharded.rs


5. src/server/conn/handler_single.rs ✨ Enhancement +28/-22

Add ACL cache refresh and fast-path check

• Call refresh_acl_cache() on connection initialization
• Call refresh_acl_cache() after successful AUTH and HELLO commands
• Wrap ACL permission check and key pattern check with fast-path guard: skip RwLock when
 cached_acl_unrestricted=true

src/server/conn/handler_single.rs


6. src/server/conn/blocking.rs ✨ Enhancement +137/-84

Extend inline dispatch to plain SET with eviction and AOF

• Expand try_inline_dispatch() to handle plain *3 SET key value (no options) in addition to GET
• Add can_inline_writes parameter to gate SET inlining (requires unrestricted ACL, no MULTI, no
 tracking)
• Implement SET path with maxmemory eviction check via try_evict_if_needed()
• Append raw RESP bytes to AOF (zero re-serialization)
• Return +OK\r\n response for successful SET
• Reject SET with options (*5+) or when can_inline_writes=false
• Update all call sites to pass new parameters
• Comprehensive documentation of side-effects handled vs. skipped

src/server/conn/blocking.rs


7. src/server/conn/tests.rs 🧪 Tests +190/-18

Add SET inlining tests and update call sites

• Add make_rt_config() helper to create default RuntimeConfig for tests
• Add test_inline_set_executes_when_writes_enabled() to verify plain SET inlining works
• Add test_inline_set_with_options_falls_through() to verify SET with options is rejected
• Rename test_inline_set_falls_through() to test_inline_set_falls_through_when_writes_disabled()
• Rename test_inline_set_with_aof_falls_through() to
 test_inline_set_with_aof_falls_through_when_writes_disabled()
• Update all existing test call sites to pass can_inline_writes and runtime_config parameters

src/server/conn/tests.rs


8. src/storage/dashtable/simd.rs ✨ Enhancement +71/-3

Add AArch64 NEON SIMD paths for DashTable probing

• Add AArch64 NEON path for match_h2() using vceqq_u8 for 16-way parallel byte comparison
• Add neon_bitmask_from_cmp() safe helper function with internal unsafe block to extract 16-bit
 mask from NEON comparison result
• Use power-of-2 weight + horizontal add pattern for efficient bitmask extraction on NEON
• Add AArch64 NEON path for match_empty_or_deleted() using bit 7 shift and comparison
• Update scalar fallback condition from not(x86_64) to not(any(x86_64, aarch64))
• Add explicit SAFETY comments documenting pointer validity, alignment, and platform guarantees

src/storage/dashtable/simd.rs


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Apr 11, 2026

Code Review by Qodo

🐞 Bugs (3)   📘 Rule violations (3)   📎 Requirement gaps (0)   🎨 UX Issues (0)
🐞\ ≡ Correctness (1) ☼ Reliability (1) ⛨ Security (1)
📘\ ☼ Reliability (1) ⚙ Maintainability (2)

Grey Divider


Action required

1. NEON unsafe not policy-approved📘
Description
New AArch64 NEON unsafe blocks were added, but UNSAFE_POLICY.md does not list NEON/SIMD
intrinsics (without is_x86_feature_detected!) as an approved pattern. This violates the unsafe
policy mapping requirement and increases the risk of unreviewed unsafe patterns spreading.
Code

src/storage/dashtable/simd.rs[R60-72]

+    pub fn match_h2(&self, h2: u8) -> BitMask {
+        // SAFETY: `self.0.as_ptr()` is valid for 16 bytes (Group is [u8; 16]).
+        // `vdupq_n_u8` and all downstream NEON intrinsics are safe to invoke
+        // on AArch64 (NEON is mandatory in ARMv8-A); they require `unsafe` in
+        // Rust only because `core::arch` intrinsics are unconditionally unsafe.
+        // `vld1q_u8` has no alignment requirement (unlike SSE2 _mm_load_si128).
+        unsafe {
+            use core::arch::aarch64::*;
+            let ctrl = vld1q_u8(self.0.as_ptr());
+            let needle = vdupq_n_u8(h2);
+            let cmp = vceqq_u8(ctrl, needle);
+            Self::neon_bitmask_from_cmp(cmp)
+        }
Evidence
src/storage/dashtable/simd.rs introduces new unsafe NEON intrinsic usage for AArch64, while
UNSAFE_POLICY.md only pre-approves is_x86_feature_detected!-gated SIMD intrinsics and does not
mention AArch64 NEON as an approved pattern; therefore the new unsafe usage cannot be mapped to a
documented approved pattern as required.

Rule 297369: Enforce unsafe code usage against UNSAFE_POLICY.md
src/storage/dashtable/simd.rs[60-72]
UNSAFE_POLICY.md[73-86]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
New AArch64 NEON `unsafe` blocks were introduced, but the repo’s `UNSAFE_POLICY.md` “Approved patterns” section does not include NEON SIMD intrinsics as an approved pattern (it currently mentions `is_x86_feature_detected!`-gated SIMD only).
## Issue Context
To comply with the unsafe policy gate, new unsafe usage must be explicitly covered by an approved pattern (or the policy must be updated) in addition to having local `// SAFETY:` comments.
## Fix Focus Areas
- UNSAFE_POLICY.md[73-86]
- src/storage/dashtable/simd.rs[60-96]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. refresh_acl_cache() unwrap comment misplaced 📘
Description
The new .unwrap() in refresh_acl_cache() is not preceded by the required single-line
justification comment immediately above the #[allow(clippy::unwrap_used)] attribute. This violates
the project’s unwrap-annotation requirement and can allow unreviewed panics to slip into production
code.
Code

src/server/conn/core.rs[R237-242]

+    #[allow(clippy::unwrap_used)] // std RwLock: poison = prior panic = unrecoverable
+    pub fn refresh_acl_cache(&mut self, acl_table: &StdRwLock<crate::acl::AclTable>) {
+        self.cached_acl_unrestricted = acl_table
+            .read()
+            .unwrap()
+            .is_user_unrestricted(&self.current_user);
Evidence
Rule 302083 requires a justification comment on the line directly preceding
#[allow(clippy::unwrap_used)] for each .unwrap(). In refresh_acl_cache(), an unwrap() is
introduced, and the allow attribute exists but the justification is trailing/inline rather than on
the immediately preceding line.

Rule 302083: Annotate safe unwrap calls with allow and justification
src/server/conn/core.rs[237-242]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`refresh_acl_cache()` uses `.unwrap()` but does not follow the required pattern: a one-line `// ...` justification comment immediately above `#[allow(clippy::unwrap_used)]`.
## Issue Context
The code currently has `#[allow(clippy::unwrap_used)]` but the justification is not formatted per the rule.
## Fix Focus Areas
- src/server/conn/core.rs[237-242]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Test unwrap lacks allow+comment 📘
Description
A new .unwrap() was added in src/server/conn/tests.rs without a #[allow(clippy::unwrap_used)]
and an immediately-preceding justification comment. This violates the unwrap annotation requirement
(which applies to tests too).
Code

src/server/conn/tests.rs[R133-137]

+    // Verify the key was actually set
+    let guard = dbs.read_db(0, 0);
+    let entry = guard.get_if_alive(b"foo", 0).expect("key should exist");
+    assert_eq!(entry.value.as_bytes().unwrap(), b"bar");
+}
Evidence
Rule 302083 requires each .unwrap() to be covered by a local #[allow(clippy::unwrap_used)] with
a justification comment directly above it. The new test uses .unwrap() on entry.value.as_bytes()
without the required annotation/comment pair.

Rule 302083: Annotate safe unwrap calls with allow and justification
src/server/conn/tests.rs[133-137]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A new `.unwrap()` was added in `src/server/conn/tests.rs` without the required `#[allow(clippy::unwrap_used)]` and justification comment directly above the allow.
## Issue Context
Even in tests, this repo requires explicit justification for unwrap usage to keep unwrap audits consistent.
## Fix Focus Areas
- src/server/conn/tests.rs[133-137]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (4)
4. Inline SET parsing not fuzzed 📘
Description
The inline RESP fast-path was extended to parse and execute plain SET directly from raw bytes, but
no cargo-fuzz target exercises this new parsing/decoding path. This increases the risk of latent
panics/OOB bugs in a byte-level parser on untrusted network input.
Code

src/server/conn/blocking.rs[R1259-1314]

+    // ---- SET path (write, plain *3 only) ----
+    // Parse second bulk-string argument (the value)
+    pos = key_end + 2;
+    if pos >= len || buf[pos] != b'$' {
+        return 0;
+    }
+    pos += 1;
+    let mut val_len: usize = 0;
+    while pos < len && buf[pos] != b'\r' {
+        let d = buf[pos];
+        if d < b'0' || d > b'9' {
+            return 0;
+        }
+        val_len = val_len * 10 + (d - b'0') as usize;
+        pos += 1;
+    }
+    if pos + 1 >= len || buf[pos] != b'\r' || buf[pos + 1] != b'\n' {
+        return 0;
+    }
+    pos += 2;
+    let val_start = pos;
+    let val_end = val_start + val_len;
+    if val_end + 2 > len || buf[val_end] != b'\r' || buf[val_end + 1] != b'\n' {
+        return 0;
+    }
+    let consumed = val_end + 2;
+
+    // Eviction check + write under exclusive lock
+    {
+        let rt = runtime_config.read();
+        let mut guard = shard_databases.write_db(shard_id, selected_db);
+        if crate::storage::eviction::try_evict_if_needed(&mut guard, &rt).is_err() {
+            write_buf
+                .extend_from_slice(b"-OOM command not allowed when used memory > 'maxmemory'\r\n");
     let _ = read_buf.split_to(consumed);
     return 1;
 }
+        drop(rt);
+
+        let key = Bytes::copy_from_slice(&buf[key_start..key_end]);
+        let value = Bytes::copy_from_slice(&buf[val_start..val_end]);
+        let mut entry = crate::storage::entry::Entry::new_string(value);
+        entry.set_last_access(guard.now());
+        entry.set_access_counter(5);
+        guard.set(key, entry);
}
-    drop(guard);
+
+    // AOF: send raw RESP bytes (already in wire format, no re-serialization)
+    if let Some(tx) = aof_tx {
+        let serialized = Bytes::copy_from_slice(&buf[..consumed]);
+        let _ = tx.try_send(crate::persistence::aof::AofMessage::Append(serialized));
+    }
+
+    write_buf.extend_from_slice(b"+OK\r\n");
let _ = read_buf.split_to(consumed);
1
Evidence
try_inline_dispatch now contains additional raw-byte parsing for SET (value bulk-string parsing
and command execution). The existing fuzz targets focus on protocol parsing
(moon::protocol::parse::parse) and telnet-style inline parsing (inline::parse_inline), and none
target try_inline_dispatch’s raw RESP inlining logic, so the new parser extension is not covered
by fuzzing as required.

Rule 302085: Fuzz new parsers and deserializers with cargo-fuzz
src/server/conn/blocking.rs[1259-1314]
fuzz/fuzz_targets/resp_parse.rs[7-34]
fuzz/fuzz_targets/inline_parse.rs[7-14]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`try_inline_dispatch` was extended to parse/execute plain RESP `SET` directly from raw bytes, but there is no fuzz target that calls this code path.
## Issue Context
This is network-facing byte parsing. The existing RESP fuzzing targets `moon::protocol::parse::parse`, which does not exercise the inline-dispatch raw-byte parser in `src/server/conn/blocking.rs`.
## Fix Focus Areas
- src/server/conn/blocking.rs[1130-1314]
- fuzz/fuzz_targets/resp_parse.rs[1-35]
- fuzz/fuzz_targets/inline_parse.rs[1-14]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Replica write via inline SET 🐞
Description
The inline SET fast-path can execute a write without any READONLY replica enforcement, because the
replica-role check exists only in the normal frame-dispatch path. This allows writes to succeed on
replicas (state divergence) whenever can_inline_writes is true.
Code

src/server/conn/handler_monoio.rs[R409-416]

 if conn.authenticated {
+            // Inline writes are safe when the user is unrestricted (ACL
+            // already cached), not inside MULTI, and tracking is off.
+            let can_inline_writes =
+                conn.cached_acl_unrestricted && !conn.in_multi && !conn.tracking_state.enabled;
     let inlined = try_inline_dispatch_loop(
         &mut read_buf,
         &mut write_buf,
Evidence
handle_connection_sharded_monoio runs try_inline_dispatch_loop immediately after reading from
the socket and can continue the loop after inlining, so any later checks in the normal frame path
(including READONLY enforcement) are skipped for inlined commands. The inline dispatcher’s SET path
performs the write directly (write_db(...).set(...)) and returns +OK without consulting
replication state.

src/server/conn/handler_monoio.rs[405-440]
src/server/conn/handler_monoio.rs[824-839]
src/server/conn/blocking.rs[1144-1153]
src/server/conn/blocking.rs[1259-1314]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Inline SET executes without READONLY replica enforcement because it runs before/without the normal frame-processing gate that rejects writes when `ReplicationRole::Replica`.
### Issue Context
`handle_connection_sharded_monoio` inlines GET/SET directly from raw RESP bytes and may `continue` the loop after writing responses, bypassing later enforcement.
### Fix Focus Areas
- src/server/conn/handler_monoio.rs[405-425]
- src/server/conn/handler_monoio.rs[824-839]
- src/server/conn/blocking.rs[1144-1153]
- src/server/conn/blocking.rs[1259-1314]
### What to change
- Ensure inline SET is disabled on replicas, e.g. compute `is_replica` from `ctx.repl_state` and force `can_inline_writes = false` when role is `Replica`.
- (Defense-in-depth) Optionally add an explicit `is_replica` parameter into `try_inline_dispatch(_loop)` and reject SET when `is_replica==true`, so future call sites cannot accidentally re-enable it.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Inline SET skips spill eviction 🐞
Description
Inline SET always evicts via try_evict_if_needed (no spill) even when the normal write path uses
async spill eviction for tiered storage. Under memory pressure, this can delete evicted keys instead
of offloading them, breaking the cold-storage GET fallback semantics.
Code

src/server/conn/blocking.rs[R1286-1303]

+    // Eviction check + write under exclusive lock
+    {
+        let rt = runtime_config.read();
+        let mut guard = shard_databases.write_db(shard_id, selected_db);
+        if crate::storage::eviction::try_evict_if_needed(&mut guard, &rt).is_err() {
+            write_buf
+                .extend_from_slice(b"-OOM command not allowed when used memory > 'maxmemory'\r\n");
     let _ = read_buf.split_to(consumed);
     return 1;
 }
+        drop(rt);
+
+        let key = Bytes::copy_from_slice(&buf[key_start..key_end]);
+        let value = Bytes::copy_from_slice(&buf[val_start..val_end]);
+        let mut entry = crate::storage::entry::Entry::new_string(value);
+        entry.set_last_access(guard.now());
+        entry.set_access_counter(5);
+        guard.set(key, entry);
Evidence
The inline SET path performs eviction using try_evict_if_needed only, while the normal monoio
write path uses try_evict_if_needed_async_spill when ctx.spill_sender is configured (tiered
storage). Since inline GET explicitly falls back to cold reads
(cold_lookup_location/read_cold_entry_at), evicting without spilling can make
previously-accessible evicted keys permanently disappear in tiered-storage configurations.

src/server/conn/blocking.rs[1228-1233]
src/server/conn/blocking.rs[1286-1296]
src/server/conn/handler_monoio.rs[1617-1637]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Inline SET’s eviction path does not support tiered-storage spill eviction and can delete keys instead of offloading them when memory pressure triggers eviction.
### Issue Context
The normal monoio write path uses `try_evict_if_needed_async_spill` when `ctx.spill_sender` is present; inline SET currently only has `runtime_config` and thus cannot spill.
### Fix Focus Areas
- src/server/conn/blocking.rs[1286-1296]
- src/server/conn/handler_monoio.rs[1617-1637]
- src/server/conn/handler_monoio.rs[405-425]
### What to change
Choose one:
1. **Correctness-first:** Disable inline SET when tiered storage spill is enabled (e.g. when `ctx.spill_sender.is_some()`), by folding that into `can_inline_writes`.
2. **Full feature parity:** Plumb `spill_sender`, `disk_offload_dir`, and `spill_file_id` (and `selected_db`) into `try_inline_dispatch(_loop)` and use `try_evict_if_needed_async_spill` in the inline SET path when configured.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


7. Stale ACL unrestricted cache 🐞
Description
cached_acl_unrestricted is only refreshed on connection init and AUTH/HELLO, but ACL rules can be
changed at runtime via ACL SETUSER/DELUSER. A connection that cached true will continue
skipping command/key permission checks (and may allow inline SET) even after the user becomes
restricted or disabled.
Code

src/server/conn/core.rs[R234-243]

+    /// Resolve and cache the unrestricted flag from the AclTable.
+    /// Called once on connection init and after AUTH / HELLO.
+    #[inline]
+    #[allow(clippy::unwrap_used)] // std RwLock: poison = prior panic = unrecoverable
+    pub fn refresh_acl_cache(&mut self, acl_table: &StdRwLock<crate::acl::AclTable>) {
+        self.cached_acl_unrestricted = acl_table
+            .read()
+            .unwrap()
+            .is_user_unrestricted(&self.current_user);
+    }
Evidence
The per-connection cache is refreshed only by refresh_acl_cache (documented as init + AUTH/HELLO),
while ACL mutations happen via ACL SETUSER which updates the shared AclTable but does not
invalidate any connection caches. The handlers explicitly skip both check_command_permission and
check_key_permission when cached_acl_unrestricted is true, so runtime ACL tightening won’t take
effect for existing connections.

src/server/conn/core.rs[234-243]
src/command/acl.rs[149-163]
src/server/conn/handler_monoio.rs[1185-1233]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Per-connection `cached_acl_unrestricted` can become stale after runtime ACL changes (ACL SETUSER/DELUSER), causing permission checks to be skipped incorrectly.
### Issue Context
`refresh_acl_cache()` is only called on init and AUTH/HELLO, but `AclTable` is mutated at runtime by ACL commands.
### Fix Focus Areas
- src/server/conn/core.rs[234-243]
- src/server/conn/handler_monoio.rs[1185-1233]
- src/command/acl.rs[149-163]
- src/acl/table.rs[270-297]
### What to change
Implement a cache invalidation mechanism, e.g.:
- Add an `acl_epoch: AtomicU64` (or similar) to `AclTable`, increment on any mutation (`apply_setuser`, `del_user`, loading/rewrite if applicable).
- Store `last_seen_acl_epoch` in `ConnectionState`; before using `cached_acl_unrestricted`, compare epochs and refresh when changed.
- Ensure inline-write gating (`can_inline_writes`) also re-checks/refreshed epoch to avoid enabling inline SET based on stale unrestricted status.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/server/conn/handler_monoio.rs (1)

410-424: ⚠️ Potential issue | 🟠 Major

Inline writes bypass replica mode check.

The can_inline_writes gate checks ACL, MULTI, and tracking state but does not check for replica mode. The READONLY enforcement at lines 824-839 only applies to commands that go through the normal Frame parsing path. An inline SET on a replica would incorrectly succeed.

Consider adding a replica mode check:

🐛 Proposed fix
 // Inline writes are safe when the user is unrestricted (ACL
 // already cached), not inside MULTI, and tracking is off.
-let can_inline_writes =
-    conn.cached_acl_unrestricted && !conn.in_multi && !conn.tracking_state.enabled;
+let is_replica = ctx.repl_state.as_ref().map_or(false, |rs| {
+    rs.try_read().map_or(false, |guard| {
+        matches!(guard.role, crate::replication::state::ReplicationRole::Replica { .. })
+    })
+});
+let can_inline_writes =
+    conn.cached_acl_unrestricted && !conn.in_multi && !conn.tracking_state.enabled && !is_replica;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/conn/handler_monoio.rs` around lines 410 - 424, The inline-write
gating (can_inline_writes) currently uses conn.cached_acl_unrestricted,
conn.in_multi and conn.tracking_state.enabled but omits any replica/READONLY
check, so inline writes via try_inline_dispatch_loop can bypass replica
enforcement; update the gate to also require the connection not be a
replica/readonly (e.g., add a check like "!conn.is_replica" or
"!ctx.replica_mode" depending on the project's flag) so can_inline_writes
becomes false for replica/READONLY connections, and ensure
try_inline_dispatch_loop will not allow mutating commands when that replica flag
is set.
🧹 Nitpick comments (1)
src/server/conn/tests.rs (1)

107-164: Add regression coverage for inline SET side effects too.

These tests cover the dispatch gate well, but the optimized path also owns AOF append and maxmemory/eviction behavior. Please add a positive case that asserts the expected AofMessage::Append payload is emitted when inline writes are enabled, plus an OOM/eviction case, so the fast path cannot drift from normal SET semantics unnoticed.

Also applies to: 289-319

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/conn/tests.rs` around lines 107 - 164, Add tests that exercise
side effects of the inline SET path: (1) a test that passes a real aof_tx into
try_inline_dispatch and asserts it sends an AofMessage::Append with the expected
payload when a plain "*3 SET" is inlined (use make_dbs(), try_inline_dispatch,
and inspect the receiver side of the channel for AofMessage::Append containing
the serialized SET); (2) an OOM/eviction test that configures rt_config with a
tiny maxmemory, inserts keys until eviction would occur, then invokes
try_inline_dispatch for a SET and asserts eviction semantics match the normal
path (verify the new key exists/old keys were evicted as expected and that any
eviction counters or behaviors are observed). Reference try_inline_dispatch,
AofMessage::Append, make_dbs, and make_rt_config to locate the relevant helpers
and mirror the existing test patterns for assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/server/conn/blocking.rs`:
- Around line 1182-1190: The loop that parses key_len can overflow when doing
key_len * 10 + ...; replace the direct multiplication/add with a checked or
saturating operation to prevent wrapping: use key_len =
key_len.checked_mul(10).and_then(|v| v.checked_add((d - b'0') as
usize)).unwrap_or(usize::MAX) (or use saturating_mul/saturating_add) and return
0 if it becomes usize::MAX so the later bounds check fails; update the code
around the variables key_len, buf, pos, len to use this safe arithmetic.

In `@src/server/conn/core.rs`:
- Around line 161-165: The per-connection boolean cached_acl_unrestricted is
being left stale on runtime ACL mutations; change the design to track a global
ACL version in AclTable (or similar) and store a per-connection acl_version
alongside cached_acl_unrestricted on the connection struct, then on every
fast-path check (where cached_acl_unrestricted is consulted, e.g. command/key
permission hot-paths and in AUTH/HELLO handling) compare connection.acl_version
to AclTable.version and recompute/refresh cached_acl_unrestricted when versions
differ; also increment AclTable.version whenever ACL-changing operations run
(ACL SETUSER, ACL DELUSER, ACL LOAD) so all existing connections will detect the
change and invalidate their cached_acl_unrestricted.

---

Outside diff comments:
In `@src/server/conn/handler_monoio.rs`:
- Around line 410-424: The inline-write gating (can_inline_writes) currently
uses conn.cached_acl_unrestricted, conn.in_multi and conn.tracking_state.enabled
but omits any replica/READONLY check, so inline writes via
try_inline_dispatch_loop can bypass replica enforcement; update the gate to also
require the connection not be a replica/readonly (e.g., add a check like
"!conn.is_replica" or "!ctx.replica_mode" depending on the project's flag) so
can_inline_writes becomes false for replica/READONLY connections, and ensure
try_inline_dispatch_loop will not allow mutating commands when that replica flag
is set.

---

Nitpick comments:
In `@src/server/conn/tests.rs`:
- Around line 107-164: Add tests that exercise side effects of the inline SET
path: (1) a test that passes a real aof_tx into try_inline_dispatch and asserts
it sends an AofMessage::Append with the expected payload when a plain "*3 SET"
is inlined (use make_dbs(), try_inline_dispatch, and inspect the receiver side
of the channel for AofMessage::Append containing the serialized SET); (2) an
OOM/eviction test that configures rt_config with a tiny maxmemory, inserts keys
until eviction would occur, then invokes try_inline_dispatch for a SET and
asserts eviction semantics match the normal path (verify the new key exists/old
keys were evicted as expected and that any eviction counters or behaviors are
observed). Reference try_inline_dispatch, AofMessage::Append, make_dbs, and
make_rt_config to locate the relevant helpers and mirror the existing test
patterns for assertions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dca5032b-5fa4-468e-a2e2-81e811619f60

📥 Commits

Reviewing files that changed from the base of the PR and between 7f03e5c and f26dc4a.

📒 Files selected for processing (8)
  • src/acl/table.rs
  • src/server/conn/blocking.rs
  • src/server/conn/core.rs
  • src/server/conn/handler_monoio.rs
  • src/server/conn/handler_sharded.rs
  • src/server/conn/handler_single.rs
  • src/server/conn/tests.rs
  • src/storage/dashtable/simd.rs

Comment thread src/server/conn/blocking.rs
Comment thread src/server/conn/core.rs
Comment thread src/storage/dashtable/simd.rs
Comment thread src/server/conn/core.rs Outdated
Comment on lines +237 to +242
#[allow(clippy::unwrap_used)] // std RwLock: poison = prior panic = unrecoverable
pub fn refresh_acl_cache(&mut self, acl_table: &StdRwLock<crate::acl::AclTable>) {
self.cached_acl_unrestricted = acl_table
.read()
.unwrap()
.is_user_unrestricted(&self.current_user);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. refresh_acl_cache() unwrap comment misplaced 📘 Rule violation ⚙ Maintainability

The new .unwrap() in refresh_acl_cache() is not preceded by the required single-line
justification comment immediately above the #[allow(clippy::unwrap_used)] attribute. This violates
the project’s unwrap-annotation requirement and can allow unreviewed panics to slip into production
code.
Agent Prompt
## Issue description
`refresh_acl_cache()` uses `.unwrap()` but does not follow the required pattern: a one-line `// ...` justification comment immediately above `#[allow(clippy::unwrap_used)]`.

## Issue Context
The code currently has `#[allow(clippy::unwrap_used)]` but the justification is not formatted per the rule.

## Fix Focus Areas
- src/server/conn/core.rs[237-242]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread src/server/conn/tests.rs
Comment on lines +133 to +137
// Verify the key was actually set
let guard = dbs.read_db(0, 0);
let entry = guard.get_if_alive(b"foo", 0).expect("key should exist");
assert_eq!(entry.value.as_bytes().unwrap(), b"bar");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Test unwrap lacks allow+comment 📘 Rule violation ⚙ Maintainability

A new .unwrap() was added in src/server/conn/tests.rs without a #[allow(clippy::unwrap_used)]
and an immediately-preceding justification comment. This violates the unwrap annotation requirement
(which applies to tests too).
Agent Prompt
## Issue description
A new `.unwrap()` was added in `src/server/conn/tests.rs` without the required `#[allow(clippy::unwrap_used)]` and justification comment directly above the allow.

## Issue Context
Even in tests, this repo requires explicit justification for unwrap usage to keep unwrap audits consistent.

## Fix Focus Areas
- src/server/conn/tests.rs[133-137]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +1259 to 1314
// ---- SET path (write, plain *3 only) ----
// Parse second bulk-string argument (the value)
pos = key_end + 2;
if pos >= len || buf[pos] != b'$' {
return 0;
}
pos += 1;
let mut val_len: usize = 0;
while pos < len && buf[pos] != b'\r' {
let d = buf[pos];
if d < b'0' || d > b'9' {
return 0;
}
val_len = val_len * 10 + (d - b'0') as usize;
pos += 1;
}
if pos + 1 >= len || buf[pos] != b'\r' || buf[pos + 1] != b'\n' {
return 0;
}
pos += 2;
let val_start = pos;
let val_end = val_start + val_len;
if val_end + 2 > len || buf[val_end] != b'\r' || buf[val_end + 1] != b'\n' {
return 0;
}
let consumed = val_end + 2;

// Eviction check + write under exclusive lock
{
let rt = runtime_config.read();
let mut guard = shard_databases.write_db(shard_id, selected_db);
if crate::storage::eviction::try_evict_if_needed(&mut guard, &rt).is_err() {
write_buf
.extend_from_slice(b"-OOM command not allowed when used memory > 'maxmemory'\r\n");
let _ = read_buf.split_to(consumed);
return 1;
}
drop(rt);

let key = Bytes::copy_from_slice(&buf[key_start..key_end]);
let value = Bytes::copy_from_slice(&buf[val_start..val_end]);
let mut entry = crate::storage::entry::Entry::new_string(value);
entry.set_last_access(guard.now());
entry.set_access_counter(5);
guard.set(key, entry);
}
drop(guard);

// AOF: send raw RESP bytes (already in wire format, no re-serialization)
if let Some(tx) = aof_tx {
let serialized = Bytes::copy_from_slice(&buf[..consumed]);
let _ = tx.try_send(crate::persistence::aof::AofMessage::Append(serialized));
}

write_buf.extend_from_slice(b"+OK\r\n");
let _ = read_buf.split_to(consumed);
1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

4. Inline set parsing not fuzzed 📘 Rule violation ☼ Reliability

The inline RESP fast-path was extended to parse and execute plain SET directly from raw bytes, but
no cargo-fuzz target exercises this new parsing/decoding path. This increases the risk of latent
panics/OOB bugs in a byte-level parser on untrusted network input.
Agent Prompt
## Issue description
`try_inline_dispatch` was extended to parse/execute plain RESP `SET` directly from raw bytes, but there is no fuzz target that calls this code path.

## Issue Context
This is network-facing byte parsing. The existing RESP fuzzing targets `moon::protocol::parse::parse`, which does not exercise the inline-dispatch raw-byte parser in `src/server/conn/blocking.rs`.

## Fix Focus Areas
- src/server/conn/blocking.rs[1130-1314]
- fuzz/fuzz_targets/resp_parse.rs[1-35]
- fuzz/fuzz_targets/inline_parse.rs[1-14]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines 409 to 416
if conn.authenticated {
// Inline writes are safe when the user is unrestricted (ACL
// already cached), not inside MULTI, and tracking is off.
let can_inline_writes =
conn.cached_acl_unrestricted && !conn.in_multi && !conn.tracking_state.enabled;
let inlined = try_inline_dispatch_loop(
&mut read_buf,
&mut write_buf,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

5. Replica write via inline set 🐞 Bug ≡ Correctness

The inline SET fast-path can execute a write without any READONLY replica enforcement, because the
replica-role check exists only in the normal frame-dispatch path. This allows writes to succeed on
replicas (state divergence) whenever can_inline_writes is true.
Agent Prompt
### Issue description
Inline SET executes without READONLY replica enforcement because it runs before/without the normal frame-processing gate that rejects writes when `ReplicationRole::Replica`.

### Issue Context
`handle_connection_sharded_monoio` inlines GET/SET directly from raw RESP bytes and may `continue` the loop after writing responses, bypassing later enforcement.

### Fix Focus Areas
- src/server/conn/handler_monoio.rs[405-425]
- src/server/conn/handler_monoio.rs[824-839]
- src/server/conn/blocking.rs[1144-1153]
- src/server/conn/blocking.rs[1259-1314]

### What to change
- Ensure inline SET is disabled on replicas, e.g. compute `is_replica` from `ctx.repl_state` and force `can_inline_writes = false` when role is `Replica`.
- (Defense-in-depth) Optionally add an explicit `is_replica` parameter into `try_inline_dispatch(_loop)` and reject SET when `is_replica==true`, so future call sites cannot accidentally re-enable it.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +1286 to +1303
// Eviction check + write under exclusive lock
{
let rt = runtime_config.read();
let mut guard = shard_databases.write_db(shard_id, selected_db);
if crate::storage::eviction::try_evict_if_needed(&mut guard, &rt).is_err() {
write_buf
.extend_from_slice(b"-OOM command not allowed when used memory > 'maxmemory'\r\n");
let _ = read_buf.split_to(consumed);
return 1;
}
drop(rt);

let key = Bytes::copy_from_slice(&buf[key_start..key_end]);
let value = Bytes::copy_from_slice(&buf[val_start..val_end]);
let mut entry = crate::storage::entry::Entry::new_string(value);
entry.set_last_access(guard.now());
entry.set_access_counter(5);
guard.set(key, entry);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

6. Inline set skips spill eviction 🐞 Bug ☼ Reliability

Inline SET always evicts via try_evict_if_needed (no spill) even when the normal write path uses
async spill eviction for tiered storage. Under memory pressure, this can delete evicted keys instead
of offloading them, breaking the cold-storage GET fallback semantics.
Agent Prompt
### Issue description
Inline SET’s eviction path does not support tiered-storage spill eviction and can delete keys instead of offloading them when memory pressure triggers eviction.

### Issue Context
The normal monoio write path uses `try_evict_if_needed_async_spill` when `ctx.spill_sender` is present; inline SET currently only has `runtime_config` and thus cannot spill.

### Fix Focus Areas
- src/server/conn/blocking.rs[1286-1296]
- src/server/conn/handler_monoio.rs[1617-1637]
- src/server/conn/handler_monoio.rs[405-425]

### What to change
Choose one:
1. **Correctness-first:** Disable inline SET when tiered storage spill is enabled (e.g. when `ctx.spill_sender.is_some()`), by folding that into `can_inline_writes`.
2. **Full feature parity:** Plumb `spill_sender`, `disk_offload_dir`, and `spill_file_id` (and `selected_db`) into `try_inline_dispatch(_loop)` and use `try_evict_if_needed_async_spill` in the inline SET path when configured.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread src/server/conn/core.rs
Comment on lines +234 to +243
/// Resolve and cache the unrestricted flag from the AclTable.
/// Called once on connection init and after AUTH / HELLO.
#[inline]
#[allow(clippy::unwrap_used)] // std RwLock: poison = prior panic = unrecoverable
pub fn refresh_acl_cache(&mut self, acl_table: &StdRwLock<crate::acl::AclTable>) {
self.cached_acl_unrestricted = acl_table
.read()
.unwrap()
.is_user_unrestricted(&self.current_user);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

7. Stale acl unrestricted cache 🐞 Bug ⛨ Security

cached_acl_unrestricted is only refreshed on connection init and AUTH/HELLO, but ACL rules can be
changed at runtime via ACL SETUSER/DELUSER. A connection that cached true will continue
skipping command/key permission checks (and may allow inline SET) even after the user becomes
restricted or disabled.
Agent Prompt
### Issue description
Per-connection `cached_acl_unrestricted` can become stale after runtime ACL changes (ACL SETUSER/DELUSER), causing permission checks to be skipped incorrectly.

### Issue Context
`refresh_acl_cache()` is only called on init and AUTH/HELLO, but `AclTable` is mutated at runtime by ACL commands.

### Fix Focus Areas
- src/server/conn/core.rs[234-243]
- src/server/conn/handler_monoio.rs[1185-1233]
- src/command/acl.rs[149-163]
- src/acl/table.rs[270-297]

### What to change
Implement a cache invalidation mechanism, e.g.:
- Add an `acl_epoch: AtomicU64` (or similar) to `AclTable`, increment on any mutation (`apply_setuser`, `del_user`, loading/rewrite if applicable).
- Store `last_seen_acl_epoch` in `ConnectionState`; before using `cached_acl_unrestricted`, compare epochs and refresh when changed.
- Ensure inline-write gating (`can_inline_writes`) also re-checks/refreshed epoch to avoid enabling inline SET based on stale unrestricted status.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@TinDang97 TinDang97 force-pushed the perf/pr43-recovery branch from a0bae1d to 2990401 Compare April 12, 2026 05:46
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/server/conn/handler_monoio.rs (1)

457-472: ⚠️ Potential issue | 🔴 Critical

Don't let inline SET bypass replica or cluster guards.

This runs before the later READONLY and cluster-routing checks, but can_inline_writes only gates on ACL/MULTI/TRACKING. A plain SET can therefore be applied locally on a replica or on a node that should return a redirection error instead of mutating state. Either disable inline writes while replication/cluster routing is active, or enforce those same guards inside the inline path first.

🛠️ Safe fallback gate until the inline path enforces the missing semantics
-            let can_inline_writes =
-                conn.cached_acl_unrestricted && !conn.in_multi && !conn.tracking_state.enabled;
+            let replica_readonly = ctx
+                .repl_state
+                .as_ref()
+                .and_then(|rs| rs.try_read().ok())
+                .is_some_and(|guard| {
+                    matches!(
+                        guard.role,
+                        crate::replication::state::ReplicationRole::Replica { .. }
+                    )
+                });
+            let can_inline_writes = conn.cached_acl_unrestricted
+                && !conn.in_multi
+                && !conn.tracking_state.enabled
+                && !replica_readonly
+                && !crate::cluster::cluster_enabled();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/conn/handler_monoio.rs` around lines 457 - 472, The inline write
path currently allowed by can_inline_writes can bypass replica and
cluster-routing guards; update the gate so inline writes are disabled when
replication or cluster routing might affect command semantics (e.g., when the
connection is READONLY, the server is a replica, or cluster redirection is
required) OR add the same replica/cluster checks at the start of
try_inline_dispatch_loop so it refuses to apply mutating commands like SET;
specifically modify the can_inline_writes computation (or add a new guard passed
into try_inline_dispatch_loop) and/or add a pre-check inside
try_inline_dispatch_loop to return a fallback (no inline) whenever conn or ctx
indicate replica/read-only/cluster-routing state rather than mutating local
state.
♻️ Duplicate comments (1)
src/server/conn/blocking.rs (1)

1267-1273: ⚠️ Potential issue | 🟡 Minor

Minor: Integer overflow in val_len parsing (same pattern as key_len).

The digit-by-digit multiplication at line 1272 (val_len * 10 + ...) has the same potential overflow issue flagged for key_len in a prior review. While the 8KB buffer limit makes exploitation unlikely, applying the same saturating arithmetic fix would be more defensive and consistent.

🛡️ Optional defensive fix
-val_len = val_len * 10 + (d - b'0') as usize;
+val_len = val_len.saturating_mul(10).saturating_add((d - b'0') as usize);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/conn/blocking.rs` around lines 1267 - 1273, The val_len parsing
loop can overflow when computing val_len = val_len * 10 + (d - b'0') as usize;
change the arithmetic to use saturating operations to match the fix used for
key_len: replace the multiplication/add with val_len =
val_len.saturating_mul(10).saturating_add((d - b'0') as usize) (or equivalent
checked-to-saturate logic) inside the loop that updates val_len (the
variables/loop using pos, len, buf, and val_len).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/server/conn/handler_monoio.rs`:
- Line 106: Cached ACL "unrestricted" flags (cached_acl_unrestricted) are stale
because they are only recomputed at connect/AUTH/HELLO and then used as a
permanent fast-path, so ACL SETUSER/DELUSER mutations let existing connections
retain elevated permissions; fix by invalidating or versioning the cache: either
(A) make conn.refresh_acl_cache(&ctx.acl_table) be invoked (or triggered)
whenever the global acl_table is mutated (ACL SETUSER/DELUSER) so existing
connections clear/recompute cached_acl_unrestricted, or (B) add an epoch/version
on acl_table and store that epoch on the connection; before taking the fast path
that checks cached_acl_unrestricted, compare epochs and recompute via
refresh_acl_cache if mismatched. Update usages around cached_acl_unrestricted,
conn.refresh_acl_cache, and the ACL mutation code paths to implement one of
these solutions so the fast-path cannot bypass checks after ACL changes.

In `@src/server/conn/handler_sharded.rs`:
- Around line 224-225: The ACL fast-path is using
conn.refresh_acl_cache(&ctx.acl_table) only at connect/AUTH/HELLO and then later
bypasses checks (e.g., in handler_sharded command handling where command/key ACL
checks are skipped); fix by invalidating or updating the connection ACL cache
whenever ACLs are mutated and by adding a version check before any bypass: add a
version or epoch on ctx.acl_table that is incremented by ACL mutation paths,
update or clear conn.acl_cache when mutations occur, and in the fast-path (the
code that currently skips checks) compare conn.acl_cache.version with
ctx.acl_table.version and call conn.refresh_acl_cache(&ctx.acl_table) if they
differ so post-auth ACL changes are respected (touch functions/methods:
conn.refresh_acl_cache, ctx.acl_table mutation handlers, and the handler_sharded
fast-path where checks are skipped).

---

Outside diff comments:
In `@src/server/conn/handler_monoio.rs`:
- Around line 457-472: The inline write path currently allowed by
can_inline_writes can bypass replica and cluster-routing guards; update the gate
so inline writes are disabled when replication or cluster routing might affect
command semantics (e.g., when the connection is READONLY, the server is a
replica, or cluster redirection is required) OR add the same replica/cluster
checks at the start of try_inline_dispatch_loop so it refuses to apply mutating
commands like SET; specifically modify the can_inline_writes computation (or add
a new guard passed into try_inline_dispatch_loop) and/or add a pre-check inside
try_inline_dispatch_loop to return a fallback (no inline) whenever conn or ctx
indicate replica/read-only/cluster-routing state rather than mutating local
state.

---

Duplicate comments:
In `@src/server/conn/blocking.rs`:
- Around line 1267-1273: The val_len parsing loop can overflow when computing
val_len = val_len * 10 + (d - b'0') as usize; change the arithmetic to use
saturating operations to match the fix used for key_len: replace the
multiplication/add with val_len = val_len.saturating_mul(10).saturating_add((d -
b'0') as usize) (or equivalent checked-to-saturate logic) inside the loop that
updates val_len (the variables/loop using pos, len, buf, and val_len).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3a9502cb-eb82-46ad-84da-16512ada2e7a

📥 Commits

Reviewing files that changed from the base of the PR and between f26dc4a and 2990401.

📒 Files selected for processing (9)
  • CHANGELOG.md
  • src/acl/table.rs
  • src/server/conn/blocking.rs
  • src/server/conn/core.rs
  • src/server/conn/handler_monoio.rs
  • src/server/conn/handler_sharded.rs
  • src/server/conn/handler_single.rs
  • src/server/conn/tests.rs
  • src/storage/dashtable/simd.rs
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/acl/table.rs
  • src/server/conn/core.rs

Comment thread src/server/conn/handler_monoio.rs
Comment thread src/server/conn/handler_sharded.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benches/simd_probe.rs`:
- Around line 71-75: The unsafe callsites invoking match_h2 on &miss (the lines
where you do let r = unsafe { black_box(&miss).match_h2(black_box(0x42)) } and
similar) are missing required // SAFETY: comments; either annotate each unsafe
block with a concise // SAFETY: description that documents the invariants (e.g.,
SSE2/NEON availability, validity/lifetime of &miss, and that black_box use is
safe), or refactor the pattern into a small helper like unsafe_match_h2(miss:
&Type, v: u8) -> Ret that contains a single unsafe block with a proper //
SAFETY: comment and call that helper from the bench to eliminate duplication and
ensure consistent documentation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d383aa04-b1ca-4207-9864-f184a576e6a1

📥 Commits

Reviewing files that changed from the base of the PR and between 2990401 and dc61511.

📒 Files selected for processing (2)
  • Cargo.toml
  • benches/simd_probe.rs
✅ Files skipped from review due to trivial changes (1)
  • Cargo.toml

Comment thread benches/simd_probe.rs
TinDang97 added a commit that referenced this pull request Apr 12, 2026
handle_connection_sharded_monoio called record_connection_closed() at
its exit, while conn_accept.rs (the caller) ALSO calls it in the non-
migration branch (line 627). The AtomicU64 counter wrapped from 0 to
u64::MAX on the second fetch_sub, causing every subsequent
try_accept_connection to reject against maxclients.

Symptom: first connection succeeds, all subsequent connections rejected
with "maxclients reached" — even though CONFIG GET maxclients returns
10000 and the original value was never reached.

Repro (before fix):
  ./target/release/moon --port 6400 --shards 1 --appendonly no &
  redis-cli -p 6400 SET foo bar   # ✓ OK
  redis-cli -p 6400 SET foo bar   # ✗ Connection reset by peer

Fix: remove the handler-level decrement. The comment at line 84 already
documents that the caller owns the increment via try_accept_connection;
by symmetry the caller owns the decrement (conn_accept.rs:547 for TLS,
conn_accept.rs:627 for plain TCP non-migrated path).

Migration path counter accounting is a separate concern (already
imbalanced) and is not addressed here.

Verified on aarch64 Linux (OrbStack moon-dev):
  - 10 sequential SETs all succeed
  - INFO clients reports connected_clients:1 (just the probe)
  - redis-benchmark SET p=16 c=50 n=10000 → 1.25M req/s (real number)

Blocks: PR #71 perf recovery — cannot measure real throughput without
this fix. Once merged, PR #71 can be validated with bench-compare.sh.
TinDang97 added a commit that referenced this pull request Apr 12, 2026
The initial inline SET used Bytes::copy_from_slice for key and value,
which triggers MALLOC+memcpy twice per SET. The Frame-based path
achieves zero-copy via Bytes::slice on a frozen BytesMut, which is
just an Arc refcount bump. This disparity caused a ~4-7% SET p=1
regression because the inline-path savings were outweighed by the
new allocations at low pipeline depth.

Fix: call read_buf.split_to(consumed).freeze() once, then slice() the
frozen Bytes for key, value, and AOF. All three are now Arc refcount
bumps over the same underlying allocation — zero malloc, zero memcpy.

Measured impact on aarch64 Linux (OrbStack, 1 shard, 50 clients):

                before nocopy       after nocopy       delta
  SET p=16:     2.94M rps           3.11M rps          +5.4%  (peak 3.60M)
  SET p=1:      237K rps            235K rps           -1%    (within noise)
  GET p=16:     3.32M rps           3.33M rps          +0.3%

PR #71 totals vs origin/main (both with PR #72 maxclients fix):
  SET p=16: 2.43M → 3.11M  =  +28% (peak +48%)
  SET p=1:  241K  → 235K   =  -2.5% (was -4%, noise-level now)
  GET p=16: 3.36M → 3.33M  =  ±0%

All 11 inline-dispatch tests pass unchanged.
TinDang97 added a commit that referenced this pull request Apr 12, 2026
Addresses all actionable findings from qodo-code-review and coderabbit
on PR #71 before merge.

CRITICAL correctness fixes (would silently regress prod):
  - ACL cache epoch invalidation:  Stale cache after ACL SETUSER/
    DELUSER/LOAD was letting in-flight connections retain bypass
    privileges after permissions were revoked. Added AtomicU64
    version counter on AclTable, shared via Arc<AtomicU64> handle
    to per-connection state. Hot path checks `acl_skip_allowed()`
    (fresh && unrestricted) — lock-free. ACL LOAD now uses
    `replace_with()` to preserve the counter identity. Two new
    unit tests cover version bumping and replace_with identity.
  - Inline SET guards:  Added `!is_replica` (prevents writes on
    replica, matching the READONLY enforcement on the normal path)
    and `ctx.spill_sender.is_none()` (tiered-storage users still
    route through the async-spill eviction path).

Policy/reliability:
  - UNSAFE_POLICY.md: add AArch64 NEON intrinsics to approved
    patterns (NEON is mandatory in ARMv8-A, no runtime detection
    needed).
  - saturating_mul / checked_add defenses against malicious digit
    runs in the inline SET/GET length parse.
  - SAFETY comments added to every unsafe block in benches/simd_probe.rs
    (audit-unsafe.sh now reports 179/179 with coverage).
  - Explicit #[allow] + justification on the `.expect` / `.unwrap`
    in the new inline-SET success test.

Test suite: 11 inline-dispatch + 19 ACL (17 + 2 new) + 9 SIMD all
pass on aarch64.  cargo fmt, cargo clippy -- -D warnings, and
scripts/audit-unsafe.sh all clean.
@pilotspacex-byte
Copy link
Copy Markdown
Contributor Author

Review feedback addressed

🔴 Critical correctness fixes (would silently regress prod)

coderabbitai — Stale ACL cache on runtime mutation: ✅ Fixed in 3481e49.
Added AtomicU64 version counter on AclTable, shared via Arc<AtomicU64> handle to per-connection state. Hot path now uses conn.acl_skip_allowed() = cached_acl_unrestricted && acl_cache_fresh() — the latter is a lock-free version compare. ACL LOAD preserves the counter identity via new replace_with() method. Two new unit tests cover version bumping + replace identity.

qodo-code-review — Replica writes via inline SET: ✅ Fixed in 3481e49.
can_inline_writes now additionally requires !is_replica (non-blocking try_read on repl_state). Falls safe on lock-contention.

qodo-code-review — Inline SET skips spill eviction: ✅ Fixed in 3481e49.
can_inline_writes now requires ctx.spill_sender.is_none(). Tiered-storage deployments route through the async-spill path unchanged.

🟡 Policy / reliability

qodo — NEON unsafe not policy-approved: ✅ Added to UNSAFE_POLICY.md Approved patterns section (NEON is mandatory in ARMv8-A, no runtime detection needed).

qodo — unwrap justification comments:refresh_acl_cache uses #[allow(clippy::unwrap_used)] with inline justification above the let guard = ... line. Test .unwrap/.expect annotated with explicit #[allow] + comment.

coderabbitai — Integer overflow in length parse: ✅ Fixed in 3481e49 + e637471. saturating_mul + checked_add defend against malicious digit runs. Parser returns 0 (falls through) on any overflow.

coderabbitai — SAFETY comments missing on bench: ✅ All 4 unsafe blocks in benches/simd_probe.rs now have SAFETY comments. scripts/audit-unsafe.sh reports 179/179.

Not addressed in this PR (follow-up)

qodo — Inline SET parsing not fuzzed: Deferred. The inline parser shares no code with the Frame parser (already fuzzed). Adding a fuzz/fuzz_targets/inline_dispatch.rs target is worthwhile but orthogonal to this PR's correctness claims.

Depends on

#72 (maxclients counter double-decrement) — required to produce valid throughput benchmarks on monoio aarch64.

Three independent optimizations targeting the remaining hot spots from
the PR #43 pipelined-SET regression profile (aarch64 OrbStack moon-dev):

Track C — Cache ACL unrestricted flag on ConnectionState (~2.3% CPU):
  Cache `cached_acl_unrestricted: bool` per-connection, re-resolved on
  AUTH/HELLO. When true, skip the RwLock acquisition + HashMap SipHash
  probe on AclTable for every command. All 3 handlers updated.

Track A — Expand inline dispatch to plain SET (~8% CPU parser+drop):
  Extend try_inline_dispatch to handle `*3 SET key value` (no options)
  directly from raw RESP bytes, bypassing Frame construction and drop.
  Includes maxmemory eviction check and AOF append (raw RESP bytes,
  zero re-serialization). Gated by `can_inline_writes` flag that
  requires unrestricted ACL, no MULTI, no CLIENT TRACKING.

Track B — NEON SIMD for DashTable probing on aarch64 (~14% CPU):
  Add AArch64 NEON path for Group::match_h2 and match_empty_or_deleted
  using vceqq_u8 + power-of-2 weight + horizontal add bitmask
  extraction. Replaces scalar 16-iteration loop with ~4-instruction
  SIMD sequence. SSE2 (x86_64) path unchanged.
- Make neon_bitmask_from_cmp a safe fn (no raw pointer params, only
  NEON vector types) with internal unsafe block — reduces unsafe API
  surface from 2 unsafe fns to 1
- Add explicit SAFETY comments to every unsafe block explaining:
  ptr validity (self.0.as_ptr() valid for 16 bytes),
  alignment (vld1q_u8 has no alignment requirement unlike SSE2),
  platform guarantee (NEON mandatory on AArch64/ARMv8-A)
- Document inline SET side-effects exhaustively: what is handled
  (eviction, AOF), what is skipped with guards (ACL, tracking, MULTI),
  and what is not applicable (blocking wakeup, vector auto-index)
The CI audit-unsafe.sh checks 3 preceding lines for `// SAFETY:`.
Multi-line comments pushed the marker beyond that window. Condense
to 2-line comments immediately above each `unsafe {` block.
Decouples Track B's perf claim from the network benchmark infrastructure
(which is currently blocked by a pre-existing maxclients counter leak).

Measured on aarch64 (Apple Silicon / OrbStack Linux VM):

| Operation                    | NEON    | Scalar   | Speedup |
|------------------------------|---------|----------|---------|
| match_h2 miss                | 883 ps  | 1.232 ns | 1.39x   |
| match_h2 hit_one             | 886 ps  | 1.231 ns | 1.39x   |
| match_h2 hit_many            | 883 ps  | 1.228 ns | 1.39x   |
| match_empty_or_deleted       | 686 ps  | 5.266 ns | 7.68x   |

The scalar match_h2 is already autovectorised by LLVM, so the manual NEON
path only wins ~39%. match_empty_or_deleted scalar has a non-vectorisable
branch (`byte & 0x80 != 0`), so the NEON path wins 7.68x.

Run:  cargo bench --bench simd_probe
The initial inline SET used Bytes::copy_from_slice for key and value,
which triggers MALLOC+memcpy twice per SET. The Frame-based path
achieves zero-copy via Bytes::slice on a frozen BytesMut, which is
just an Arc refcount bump. This disparity caused a ~4-7% SET p=1
regression because the inline-path savings were outweighed by the
new allocations at low pipeline depth.

Fix: call read_buf.split_to(consumed).freeze() once, then slice() the
frozen Bytes for key, value, and AOF. All three are now Arc refcount
bumps over the same underlying allocation — zero malloc, zero memcpy.

Measured impact on aarch64 Linux (OrbStack, 1 shard, 50 clients):

                before nocopy       after nocopy       delta
  SET p=16:     2.94M rps           3.11M rps          +5.4%  (peak 3.60M)
  SET p=1:      237K rps            235K rps           -1%    (within noise)
  GET p=16:     3.32M rps           3.33M rps          +0.3%

PR #71 totals vs origin/main (both with PR #72 maxclients fix):
  SET p=16: 2.43M → 3.11M  =  +28% (peak +48%)
  SET p=1:  241K  → 235K   =  -2.5% (was -4%, noise-level now)
  GET p=16: 3.36M → 3.33M  =  ±0%

All 11 inline-dispatch tests pass unchanged.
Addresses all actionable findings from qodo-code-review and coderabbit
on PR #71 before merge.

CRITICAL correctness fixes (would silently regress prod):
  - ACL cache epoch invalidation:  Stale cache after ACL SETUSER/
    DELUSER/LOAD was letting in-flight connections retain bypass
    privileges after permissions were revoked. Added AtomicU64
    version counter on AclTable, shared via Arc<AtomicU64> handle
    to per-connection state. Hot path checks `acl_skip_allowed()`
    (fresh && unrestricted) — lock-free. ACL LOAD now uses
    `replace_with()` to preserve the counter identity. Two new
    unit tests cover version bumping and replace_with identity.
  - Inline SET guards:  Added `!is_replica` (prevents writes on
    replica, matching the READONLY enforcement on the normal path)
    and `ctx.spill_sender.is_none()` (tiered-storage users still
    route through the async-spill eviction path).

Policy/reliability:
  - UNSAFE_POLICY.md: add AArch64 NEON intrinsics to approved
    patterns (NEON is mandatory in ARMv8-A, no runtime detection
    needed).
  - saturating_mul / checked_add defenses against malicious digit
    runs in the inline SET/GET length parse.
  - SAFETY comments added to every unsafe block in benches/simd_probe.rs
    (audit-unsafe.sh now reports 179/179 with coverage).
  - Explicit #[allow] + justification on the `.expect` / `.unwrap`
    in the new inline-SET success test.

Test suite: 11 inline-dispatch + 19 ACL (17 + 2 new) + 9 SIMD all
pass on aarch64.  cargo fmt, cargo clippy -- -D warnings, and
scripts/audit-unsafe.sh all clean.
checked_add on both key_end+2 and val_end+2. In debug, Rust panics on
unsigned overflow; in release, the previous end+2 would silently wrap
and slip past greater-than-len, leading to a panic on the subsequent
out-of-bounds slice index. Now the parser cleanly returns 0 (falls
through to the Frame path) on any overflow.

Covered path: SET with a multi-gigabyte claimed key or value length
(after saturating_mul). GET path reuses key_end_crlf to avoid
re-deriving the proven-safe value.
@TinDang97 TinDang97 force-pushed the perf/pr43-recovery branch from e637471 to 94216bf Compare April 12, 2026 07:36
@pilotspacex-byte pilotspacex-byte merged commit 25f78de into main Apr 12, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants