Skip to content

chore: PR #3 review hygiene followups#4

Merged
membphis merged 2 commits into
mainfrom
worktree-review-followup-hygiene
May 15, 2026
Merged

chore: PR #3 review hygiene followups#4
membphis merged 2 commits into
mainfrom
worktree-review-followup-hygiene

Conversation

@membphis
Copy link
Copy Markdown
Collaborator

Summary

Address the local code-review findings from PR #3 (build: Makefile with bench target + AVX2 scanner perf fixes):

  • Important items resolved: tighten the scanner_crosscheck proptest to assert full Result + indices equality (was only comparing indices on Ok); switch the bench RNG from math.random (libc-dependent) to a deterministic Park-Miller LCG; cap the per-iteration upper bound in make_payload so payload sizes land within ~0.1% of their label (was up to +49%); remove a dead branch in the AVX2 tail handler; drop the inaccurate "defensive" wording from validate_brackets's doc comment.
  • Minor items resolved: Makefile help now matches the spec's "target — description" em-dash format and uses a non-greedy :[^#]*## FS pattern; bump 2m/5m/10m bench iters from 10 to 20 for stability; rename the misleading escaped_quotes_do_not_trip_fastpath test to escaped_quotes_remain_correct_with_fastpath.

Items intentionally left out per the prior assessment ("剩下都是 hygiene"): bench payload using 'A' instead of real base64 (doesn't affect scanner measurements), LUA_CPATH single-quote hardening, scan_emit_resume i += 2 minor over-step at EOF (functionally correct), and README # Safety doc visibility — already tracked in Deferred.

Test plan

  • cargo test --release — 106 tests across 9 binaries + 2000-case scalar_avx2_bit_identical proptest, all pass with the tightened assertion
  • cargo test --release --no-default-features — scalar-only build, 96 tests pass
  • make bench — all 9 scenarios' actual byte counts now match their labels within 0.1% (100k→102351, 1m→1048527, 10m→10485711)
  • make help — em-dash output verified
  • CI green on this branch

membphis added 2 commits May 15, 2026 16:28
Five Important findings + four small Minor follow-ups from the PR #3
local code review:

Important
- tests/scanner_crosscheck.rs — drop the stale "AVX2 does not validate
  brackets" comment and tighten the proptest to require full Result
  equality (Ok/Err verdict + error offset) and indices equality on every
  case, not just on Ok. After the tail-bypass fix scalar and AVX2 run the
  same scan_emit_resume + validate_brackets pipeline, so this is now
  enforceable. Still passes the 2000-case proptest.
- benches/lua_bench.lua — switch image-size RNG from math.random (which
  delegates to libc rand() and varies across machines) to a deterministic
  Park-Miller LCG. Same target_bytes now produces byte-identical output
  on any LuaJIT 2.1 host.
- benches/lua_bench.lua — tighten the loop so the actual payload size
  matches its label. Cap the per-iteration `upper` at `remaining` and
  allow the last image to shrink below the 50 KB floor when fewer bytes
  remain. Observed: every scenario now lands within ~0.1% of its label
  (100k -> 102351 bytes, 1m -> 1048527, 10m -> 10485711) vs up to +49%
  before.
- src/scan/avx2.rs — remove the dead `else if in_string != 0` branch
  inside the tail handler. `i < buf.len()` makes the `scalar_start <=
  buf.len()` check trivially true, and scan_emit_resume already returns
  Err(buf.len()) when start == buf.len() and in_string is set. Replace
  the unreachable branch with a comment that documents the invariant.
- src/scan/mod.rs — drop the inaccurate "the check is defensive" wording
  from validate_brackets. The function is correctness-coupled with the
  scanner that produced its index list; a forged quote would flip
  in_string and mask later mismatches.

Minor
- Makefile — match the spec's "target — description" help format (em-dash
  separator) and tighten the awk FS pattern from `:.*## ` to `:[^#]*## `
  so descriptions containing `##` aren't truncated by the greedy `.*`.
- benches/lua_bench.lua — bump 2m / 5m / 10m iters from 10 to 20 so
  bigger-payload measurements ride out one-shot allocator / page-fault
  noise.
- src/scan/avx2.rs — rename `escaped_quotes_do_not_trip_fastpath` to
  `escaped_quotes_remain_correct_with_fastpath`. The test asserts parity
  with scalar, not that the branch was taken (we have no counter to
  observe that), so the name should reflect what's actually checked.

cargo test --release: 70+3+10+1+5+3+12+1+1 = 106 unit/integration tests
plus the 2000-case proptest, all pass. cargo test --release
--no-default-features (scalar-only build): 60+3+10+1+5+3+12+1+1 = 96
tests, all pass.
Address the Important #1 + three Minor findings from the local review
of PR #4 (the docs drifted from the code introduced in the same PR):

- benches/lua_bench.lua: rewrite the size-accuracy comment. The prior
  draft referenced a `remaining + slack` upper-bound expression that
  the final code does not use; replace with the two-branch behaviour
  actually implemented (normal `min(500K, remaining)`; final image
  falls through to `max(1024, remaining)`) and the real worst-case
  overshoot (~1 KB, not "~10 KB").
- Makefile: explain the `:[^#]*## ` FS choice and its tradeoff (targets
  with `#` in the prerequisite list won't render — none today).
- tests/scanner_crosscheck.rs: clarify that the indices assertion holds
  for both Ok and Err cases because scan_emit_resume always completes
  emission before any potential Err, and validate_brackets does not
  modify the index list.
- src/scan/avx2.rs: spell out the scalar_start invariant including the
  exact boundary case (scalar_start == buf.len() when i == buf.len()-1
  && in_string != 0 && bs_carry != 0) that scan_emit_resume's post-loop
  in_str check covers.

No code changes. All tests still pass.
@membphis membphis merged commit 0a5a758 into main May 15, 2026
2 checks passed
@membphis membphis deleted the worktree-review-followup-hygiene branch May 15, 2026 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant