Skip to content

test: cargo-fuzz loader targets + SignBitmap malformed-loader coverage#4

Merged
Fieldnote-Echo merged 7 commits into
mainfrom
prod/fuzz
May 23, 2026
Merged

test: cargo-fuzz loader targets + SignBitmap malformed-loader coverage#4
Fieldnote-Echo merged 7 commits into
mainfrom
prod/fuzz

Conversation

@Fieldnote-Echo
Copy link
Copy Markdown
Owner

Fuzz targets + SignBitmap loader-test gap

Stacked on #1 (hygiene) — merge #1 first.

cargo-fuzz scaffold (fuzz/)

Four libFuzzer targets, one per rank_io loader — load_rank, load_rankquant, load_bitmap, load_sign_bitmap — driving the loaders directly with arbitrary bytes via a unique temp file (auto-cleaned). Deps are isolated in fuzz/Cargo.toml; the main crate manifest is untouched.

Results: each target run at -runs=50000no crashes, aborts, or OOM/timeout artifacts. Coverage feedback even rediscovered each format's magic (TVR1/TVRQ/TVBM/TVSB), confirming the targets reach past the header gate.

SignBitmap loader-test gap

The cross-loader malformed-file test (rank_io_loaders_reject_malformed_files_without_panicking) omitted SignBitmapIndex::load. Added a fourth catch_unwind block asserting it returns Err (never panics) on every case, plus three TVSB-shaped malformed inputs (dim_not_64, oversize, truncated) so the .tvsb header path is exercised on its own, not just via foreign magic.

Verification

  • cargo +nightly fuzz build — all four targets compile; 50k runs each, zero crashes.
  • cargo test 80/0, --features experimental 86/0.

All 26 clippy errors fixed across src/, tests/, and examples/:
- manual_is_multiple_of (13×): x % n == 0 → x.is_multiple_of(n),
  stable since 1.87, safe on MSRV 1.89
- manual_range_contains (2×): negated range comparisons →
  !(a..=b).contains(&x)
- manual_repeat_n (1×): repeat(v).take(n) → repeat_n(v, n),
  stable 1.82
- too_many_arguments (7×): #[allow] with justifying comment on
  scan_b2_fastscan_avx512, scan_b2_fastscan_scalar,
  scan_via_lut_scalar (src), finalise_row, bench_two_stage,
  bench_two_stage_batched, bench_sign_two_stage_batched (examples)
- needless_range_loop (9×): #[allow] on all SIMD kernel loops
  (bitmap.rs AVX-512 kernels, sign_bitmap.rs AVX-512 kernel,
  fastscan.rs scalar finalize) plus two clear mechanical rewrites
  in tests/rank_index/quant.rs and two #[allow] in
  tests/rank_index/index.rs (raw index used in assertion message)

cargo fmt --all run; reformatted bitmap.rs, fastscan.rs and
several test/example files. No behavior change.
Old lockfile was version 3 and still listed serde as a transitive
dependency that no longer exists in the dependency tree. Regenerated
with `cargo generate-lockfile`; new lockfile is version 4, contains
no serde entries, and passes `cargo build --locked` and
`cargo test --locked`.
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add libFuzzer fuzz targets and SignBitmapIndex loader test coverage

✨ Enhancement 🧪 Tests

Grey Divider

Walkthroughs

Description
• Add libFuzzer targets for all four rank_io loaders (TVR1, TVRQ, TVBM, TVSB)
• Extend malformed-file test to cover SignBitmapIndex with three TVSB-specific cases
• Apply cargo fmt and fix 26 clippy warnings across codebase
• Regenerate Cargo.lock to remove stale serde dependency
Diagram
flowchart LR
  A["Fuzz Scaffold<br/>fuzz/Cargo.toml"] -->|"Four targets"| B["load_rank<br/>load_rankquant<br/>load_bitmap<br/>load_sign_bitmap"]
  B -->|"Drive loaders<br/>with arbitrary bytes"| C["rank_io module"]
  D["Malformed-file test"] -->|"Add SignBitmapIndex<br/>+ TVSB cases"| E["rank_io_loaders_reject_malformed_files"]
  F["Code cleanup"] -->|"fmt + clippy fixes"| G["26 warnings resolved"]

Loading

File Changes

1. fuzz/Cargo.toml ⚙️ Configuration changes +51/-0

New cargo-fuzz scaffold with isolated dependencies

fuzz/Cargo.toml


2. fuzz/fuzz_targets/load_rank.rs 🧪 Tests +38/-0

libFuzzer target for TVR1 rank loader

fuzz/fuzz_targets/load_rank.rs


3. fuzz/fuzz_targets/load_rankquant.rs 🧪 Tests +35/-0

libFuzzer target for TVRQ rankquant loader

fuzz/fuzz_targets/load_rankquant.rs


View more (22)
4. fuzz/fuzz_targets/load_bitmap.rs 🧪 Tests +33/-0

libFuzzer target for TVBM bitmap loader

fuzz/fuzz_targets/load_bitmap.rs


5. fuzz/fuzz_targets/load_sign_bitmap.rs 🧪 Tests +35/-0

libFuzzer target for TVSB sign bitmap loader

fuzz/fuzz_targets/load_sign_bitmap.rs


6. tests/rank_index/main.rs 🧪 Tests +56/-13

Add SignBitmapIndex to malformed-file test coverage

tests/rank_index/main.rs


7. examples/bench_rank.rs Formatting +141/-63

Format and add clippy allow annotations

examples/bench_rank.rs


8. src/rank_io.rs ✨ Enhancement +11/-21

Replace modulo checks with is_multiple_of method

src/rank_io.rs


9. src/rank_index/bitmap.rs ✨ Enhancement +19/-29

Apply is_multiple_of and add needless_range_loop allows

src/rank_index/bitmap.rs


10. src/rank_index/quant.rs ✨ Enhancement +30/-15

Replace modulo checks and format function signatures

src/rank_index/quant.rs


11. src/rank_index/fastscan.rs ✨ Enhancement +10/-19

Add too_many_arguments and needless_range_loop allows

src/rank_index/fastscan.rs


12. src/rank_index/quant_kernels.rs ✨ Enhancement +13/-8

Add too_many_arguments allow and format assertions

src/rank_index/quant_kernels.rs


13. src/sign_bitmap.rs ✨ Enhancement +10/-11

Replace modulo checks and add needless_range_loop allows

src/sign_bitmap.rs


14. src/rank.rs ✨ Enhancement +15/-8

Add needless_range_loop allow and format assertions

src/rank.rs


15. src/rank_index/util.rs Formatting +2/-7

Format function signatures and simplify conditions

src/rank_index/util.rs


16. src/rank_index/index.rs ✨ Enhancement +5/-1

Add needless_range_loop allows and format struct init

src/rank_index/index.rs


17. tests/rank_index/bitmap.rs Formatting +40/-23

Reorder imports and format long lines

tests/rank_index/bitmap.rs


18. tests/rank_index/quant.rs ✨ Enhancement +17/-14

Reorder imports and refactor loop to iterator

tests/rank_index/quant.rs


19. tests/rank_index/fastscan.rs Formatting +15/-7

Reorder imports and format long expressions

tests/rank_index/fastscan.rs


20. tests/rank_index/index.rs ✨ Enhancement +3/-1

Reorder imports and add needless_range_loop allows

tests/rank_index/index.rs


21. tests/rank_index/multi_bucket.rs Formatting +2/-2

Reorder imports and format iterator chain

tests/rank_index/multi_bucket.rs


22. tests/redteam_alpha.rs Formatting +21/-6

Format long expressions and assertions

tests/redteam_alpha.rs


23. tests/redteam_beta.rs Formatting +11/-29

Simplify filter chains and format assignments

tests/redteam_beta.rs


24. tests/redteam_delta.rs Formatting +7/-5

Format file creation and path construction

tests/redteam_delta.rs


25. src/rank_index/multi_bucket.rs Additional files +3/-6

...

src/rank_index/multi_bucket.rs


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 22, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider

Great, no issues found!

Qodo reviewed your code and found no material issues that require review

Grey Divider

Qodo Logo

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new fuzzing suite for the ordvec library, targeting its various index loaders to ensure robustness against malformed inputs. It also includes widespread code reformatting, the addition of Clippy lint suppressions, and the removal of the serde dependency. Review feedback highlights potential toolchain compatibility issues with Rust 1.82.0+ features like is_multiple_of and repeat_n, suggesting more compatible alternatives. Additionally, the reviewer recommended refactoring the I/O logic to support in-memory operations, which would significantly improve fuzzing efficiency by reducing disk I/O overhead.

Comment thread src/rank_io.rs
Comment thread tests/rank_index/main.rs
Comment thread fuzz/fuzz_targets/load_bitmap.rs
Fieldnote-Echo and others added 3 commits May 22, 2026 18:01
The x86 SIMD dispatch (select_simd_tier + the SimdTier match arms, the AVX
kernels, BATCHED_AVX512_CHUNK) is cfg(target_arch=x86_64)-gated, but the glue
it references — the SimdTier::Avx2/Avx512 variants, the batched-chunk consts,
and the simd_tier / centre_drop_used bindings — was defined unconditionally.
On non-x86 (aarch64 / macos-latest CI) those are dead/unused and, under
RUSTFLAGS=-D warnings, fail the build with 7 dead_code/unused errors.

Add cfg_attr(not(target_arch=x86_64), allow(...)) to each so the crate builds
clean on aarch64 (scalar path) while x86 is untouched. Verified: aarch64 lib +
tests + examples compile clean under -D warnings; x86 fmt/clippy/test 80/86.
Scaffold a cargo-fuzz crate under fuzz/ with one libFuzzer target per
on-disk loader in src/rank_io.rs: load_rank (TVR1), load_rankquant
(TVRQ), load_bitmap (TVBM), load_sign_bitmap (TVSB).

rank_io is pub at the crate root, so each target drives
ordvec::rank_io::load_* directly — same loader code as the
RankIndex/RankQuantIndex/BitmapIndex/SignBitmapIndex::load wrappers,
one fewer indirection. Each target writes the arbitrary input to a
unique tempfile (auto-cleaned) and lets the io::Result drop; libFuzzer
treats any panic/abort/OOB as a crash, so the no-panic contract is the
assertion.

The fuzz crate's deps (libfuzzer-sys, arbitrary, tempfile) are isolated
in fuzz/Cargo.toml and do not touch the main crate manifest. Build
artifacts, corpora, and crash artifacts are gitignored.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The shared malformed-file fuzz test
(rank_io_loaders_reject_malformed_files_without_panicking) ran every
malformed case through RankIndex/RankQuantIndex/BitmapIndex::load but
omitted SignBitmapIndex::load, leaving the .tvsb loader without the same
no-panic discipline (Grumpy P1).

Add a fourth catch_unwind block asserting SignBitmapIndex::load returns
Err (never panics) on each case, plus three TVSB-shaped malformed
inputs (dim not a multiple of 64, overflowing payload, truncated
payload) so the sign-bitmap format is exercised on its own header path
rather than only on foreign magic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a cargo-fuzz harness for the rank_io loaders and extends the existing malformed-file regression test to cover SignBitmapIndex::load, while also applying a set of small refactors/formatting/clippy-appeasement changes across tests and SIMD scanning code.

Changes:

  • Add fuzz/ with four libFuzzer targets to exercise rank_io::{load_rank, load_rankquant, load_bitmap, load_sign_bitmap} on arbitrary bytes via temp files.
  • Extend rank_io_loaders_reject_malformed_files_without_panicking to include SignBitmapIndex::load plus TVSB-shaped malformed inputs.
  • Minor internal cleanups: replace % divisibility checks with is_multiple_of, add targeted clippy allows, and apply formatting-only rewrites across tests/examples.

Reviewed changes

Copilot reviewed 26 out of 28 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/redteam_delta.rs Formatting-only rewrite of temp-file helper.
tests/redteam_beta.rs Formatting-only rewrites in assertions/collections.
tests/redteam_alpha.rs Formatting-only rewrites in helpers and assertions.
tests/rank_index/quant.rs Import ordering + small loop rewrite in a test (equivalent semantics).
tests/rank_index/multi_bucket.rs Import ordering only.
tests/rank_index/main.rs Adds SignBitmap loader to malformed-file no-panic test + adds TVSB malformed cases.
tests/rank_index/index.rs Import ordering + clippy allow annotations in tests.
tests/rank_index/fastscan.rs Import ordering + formatting-only rewrites.
tests/rank_index/bitmap.rs Import ordering + formatting-only rewrites.
src/sign_bitmap.rs Minor refactors: is_multiple_of checks + clippy allows for range loops + dead_code gating.
src/rank.rs Add clippy allow for a range loop + formatting-only changes in tests.
src/rank_io.rs Uses range checks + is_multiple_of for loader validation; formatting-only signature changes.
src/rank_index/util.rs Formatting-only line wrapping changes.
src/rank_index/quant.rs is_multiple_of in SIMD dispatch guards + clippy cfg_attr allows + formatting-only rewrites.
src/rank_index/quant_kernels.rs Adds clippy allow for kernel argument count + formatting-only rewrites.
src/rank_index/multi_bucket.rs Formatting-only rewrite of Rayon loop.
src/rank_index/index.rs Formatting-only struct literal expansion.
src/rank_index/fastscan.rs Adds clippy allows for kernel signatures/range loops + formatting-only rewrites.
src/rank_index/bitmap.rs is_multiple_of checks + clippy allows for range loops + formatting-only rewrites.
fuzz/fuzz_targets/load_sign_bitmap.rs New libFuzzer target for rank_io::load_sign_bitmap.
fuzz/fuzz_targets/load_rankquant.rs New libFuzzer target for rank_io::load_rankquant.
fuzz/fuzz_targets/load_rank.rs New libFuzzer target for rank_io::load_rank.
fuzz/fuzz_targets/load_bitmap.rs New libFuzzer target for rank_io::load_bitmap.
fuzz/Cargo.toml New isolated fuzz crate manifest.
fuzz/Cargo.lock New lockfile for fuzz-only dependencies.
fuzz/.gitignore Ignore fuzz build artifacts/corpus outputs.
examples/bench_rank.rs Import ordering + formatting-only rewrites + clippy allows for helper arity.
Cargo.lock Lockfile v4 update / dependency set adjustment (stacked hygiene).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/rank_index/main.rs Outdated
Comment thread fuzz/Cargo.toml
Address bot review on PR #4:
- fuzz/Cargo.toml: remove the arbitrary dependency — no target uses the
  Arbitrary derive (all drive the loaders with raw &[u8]), so it was unused.
- tests/rank_index/main.rs: the tvr_truncated explanatory comment had been
  aligned ~44 cols right by rustfmt (it continued a trailing line-comment);
  move it above those lines so rustfmt keeps it at normal indent.
The prior commit removed arbitrary from fuzz/Cargo.toml but left the lockfile
recording it as a direct dependency of ordvec-fuzz, so cargo build --locked was
inconsistent. Stage the regenerated lockfile: arbitrary is dropped from
ordvec-fuzz's direct deps (it stays transitively via libfuzzer-sys). Verified
with cargo +nightly build --locked.
@Fieldnote-Echo Fieldnote-Echo merged commit 40154d6 into main May 23, 2026
@Fieldnote-Echo Fieldnote-Echo deleted the prod/fuzz branch May 23, 2026 00:03
Fieldnote-Echo added a commit that referenced this pull request May 23, 2026
… review)

Address the bot review wave (gemini/Codex/qodo) — all "convert core panics to
clean Python errors", completing the binding's boundary-guard design:

- Width validation (check_width): every f32 input now checks ncols == dim (2-D)
  / len == dim (1-D). The core derives n = len/dim and only asserts divisibility,
  so a wrong-but-divisible shape (e.g. (1,128) into a dim-64 index) was silently
  reinterpreted as a different vector count, or panicked on the result reshape.
  Now a clean ValueError. (gemini x3 critical, Codex x2 P1, qodo #3)
- Constructor validation: Rank/RankQuant/Bitmap/SignBitmap `new` return PyResult
  and validate against the EXACT core asserts (dim in [2, u16::MAX]; bits in
  {1,2,4} + dim multiple of 8/bits and 2^bits; dim % 64 + 0 < n_top < dim;
  dim % 64 + <= MAX_SIGN_BITMAP_DIM) -> ValueError instead of panic. (gemini x4)
- swap_remove (Rank, RankQuant): bounds-check -> IndexError, not panic.
  (gemini high, qodo #4)
- README provenance tightened to the canonical "developed within turbovec,
  factored out" phrasing. (qodo #2)

Tests: +9 (width-mismatch x6, swap_remove OOB x3); constructor-rejection tests
tightened from BaseException to ValueError. Suite now 117 passed + 1 xfail.
clippy -D warnings + fmt clean; MSRV 1.89 builds core + binding.

Not changed: qodo #1 (ndarray via numpy) is a deliberate, documented core-vs-
binding split (deps grep + publish scoped to -p ordvec; the core's published lock
is clean; the binding is publish = false, PyPI-only) -- explained on-thread.
Fieldnote-Echo added a commit that referenced this pull request May 25, 2026
… (qodo)

Two robustness fixes to part (2) of release_publish_invariants.sh:

- Publish-scoped (qodo #1): step_line grepped the whole workflow with head -1,
  so a download-artifact in another job could satisfy the ordering even if the
  publish job regressed. Now extract the 'publish:' job body (its key to the next
  2-space-indented job key or EOF) and search only within it.

- Multi-line aware (qodo #4): the cleanup was anchored on a single-line 'run:',
  so a 'run: |' block would false-fail. Now match the delete command on its own
  line, so both 'run: ... -delete' and a multi-line 'run: |' block work — still
  requiring a real delete ('find ... -delete' or 'rm ... *.cdx.json').

Verified A-G: passes on the real workflow and multi-line run:| (F); fails on
removed / reordered-in-publish / comment-only / non-deleting-find; and is no
longer fooled by a decoy download-artifact in another job (G).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants