test: cargo-fuzz loader targets + SignBitmap malformed-loader coverage#4
Conversation
All 26 clippy errors fixed across src/, tests/, and examples/: - manual_is_multiple_of (13×): x % n == 0 → x.is_multiple_of(n), stable since 1.87, safe on MSRV 1.89 - manual_range_contains (2×): negated range comparisons → !(a..=b).contains(&x) - manual_repeat_n (1×): repeat(v).take(n) → repeat_n(v, n), stable 1.82 - too_many_arguments (7×): #[allow] with justifying comment on scan_b2_fastscan_avx512, scan_b2_fastscan_scalar, scan_via_lut_scalar (src), finalise_row, bench_two_stage, bench_two_stage_batched, bench_sign_two_stage_batched (examples) - needless_range_loop (9×): #[allow] on all SIMD kernel loops (bitmap.rs AVX-512 kernels, sign_bitmap.rs AVX-512 kernel, fastscan.rs scalar finalize) plus two clear mechanical rewrites in tests/rank_index/quant.rs and two #[allow] in tests/rank_index/index.rs (raw index used in assertion message) cargo fmt --all run; reformatted bitmap.rs, fastscan.rs and several test/example files. No behavior change.
Old lockfile was version 3 and still listed serde as a transitive dependency that no longer exists in the dependency tree. Regenerated with `cargo generate-lockfile`; new lockfile is version 4, contains no serde entries, and passes `cargo build --locked` and `cargo test --locked`.
Review Summary by QodoAdd libFuzzer fuzz targets and SignBitmapIndex loader test coverage
WalkthroughsDescription• Add libFuzzer targets for all four rank_io loaders (TVR1, TVRQ, TVBM, TVSB) • Extend malformed-file test to cover SignBitmapIndex with three TVSB-specific cases • Apply cargo fmt and fix 26 clippy warnings across codebase • Regenerate Cargo.lock to remove stale serde dependency Diagramflowchart LR
A["Fuzz Scaffold<br/>fuzz/Cargo.toml"] -->|"Four targets"| B["load_rank<br/>load_rankquant<br/>load_bitmap<br/>load_sign_bitmap"]
B -->|"Drive loaders<br/>with arbitrary bytes"| C["rank_io module"]
D["Malformed-file test"] -->|"Add SignBitmapIndex<br/>+ TVSB cases"| E["rank_io_loaders_reject_malformed_files"]
F["Code cleanup"] -->|"fmt + clippy fixes"| G["26 warnings resolved"]
File Changes1. fuzz/Cargo.toml
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new fuzzing suite for the ordvec library, targeting its various index loaders to ensure robustness against malformed inputs. It also includes widespread code reformatting, the addition of Clippy lint suppressions, and the removal of the serde dependency. Review feedback highlights potential toolchain compatibility issues with Rust 1.82.0+ features like is_multiple_of and repeat_n, suggesting more compatible alternatives. Additionally, the reviewer recommended refactoring the I/O logic to support in-memory operations, which would significantly improve fuzzing efficiency by reducing disk I/O overhead.
The x86 SIMD dispatch (select_simd_tier + the SimdTier match arms, the AVX kernels, BATCHED_AVX512_CHUNK) is cfg(target_arch=x86_64)-gated, but the glue it references — the SimdTier::Avx2/Avx512 variants, the batched-chunk consts, and the simd_tier / centre_drop_used bindings — was defined unconditionally. On non-x86 (aarch64 / macos-latest CI) those are dead/unused and, under RUSTFLAGS=-D warnings, fail the build with 7 dead_code/unused errors. Add cfg_attr(not(target_arch=x86_64), allow(...)) to each so the crate builds clean on aarch64 (scalar path) while x86 is untouched. Verified: aarch64 lib + tests + examples compile clean under -D warnings; x86 fmt/clippy/test 80/86.
Scaffold a cargo-fuzz crate under fuzz/ with one libFuzzer target per on-disk loader in src/rank_io.rs: load_rank (TVR1), load_rankquant (TVRQ), load_bitmap (TVBM), load_sign_bitmap (TVSB). rank_io is pub at the crate root, so each target drives ordvec::rank_io::load_* directly — same loader code as the RankIndex/RankQuantIndex/BitmapIndex/SignBitmapIndex::load wrappers, one fewer indirection. Each target writes the arbitrary input to a unique tempfile (auto-cleaned) and lets the io::Result drop; libFuzzer treats any panic/abort/OOB as a crash, so the no-panic contract is the assertion. The fuzz crate's deps (libfuzzer-sys, arbitrary, tempfile) are isolated in fuzz/Cargo.toml and do not touch the main crate manifest. Build artifacts, corpora, and crash artifacts are gitignored. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The shared malformed-file fuzz test (rank_io_loaders_reject_malformed_files_without_panicking) ran every malformed case through RankIndex/RankQuantIndex/BitmapIndex::load but omitted SignBitmapIndex::load, leaving the .tvsb loader without the same no-panic discipline (Grumpy P1). Add a fourth catch_unwind block asserting SignBitmapIndex::load returns Err (never panics) on each case, plus three TVSB-shaped malformed inputs (dim not a multiple of 64, overflowing payload, truncated payload) so the sign-bitmap format is exercised on its own header path rather than only on foreign magic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a cargo-fuzz harness for the rank_io loaders and extends the existing malformed-file regression test to cover SignBitmapIndex::load, while also applying a set of small refactors/formatting/clippy-appeasement changes across tests and SIMD scanning code.
Changes:
- Add
fuzz/with four libFuzzer targets to exerciserank_io::{load_rank, load_rankquant, load_bitmap, load_sign_bitmap}on arbitrary bytes via temp files. - Extend
rank_io_loaders_reject_malformed_files_without_panickingto includeSignBitmapIndex::loadplus TVSB-shaped malformed inputs. - Minor internal cleanups: replace
%divisibility checks withis_multiple_of, add targeted clippy allows, and apply formatting-only rewrites across tests/examples.
Reviewed changes
Copilot reviewed 26 out of 28 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
tests/redteam_delta.rs |
Formatting-only rewrite of temp-file helper. |
tests/redteam_beta.rs |
Formatting-only rewrites in assertions/collections. |
tests/redteam_alpha.rs |
Formatting-only rewrites in helpers and assertions. |
tests/rank_index/quant.rs |
Import ordering + small loop rewrite in a test (equivalent semantics). |
tests/rank_index/multi_bucket.rs |
Import ordering only. |
tests/rank_index/main.rs |
Adds SignBitmap loader to malformed-file no-panic test + adds TVSB malformed cases. |
tests/rank_index/index.rs |
Import ordering + clippy allow annotations in tests. |
tests/rank_index/fastscan.rs |
Import ordering + formatting-only rewrites. |
tests/rank_index/bitmap.rs |
Import ordering + formatting-only rewrites. |
src/sign_bitmap.rs |
Minor refactors: is_multiple_of checks + clippy allows for range loops + dead_code gating. |
src/rank.rs |
Add clippy allow for a range loop + formatting-only changes in tests. |
src/rank_io.rs |
Uses range checks + is_multiple_of for loader validation; formatting-only signature changes. |
src/rank_index/util.rs |
Formatting-only line wrapping changes. |
src/rank_index/quant.rs |
is_multiple_of in SIMD dispatch guards + clippy cfg_attr allows + formatting-only rewrites. |
src/rank_index/quant_kernels.rs |
Adds clippy allow for kernel argument count + formatting-only rewrites. |
src/rank_index/multi_bucket.rs |
Formatting-only rewrite of Rayon loop. |
src/rank_index/index.rs |
Formatting-only struct literal expansion. |
src/rank_index/fastscan.rs |
Adds clippy allows for kernel signatures/range loops + formatting-only rewrites. |
src/rank_index/bitmap.rs |
is_multiple_of checks + clippy allows for range loops + formatting-only rewrites. |
fuzz/fuzz_targets/load_sign_bitmap.rs |
New libFuzzer target for rank_io::load_sign_bitmap. |
fuzz/fuzz_targets/load_rankquant.rs |
New libFuzzer target for rank_io::load_rankquant. |
fuzz/fuzz_targets/load_rank.rs |
New libFuzzer target for rank_io::load_rank. |
fuzz/fuzz_targets/load_bitmap.rs |
New libFuzzer target for rank_io::load_bitmap. |
fuzz/Cargo.toml |
New isolated fuzz crate manifest. |
fuzz/Cargo.lock |
New lockfile for fuzz-only dependencies. |
fuzz/.gitignore |
Ignore fuzz build artifacts/corpus outputs. |
examples/bench_rank.rs |
Import ordering + formatting-only rewrites + clippy allows for helper arity. |
Cargo.lock |
Lockfile v4 update / dependency set adjustment (stacked hygiene). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address bot review on PR #4: - fuzz/Cargo.toml: remove the arbitrary dependency — no target uses the Arbitrary derive (all drive the loaders with raw &[u8]), so it was unused. - tests/rank_index/main.rs: the tvr_truncated explanatory comment had been aligned ~44 cols right by rustfmt (it continued a trailing line-comment); move it above those lines so rustfmt keeps it at normal indent.
The prior commit removed arbitrary from fuzz/Cargo.toml but left the lockfile recording it as a direct dependency of ordvec-fuzz, so cargo build --locked was inconsistent. Stage the regenerated lockfile: arbitrary is dropped from ordvec-fuzz's direct deps (it stays transitively via libfuzzer-sys). Verified with cargo +nightly build --locked.
… review) Address the bot review wave (gemini/Codex/qodo) — all "convert core panics to clean Python errors", completing the binding's boundary-guard design: - Width validation (check_width): every f32 input now checks ncols == dim (2-D) / len == dim (1-D). The core derives n = len/dim and only asserts divisibility, so a wrong-but-divisible shape (e.g. (1,128) into a dim-64 index) was silently reinterpreted as a different vector count, or panicked on the result reshape. Now a clean ValueError. (gemini x3 critical, Codex x2 P1, qodo #3) - Constructor validation: Rank/RankQuant/Bitmap/SignBitmap `new` return PyResult and validate against the EXACT core asserts (dim in [2, u16::MAX]; bits in {1,2,4} + dim multiple of 8/bits and 2^bits; dim % 64 + 0 < n_top < dim; dim % 64 + <= MAX_SIGN_BITMAP_DIM) -> ValueError instead of panic. (gemini x4) - swap_remove (Rank, RankQuant): bounds-check -> IndexError, not panic. (gemini high, qodo #4) - README provenance tightened to the canonical "developed within turbovec, factored out" phrasing. (qodo #2) Tests: +9 (width-mismatch x6, swap_remove OOB x3); constructor-rejection tests tightened from BaseException to ValueError. Suite now 117 passed + 1 xfail. clippy -D warnings + fmt clean; MSRV 1.89 builds core + binding. Not changed: qodo #1 (ndarray via numpy) is a deliberate, documented core-vs- binding split (deps grep + publish scoped to -p ordvec; the core's published lock is clean; the binding is publish = false, PyPI-only) -- explained on-thread.
… (qodo) Two robustness fixes to part (2) of release_publish_invariants.sh: - Publish-scoped (qodo #1): step_line grepped the whole workflow with head -1, so a download-artifact in another job could satisfy the ordering even if the publish job regressed. Now extract the 'publish:' job body (its key to the next 2-space-indented job key or EOF) and search only within it. - Multi-line aware (qodo #4): the cleanup was anchored on a single-line 'run:', so a 'run: |' block would false-fail. Now match the delete command on its own line, so both 'run: ... -delete' and a multi-line 'run: |' block work — still requiring a real delete ('find ... -delete' or 'rm ... *.cdx.json'). Verified A-G: passes on the real workflow and multi-line run:| (F); fails on removed / reordered-in-publish / comment-only / non-deleting-find; and is no longer fooled by a decoy download-artifact in another job (G).
Fuzz targets + SignBitmap loader-test gap
cargo-fuzz scaffold (
fuzz/)Four libFuzzer targets, one per
rank_ioloader —load_rank,load_rankquant,load_bitmap,load_sign_bitmap— driving the loaders directly with arbitrary bytes via a unique temp file (auto-cleaned). Deps are isolated infuzz/Cargo.toml; the main crate manifest is untouched.Results: each target run at
-runs=50000— no crashes, aborts, or OOM/timeout artifacts. Coverage feedback even rediscovered each format's magic (TVR1/TVRQ/TVBM/TVSB), confirming the targets reach past the header gate.SignBitmap loader-test gap
The cross-loader malformed-file test (
rank_io_loaders_reject_malformed_files_without_panicking) omittedSignBitmapIndex::load. Added a fourthcatch_unwindblock asserting it returnsErr(never panics) on every case, plus three TVSB-shaped malformed inputs (dim_not_64,oversize,truncated) so the.tvsbheader path is exercised on its own, not just via foreign magic.Verification
cargo +nightly fuzz build— all four targets compile; 50k runs each, zero crashes.cargo test80/0,--features experimental86/0.