Skip to content

[codex] add ordvec C ABI crate#101

Merged
project-navi-bot merged 1 commit into
mainfrom
codex/c-abi-ffi
May 29, 2026
Merged

[codex] add ordvec C ABI crate#101
project-navi-bot merged 1 commit into
mainfrom
codex/c-abi-ffi

Conversation

@Fieldnote-Echo
Copy link
Copy Markdown
Owner

Summary

  • add the ordvec-ffi workspace crate with cdylib/staticlib outputs and a committed cbindgen-generated C header
  • expose ABI v1 status/version/init, thread-local error details, load/info/free, and synchronous single-query search for .tvrq RankQuant and .tvbm Bitmap
  • add exact struct-size validation, caller-owned hit buffers, stats output, panic containment, C/C++ header smoke tests, and core Bitmap::search_subset

Validation

  • cargo fmt --all --check
  • cbindgen ordvec-ffi --config ordvec-ffi/cbindgen.toml --output ordvec-ffi/include/ordvec.h --verify
  • cargo test
  • cargo test -p ordvec-ffi
  • cargo clippy --all-targets -- -D warnings
  • cargo clippy -p ordvec-ffi --all-targets -- -D warnings
  • cargo build -p ordvec-ffi
  • full stack also validated with GOCACHE=/tmp/go-build go test -count=1 ./... and GOCACHE=/tmp/go-build go test -race -count=1 ./... from ordvec-go

Stack base: main. Next PR in stack: codex/c-api-docs.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 28, 2026

Codecov Report

❌ Patch coverage is 0% with 24 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/bitmap.rs 0.00% 24 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a C FFI wrapper crate (ordvec-ffi) for the ordvec library, enabling index loading, metadata retrieval, and search capabilities from C/C++ environments, alongside adding a search_subset method to the Bitmap index. The review feedback highlights critical safety issues in the FFI code where raw pointers pointing to potentially uninitialized memory are dereferenced to create Rust references, which is immediate Undefined Behavior; using ptr::addr_of! and ptr::write is recommended instead. Additionally, the feedback suggests handling the absence of a C compiler gracefully in smoke tests to prevent unnecessary test failures in environments without cc.

Comment thread ordvec-ffi/src/lib.rs Outdated
Comment thread ordvec-ffi/src/lib.rs
Comment thread ordvec-ffi/tests/header_smoke.rs Outdated
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add ordvec C ABI v1 crate with RankQuant and Bitmap search support
✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add ordvec-ffi workspace crate with C ABI v1 bindings
• Expose synchronous search for RankQuant and Bitmap indexes
• Implement exact struct-size validation and panic containment
• Add C/C++ header smoke tests and Bitmap::search_subset core method
Diagram
flowchart LR
  Core["Core ordvec<br/>RankQuant/Bitmap"]
  FFI["ordvec-ffi crate<br/>C ABI v1"]
  Header["Generated ordvec.h<br/>C/C++ compatible"]
  Tests["Header smoke tests<br/>C11/C++17"]
  
  Core -->|wraps| FFI
  FFI -->|generates| Header
  Header -->|validated by| Tests
  FFI -->|adds| Subset["Bitmap::search_subset"]

Loading

Grey Divider

File Changes

1. ordvec-ffi/src/lib.rs ✨ Enhancement +1317/-0

Complete C ABI v1 implementation with search and validation

• Implement complete C ABI v1 with status codes, error handling, and thread-local error details
• Expose index load/info/free operations and synchronous single-query search
• Support full-index and subset search for both RankQuant and Bitmap indexes
• Validate struct sizes, pointers, query dimensions, and candidate row IDs
• Implement panic containment via catch_unwind with error message propagation
• Include comprehensive unit tests for load, search, validation, and error cases

ordvec-ffi/src/lib.rs


2. ordvec-ffi/tests/header_smoke.rs 🧪 Tests +83/-0

C and C++ header compilation smoke tests

• Add C11 compilation test for generated ordvec.h header
• Add C++17 compilation test for C++ compatibility
• Verify header includes and struct initialization functions compile correctly

ordvec-ffi/tests/header_smoke.rs


3. src/bitmap.rs ✨ Enhancement +30/-0

Add Bitmap subset search capability

• Add search_subset method to search against caller-supplied document ID subset
• Support unsorted and duplicate document IDs with independent scoring
• Maintain consistent tie-breaking policy (overlap descending, row ID ascending)

src/bitmap.rs


View more (4)
4. Cargo.toml ⚙️ Configuration changes +1/-1

Register ordvec-ffi workspace member

• Add ordvec-ffi to workspace members list

Cargo.toml


5. ordvec-ffi/Cargo.toml ⚙️ Configuration changes +14/-0

New ordvec-ffi crate configuration

• Create new crate manifest for ordvec-ffi
• Configure library outputs as rlib, cdylib, and staticlib
• Declare dependency on core ordvec crate

ordvec-ffi/Cargo.toml


6. ordvec-ffi/cbindgen.toml ⚙️ Configuration changes +29/-0

cbindgen configuration for C header generation

• Configure cbindgen for C header generation with C11/C++17 compatibility
• Include compile-time struct size assertions for ABI stability
• Exclude internal types from C header export

ordvec-ffi/cbindgen.toml


7. ordvec-ffi/include/ordvec.h 📝 Documentation +228/-0

Generated C ABI v1 header file

• Generated C header with complete ABI v1 type definitions and function declarations
• Define status codes, index kinds, capabilities, and search flags
• Declare struct initialization, index load/info/free, and search functions
• Include compile-time struct size assertions for C11 and C++

ordvec-ffi/include/ordvec.h


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 29, 2026

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0)

Grey Divider


Remediation recommended

1. C11 smoke test panics 🐞 Bug ☼ Reliability
Description
header_compiles_as_c11 unconditionally panics if the cc compiler is missing, making `cargo test
-p ordvec-ffi` fail on environments without a C toolchain. The C++ variant already treats a missing
compiler as a skip, so behavior is inconsistent and fragile.
Code

ordvec-ffi/tests/header_smoke.rs[R37-46]

Evidence
The C11 test currently panics on missing cc via .expect(...), while the C++ test explicitly
treats NotFound as a skip, demonstrating the intended non-fatal behavior when a compiler isn't
present.

ordvec-ffi/tests/header_smoke.rs[22-50]
ordvec-ffi/tests/header_smoke.rs[52-82]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`ordvec-ffi/tests/header_smoke.rs` spawns `cc` and uses `.expect(...)`, which panics when `cc` is not installed (NotFound). This makes `cargo test -p ordvec-ffi` brittle across CI/dev environments.

### Issue Context
The C++ smoke test in the same file already handles a missing compiler by skipping when `ErrorKind::NotFound`. The C test should match that pattern.

### Fix Focus Areas
- ordvec-ffi/tests/header_smoke.rs[22-50]

### Suggested change
- Replace the `.status().expect("failed to spawn C compiler")` call with `.status()`.
- `match` on the result similarly to the C++ test:
 - `Ok(status)` => assert success
 - `Err(err)` if `err.kind() == NotFound` => return/skip
 - otherwise panic with a useful message
- (Optional) Consider honoring a `CC` environment variable if you want to be extra portable.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Subset search full-sorts 🐞 Bug ➹ Performance
Description
Bitmap::search_subset fully sorts all candidate rows and only then truncates to k, causing
avoidable O(m log m) CPU and extra memory for large candidate lists. This is directly exercised by
the C ABI subset-search path for bitmap indexes.
Code

src/bitmap.rs[R429-443]

Evidence
The implementation allocates per-candidate buffers and performs a full sort_unstable_by before
slicing to k_eff, and the C ABI explicitly calls this method for bitmap subset searches.

src/bitmap.rs[416-444]
ordvec-ffi/src/lib.rs[811-830]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`Bitmap::search_subset` builds an `entries` vector for all `doc_ids`, sorts the entire vector, and then takes `k_eff`. This is unnecessarily expensive when `doc_ids.len()` is large and `k` is small.

### Issue Context
The new C ABI calls `Bitmap::search_subset` for subset searches (`ordvec_index_search` when `candidate_rows` is provided for bitmap indexes). If callers provide large candidate lists (e.g., tens/hundreds of thousands), full sorting becomes a bottleneck.

### Fix Focus Areas
- src/bitmap.rs[416-444]
- ordvec-ffi/src/lib.rs[811-830]

### Suggested change
- Keep the same ordering semantics (score desc, row_id asc), but avoid sorting the whole list:
 - Build `entries: Vec<(u32 score, u32 row)>` as today.
 - Use `entries.select_nth_unstable_by(k_eff - 1, cmp)` with `cmp` = `(score desc, row asc)`.
 - Then sort only `entries[..k_eff]` with the same `cmp` before producing outputs.
- This reduces complexity to ~O(m) partition + O(k log k) sort of the head, while preserving deterministic tie behavior.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a new ordvec-ffi workspace crate that exposes a stable C ABI (v1) for loading and searching .tvrq RankQuant and .tvbm Bitmap indexes, with a committed cbindgen-generated ordvec.h, status/version/init scaffolding, thread-local error details, exact struct-size validation, caller-owned hit buffers, optional stats, and panic containment. To support subset search through the C ABI on Bitmap indexes, the core crate also gains a public Bitmap::search_subset built on top of the existing body_overlap_scores_subset.

Changes:

  • Add ordvec-ffi crate with cdylib/staticlib/rlib outputs, ABI v1 C surface (load/info/free + synchronous single-query search for RankQuant and Bitmap), thread-local last-error, panic catching, and exact struct-size validation; commit cbindgen.toml and generated include/ordvec.h with C/C++ static size assertions.
  • Add C11 and C++17 header smoke tests plus extensive Rust-side unit tests covering load/info/search happy paths and a broad matrix of validation failures.
  • Add Bitmap::search_subset in core, returning top-k for a caller-supplied (possibly unsorted/duplicated) candidate set with deterministic (score desc, row asc) tie-breaking; wire ordvec-ffi into the workspace.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
Cargo.toml Adds ordvec-ffi to workspace members (default-members unchanged, so root CI doesn't cover it).
Cargo.lock Lockfile entry for the new crate.
ordvec-ffi/Cargo.toml New unpublished crate with rlib/cdylib/staticlib, depends only on the core crate.
ordvec-ffi/cbindgen.toml cbindgen config: C language, cpp_compat, excludes internal Rust types, appends static size asserts.
ordvec-ffi/include/ordvec.h Committed generated header with ABI v1 types, status/kind constants, and static size assertions outside the include guard.
ordvec-ffi/src/lib.rs Full FFI implementation: opaque handle, status codes, struct-size checks, candidate validation, panic-safe boundary, search dispatch + stats, extensive unit tests.
ordvec-ffi/tests/header_smoke.rs Compiles ordvec.h as C11 (hard-failure if cc missing) and C++17 (silently skipped if c++ missing).
src/bitmap.rs Adds Bitmap::search_subset (single query, caller-supplied candidate IDs, deterministic top-k) used by the FFI subset path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ordvec-ffi/src/lib.rs
Comment thread ordvec-ffi/src/lib.rs
Comment thread Cargo.toml
Comment thread ordvec-ffi/include/ordvec.h Outdated
Comment thread ordvec-ffi/tests/header_smoke.rs
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Copy link
Copy Markdown
Owner Author

Addressed the remaining bot findings for this PR:

  • stats_out struct-size validation now reads only the leading field with ptr::addr_of!(...).read() without forming a Rust reference to potentially uninitialized output storage.
  • ORDVEC_CAP_STATS/stats semantics are documented in the generated header comments: ABI v1 populates total time and search-space counters while granular fields remain reserved/zero.
  • RankQuant subset materialization has an explicit correctness comment explaining the global row-id tie-order requirement and why this path asks core for all candidates before FFI truncation.
  • ordvec-ffi has an explicit CI lane for cbindgen --verify, build, tests, and clippy.
  • C and C++ header smoke tests handle missing compilers consistently.
  • Bitmap::search_subset now uses select_nth_unstable_by and sorts only the top-k head.
  • Added a linked C smoke test that builds and runs a C program against libordvec_ffi.a.

Validation run locally: cargo fmt --all --check, cbindgen --verify, cargo test, cargo build -p ordvec-ffi && cargo test -p ordvec-ffi, cargo clippy --all-targets -- -D warnings, and cargo clippy -p ordvec-ffi --all-targets -- -D warnings.

@project-navi-bot project-navi-bot merged commit a875f4e into main May 29, 2026
33 checks passed
@project-navi-bot project-navi-bot deleted the codex/c-abi-ffi branch May 29, 2026 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants