Supersedes #27: HNSW correctness + supply-chain CI (merge resolved against current main) by FlexNetOS · Pull Request #30 · FlexNetOS/ruvector

FlexNetOS · 2026-05-21T04:29:00Z

Supersedes #27

PR #27 ("Fix HNSW correctness issues and improve storage rebuild process") was authored from ruvnet/RuVector:main (a third-party fork). It became CONFLICTING against current main after months of org-normalization commits landed. Resolving the conflicts in-place would have required pushing to a fork I don't own — so this PR brings the same content into FlexNetOS/ruvector directly.

What's in this PR

All 8 commits from #27 plus a merge commit (1c128662) bringing the branch up to date with current main:

.github/dependabot.yml (new) — daily npm/cargo/actions update PRs
.github/workflows/supply-chain.yml (new) — 5-layer supply-chain CI
.github/workflows/regression-guard.yml (updated) — extended coverage
deny.toml (new) — cargo-deny rules
crates/ruvector-router-core/src/{index,vector_db}.rs — HNSW correctness fixes + storage rebuild improvements
examples/subpolynomial-time/Cargo.toml — minor deps
npm/core/platforms/win32-x64-msvc/ruvector.node — NAPI binary
npm/packages/diskann/{README.md,package.json} — docs + version
npm/package.json, npm/package-lock.json — npm criticals cleared

Conflict resolution applied

npm/core/platforms/win32-x64-msvc/ruvector.node — Took origin/main's newer NAPI-RS build artifact (regenerable; the newer binary reflects all recent NAPI updates).
npm/packages/diskann/README.md — Kept the PR's rich documentation (API table, benchmarks, algorithm notes) over main's 1-line stub. Normalized 4 github.com/ruvnet/ruvector URLs to github.com/FlexNetOS/ruvector to match the canonical org slug.

Provenance

Original commits are preserved with their original authors (FlexNetOS). The merge commit (1c128662) is the resolution.

Test plan

CI on this branch must pass before merge (regression-guard, clippy, fmt, rvf-runtime tests, supply-chain workflow self-checks)
Reviewer: spot-check the 2 conflict resolutions in this PR's "Files changed" tab against the originals in Fix HNSW correctness issues and improve storage rebuild process #27
Once green and merged, close Fix HNSW correctness issues and improve storage rebuild process #27 with a pointer to this PR

🤖 Generated with claude-flow

…ed pruning + storage rebuild Three remaining root causes from issue ruvnet#430, plus the storage-rebuild gap from PR ruvnet#460. Bug B — insert beam was clamped to ef_construction.min(m * 2). With defaults (m=16, ef_construction=200) the beam silently became 32. Late- inserted clusters got wired through whatever was near the entry point instead of through ef_construction-wide neighbour search. Bug C — adjacency-list pruning used `drain(0..drain_count)`, dropping the OLDEST edges regardless of distance. Proper HNSW pruning keeps the m CLOSEST edges. Now sort by `calculate_distance` to the anchor vector and truncate to m. Kept a fallback that preserves the newest-m behaviour when the anchor vector lookup fails so we never panic on a missing vector. Storage — VectorDB::new() always created a fresh empty HnswIndex, so previously persisted vectors were invisible to search after reopening the database. Now rebuild via storage.get_all_ids() + index.insert_batch() on open, and seed VectorDbStats.total_vectors with the recovered count. Tests: - test_pruning_keeps_closest_not_newest: builds a hub with 20 close neighbours then 6 far neighbours, asserts no "far_*" id appears in top-10 around the hub. Fails on FIFO pruning. - test_index_rebuilt_from_storage_on_open: writes 5 vectors via one VectorDB instance, reopens against the same path, asserts search returns the persisted match. Fails on the historical empty-index bug. Regression-guard CI additions: - hnsw-insert-beam-no-m2-clamp: textually forbids the ef_construction.min(m*2) pattern in index.rs. - hnsw-distance-based-neighbor-pruning: requires calculate_distance and the `> m * 2` overflow gate to both live in index.rs. - vector-db-rebuilds-index-on-open: requires storage.get_all_ids() in vector_db.rs. - hnsw-recall-at-1 job now also runs the two new tests. Supersedes PR ruvnet#460 (CoolDude1969) which covered storage rebuild + an overlapping heap fix already in main from PR ruvnet#466. Closes ruvnet#430. Co-Authored-By: claude-flow <ruv@ruv.net>

Surface the ruvnet#430 HNSW correctness fixes (insert beam, distance-based pruning, storage rebuild) to npm consumers. Bump applies to the meta package and all 5 platform-specific subpackages so optionalDependencies resolve consistently after publish-all.yml runs. Co-Authored-By: claude-flow <ruv@ruv.net>

The expanded README and 0.1.1 version were already published to npm by an earlier release, but never committed back to git. Verified identical to `npm pack @ruvector/diskann@0.1.1`. Bringing the working tree in sync so future bumps start from a clean baseline. Co-Authored-By: claude-flow <ruv@ruv.net>

No behaviour change — collapses single-expression closure and assignment onto one line per rustfmt defaults so the rustfmt CI job passes. Co-Authored-By: claude-flow <ruv@ruv.net>

The `optional-deps-resolvable-on-npm` regression guard fails because @ruvector/router-<platform>@0.1.31 doesn't exist on npm yet — those platform binaries are only published by `publish-all.yml` after a tag is cut, which happens AFTER this PR merges. Splitting the work: - This PR: HNSW correctness fix + CI guards (keeps regression-guard green on every commit). - Follow-up release PR: bump @ruvector/router meta + 5 platform packages to 0.1.31, tag v0.1.31, publish-all.yml ships the fix. This commit reverts c5c7e7f and is itself reverted in the release PR. Co-Authored-By: claude-flow <ruv@ruv.net>

Mirrors the pattern landed on sublinear-time-solver#25: 1. dependency-review (PRs only, informational) 2. cargo-audit (RustSec advisory DB, vulnerabilities only) 3. cargo-deny (license/source/ban policy via deny.toml) 4. npm-audit (workspace npm/ at --audit-level=critical) 5. lockfile-integrity (cargo metadata --locked) npm criticals cleared via package.json overrides: - vm2: transitively dropped via @google-cloud/redis 5.x - fast-xml-parser: >=5.7.0 (was <=5.6.0 vuln) - protobufjs: >=7.5.6 (was <=7.5.5 vuln) - @google-cloud/redis: >=5.0.0 (was <=3.3.0 vuln) - handlebars: picked up >=4.7.9 via override resolution Result: 73 vulns → 33 (3 crit → 0, 36 high → 19, 17 medium → 5). 19 highs remain (mostly devDep transitives + ML helpers) and are tracked via the new dependabot.yml — Dependabot will chip away weekly. deny.toml ignore-list with re-review dates covers: - RUSTSEC-2023-0071 rsa Marvin Attack (no patched version yet, local-only signing for Kalshi API; re-review 2026-08-01) - RUSTSEC-2026-0097 rand unsoundness (not triggerable in our usage — no logging inside RNG draws) - RUSTSEC-2026-0115/0116/0117 imageproc unsoundness (scipix offline examples only, never published) - 8 unmaintained advisories (paste, bincode, instant, rand_os, proc-macro-error, rustls-pemfile, rusttype, number_prefix, core2) — all transitive, no CVE, tracked for migration Added BSL-1.0, CDLA-Permissive-2.0, NCSA licenses to allowlist (present in transitive deps via xxhash-rust, tch-rs, LLVM family). dependabot.yml schedules weekly Tuesday 09:35 UTC for cargo + npm + github-actions ecosystems with patch+minor grouping. Co-Authored-By: claude-flow <ruv@ruv.net>

ci(security): 5-layer supply-chain CI + clear 3 npm criticals

Built from commit b9bb370 Platforms updated: - linux-x64-gnu - linux-arm64-gnu - darwin-x64 - darwin-arm64 - win32-x64-msvc 🤖 Generated by GitHub Actions

# Conflicts: # npm/core/platforms/win32-x64-msvc/ruvector.node # npm/packages/diskann/README.md

chatgpt-codex-connector

💡 Codex Review

ruvector/.github/workflows/regression-guard.yml

Lines 37 to 38 in 1c12866

    
                     if grep -rnPe '(\S+)\.write\(\)[^;]*\1\.(write|read)\(\)' \ 
        
                             -rnPe '(\S+)\.read\(\)[^;]*\1\.write\(\)' \

Fix PCRE grep invocation in deadlock guard job

The reentrant-lock guard uses a single grep -P command with two -e patterns (-rnPe ... -rnPe ...), but GNU grep's PCRE mode supports only one pattern, so this command exits with an error instead of evaluating either check. Because it's inside an if ...; then condition, the script treats that failure as a non-match and continues, which silently disables this regression guard and lets reentrant RwLock patterns slip through CI.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-21T04:41:01Z

+    # to users).
+    ignore:
+      - dependency-name: "*"
+        dependency-type: "development"


Replace invalid ignore key with a supported Dependabot rule

The npm update block uses ignore with dependency-type: "development", but dependency-type is not a supported field for ignore entries (it is supported for allow/groups). This can make the Dependabot config invalid or cause the rule to be ignored, which means the intended filtering of devDependency updates will not work and can stop expected npm dependency-update PRs from being generated.

Useful? React with 👍 / 👎.

…y-chain-resolved # Conflicts: # npm/core/platforms/win32-x64-msvc/ruvector.node

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5858b9a21

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-21T07:56:44Z

+
+permissions:
+  contents: read
+  pull-requests: read


Grant PR write permission for dependency-review comments

Enable pull-requests: write when comment-summary-in-pr: on-failure is set; the dependency-review-action docs/examples require write scope to post PR summaries, and with only pull-requests: read this step cannot comment and may fail noisily (currently masked by continue-on-error). In PR runs, that effectively disables the intended Layer 1 review signal even when the repository supports dependency review.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T07:56:44Z

+    { triple = "x86_64-unknown-linux-gnu" },
+    { triple = "aarch64-unknown-linux-gnu" },
+    { triple = "x86_64-apple-darwin" },
+    { triple = "aarch64-apple-darwin" },


Include Windows targets in cargo-deny graph checks

cargo deny check only evaluates dependency edges that match the configured [graph].targets; with just Linux/macOS triples here, Windows-only crates are excluded from advisories/license/source checks. This repo clearly ships/tests Windows artifacts (x86_64-pc-windows-msvc appears across multiple workflows), so the current target set leaves a real blind spot in the new supply-chain policy.

Useful? React with 👍 / 👎.

Resolved conflict in npm/core/platforms/win32-x64-msvc/ruvector.node by taking origin/main's version (newer NAPI-RS rebuild). CI rebuilds binaries on push regardless. Co-Authored-By: claude-flow <ruv@ruv.net>

- Run cargo fmt on rvf-runtime/tests/agi_e2e.rs (assert! macro wrapping was the only rustfmt diff) - Expand dependency-review allow-licenses to cover SPDX expressions appearing in our resolved transitive graph (BlueOak-1.0.0, MIT-0, Ruby) and add allow-dependencies for @anthropic-ai/claude-code* packages whose license ships as a README reference (LicenseRef-bad). Closes Rustfmt + dependency-review failures observed on PR 30. Co-Authored-By: claude-flow <ruv@ruv.net>

ecc-tools · 2026-05-21T08:12:38Z

ECC bundle files are already tracked in this repository. Skipping generation of another bundle PR.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2a3ddbf698

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-21T08:26:59Z

+    name: dependency-review (PRs only)
+    runs-on: ubuntu-latest
+    if: github.event_name == 'pull_request'
+    continue-on-error: true


Make dependency-review failures block PRs

Remove continue-on-error: true from the dependency-review job, because with this setting any vulnerability/license failure from actions/dependency-review-action is treated as non-blocking and the workflow can still pass. In PRs that introduce a new moderate/high vulnerability (or license violation), this silently bypasses the intended Layer 1 gate and allows merge unless another layer independently catches it.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T08:26:59Z

+          # `LicenseRef-bad-see-license-in-…`. Allow-list the package + per-
+          # platform binary variants by name so the unparseable LicenseRef
+          # doesn't gate PRs.
+          allow-dependencies: '@anthropic-ai/claude-code, @anthropic-ai/claude-code-darwin-arm64, @anthropic-ai/claude-code-darwin-x64, @anthropic-ai/claude-code-linux-arm64, @anthropic-ai/claude-code-linux-arm64-musl, @anthropic-ai/claude-code-linux-x64, @anthropic-ai/claude-code-linux-x64-musl, @anthropic-ai/claude-code-win32-arm64, @anthropic-ai/claude-code-win32-x64'


Use a supported dependency-review allowlist input

Replace allow-dependencies with the action’s supported input (allow-dependencies-licenses using purl values), because allow-dependencies is not a recognized dependency-review-action parameter and will be ignored. As written, the Anthropic package exception does not actually apply, so PRs can still fail on those license detections despite the intended allowlist.

Useful? React with 👍 / 👎.

1. supply-chain.yml: replace invalid 'allow-dependencies' input with 'allow-dependencies-licenses' (the v4 action rejects the former, causing dependency-review to fail on @anthropic-ai/claude-code license detection) 2. supply-chain.yml: upgrade permissions to pull-requests: write so comment-summary-in-pr: on-failure can post PR summaries 3. dependabot.yml: replace invalid 'dependency-type' under 'ignore' (only valid under 'allow') with allow: [{dependency-type: production}] 4. regression-guard.yml: split dual-pattern PCRE grep into two separate invocations — GNU grep -P supports only one pattern per call; the prior form silently errored and disabled the reentrant-lock guard 5. deny.toml: add x86_64-pc-windows-msvc target triple so cargo-deny checks cover Windows deps (the repo ships win32 NAPI artifacts)

ecc-tools · 2026-05-21T12:51:42Z

ECC bundle files are already tracked in this repository. Skipping generation of another bundle PR.

The dependency-review-action@v4 requires package-url (PURL) format for the allow-dependencies-licenses input. Bare npm package names cause 'package-url must start with pkg:' parse errors. Format: pkg:npm/%40<scope>/<name> (percent-encoded @ for scoped pkgs)

ecc-tools · 2026-05-21T12:53:03Z

ECC bundle files are already tracked in this repository. Skipping generation of another bundle PR.

devin-ai-integration

Devin Review found 8 new potential issues.

devin-ai-integration · 2026-05-21T13:04:49Z

📝 Info: index.rs exceeds 500-line guideline from CLAUDE.md

CLAUDE.md's Project Architecture section states "Keep files under 500 lines." index.rs was already ~558 lines before this PR and is now 650 lines after adding the pruning logic and test_pruning_keeps_closest_not_newest test. The violation is pre-existing, but the PR extends it. The added code (pruning logic + regression test) is correctly co-located with the HNSW index implementation, so splitting requires a thoughtful refactor (e.g., extracting tests to a separate tests/ module or the pruning helper to a submodule).

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-05-21T13:04:51Z

📝 Info: deny.toml placed in repo root vs CLAUDE.md file organization rules

The CLAUDE.md states 'NEVER save to root folder' and 'Use /config for configuration files'. However, deny.toml is a standard Rust ecosystem tool config that cargo deny check (in .github/workflows/supply-chain.yml:118) expects at the workspace root by default. The repo already has other config files in the root (Cargo.toml, package.json, .mcp.json). The tool convention effectively requires root placement unless --config config/deny.toml is passed to the CI command. This is a pragmatic exception to the organizational rule rather than a meaningful violation, since relocating it would require updating the CI workflow.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-05-21T13:04:53Z

+        let vectors_snapshot = self.vectors.read();

        // Re-acquire graph lock for modifications
        let mut graph = self.graph.write();


📝 Info: vectors.read() held during pruning widens the critical section vs old code

The new code at crates/ruvector-router-core/src/index.rs:129 acquires self.vectors.read() and holds it across the entire graph modification loop (through line 176). In the old code, only graph.write() was held during connection modifications. The new pattern additionally blocks any concurrent self.vectors.write() (i.e., the very first step of every other concurrent insert call at line 97). This is not a deadlock (lock ordering is consistent: vectors → graph across all paths), but it serializes the initial vector store step of concurrent inserts during the pruning phase, which is a minor throughput regression for high-concurrency workloads. The test_hnsw_concurrent_inserts test passes because it validates correctness, not throughput.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-05-21T13:04:54Z

            return Ok(());
        }


📝 Info: is_first variable is always false — dead code at lines 114-116

The is_first block (lines 102-112) always evaluates to false because when the entry point is None, the function returns early at line 109 (return Ok(())). The check at lines 114-116 (if is_first { return Ok(()); }) is therefore unreachable dead code. This is a pre-existing pattern not introduced by this PR, but worth noting as it slightly obscures the control flow.

(Refers to lines 114-116)

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-05-21T13:04:56Z

+        let stored_ids = storage.get_all_ids()?;
+        let total_vectors = stored_ids.len();
+        if !stored_ids.is_empty() {
+            let mut entries = Vec::with_capacity(stored_ids.len());
+            for id in &stored_ids {
+                if let Some(vector) = storage.get(id)? {
+                    entries.push((id.clone(), vector));
+                }
+            }
+            index.insert_batch(entries)?;


📝 Info: total_vectors stat assumes all stored IDs have retrievable vectors

In VectorDB::new, total_vectors is set to stored_ids.len() at line 40, but the subsequent loop at lines 43-46 only pushes entries where storage.get(id) returns Some. If any stored ID lacks a decodable vector (data corruption), the stat would overcount relative to what's actually in the index. In practice this shouldn't occur — all IDs in VECTORS_TABLE should have valid bincode-encoded vectors. The count() method (line 198-199) delegates to storage.count() which reads the table length directly, so count() and total_vectors would agree even in the corruption case — both would overcount. Minimal practical impact.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-05-21T13:04:59Z

+                        let mut scored: Vec<(String, f32)> = neighbor_connections
+                            .drain(..)
+                            .filter_map(|cid| {
+                                vectors_snapshot.get(&cid).map(|cv| {
+                                    let d = calculate_distance(anchor_vec, cv, self.config.metric)
+                                        .unwrap_or(f32::MAX);
+                                    (cid, d)
+                                })
+                            })
+                            .collect();
+                        scored.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(Ordering::Equal));
+                        scored.truncate(self.config.m);
+                        *neighbor_connections = scored.into_iter().map(|(cid, _)| cid).collect();


📝 Info: filter_map in pruning silently drops connections for missing vectors

At crates/ruvector-router-core/src/index.rs:154, the filter_map skips any cid whose vector is not found in vectors_snapshot. If this occurs, the connection is permanently lost from the adjacency list after drain(..). In theory this cannot happen because all IDs in the graph must have their vectors stored (vectors.write() at line 97 always executes before graph connections are established), and the snapshot is taken after all prior writes are visible. However, if a future code change introduces a path where a vector ID appears in graph connections without a corresponding entry in the vectors HashMap (e.g., a partial failure during batch insert), this would silently degrade graph connectivity rather than raising an error.

Was this helpful? React with 👍 or 👎 to provide feedback.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 84d6ab2f46

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-21T13:06:52Z

+      - name: install cargo-deny
+        run: cargo install --locked cargo-deny
+      - name: cargo deny check
+        run: cargo deny check


Run cargo-deny with all Cargo features enabled

The new cargo deny check step only scans the default feature graph, so advisories/licenses on dependencies behind non-default features are skipped. cargo-deny defaults all-features to false unless you pass --all-features or set [graph].all-features = true, and this workspace has many optional feature-gated deps (for example crates with default = [] plus optional dependencies). As written, PRs can introduce a vulnerable crate through an opt-in feature without tripping this supply-chain gate.

Useful? React with 👍 / 👎.

Fixed in 6c7ed83 — added --all-features to the cargo deny check command and set all-features = true in deny.toml so non-default feature-gated deps are scanned.

chatgpt-codex-connector · 2026-05-21T13:06:52Z

+    { triple = "aarch64-unknown-linux-gnu" },
+    { triple = "x86_64-apple-darwin" },
+    { triple = "aarch64-apple-darwin" },
+    { triple = "x86_64-pc-windows-msvc" },
+]


Add missing shipped target triples to cargo-deny graph

The target filter in deny.toml still omits platforms this repo builds and publishes for (for example aarch64-pc-windows-msvc and *-unknown-linux-musl appear in CI/workspace package targets), and cargo-deny ignores target-specific dependency edges when the target isn’t listed. That leaves a real policy blind spot where advisories/license/source violations in those platform-only dependencies won’t fail the supply-chain gate.

Useful? React with 👍 / 👎.

Fixed in 6c7ed83 — added the 3 missing shipped targets to deny.toml:

x86_64-unknown-linux-musl

aarch64-unknown-linux-musl

aarch64-pc-windows-msvc

These all appear in CI build matrices (NAPI/SONA workflows) but were previously unchecked.

Address Codex review P2 findings: 1. Run cargo deny with --all-features so deps behind non-default feature gates are also scanned for advisories/licenses/sources. Set [graph].all-features = true in deny.toml as the canonical config. 2. Add the 3 shipped targets missing from deny.toml: - x86_64-unknown-linux-musl (NAPI/SONA builds) - aarch64-unknown-linux-musl (NAPI/SONA builds) - aarch64-pc-windows-msvc (NAPI builds) These triples appear in CI build matrices but were unchecked by cargo-deny, leaving a blind spot for platform-specific advisories.

ecc-tools · 2026-05-21T13:07:49Z

ECC bundle files are already tracked in this repository. Skipping generation of another bundle PR.

devin-ai-integration

Devin Review found 2 new potential issues.

devin-ai-integration · 2026-05-21T13:16:43Z

+          found=0
+          if grep -rnP '(\S+)\.write\(\)[^;]*\1\.(write|read)\(\)' \
                 --include='*.rs' -- crates/ ; then
+            found=1
+          fi
+          if grep -rnP '(\S+)\.read\(\)[^;]*\1\.write\(\)' \
+                 --include='*.rs' -- crates/ ; then
+            found=1
+          fi
+          if [ "$found" -eq 1 ]; then
            echo "::error::Found reentrant parking_lot lock acquisition on a single statement (regression of issue #437). Bind the guard once: 'let mut g = x.write(); g.field = …;'"
            exit 1
          fi


📝 Info: Regression guard comment claims grep -P supports only one pattern per invocation — this is incorrect

The comment at line 36-37 states 'Bash -P (PCRE) supports backreferences but only ONE pattern per invocation'. This is incorrect — GNU grep's -P flag supports multiple -e patterns per invocation, and each pattern has its own independent capture group namespace, so backreferences like \1 work correctly per-pattern. The old code (grep -rnPe 'pat1' -rnPe 'pat2') was syntactically valid. The split into two greps is functionally equivalent and arguably clearer, but the rationale in the comment is wrong.

Was this helpful? React with 👍 or 👎 to provide feedback.

The comment is actually correct — GNU grep's -P flag does not support multiple -e patterns:

$ echo "x.write() x.write()" | grep -P -e '(\S+)\.write' -e '(\S+)\.read' grep: the -P option only supports a single pattern

This is a documented limitation of GNU grep (PCRE mode). The split into two invocations is a necessary fix, not just a clarity improvement.

The 0.1.0→0.1.1 bump was README-only but the five platform-specific optionalDependencies still pin 0.1.0. Per repo convention (visible in @ruvector/router and @ruvector/rvf-node), the wrapper package version must stay in lockstep with its native binaries. Revert to 0.1.0 to avoid ABI skew when platform binaries are eventually republished.

ecc-tools · 2026-05-21T13:17:36Z

ECC bundle files are already tracked in this repository. Skipping generation of another bundle PR.

…all-features scan The --all-features flag correctly expanded cargo-deny's scope to cover optional-feature deps. This surfaced 5 pre-existing RUSTSEC advisories (derivative, pprof, pqcrypto-dilithium, pqcrypto-kyber, wee_alloc) and 1 license violation (inferno's CDDL-1.0). All are pre-existing transitive deps behind optional features — not introduced by this PR. Each gets a justification + 2026-08-01 re-review date in deny.toml. CDDL-1.0 is OSI-approved and added to the license allow list for the inferno flamegraph library.

ecc-tools · 2026-05-21T13:32:42Z

ECC bundle files are already tracked in this repository. Skipping generation of another bundle PR.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 560e85fc91

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-21T13:41:55Z

+          # The `insert` function in index.rs must reach calculate_distance()
+          # AFTER the `> self.config.m * 2` overflow check fires — that is
+          # what proves the pruner is distance-aware, not FIFO.
+          if ! grep -nE 'calculate_distance' crates/ruvector-router-core/src/index.rs >/dev/null ; then


Make pruning regression check actually inspect overflow path

This guard only checks whether calculate_distance appears anywhere in index.rs, but that function is already used in unrelated search code, so the job will still pass even if distance-based pruning is removed from the overflow branch. That leaves the new “distance-based-neighbor-pruning” CI check unable to detect the exact regression it claims to block.

Useful? React with 👍 / 👎.

Good catch — fixed in 058e8d8. The check now extracts a 20-line window after the overflow gate (> self.config.m * 2) and verifies calculate_distance appears within that window, rather than grepping the whole file where search code also calls it.

chatgpt-codex-connector · 2026-05-21T13:41:55Z

+
+permissions:
+  contents: read
+  pull-requests: write


Restrict PR-write token to the dependency-review job

pull-requests: write is set at workflow scope, so every job in this workflow (including jobs that execute downloaded toolchains and registry data) receives a write-capable GITHUB_TOKEN. Only the dependency-review step needs this permission for comment-summary-in-pr, so leaving it global unnecessarily expands blast radius if any action or dependency in other jobs is compromised.

Useful? React with 👍 / 👎.

Agreed — fixed in 058e8d8. pull-requests: write is now scoped to the dependency-review job only. The other four jobs (cargo-audit, cargo-deny, npm-audit, lockfile-integrity) run with the workflow-level contents: read permission only.

1. supply-chain.yml: Move pull-requests: write from workflow-level to the dependency-review job only. Other jobs (cargo-audit, cargo-deny, npm-audit, lockfile-integrity) don't need write access and should run with read-only tokens to minimize blast radius. 2. regression-guard.yml: The hnsw-distance-based-neighbor-pruning check now verifies calculate_distance() appears within 20 lines of the overflow gate (> self.config.m * 2), not just anywhere in index.rs. The old whole-file grep would pass even if distance-based pruning was removed, because search code also calls calculate_distance. Both issues flagged by Codex review.

ecc-tools · 2026-05-21T13:43:27Z

ECC bundle files are already tracked in this repository. Skipping generation of another bundle PR.

devin-ai-integration

Devin Review found 1 new potential issue.

devin-ai-integration · 2026-05-21T13:57:14Z

+            let mut entries = Vec::with_capacity(stored_ids.len());
+            for id in &stored_ids {
+                if let Some(vector) = storage.get(id)? {
+                    entries.push((id.clone(), vector));
+                }
+            }
+            index.insert_batch(entries)?;


📝 Info: Storage rebuild reads each vector in a separate transaction (O(N) transactions)

The rebuild loop at crates/ruvector-router-core/src/vector_db.rs:43-47 calls storage.get(id) for each stored ID. Each get() call opens a new read transaction (storage.rs:182). For a database with thousands of vectors, this means thousands of separate transactions. A single-transaction bulk read would be more efficient. This is a startup-only cost so it doesn't affect runtime performance, but for large persisted databases the reopen time could be noticeable.

Was this helpful? React with 👍 or 👎 to provide feedback.

Acknowledged — this is a startup-only cost and acceptable for the initial fix. A bulk get_all() or single-transaction read is a good follow-up optimization for large persisted databases, but out of scope for this correctness fix.

ruvnet and others added 9 commits May 18, 2026 16:30

style: cargo fmt --all on touched HNSW pruning block

b26001a

No behaviour change — collapses single-expression closure and assignment onto one line per rustfmt defaults so the rustfmt CI job passes. Co-Authored-By: claude-flow <ruv@ruv.net>

Merge pull request ruvnet#474 from ruvnet/ci/supply-chain-guards

b9bb370

ci(security): 5-layer supply-chain CI + clear 3 npm criticals

chore: Update NAPI-RS binaries for all platforms

dd5e145

Built from commit b9bb370 Platforms updated: - linux-x64-gnu - linux-arm64-gnu - darwin-x64 - darwin-arm64 - win32-x64-msvc 🤖 Generated by GitHub Actions

Merge remote-tracking branch 'origin/main' into ruvnet/main

1c12866

# Conflicts: # npm/core/platforms/win32-x64-msvc/ruvector.node # npm/packages/diskann/README.md

chatgpt-codex-connector Bot reviewed May 21, 2026

View reviewed changes

FlexNetOS mentioned this pull request May 21, 2026

Fix HNSW correctness issues and improve storage rebuild process #27

Closed

Merge remote-tracking branch 'origin/main' into merge/pr27-hnsw-suppl…

b5858b9

…y-chain-resolved # Conflicts: # npm/core/platforms/win32-x64-msvc/ruvector.node

chatgpt-codex-connector Bot reviewed May 21, 2026

View reviewed changes

FlexNetOS and others added 2 commits May 21, 2026 03:04

merge: integrate origin/main into PR 30 (resolve binary conflict)

9bda302

Resolved conflict in npm/core/platforms/win32-x64-msvc/ruvector.node by taking origin/main's version (newer NAPI-RS rebuild). CI rebuilds binaries on push regardless. Co-Authored-By: claude-flow <ruv@ruv.net>

chatgpt-codex-connector Bot reviewed May 21, 2026

View reviewed changes

devin-ai-integration Bot reviewed May 21, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 21, 2026

View reviewed changes

devin-ai-integration Bot reviewed May 21, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 21, 2026

View reviewed changes

devin-ai-integration Bot reviewed May 21, 2026

View reviewed changes

FlexNetOS merged commit 8fdc94c into main May 21, 2026
69 of 70 checks passed

	if grep -rnPe '(\S+)\.write\(\)[^;]*\1\.(write\|read)\(\)' \
	-rnPe '(\S+)\.read\(\)[^;]*\1\.write\(\)' \

Conversation

FlexNetOS commented May 21, 2026

Supersedes #27

What's in this PR

Conflict resolution applied

Provenance

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

ecc-tools Bot commented May 21, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

ecc-tools Bot commented May 21, 2026

Uh oh!

ecc-tools Bot commented May 21, 2026

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

devin-ai-integration Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 21, 2026 •

edited

Loading