Skip to content

Ruvector updates #12

Merged
ruvnet merged 9 commits intomainfrom
claude/review-dna-capabilities-01EG6MdGiUpao5c8iCn1624X
Nov 25, 2025
Merged

Ruvector updates #12
ruvnet merged 9 commits intomainfrom
claude/review-dna-capabilities-01EG6MdGiUpao5c8iCn1624X

Conversation

@ruvnet
Copy link
Copy Markdown
Owner

@ruvnet ruvnet commented Nov 25, 2025

This pull request introduces two new crates, ruvector-collections and ruvector-cluster, and integrates them into the workspace. These crates provide foundational components for managing vector collections and distributed clustering/sharding. The changes include implementations for collection configuration/validation, error handling, and cluster node discovery (static, gossip, and multicast). Additionally, there are minor improvements to workspace configuration and dependency management.

New Crates and Core Functionality

  • Added the ruvector-collections crate, which defines collection types, configuration validation, statistics, and error handling for vector collections. This includes support for configuration validation, human-readable stats, and integration with the core vector database. [1]], [2]], [3]])
  • Added the ruvector-cluster crate, implementing distributed clustering and sharding primitives. It provides node discovery mechanisms (static, gossip, multicast), node registration and heartbeat logic, and basic gossip protocol statistics and tests. [1]], [2]])

Workspace and Dependency Updates

  • Registered the new crates in the workspace by updating the members list in Cargo.toml. ([Cargo.tomlR8-R22])
  • Updated the bincode dependency to enable the serde feature, improving serialization compatibility. ([Cargo.tomlL37-R45])
  • Removed the opt-level override for the test profile, so it now uses the default optimization level. ([Cargo.tomlL102])

claude and others added 9 commits November 25, 2025 01:17
- Fix import paths in comparison_benchmark.rs and hnsw_search.rs
- Add Python benchmark suite comparing rUvector vs Qdrant
- Create detailed performance comparison documentation

Key findings:
- rUvector: 22x faster search at 50K vectors
- HNSW search: 45-165µs latency (k=1 to k=100)
- Distance calculations: 22-135ns (SIMD-optimized)
- Quantization: 4-32x memory compression
Detailed feature gap analysis and implementation plan covering:

Priority 1 (Critical):
- REST/gRPC API server with OpenAPI spec
- Advanced payload indexing (9 index types)
- Multi-collection management with aliases
- Snapshots and S3 backup support

Priority 2 (Scalability):
- Distributed mode with sharding
- Raft consensus for metadata
- Configurable replication

Priority 3 (Enterprise):
- Authentication with JWT RBAC
- TLS support (client + inter-node)
- Prometheus/OpenTelemetry metrics

Priority 4 (Performance):
- Asymmetric quantization
- Variable bit-width (1.5-bit, 2-bit)
- Tiered storage (hot/warm/cold)

Priority 5 (DX):
- Python/Go/Java SDKs
- Web dashboard
- Migration tools (FAISS, Pinecone, Weaviate)

Preserves rUvector advantages: 22x faster search, WASM,
hypergraphs, AgenticDB, sub-100µs latency
New Crates:
- ruvector-server: REST API server using axum (collections, points, health endpoints)
- ruvector-collections: Multi-collection management with aliases
- ruvector-filter: Advanced payload indexing (9 index types, geo, full-text)
- ruvector-snapshot: Backup/restore with gzip compression and checksums
- ruvector-metrics: Prometheus metrics and health checks

Integrations:
- Node.js NAPI-RS: CollectionManager, filters, metrics, health endpoints
- WASM: CollectionManager, FilterBuilder (with feature flag)

Performance Benchmarks:
- HNSW search: 41-151µs (k=1 to k=100)
- Distance calc: 16-142ns (128-1536 dims)
- Batch distances: 278µs (1000x384)

All crates compile in both debug and release modes.
…cation

- ruvector-cluster: Distributed coordination with DAG-based consensus,
  consistent hashing sharding, node discovery (static/gossip/multicast),
  and load balancing across shards

- ruvector-raft: Full Raft consensus implementation following the paper
  spec, including leader election, log replication, snapshots, and RPC
  messages with bincode 2.0 serialization

- ruvector-replication: Data replication with sync/async/semi-sync modes,
  vector clock conflict resolution, CRDT-inspired merge strategies,
  change streaming with checkpointing, and automatic failover with
  quorum-based decisions

All 56 tests pass across the 3 new crates. Fixed several issues during
review: bincode error types, Send bounds for async spawns, unnecessary
async methods converted to sync.
The platform-specific packages (darwin-arm64, darwin-x64, linux-arm64-gnu,
linux-x64-gnu, win32-x64-msvc) were updated to 0.1.2, but the main
npm/core/package.json still referenced 0.1.1, causing CI build failures.

This commit updates the optionalDependencies to match the actual package
versions and syncs the package-lock.json accordingly.

Fixes build failures in PR #12.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The "Find built .node files (debug)" step was failing on Windows because
it defaulted to PowerShell, which doesn't understand /dev/null redirection.

Adding shell: bash makes it consistent with the other build steps and
ensures cross-platform compatibility.

Fixes Windows build failures in PR #12.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Replace deprecated macos-13 with macos-15-large to avoid brownouts
and ensure Intel Mac builds continue to work.

GitHub is deprecating macos-13 runners:
actions/runner-images#13046

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
macos-15-large doesn't appear to be available. Using macos-13-xlarge
which is the larger Intel runner still available during the transition period.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Based on GitHub's official documentation, macos-15-intel is the correct
replacement for deprecated macos-13 runners for x86_64 architecture.

Reference: actions/runner-images#13045

This is the last available x86_64 image from Actions, available until
August 2027.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@ruvnet ruvnet merged commit 85b67c4 into main Nov 25, 2025
6 checks passed
ruvnet added a commit that referenced this pull request Feb 20, 2026
The platform-specific packages (darwin-arm64, darwin-x64, linux-arm64-gnu,
linux-x64-gnu, win32-x64-msvc) were updated to 0.1.2, but the main
npm/core/package.json still referenced 0.1.1, causing CI build failures.

This commit updates the optionalDependencies to match the actual package
versions and syncs the package-lock.json accordingly.

Fixes build failures in PR #12.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
ruvnet added a commit that referenced this pull request Feb 20, 2026
The "Find built .node files (debug)" step was failing on Windows because
it defaulted to PowerShell, which doesn't understand /dev/null redirection.

Adding shell: bash makes it consistent with the other build steps and
ensures cross-platform compatibility.

Fixes Windows build failures in PR #12.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
ruvnet added a commit that referenced this pull request Feb 20, 2026
ruvnet added a commit that referenced this pull request Apr 22, 2026
… for AC-2

Commit 19 (d06e80f on feat/analysis-rate-encoder, merged) ran a
controlled A/B on the same 8-protocol labeled corpus that disproved
SDPA at discovery #10: raw per-neuron-per-time-bin spike counts (the
crudest possible encoder; no projection, no attention) scored
rate-histogram precision@5 = 0.079 vs SDPA's 0.072 — delta +0.007,
inside the ±0.05 tie band.

Both encoders score below random chance for 8 classes (0.125). The
crudest encoder that preserves all raster information ties the
shipped encoder. That rules out the encoder axis of ADR §17 item
10's three-axis framing.

Remaining AC-2 levers:
  - substrate: real FlyWire v783 ingest replaces synthetic SBM
    (predicted to separate under its heavier non-hub tail)
  - labels:    raster-regime labels replace stimulus-protocol
    labels (matches what the encoder actually captures)

Both are research-level pivots for a separate ADR, not engineering
levers on this branch.

The branch's broader pattern of measurement-disproving pre-measurement
diagnoses now stands at 11-of-12 named levers tested surfacing at
least one honest surprise. The sole unambiguous win remains commit 10
(adaptive detect cadence, 4.29×) — changed *when*, not *what*.

Co-Authored-By: claude-flow <ruv@ruv.net>
EOF
)
ruvnet added a commit that referenced this pull request Apr 22, 2026
…ent + lesion + audit

Ships the public ABIs + productized wrappers that move three of
Connectome OS's exotic applications (README Part 3) one concrete
step closer to feasible. Each is scaffolding, not a full
implementation — the production pieces (MuJoCo bridge, mouse
connectome, real FlyWire data) genuinely can't ship from this
branch — but each gives external code the typed surface to build
against today.

Three new top-level modules:

1. src/embodiment.rs — BodySimulator trait + 2 implementations
   (247 LOC incl. tests)

   The slot where a physics body sits between the connectome's
   motor outputs and sensory inputs. Defines the per-tick ABI
   (, , ) that Phase-3 MuJoCo + NeuroMechFly
   will drop into. Ships two impls:
     - StubBody — deterministic open-loop drive over an existing
       Stimulus schedule. Preserves AC-1. This is what the
       Tier-1 demo runs with.
     - MujocoBody — Phase-3 panic-stub. Constructs without
       panicking (so downstream code can Box<dyn BodySimulator>
       against it today); panics on step/reset with an
       actionable diagnostic pointing at ADR-154 §13 and
       04-embodiment.md.

   Unblocks application #10 — 'embodied fly navigation in VR'.
   The remaining Phase-3 work is the cxx bridge + NeuroMechFly
   MJCF ingest; the wiring is now waiting, not un-designed.

2. src/lesion.rs — LesionStudy + CandidateCut + LesionReport
   (374 LOC incl. tests)

   Productization of AC-5 σ-separation. Outside code can now
   answer 'which edges are load-bearing for behaviour X?'
   without copy-pasting the test internals. Paired-trial loop,
   σ distance against a nominated reference cut, deterministic
   across repeat runs. Includes boundary_edges() / interior_edges()
   helpers so callers can build cuts from a FunctionalPartition
   without re-deriving the traversal.

   Unblocks application #11 — 'in-silico circuit-lesion studies'.
   Also powers the audit module (next).

3. src/audit.rs — StructuralAudit + StructuralAuditReport
   (235 LOC incl. tests)

   One-call orchestrator that runs every analysis primitive
   (Fiedler coherence, structural mincut, functional mincut,
   SDPA motif retrieval, AC-5-shaped causal perturbation) and
   returns a single report a reviewer can read top-to-bottom.
   Auto-generates boundary-vs-interior candidate cuts when the
   caller doesn't supply explicit ones. Same determinism
   contract as every underlying primitive.

   Unblocks application #13 — 'connectome-grounded AI safety
   auditing'. The framing is 'safety auditing'; the deliverable
   is a reproducible report, not a safety guarantee.

Applications #12 ('cross-species connectome transfer') needs a
second heterogeneous connectome; today we have the fly-scale
substrate only. Deferred until Tier-2 mouse data lands.

Application #14 ('substrate for structural-intelligence research
papers') was already open — it's the meta-application, no
scaffolding needed.

Lib.rs re-exports the new public types so downstream consumers
can
directly.

Measurements:
  10/10 new unit tests pass on :
    embodiment: 5 tests (trait object-safe, stub determinism +
                windowing, mujoco stub construct-ok +
                step-panics-with-diagnostic)
    lesion:     3 tests (report shape, boundary/interior disjoint,
                deterministic across repeats)
    audit:      2 tests (populates every field, deterministic)

  All 73 prior tests still pass; no API regression.

  Total new LOC: 856 (247 + 374 + 235) src + tests; all files
  under the 500-line ADR-154 §3.2 file budget.

Positioning rubric held. Scaffolding is scaffolding — not new
scientific claims. Every module docstring links back to the
Connectome-OS README Part 3 application it unblocks.

Co-Authored-By: claude-flow <ruv@ruv.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants