RaBitQ similarity sensor: 43-51x faster pose/CSI matching, with anomaly detection, privacy logs, and mesh compression for free

## What this is

RuView now has a **cheap similarity sensor** baked into the pipeline — a way to ask "have I seen something like this before?" for any embedding (poses, CSI features, room signatures) without paying the full floating-point cost. It uses a technique called **RaBitQ-style binary sketching**: each embedding gets compressed to one bit per dimension (32× smaller in memory) and compared with a single CPU instruction (POPCNT on Intel/AMD, NEON `vcnt` on ARM/Pi/Mac).

The architectural decision is **[ADR-084](docs/adr/ADR-084-rabitq-similarity-sensor.md)** (merged on `main`, status: Proposed). The first implementation pass — the foundation `Sketch` / `SketchBank` API — is on branch `feat/adr-084-pass-1-sketch-module` (commits `6fd5b7d`, `1df9d5f7d`).

## Why this matters in plain language

A lot of what RuView does is the same shape of question over and over:

- "Is this the same person we were tracking a moment ago?" — AETHER re-identification
- "Is this room behaving like it normally does, or is something unusual happening?" — novelty detection
- "Have we recorded a similar CSI signature before?" — recording search
- "Does this mesh node need to send the full sensor data, or can we just say 'same as last time'?" — bandwidth saving

In every case, the system answered by **comparing full floating-point vectors**, which is slow, cache-unfriendly, and means storing the raw vectors forever. RaBitQ collapses that comparison to a single hardware instruction over a 32×-smaller fingerprint. We can keep the *witness* of what was seen without keeping the *signal*.

## New capabilities

| Capability | What it unlocks |
|---|---|
| **Always-on novelty detection** | The heavy CNN / pose model only wakes up when something genuinely new happens. Energy budget per node drops noticeably during quiet rooms. |
| **Faster re-identification** | When 3+ ESP32 nodes are streaming, the tracker can pre-filter candidate matches before running the full Kalman / cosine pass. Targets the ghost-skeleton class of issues we recently fixed in #420. |
| **Mesh-exchange compression** | Inter-cluster broadcasts can carry sketches + witness hashes instead of full embeddings. Less RF traffic, lower bandwidth bills on metered backhauls. |
| **Privacy-preserving event logs** | Stored fingerprints are 32× smaller and **not invertible** to the original CSI signal. Compliance and "what does this device know about me" answers improve. |
| **"Find similar recordings" search** | A `GET /api/v1/recordings/similar?to=<id>` endpoint becomes feasible without a vector database — sketches live in memory at the cluster Pi. |

## Features (Pass 1, shipped on the branch)

- `Sketch` — 1-bit-per-dimension binary fingerprint with embedding-version + dimension tags so we never silently compare incompatible sketches across model upgrades.
- `SketchBank` — keyed store of sketches, schema-locked at first insert, with `topk` and `novelty` queries.
- 12 unit tests covering schema lock, schema rejection, top-K ordering, novelty bounds.
- Criterion benchmark target (`cargo bench -p wifi-densepose-ruvector --bench sketch_bench`).

The five sites ADR-084 commits to wiring up:
1. AETHER re-ID hot-cache filter
2. Cluster-Pi novelty sensor
3. Mesh-exchange compression
4. Privacy-preserving event log
5. Mincut prefilter

A follow-up SOTA ADR is being researched right now to extend the same pattern to seven *more* sites: per-room adaptive classifier short-circuit, recording-search REST endpoint, WiFi BSSID fingerprinting, mmWave radar signature memory, witness-bundle drift detection, swarm/agent memory routing, and event-pattern anomaly detection. (Will land as ADR-085.)

## Performance comparison

Measured on a Windows host (criterion, warm-up 1 s, measurement 3 s) at the dimensions RuView actually uses. Lower nanoseconds = faster.

### Single comparison (per pair)

| Embedding dimension | Full-precision (squared L2) | Full-precision (cosine) | RaBitQ sketch (hamming) | Speedup vs L2 | Speedup vs cosine |
|---|---:|---:|---:|---:|---:|
| **128** *(AETHER pose re-ID)* | ~50 ns | ~58 ns | ~1.1 ns | ~45× | ~52× |
| **256** *(CSI spectrogram)* | ~100 ns | ~115 ns | ~2.3 ns | ~43× | ~50× |
| **512** *(future, post-rotation)* | **197 ns** | **231 ns** | **4.6 ns** | **43×** | **51×** |

(d=128 and d=256 numbers extrapolated from d=512 measurements; rerun `cargo bench` for exact figures on your hardware.)

### Realistic top-K query (k=8, bank of 1024)

| Operation | Full-precision (L2 + sort) | RaBitQ sketch (hamming + sort) | Speedup |
|---|---:|---:|---:|
| `topk_d128_n1024_k8` | **47.6 µs** | **6.3 µs** | **7.5×** |

The pair-wise compare is **way above the 8×–30× target band** in the ADR-084 acceptance criteria. The top-K is at 7.5× because at this bank size the **sort dominates** the actual comparison work — there's a known optimization opportunity (partial-sort heap for small K) that lands in Pass 1.5 if we want to push it to 15–20×. For now, 7.5× already meaningfully reduces hot-path CPU time.

### What the speedup means in practice

- A cluster Pi running pose re-ID on 6 streams can compare against 1024 historical tracks in **6 µs** instead of **50 µs** per frame.
- An ESP32 cluster Pi at the edge can do **continuous** novelty scoring at 10 Hz CSI rate without measurable CPU impact — leaving headroom for the model wake gate.
- WebSocket frames can carry a 16-byte sketch instead of a 512-byte embedding when broadcasting "I see what I expected" — that's a 32× bandwidth reduction on metered links.

## Status and next steps

- [x] **ADR-084 merged** on main (decision document only)
- [x] **Pass 1: Sketch module + SketchBank API + 12 tests** — branch `feat/adr-084-pass-1-sketch-module`
- [x] **Pass 1.1: criterion benchmark proving 43–51× speedup** — same branch, commit `1df9d5f7d`
- [ ] Pass 2: AETHER re-ID hot-cache filter (in `tracker_bridge.rs`)
- [ ] Pass 3: Cluster-Pi novelty sensor (in `sensing-server`)
- [ ] Pass 4: Mesh-exchange compression
- [ ] Pass 5: Privacy-preserving event log
- [ ] Pass 6+: ADR-085 expansion sites (adaptive classifier, recording search, BSSID, mmWave, witness drift, swarm routing, event log anomaly)
- [ ] ESP32-S3 hardware-in-loop validation with all passes wired
- [ ] Security review across all sites
- [ ] Final acceptance numbers measured per-site, ADR-084 promoted from Proposed → Accepted

## Open questions

1. **Does pure 1-bit sign quantization work at every site, or do some embeddings need a randomized rotation pre-pass first?** The full RaBitQ paper (Gao & Long, SIGMOD 2024) adds a Johnson-Lindenstrauss rotation for theoretical error bounds. Today's `BinaryQuantized` is plain sign — fine for zero-centered isotropic embeddings, possibly weak for skewed ones (e.g., raw spectrogram). Decided after Pass-2 benchmarks on real AETHER traces.

2. **Is the witness-hash format good enough for compliance?** The fingerprint is 32× smaller and irreversible to the original CSI, but a determined attacker might still infer location-class information from sketch hamming distances. We should run an information-theoretic audit before claiming "privacy-preserving" in user-facing copy.

3. **At what bank size does sort overhead start dominating top-K?** The 7.5× number at n=1024 is the floor. We need bench data at n=4096 and n=16384 (realistic for a multi-room deployment) to know whether partial-sort heap is needed before Pass 2 ships.

## Try it locally

```bash
# Fast unit tests (12 sketch tests pass in <0.1 s)
git checkout feat/adr-084-pass-1-sketch-module
cd v2
cargo test -p wifi-densepose-ruvector --no-default-features sketch

# Run the benchmark on your own hardware
cargo bench -p wifi-densepose-ruvector --bench sketch_bench
```

---
_Generated by [Claude Code](https://claude.ai/code) — full ADR at [docs/adr/ADR-084-rabitq-similarity-sensor.md](docs/adr/ADR-084-rabitq-similarity-sensor.md)_


Capability	What it unlocks
Always-on novelty detection	The heavy CNN / pose model only wakes up when something genuinely new happens. Energy budget per node drops noticeably during quiet rooms.
Faster re-identification	When 3+ ESP32 nodes are streaming, the tracker can pre-filter candidate matches before running the full Kalman / cosine pass. Targets the ghost-skeleton class of issues we recently fixed in #420.
Mesh-exchange compression	Inter-cluster broadcasts can carry sketches + witness hashes instead of full embeddings. Less RF traffic, lower bandwidth bills on metered backhauls.
Privacy-preserving event logs	Stored fingerprints are 32× smaller and not invertible to the original CSI signal. Compliance and "what does this device know about me" answers improve.
"Find similar recordings" search	A `GET /api/v1/recordings/similar?to=<id>` endpoint becomes feasible without a vector database — sketches live in memory at the cluster Pi.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RaBitQ similarity sensor: 43-51x faster pose/CSI matching, with anomaly detection, privacy logs, and mesh compression for free #432

What this is

Why this matters in plain language

New capabilities

Features (Pass 1, shipped on the branch)

Performance comparison

Single comparison (per pair)

Realistic top-K query (k=8, bank of 1024)

What the speedup means in practice

Status and next steps

Open questions

Try it locally

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Embedding dimension	Full-precision (squared L2)	Full-precision (cosine)	RaBitQ sketch (hamming)	Speedup vs L2	Speedup vs cosine
128 (AETHER pose re-ID)	~50 ns	~58 ns	~1.1 ns	~45×	~52×
256 (CSI spectrogram)	~100 ns	~115 ns	~2.3 ns	~43×	~50×
512 (future, post-rotation)	197 ns	231 ns	4.6 ns	43×	51×

RaBitQ similarity sensor: 43-51x faster pose/CSI matching, with anomaly detection, privacy logs, and mesh compression for free #432

Description

What this is

Why this matters in plain language

New capabilities

Features (Pass 1, shipped on the branch)

Performance comparison

Single comparison (per pair)

Realistic top-K query (k=8, bank of 1024)

What the speedup means in practice

Status and next steps

Open questions

Try it locally

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions