EML Operator-Inspired Optimizations: Log Quantization, Unified Distance, EML Trees

> **📋 Status: Ready for maintainer review (2026-04-14)**
>
> Proposal fully implemented, benchmarked on real public ANN datasets, and committed to branch `issue/eml` on fork [shaal/ruvector](https://github.com/shaal/ruvector/tree/issue/eml). PR #352 is open but this issue contains everything needed to evaluate the work end-to-end. All four proof reports live in [`bench_results/`](https://github.com/shaal/ruvector/tree/issue/eml/bench_results) on the branch.
>
> **One-line ask:** merge the branch (or cherry-pick the relevant commits) if the SIFT1M real-data win and the honest failure-mode documentation meet the bar; reject with specific concerns if not.

---

## TL;DR

Four optimizations inspired by Odrzywołek's ["All elementary functions from a single operator"](https://arxiv.org/html/2603.21852v2) paper, validated through a rigorous four-stage proof chain that includes one self-disproof and one hypothesis-test failure — honestly reported.

### Headline real-data result (the single most important number)

**On SIFT1M (100K subset, the canonical ANN benchmark, 3 seeds): +14.0% QPS at ef=64, +0.75% Recall@1, −3.3% build time.** Recall preserved or slightly improved across k=1/10/100. Full table below.

### Honest caveat

**On GloVe-100d: −10.4% QPS at ef=256** (recall preserved). The non-power-of-two 100D dimension causes SimSIMD's scalar tail-handling to execute per distance call. We tried zero-padding as a fix (v4) and it *made the regression worse* due to per-call padding overhead. The correct fix (pad-at-insert) is scoped as follow-up work; the API hook for it is already shipped.

### Four-stage proof chain

| Stage | Outcome | Report |
|---|---|---|
| **v1** Micro + end-to-end scalar unified kernel | ❌ Disproved — −21% QPS regression | [`eml_proof_2026-04-14.md`](https://github.com/shaal/ruvector/blob/issue/eml/bench_results/eml_proof_2026-04-14.md) |
| **v2** Port `compute()` to SimSIMD, re-run synthetic | ✅ Confirmed — +5–11% QPS, recall preserved | [`eml_proof_2026-04-14_v2.md`](https://github.com/shaal/ruvector/blob/issue/eml/bench_results/eml_proof_2026-04-14_v2.md) |
| **v3** Real public ANN datasets (SIFT1M + GloVe) | ⚠️ Mixed — SIFT1M +14% win; GloVe −10% regression | [`eml_proof_2026-04-14_v3.md`](https://github.com/shaal/ruvector/blob/issue/eml/bench_results/eml_proof_2026-04-14_v3.md) |
| **v4** Dimension-padding hypothesis test | ❌ Disproved — per-call padding makes it worse | [`eml_proof_2026-04-14_v4.md`](https://github.com/shaal/ruvector/blob/issue/eml/bench_results/eml_proof_2026-04-14_v4.md) |

---

## The Proposal

Four optimizations derived from the proof that `eml(x, y) = exp(x) − ln(y)` combined with the constant 1 is *functionally complete* for all elementary mathematical functions — analogous to NAND gates in Boolean logic. The paper's identities we use:

- `eml(x, 1) = exp(x)` — exponential
- `eml(0, y) = 1 − ln(y)` — shifted logarithm
- `eml(ln(a), exp(b)) = a − b` — subtraction

### Optimization 1 — `LogQuantized` (Logarithmic Scalar Quantization)

**Problem.** `ScalarQuantized` maps values linearly to [0, 255]. Real embedding distributions (transformer outputs, ReLU activations, frequency features) are non-uniform — values cluster near zero with heavy tails. Linear mapping wastes bits on the sparse tail.

**Solution.** Apply `ln(x − min + 1)` before uniform quantization and `exp(q) − 1 + min` to reconstruct. Same 4× compression as scalar, finer granularity where values are dense.

**Implementation.** [`src/quantization.rs`](https://github.com/shaal/ruvector/blob/issue/eml/crates/ruvector-core/src/quantization.rs) — same u8 storage as `ScalarQuantized`, same SIMD-accelerated distance path.

### Optimization 2 — `UnifiedDistanceParams` (Branch-free Distance Kernel)

**Problem.** `distance()` dispatches via `match metric { … }` on every call. Inside HNSW search, this match executes thousands of times per query.

**Solution.** Pre-resolve the metric once into boolean flags (`normalize`, `use_abs_diff`, `use_sq_diff`, `apply_sqrt`, `negate`). The inner loop then evaluates a predictable `if/else if` cascade on those flags, which branch-predicts perfectly for a given index.

**Implementation** ([`src/advanced/eml.rs`](https://github.com/shaal/ruvector/blob/issue/eml/crates/ruvector-core/src/advanced/eml.rs)). After v1's disproof, `compute()` now dispatches directly to `simsimd::SpatialSimilarity::{cosine, sqeuclidean, dot}` — same hand-tuned AVX2/NEON kernels as the baseline.

### Optimization 3 — `EmlTree` / `EmlModel` (Learned-Index CDF Approximator)

**Problem.** `RecursiveModelIndex` uses linear models for CDF approximation; real distributions are rarely linear.

**Solution.** Replace linear model with a trainable binary tree of `eml()` nodes. The paper's functional-completeness proof guarantees any elementary CDF can be represented.

**Evidence.** On `y = exp(x)`, a depth-1 EML tree converges in 12 iterations to test-MSE `0.000000` vs linear model's `0.194` (~2 billion× improvement).

### Optimization 4 — `EmlScoreFusion` (Non-linear Hybrid Search Scoring)

**Problem.** Linear fusion `α·vector + (1−α)·bm25` can't capture non-linear interactions.

**Solution.** `eml(a·vector + b, c·bm25 + d)` — captures the real-world pattern that vector similarity has exponential impact while keyword matching has logarithmic impact. Measured **3.16× asymmetry** in `tests/eml_proof.rs` (PROOF 5).

**Overhead.** 2.5 ns/pair absolute (4.3× slower than linear). Negligible for top-K hybrid scoring.

---

## Evidence Detail

### Micro-evidence ([`tests/eml_proof.rs`](https://github.com/shaal/ruvector/blob/issue/eml/crates/ruvector-core/tests/eml_proof.rs) — 8 integration tests, all passing)

| Claim | Measurement |
|---|---|
| LogQuantized MSE (half-normal / ReLU) | **20.9% lower** than Scalar |
| LogQuantized MSE (exponential) | **52.5% lower** than Scalar |
| LogQuantized MSE (log-normal) | **5.1% lower** than Scalar |
| Unified distance correctness | max diff `< 3.34e-6` across 1,710 vector pairs |
| EML tree on `exp(x)` | **~2 billion×** lower test-MSE than linear model |
| EML score fusion asymmetry | **3.16×** vector/keyword ratio |
| Zero-padding semantic neutrality | proven for Euclidean / Cosine / DotProduct / Manhattan |

### End-to-End real-data evidence (SIFT1M — the headline win)

**Dataset:** SIFT1M from [Texmex corpus](ftp://ftp.irisa.fr/local/texmex/corpus/sift.tar.gz), 100K base × 500 queries × 128D, Euclidean.
**HNSW:** M=16, ef_construction=200. Three seeds [42, 1337, 2024].

| Metric | Baseline (Scalar+Dispatch) | LogQuantized (Log+Unified SIMD) | Delta |
|---|---|---|---|
| **QPS ef=64** | 2,946 ± 300 | **3,358 ± 262** | **+14.0%** ⭐ |
| QPS ef=256 | 1,604 ± 63 | 1,653 ± 94 | +3.0% |
| **Recall@1 (ef=256)** | 0.9807 ± 0.0057 | **0.9880 ± 0.0016** | **+0.75%** ⭐ |
| Recall@10 | 0.9843 | 0.9853 | +0.1% (~same) |
| Recall@100 | 0.9785 | 0.9785 | ~0% |
| **Build time** | 39.0 s | 37.7 s | **−3.3%** |
| p50 ef=64 | 337 µs | 298 µs | −11.5% (better) |
| p95 ef=64 | 480 µs | 417 µs | −13.1% (better) |
| p99 ef=64 | 538 µs | 504 µs | −6.3% (better) |
| p99.9 ef=64 | 658 µs ± 147 | 882 µs ± 551 | +34.0% (high stddev) |
| p50 ef=256 | 618 µs | 591 µs | −4.4% |
| p95/p99 ef=256 | — | — | ≈baseline ±10% |
| p99.9 ef=256 | 1,190 µs | 1,628 µs | +36.9% (tail noise) |

### End-to-End real-data evidence (GloVe-100d — the honest regression)

**Dataset:** GloVe-100d from [Stanford NLP](https://nlp.stanford.edu/data/glove.6B.zip), 100K base × 500 queries × 100D, Cosine.
**HNSW:** identical config.

| Metric | Baseline | Log+Unified | Delta |
|---|---|---|---|
| QPS ef=64 | 2,483 | 2,429 | −2.2% (~same) |
| **QPS ef=256** | 1,198 | **1,073** | **−10.4% WORSE** ⚠️ |
| Recall@1 | 0.9740 | 0.9713 | −0.27% (noise) |
| Recall@10 | 0.9698 | 0.9705 | +0.1% |
| Recall@100 | 0.9374 | 0.9375 | ~0% |

**Root cause**: 100D is not a multiple of 8, so SimSIMD runs a scalar tail-loop on every distance call. Cosine normalization also dispatches to the same `simsimd::cosine()` function in both baseline and unified paths, so the theoretical branch-free advantage has no room to manifest.

### v4 padding hypothesis test (honest disproof)

We added `pad_to_power_of_two: bool` to `UnifiedDistanceParams` and `HnswIndex::new_unified_padded()`, zero-padding 100D → 104D per-call via thread-local scratch. Re-ran the same GloVe benchmark:

| GloVe Δ vs baseline | v3 unpadded | v4 padded | Interpretation |
|---|---|---|---|
| QPS ef=64 | −2.2% | **−24.6% WORSE** | padding hurt |
| QPS ef=256 | −10.4% | **−23.4% WORSE** | padding hurt |
| Build time | +3.3% | **+31.7% WORSE** | padding hurt |

**Why:** per-call padding path = `RefCell::borrow_mut()` × 2 + `clear()` × 2 + `extend_from_slice(100)` × 2 (400-byte memcpy each) + `resize(104)` × 2 + SimSIMD call. That's ~60–100 cycles overhead per distance call vs ~5–10 cycles saved on the SimSIMD tail. Net loss. HNSW does thousands of distance calls per query, so the loss amplifies.

**Correct fix (scoped as follow-up, not in this PR):** pad vectors *once* at `add()` / `search()` time, store padded in HNSW, let `compute_raw()` see pre-aligned slices with zero per-call overhead. The `pad_to_power_of_two` API hook is shipped so this follow-up lands without a public API change. A [`TODO(future)`](https://github.com/shaal/ruvector/blob/issue/eml/crates/ruvector-core/src/advanced/eml.rs#L649) comment near `UnifiedDistanceParams::with_padding()` records this.

---

## API Surface (all opt-in; defaults unchanged)

```rust
// New variant in existing enum — backward compatible
pub enum QuantizationConfig {
    None, Scalar, Product { ... }, Binary,
    Log,                       // ← NEW
}

// New constructors alongside existing HnswIndex::new()
impl HnswIndex {
    pub fn new(...)                   // UNCHANGED — default
    pub fn new_unified(...)           // ← NEW: branch-free kernel
    pub fn new_unified_padded(...)    // ← NEW: + zero-padding (experimental, see v4)
}

// New type in advanced::eml
pub struct UnifiedDistanceParams {
    pub negate: bool,
    pub normalize: bool,
    pub use_abs_diff: bool,
    pub use_sq_diff: bool,
    pub apply_sqrt: bool,
    pub pad_to_power_of_two: bool,    // ← NEW (default false)
}
impl UnifiedDistanceParams {
    pub fn euclidean() -> Self { ... }
    pub fn cosine() -> Self { ... }
    pub fn dot_product() -> Self { ... }
    pub fn manhattan() -> Self { ... }
    pub fn from_metric(DistanceMetric) -> Self { ... }
    pub fn with_padding(self, bool) -> Self { ... }    // ← NEW
    pub fn compute(&self, &[f32], &[f32]) -> f32 { ... }
    pub fn batch_compute(...) -> Vec<f32> { ... }
    pub fn batch_compute_parallel(...) -> Vec<f32> { ... }  // feature = "parallel"
}

// Plus: EmlTree, EmlModel, EmlScoreFusion, TrainConfig, eml(), eml_safe()
// all under `ruvector_core::advanced::eml`, opt-in usage.
```

**Zero breaking changes.** Default `VectorDB::new(DbOptions::default())` still produces `ScalarQuantized` + match-dispatched distance + standard `HnswIndex::new()`. Users opt into each EML feature explicitly.

---

## Test & Validation Coverage

| Suite | Count | Result |
|---|---|---|
| EML unit tests (`advanced::eml::tests`) | 27 | ✅ all pass (21 original + 6 new padding correctness) |
| Quantization tests (incl. LogQuantized) | 18 | ✅ all pass |
| HNSW tests | 7 | ✅ all pass |
| Integration proofs (`tests/eml_proof.rs`) | 8 | ✅ all pass |
| **Total across all affected modules** | **60+** | ✅ all pass |

Micro-benchmarks: `cargo bench -p ruvector-core --bench eml_bench` — 6 Criterion groups.

End-to-end: `cargo bench -p ruvector-core --bench eml_end_to_end` — fast Criterion subset runs in ~1 min; full real-dataset proof in ~15 min.

---

## Reproducibility (maintainer can re-run from scratch)

```bash
# Check out the branch
git fetch https://github.com/shaal/ruvector.git issue/eml
git checkout -b review/eml-issue-351 FETCH_HEAD

# Pre-cache datasets (one-time, ~1GB; gitignored)
mkdir -p bench_data && cd bench_data
curl -fLO ftp://ftp.irisa.fr/local/texmex/corpus/sift.tar.gz && tar xzf sift.tar.gz
curl -fLO https://nlp.stanford.edu/data/glove.6B.zip && unzip -o glove.6B.zip glove.6B.100d.txt
cd ..

# Run correctness tests (< 1 min)
cargo test -p ruvector-core --lib --release -- advanced::eml:: quantization:: index::hnsw::
cargo test -p ruvector-core --test eml_proof --release -- --nocapture

# Micro-benchmarks (~5 min)
cargo bench -p ruvector-core --bench eml_bench

# End-to-end real-dataset proof (v3 table, ~15 min at 100K)
EML_FULL_PROOF=1 EML_REAL_DATASETS=1 EML_PROOF_N=100000 EML_PROOF_Q=500 \
  cargo bench -p ruvector-core --bench eml_end_to_end -- eml_e2e_full_proof

# v4 padding test on GloVe only (~5 min)
EML_FULL_PROOF=1 EML_REAL_DATASETS=1 EML_SYNTHETIC_DATASETS=0 EML_SKIP_SIFT1M=1 \
  EML_PAD_UNIFIED=1 EML_PROOF_N=100000 EML_PROOF_Q=500 \
  cargo bench -p ruvector-core --bench eml_end_to_end -- eml_e2e_full_proof
```

Each benchmark prints a Markdown table + inline CSV to stderr, identical format to the committed proof reports.

---

## Files Changed (on branch `issue/eml`)

| File | Role |
|---|---|
| [`src/advanced/eml.rs`](https://github.com/shaal/ruvector/blob/issue/eml/crates/ruvector-core/src/advanced/eml.rs) | Core EML operator, EmlTree, training, SIMD `UnifiedDistanceParams` with padding, `EmlScoreFusion` |
| [`src/quantization.rs`](https://github.com/shaal/ruvector/blob/issue/eml/crates/ruvector-core/src/quantization.rs) | `LogQuantized` type with SIMD distance |
| [`src/types.rs`](https://github.com/shaal/ruvector/blob/issue/eml/crates/ruvector-core/src/types.rs) | `QuantizationConfig::Log` variant + doc comment with power-of-two caveat + all proof refs |
| [`src/index/hnsw.rs`](https://github.com/shaal/ruvector/blob/issue/eml/crates/ruvector-core/src/index/hnsw.rs) | `HnswDistanceFn` enum, `DistanceStrategy`, `new_unified{_padded}()` |
| [`benches/eml_bench.rs`](https://github.com/shaal/ruvector/blob/issue/eml/crates/ruvector-core/benches/eml_bench.rs) | 6 Criterion micro-benchmark groups |
| [`benches/eml_end_to_end.rs`](https://github.com/shaal/ruvector/blob/issue/eml/crates/ruvector-core/benches/eml_end_to_end.rs) | End-to-end ANN proof: .fvecs + GloVe loaders, `Dataset` abstraction, env-gated proof runner |
| [`tests/eml_proof.rs`](https://github.com/shaal/ruvector/blob/issue/eml/crates/ruvector-core/tests/eml_proof.rs) | 8 integration proof tests |
| [`docs/adr/ADR-033-eml-operator-optimizations.md`](https://github.com/shaal/ruvector/blob/issue/eml/docs/adr/ADR-033-eml-operator-optimizations.md) | Architecture decision record |
| [`docs/adr/DDD-eml-bounded-context.md`](https://github.com/shaal/ruvector/blob/issue/eml/docs/adr/DDD-eml-bounded-context.md) | Domain-driven design context |
| [`docs/benchmarks/BENCHMARKING_GUIDE.md`](https://github.com/shaal/ruvector/blob/issue/eml/docs/benchmarks/BENCHMARKING_GUIDE.md) | New section + headline numbers + v1–v4 references |
| [`bench_results/eml_proof_2026-04-14{,_v2,_v3,_v4}.md`](https://github.com/shaal/ruvector/tree/issue/eml/bench_results) | Four proof reports (evidence chain) |
| `.gitignore` | `/bench_data/` excluded |

### Commit history (4 commits on the branch, ready for cherry-pick or squash-merge)

```
9bc8d165 docs: align types.rs + BENCHMARKING_GUIDE.md with v4 finding + TODO
9cdc6907 feat: real-dataset ANN proof (SIFT1M + GloVe) + dimension padding option
ab18edf9 feat: end-to-end ANN proof for EML optimizations + SIMD port
5a4b5f8e feat: add EML operator-inspired optimizations for quantization, distance, and learned indexes
```

---

## Known Follow-up Work (explicitly NOT in this PR)

1. **Pad-at-insert HNSW variant.** Closes the GloVe regression if the hypothesis holds at the insert layer instead of the per-distance-call layer. The API hook (`pad_to_power_of_two` flag + `new_unified_padded()`) is already shipped so this lands without a public API change. Estimated scope: change `HnswIndex::add_batch()` and `search_with_ef()` to pad once; ~100 LOC.
2. **Full 1M SIFT1M run.** Current validation uses the 100K subset for a ~15-min run; 1M would take ~3 hours but confirm the win scales.
3. **ANN-Benchmarks protocol on Deep1M / MS MARCO / Fashion-MNIST.** Broader corpus coverage before declaring the wins universally production-ready.
4. **Faiss / pgvector comparative benchmarks.** Cross-library validation.

None of these are blocking for the current PR — they're future work the proof chain identifies, not gaps in the current evidence.

---

## What to Review

For the maintainer, the minimum viable review path is:

1. **Read the TL;DR above** — is the SIFT1M headline win credible and valuable?
2. **Open [v3 report](https://github.com/shaal/ruvector/blob/issue/eml/bench_results/eml_proof_2026-04-14_v3.md)** — scan the Markdown table for SIFT1M; verify the reproducibility instructions make sense.
3. **Open [v4 report](https://github.com/shaal/ruvector/blob/issue/eml/bench_results/eml_proof_2026-04-14_v4.md)** — verify the honest regression disclosure for GloVe is adequate and the pad-at-insert follow-up plan is sound.
4. **Skim [`types.rs`](https://github.com/shaal/ruvector/blob/issue/eml/crates/ruvector-core/src/types.rs) around line 105** — verify `QuantizationConfig::Log` doc carries the caveat clearly.
5. **Verify defaults unchanged** — `DbOptions::default()` still produces `ScalarQuantized`; no existing user gets silently opted into new behavior.

If those five checks are satisfied, this can merge. If any concerns surface, happy to address them directly in comments or spin up additional benchmarks.

---

## References

- Odrzywołek, A. (2025). ["All elementary functions from a single operator."](https://arxiv.org/html/2603.21852v2) arXiv:2603.21852v2
- [ADR-033 — EML operator optimizations](https://github.com/shaal/ruvector/blob/issue/eml/docs/adr/ADR-033-eml-operator-optimizations.md)
- [DDD-EML bounded context](https://github.com/shaal/ruvector/blob/issue/eml/docs/adr/DDD-eml-bounded-context.md)
- [PR #352](https://github.com/ruvnet/RuVector/pull/352) — same content as this issue; this issue is the authoritative review surface

Stage	Outcome	Report
v1 Micro + end-to-end scalar unified kernel	❌ Disproved — −21% QPS regression	`eml_proof_2026-04-14.md`
v2 Port `compute()` to SimSIMD, re-run synthetic	✅ Confirmed — +5–11% QPS, recall preserved	`eml_proof_2026-04-14_v2.md`
v3 Real public ANN datasets (SIFT1M + GloVe)	⚠️ Mixed — SIFT1M +14% win; GloVe −10% regression	`eml_proof_2026-04-14_v3.md`
v4 Dimension-padding hypothesis test	❌ Disproved — per-call padding makes it worse	`eml_proof_2026-04-14_v4.md`

File	Role
`src/advanced/eml.rs`	Core EML operator, EmlTree, training, SIMD `UnifiedDistanceParams` with padding, `EmlScoreFusion`
`src/quantization.rs`	`LogQuantized` type with SIMD distance
`src/types.rs`	`QuantizationConfig::Log` variant + doc comment with power-of-two caveat + all proof refs
`src/index/hnsw.rs`	`HnswDistanceFn` enum, `DistanceStrategy`, `new_unified{_padded}()`
`benches/eml_bench.rs`	6 Criterion micro-benchmark groups
`benches/eml_end_to_end.rs`	End-to-end ANN proof: .fvecs + GloVe loaders, `Dataset` abstraction, env-gated proof runner
`tests/eml_proof.rs`	8 integration proof tests
`docs/adr/ADR-033-eml-operator-optimizations.md`	Architecture decision record
`docs/adr/DDD-eml-bounded-context.md`	Domain-driven design context
`docs/benchmarks/BENCHMARKING_GUIDE.md`	New section + headline numbers + v1–v4 references
`bench_results/eml_proof_2026-04-14{,_v2,_v3,_v4}.md`	Four proof reports (evidence chain)
`.gitignore`	`/bench_data/` excluded

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EML Operator-Inspired Optimizations: Log Quantization, Unified Distance, EML Trees #351

TL;DR

Headline real-data result (the single most important number)

Honest caveat

Four-stage proof chain

The Proposal

Optimization 1 — `LogQuantized` (Logarithmic Scalar Quantization)

Optimization 2 — `UnifiedDistanceParams` (Branch-free Distance Kernel)

Optimization 3 — `EmlTree` / `EmlModel` (Learned-Index CDF Approximator)

Optimization 4 — `EmlScoreFusion` (Non-linear Hybrid Search Scoring)

Evidence Detail

Micro-evidence (`tests/eml_proof.rs` — 8 integration tests, all passing)

End-to-End real-data evidence (SIFT1M — the headline win)

End-to-End real-data evidence (GloVe-100d — the honest regression)

v4 padding hypothesis test (honest disproof)

API Surface (all opt-in; defaults unchanged)

Test & Validation Coverage

Reproducibility (maintainer can re-run from scratch)

Files Changed (on branch `issue/eml`)

Commit history (4 commits on the branch, ready for cherry-pick or squash-merge)

Known Follow-up Work (explicitly NOT in this PR)

What to Review

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Claim	Measurement
LogQuantized MSE (half-normal / ReLU)	20.9% lower than Scalar
LogQuantized MSE (exponential)	52.5% lower than Scalar
LogQuantized MSE (log-normal)	5.1% lower than Scalar
Unified distance correctness	max diff `< 3.34e-6` across 1,710 vector pairs
EML tree on `exp(x)`	~2 billion× lower test-MSE than linear model
EML score fusion asymmetry	3.16× vector/keyword ratio
Zero-padding semantic neutrality	proven for Euclidean / Cosine / DotProduct / Manhattan

Metric	Baseline (Scalar+Dispatch)	LogQuantized (Log+Unified SIMD)	Delta
QPS ef=64	2,946 ± 300	3,358 ± 262	+14.0% ⭐
QPS ef=256	1,604 ± 63	1,653 ± 94	+3.0%
Recall@1 (ef=256)	0.9807 ± 0.0057	0.9880 ± 0.0016	+0.75% ⭐
Recall@10	0.9843	0.9853	+0.1% (~same)
Recall@100	0.9785	0.9785	~0%
Build time	39.0 s	37.7 s	−3.3%
p50 ef=64	337 µs	298 µs	−11.5% (better)
p95 ef=64	480 µs	417 µs	−13.1% (better)
p99 ef=64	538 µs	504 µs	−6.3% (better)
p99.9 ef=64	658 µs ± 147	882 µs ± 551	+34.0% (high stddev)
p50 ef=256	618 µs	591 µs	−4.4%
p95/p99 ef=256	—	—	≈baseline ±10%
p99.9 ef=256	1,190 µs	1,628 µs	+36.9% (tail noise)

Metric	Baseline	Log+Unified	Delta
QPS ef=64	2,483	2,429	−2.2% (~same)
QPS ef=256	1,198	1,073	−10.4% WORSE ⚠️
Recall@1	0.9740	0.9713	−0.27% (noise)
Recall@10	0.9698	0.9705	+0.1%
Recall@100	0.9374	0.9375	~0%

GloVe Δ vs baseline	v3 unpadded	v4 padded	Interpretation
QPS ef=64	−2.2%	−24.6% WORSE	padding hurt
QPS ef=256	−10.4%	−23.4% WORSE	padding hurt
Build time	+3.3%	+31.7% WORSE	padding hurt

Suite	Count	Result
EML unit tests (`advanced::eml::tests`)	27	✅ all pass (21 original + 6 new padding correctness)
Quantization tests (incl. LogQuantized)	18	✅ all pass
HNSW tests	7	✅ all pass
Integration proofs (`tests/eml_proof.rs`)	8	✅ all pass
Total across all affected modules	60+	✅ all pass

EML Operator-Inspired Optimizations: Log Quantization, Unified Distance, EML Trees #351

Description

TL;DR

Headline real-data result (the single most important number)

Honest caveat

Four-stage proof chain

The Proposal

Optimization 1 — LogQuantized (Logarithmic Scalar Quantization)

Optimization 2 — UnifiedDistanceParams (Branch-free Distance Kernel)

Optimization 3 — EmlTree / EmlModel (Learned-Index CDF Approximator)

Optimization 4 — EmlScoreFusion (Non-linear Hybrid Search Scoring)

Evidence Detail

Micro-evidence (tests/eml_proof.rs — 8 integration tests, all passing)

End-to-End real-data evidence (SIFT1M — the headline win)

End-to-End real-data evidence (GloVe-100d — the honest regression)

v4 padding hypothesis test (honest disproof)

API Surface (all opt-in; defaults unchanged)

Test & Validation Coverage

Reproducibility (maintainer can re-run from scratch)

Files Changed (on branch issue/eml)

Commit history (4 commits on the branch, ready for cherry-pick or squash-merge)

Known Follow-up Work (explicitly NOT in this PR)

What to Review

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Optimization 1 — `LogQuantized` (Logarithmic Scalar Quantization)

Optimization 2 — `UnifiedDistanceParams` (Branch-free Distance Kernel)

Optimization 3 — `EmlTree` / `EmlModel` (Learned-Index CDF Approximator)

Optimization 4 — `EmlScoreFusion` (Non-linear Hybrid Search Scoring)

Micro-evidence (`tests/eml_proof.rs` — 8 integration tests, all passing)

Files Changed (on branch `issue/eml`)