Skip to content

CREATE VECTOR INDEX DSL has no quantization params — engine supports SQ8/PQ/IVF-PQ but unreachable from SQL #25

@hollanf

Description

@hollanf

Summary

docs/vectors.md advertises three quantization tiers (HNSW FP32, HNSW+SQ8, HNSW+PQ, IVF-PQ) and the engine fully implements them, but the SQL DDL for CREATE VECTOR INDEX accepts no quantization parameters and silently passes empty/zero defaults to the executor. Result: every vector index created via SQL is full-FP32 HNSW, the heaviest configuration, with no SQL way to opt into the documented tiers.

Severity

High — every NodeDB deployment that uses vectors via SQL pays the worst-case RAM overhead. From the docs table:

HNSW (FP32)    ~1.5 KB / 384d    ~99% recall
HNSW + SQ8     ~384 B  / 384d    ~98% recall   (4× saving)
HNSW + PQ      ~96 B   / 384d    ~95% recall   (16× saving)
IVF-PQ         ~16 B   / 384d    ~85-95%       (96× saving)

For 1024-dim Voyage/OpenAI embeddings the FP32 cost is ~4 KB/vec. A 1M-row collection sits at ~4 GB resident. Users have no SQL path to the documented sub-1KB tiers.

Repro

CREATE COLLECTION articles TYPE document;
CREATE VECTOR INDEX idx_articles_embedding ON articles METRIC cosine DIM 1024;

-- Tried (expected per docs/vectors.md tiers):
CREATE VECTOR INDEX idx ON articles METRIC cosine DIM 1024 INDEX_TYPE hnsw_pq PQ_M 8;
CREATE VECTOR INDEX idx ON articles METRIC cosine DIM 1024 INDEX_TYPE ivf_pq IVF_CELLS 256 IVF_NPROBE 16;

-- Both ignored — parser doesn't recognize INDEX_TYPE/PQ_M/IVF_CELLS/IVF_NPROBE.
-- Always falls through to default HNSW FP32.

Root cause

In nodedb/src/control/server/pgwire/ddl/dsl.rs:

  • Line 229 declares syntax: CREATE VECTOR INDEX <name> ON <coll> [METRIC ...] [M <m>] [EF_CONSTRUCTION <ef>] [DIM <dim>] — no quantization params at all.
  • Lines 253-256 parse only metric / m / ef_construction / dim.
  • Lines 269-279 dispatch VectorOp::SetParams with hardcoded zeros:
    index_type: String::new(),
    pq_m: 0,
    ivf_cells: 0,
    ivf_nprobe: 0,

Downstream the engine handler at nodedb/src/data/executor/handlers/vector.rs:293 calls IndexType::parse(\"\")Some(IndexType::Hnsw) (default) and stores the config. The pq_m/ivf_cells/ivf_nprobe zero-fallbacks at lines 317-319 even auto-default to 8 / 256 / 16 — meaning the only piece the SQL surface needs to wire is index_type, and the rest already has sensible defaults baked in.

The native NDB protocol does propagate index_type (nodedb/src/control/server/native/dispatch/plan_builder/vector.rs:125-128), so the gap is purely in the SQL DSL.

Pointers

  • docs/vectors.md — advertises tiers users cannot reach
  • nodedb-vector/src/index_config.rs:7-15IndexType enum (Hnsw / HnswPq / IvfPq)
  • nodedb-vector/src/index_config.rs:30-41IndexConfig (pq_m, ivf_cells, ivf_nprobe)
  • nodedb-vector/src/quantize/sq8.rs — SQ8 codec
  • nodedb-vector/src/quantize/pq.rs — PQ codec
  • nodedb-vector/src/ivf.rs — IVF-PQ index
  • nodedb-vector/src/collection/segment.rs:45 — `Optional SQ8 quantized vectors for accelerated traversal`
  • nodedb/src/control/server/pgwire/ddl/dsl.rs:229-300 — DSL parser (where new keywords would land)
  • nodedb/src/control/server/pgwire/ddl/dsl.rs:269-279 — hardcoded zeros sent to SetParams
  • nodedb/src/control/server/native/dispatch/plan_builder/vector.rs:125-128 — native protocol already exposes index_type
  • nodedb/src/data/executor/handlers/vector.rs:293-320 — engine handler that already understands the params

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions