-
Notifications
You must be signed in to change notification settings - Fork 447
Open
Description
Motivation
Issues #100 and #147 request GPU-accelerated indexing. zvec already has standalone GPU backends (FAISS GPU, cuVS CAGRA/IVF-PQ, Apple MPS) but they're not integrated with the Collection API — users can't do collection.query() with GPU acceleration.
Proposed approach
1. UnifiedGpuIndex ABC (python/zvec/backends/unified.py)
Abstract base with train() + add() + search() + size() + backend_name, plus 6 adapters:
| Adapter | Wraps | Priority |
|---|---|---|
CppCuvsAdapter |
Native _zvec pybind11 (zero-copy) |
1 (highest) |
CuvsCAGRAAdapter |
cuvs.neighbors.cagra |
2 |
CuvsIvfPqAdapter |
cuvs.neighbors.ivf_pq |
3 |
FaissGpuAdapter |
backends/gpu.py::GPUIndex |
4 |
AppleMpsAdapter |
backends/apple_silicon.py |
5 |
FaissCpuAdapter |
FAISS CPU fallback | 6 |
C++ native path is preferred (avoids Python→GPU data copies).
2. GpuIndex bridge (python/zvec/gpu_index.py)
Connects GPU search to Collection:
gpu = collection.gpu_index("embedding")
gpu.build(vectors, ids)
docs = gpu.query(query_vector, topk=10, output_fields=["title"])
# Returns list[Doc] — same format as collection.query()Flow: GPU search → map indices to doc IDs → collection.fetch() → attach scores → return list[Doc].
3. Backend detection (python/zvec/backends/detect.py)
Adds C++ cuVS and Python cuVS detection with proper priority chain.
Tested on RTX 4090
| Backend | QPS (50K vectors, dim=128) |
|---|---|
| FAISS GPU (flat) | 529,316 |
| cuVS IVF-PQ | 45,771 |
| cuVS CAGRA | 43,711 |
Questions for maintainers
- Is
collection.gpu_index(field_name)the right API surface, or would you prefer a different entry point? - The current design requires explicit
gpu.build(vectors, ids)since_Collectionhas no scan/iterate API. Would you consider exposing a bulk vector extraction API from C++? - Any preference on how the C++ cuVS priority should interact with the existing backend detection?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Backlog