This project implements a REST API for a Vector Database using FastAPI, with a Rust (PyO3) core for Brute Force, IVF Flat, and IVF PQ indexing plus an official Python SDK. It supports CRUD for libraries, documents, and chunks, indexing, and kNN search. Data is stored in-memory with RW locks for concurrency and persisted to disk as numpy snapshots. No external vector DB libs used, as per guidelines.
Key features:
- Fixed schemas with Pydantic validation and Value Objects for invariants (non-empty IDs, valid embeddings).
- Three indexing algorithms implemented in Rust and exposed to Python via PyO3: Brute Force, IVF Flat, and IVF PQ with documented complexities.
- Thread-safe operations via per-library RWLock.
- Dockerized API.
- Python SDK for easy interaction.
- On-disk snapshots (
data/<library_id>/) with atomic manifest + NumPy arrays (ids.npy,vectors.npy, etc.) and metadata indices.
Extra: Domain-Driven Design elements (VO, services decoupling), FP style where applicable (itertools for traversals, but kept simple).
Use uv for dependency management (fast, modern alternative to pip/poetry).
- Install uv:
curl -LsSf https://astral.sh/uv/install.sh | shor see uv installation for Windows or instalation with pip. - Sync all packages:
uv sync --all-packages- This installs every workspace member (
api,core,sdk,rindex,dashboard) and keeps their lockfile in sync. - Re-run after adding dependencies or pulling changes that modify
pyproject.toml/uv.lock.
- This installs every workspace member (
- Build the Rust core library (rindex) locally:
uv run poe rindex-develop - (Optional) Install pre-commit:
uv run pre-commit install
Run from root: uv run poe <task>
Core workspace tasks (root pyproject.toml):
format: Format code with Ruff.lint: Lint and auto-fix with Ruff.typecheck: Static typing with basedpyright.test: Execute the aggregated test suite (core,api,sdk).pre-commit: Convenience wrapper for lint → format → tests → typecheck.all: Runsformat,lint,typecheck,testsequentially.
Package-specific tasks (invoked with uv run poe <task> as well, thanks to the included configuration):
rindex-develop: Build the PyO3 extension in editable mode (packages/rindex).api-test,core-test,sdk-test: Scoped pytest runs per package.dashboard: Launch the Streamlit UI (packages/streamlit).
Discover the full list at any time: uv run poe --list
For watch mode typecheck: uv run poe typecheck --watch
Use the existing Poe task so the right module path and reload flags stay in sync:
uv run poe api-serveOpen http://localhost:8000/docs for Swagger UI.
Build the image (multi-stage, ~1 GB final size, can be optimiced with cargo build --release and further optimizations, refining the .dockerignore):
docker build -t vector-db-api .Run the container, exposing the FastAPI service on port 8000:
docker run --rm -p 8000:8000 vector-db-apiThe image already contains the compiled rindex extension and all Python workspace dependencies (including NumPy). Snapshots created by the API are written inside the container under /app/data; mount a host volume if you need persistence between runs.
Spin up a local UI that exercises the API through the SDK.
- Start the API (see above).
- Run the dashboard:
uv run poe dashboard - Point it to your API base URL (defaults to
http://localhost:8000). - The UI lets you create/delete libraries, documents, and chunks, rebuild indexes, and run searches.
- Embeddings are generated automatically with the
sentence-transformers/all-MiniLM-L6-v2model (cached locally on first use). - It surfaces API errors inline (e.g., duplicate IDs, empty indexes), making it easy to confirm validation flows without crafting HTTP calls manually.
Implemented three algorithms in Rust (PyO3 bindings) without external vector DB libs:
-
Brute Force:
- Build: O(1) - just store list.
- Query: O(N * d) time (N chunks, d dim), O(N) space.
- Exact, simple baseline. Good for small N.
-
IVF Flat (Inverted File Flat):
- Build: O(N log N) via K-Means clustering into n_lists centroids.
- Query: O(nprobes * (listsize * d)) average, O(N) worst; probes nearest centroids and scans their lists exactly.
- Space: O(N * d).
- Uses K-Means for partitioning vectors into inverted lists; exact search within probed lists. Configurable n_lists (default 16), n_probes (default ~4). Good for medium-large N, balances speed and exactness.
-
IVF PQ (Inverted File with Product Quantization):
- Build: O(N log N + N _ num_subvectors _ num_codewords) for K-Means + PQ training.
- Query: O(nprobes * listsize * num_subvectors) approx, using precomputed codes and codebooks for fast distance estimation.
- Space: O(N * (log num_codewords / subvector) + centroids/codebooks), highly compressed.
- Combines IVF clustering with PQ: residuals quantized into codes per subvector (default num_subvectors=4, num_codewords=16). Approximate but tunable precision vs. space. Ideal for high-dim (e.g., 768) and large N.
Trade-offs: Brute for accuracy/small data and as a reference; IVF Flat for faster exact queries; IVF PQ for scalable approx search with small index size. All indices expose save/load to persist snapshots. Python facades map string IndexVectorID values to sequential integers (stored in id_map.json) before delegating to Rust.
- Each library snapshot lives in
data/<library_id>/and contains:manifest.json(written by the Rust core) describing index type, metric, params, version.- NumPy arrays (
ids.npy,vectors.npy,centroids.npy, etc.) for index payloads. id_map.json(string -> integer mapping used inside Rust indices).metadata.json(vector -> metadata/doc metadata dump) andmeta_index.json(inverted metadata index for quick filtering).
IndexingService.build_indexsaves a fresh snapshot,ensure_indexwill lazily reload from disk when registry state is missing/outdated.- Snapshots are written atomically (
manifest.jsonrenamed last) so partially written indexes are avoided.
Selected per library via index_type; default brute.
- InMemoryLibraryRepo uses RWLock per library ID.
- Reads (get, search): acquire_read (multiple concurrent).
- Writes (create/update/delete, index): acquire_write (exclusive, waits readers).
- Ensures no races on shared library state.
- Value Objects: ID (non-empty str), Embedding (non-empty numbers, normalizable).
- Invariants/Preconditions: Pydantic validators (e.g., lib name non-empty, valid chunks).
- Entities: Library/Document/Chunk with identity.
- Services: Decouple API from logic (IndexingService, QueryService).
- Callee decides: expose primitives (e.g., raw embeddings) alongside VO.
-
uv: Fast resolver/installer, workspace support for monorepo.
-
ruff: Combination of flake8, isort, pycodestyle, pyflakes, pylint, in one tool, extremely fast linter and formatter.
-
pytest: de-facto standard for testing in python.
-
basedpyright: For type checking (stricter than mypy, faster, no Node.js needed via wheel). ty or pyrefly will be better options in the future when the conformance goes higher.
- Mode: "recommended" (strict but practical).
- Stricter checks catch more errors.
- Watch mode:
uv run poe typecheck --watchfor live feedback. - Prefer
# pyright:ignoreover# type: ignorefor specificity.
- Vs mypy: See Pyright vs Mypy.
- Mode: "recommended" (strict but practical).
-
poethepoet: Task runner as there is no default in the python ecosystem, used for devs to run all sort of defined tasks/scripts.
- In-memory storage with on-disk snapshots for persistence (data/<library_id>/ with manifest.json, NumPy arrays, and metadata indices).
- Basic IVF PQ (approximate distances via decoding; no advanced ADC like OPQ).
- No auth (extensible via headers).
- Tests: Unit (indices/services), integration (API via TestClient).