Skip to content

dhaya/ctxengine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vsearch

Minimal vector search stack:

  • server (Go): HTTP API for training, adding vectors, and searching.
  • faiss (Python/FastAPI): FAISS index service (GPU-first).

Configuration is TOML (config/vsearch.toml) and is mounted into containers at /etc/vsearch/vsearch.toml.

Architecture

client (embeddings) ──HTTP──> server (Go) ──HTTP──> faiss service (FastAPI + FAISS)
  • The system is vector-in / vector-out. There is no embedding model or document ingestion yet.
  • Index parameters live under [index] in config/vsearch.toml and are consumed by the FAISS service.

Run (container-first)

Prereqs:

  • Docker + Docker Compose v2
  • NVIDIA Container Toolkit + a GPU-capable runtime (required for the faiss service today)

Start the stack:

docker compose -f docker/compose.yaml up --build

No local GPU? You can still run the Go API container and point it at a remote FAISS service:

docker compose -f docker/compose.yaml up --no-deps --build server

Endpoints (default):

  • API server: http://localhost:8080
  • FAISS service: http://localhost:50051

Health:

curl -sS http://localhost:8080/health
curl -sS http://localhost:50051/health

Configuration

Single source of truth: config/vsearch.toml.

Key sections:

  • [server]: bind address, timeouts
  • [faiss]: where the Go server reaches the FAISS service
  • [index]: FAISS index shape and search params (dimension, metric, nlist, nprobe)

Notes:

  • index.use_gpu is currently not wired in the FAISS service; it always attempts GPU initialization.
  • server.max_concurrent_requests and [metrics] are present in config but not enforced/exposed in the Go server yet.

HTTP API

The Go server exposes:

  • GET /health
  • POST /v1/train (train IVF)
  • POST /v1/vectors (add vectors with external IDs)
  • POST /v1/search (kNN search)

All vector payloads must have inner length index.dimension.

Smoke test (small vectors)

For a quick manual test, temporarily set:

  • index.dimension = 4
  • index.nlist = 1
  • index.nprobe = 1

Then restart containers and run:

Train:

curl -sS -X POST http://localhost:8080/v1/train \
  -H 'Content-Type: application/json' \
  -d '{"vectors":[[0.0,0.0,0.0,0.0],[1.0,1.0,1.0,1.0],[2.0,2.0,2.0,2.0]]}'

Add:

curl -sS -X POST http://localhost:8080/v1/vectors \
  -H 'Content-Type: application/json' \
  -d '{"vectors":[[0.1,0.1,0.1,0.1],[1.1,1.1,1.1,1.1]],"ids":["a","b"]}'

Search:

curl -sS -X POST http://localhost:8080/v1/search \
  -H 'Content-Type: application/json' \
  -d '{"vectors":[[0.0,0.0,0.0,0.0]],"k":2}'

Repo layout

  • cmd/vsearch/: Go entrypoint
  • internal/server/: HTTP routes and request/response shapes
  • internal/faiss/: Go client for the FAISS service
  • faiss_service/: Python FastAPI service hosting the FAISS index
  • config/vsearch.toml: runtime config (mounted into containers)
  • docker/: Dockerfiles and docker/compose.yaml

Current status (important gaps)

  • The FAISS index is in-memory only; there is no persistence/snapshots.
  • No ingestion pipeline: PDFs/HTML/code parsing, chunking, embeddings, dedup, and backfills are not implemented.
  • No authn/z, multitenancy, quotas, or schema/versioning for stored content.
  • Metrics are not exposed from the Go server (package exists but is unused).

Next phase

See PHASE2_PLAN.md.

About

GPU-accelerated vector retrieval server with FAISS and Go

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors