A lightweight vector database built in Go, inspired by Pinecone. Features HNSW-based approximate nearest neighbor search, metadata filtering, disk persistence, and OpenAI embedding integration.
- HNSW Index - O(log n) approximate nearest neighbor search
- Multiple Distance Metrics - Cosine similarity (default), Euclidean, Dot product
- Metadata Filtering - MongoDB-style query operators
- Disk Persistence - Binary format with automatic index serialization
- OpenAI Integration - Text-to-embedding with text-embedding-3-small
- REST API - Full CRUD and query endpoints
- CLI - Command-line interface for all operations
# Clone and build
git clone <repo-url>
cd Go-Vector-Database
go build -o vecdb ./cmd/vecdb
# Configure OpenAI API key (required for text embeddings)
echo "OPENAI_API_KEY=your-api-key-here" > .env./vecdb serve
# Server starts on http://localhost:8080# Insert with text (auto-embeds via OpenAI)
curl -X POST http://localhost:8080/vectors/text \
-H "Content-Type: application/json" \
-d '{
"id": "doc1",
"text": "Machine learning is a subset of artificial intelligence",
"metadata": {"category": "tech", "author": "john"}
}'
# Insert with raw vector values
curl -X POST http://localhost:8080/vectors \
-H "Content-Type: application/json" \
-d '{
"vectors": [
{"id": "vec1", "values": [0.1, 0.2, ...], "metadata": {"type": "embedding"}}
]
}'# Query with text
curl -X POST http://localhost:8080/query/text \
-H "Content-Type: application/json" \
-d '{
"text": "AI and deep learning",
"top_k": 5,
"include_metadata": true
}'
# Query with filters
curl -X POST http://localhost:8080/query/text \
-H "Content-Type: application/json" \
-d '{
"text": "programming languages",
"top_k": 10,
"filter": {"category": "tech"},
"include_metadata": true
}'curl -X POST http://localhost:8080/save| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Health check and stats |
/vectors |
POST | Upsert vectors (batch) |
/vectors/text |
POST | Upsert single vector with text |
/vectors/{id} |
GET | Get vector by ID |
/vectors/{id} |
DELETE | Delete vector |
/query |
POST | Query with vector values |
/query/text |
POST | Query with text |
/stats |
GET | Database statistics |
/save |
POST | Persist to disk |
/load |
POST | Load from disk |
/clear |
POST | Clear all vectors |
# Start HTTP server
./vecdb serve [-port 8080] [-host 0.0.0.0]
# Insert vector with text
./vecdb insert -id doc1 -text "Hello world" -metadata '{"key": "value"}'
# Insert vector with values
./vecdb insert -id vec1 -values "0.1,0.2,0.3,..."
# Query with text
./vecdb query -text "search query" -k 5
# Query with values
./vecdb query -values "0.1,0.2,0.3,..." -k 5
# Show statistics
./vecdb stats
# Show version
./vecdb versionSupports MongoDB-style operators:
// Comparison
{"field": {"$eq": "value"}}
{"field": {"$ne": "value"}}
{"field": {"$gt": 10}}
{"field": {"$gte": 10}}
{"field": {"$lt": 20}}
{"field": {"$lte": 20}}
// Array
{"field": {"$in": ["a", "b", "c"]}}
{"field": {"$nin": ["x", "y"]}}
// Logical
{"$and": [{"field1": "a"}, {"field2": "b"}]}
{"$or": [{"field1": "a"}, {"field2": "b"}]}
// String
{"field": {"$contains": "substring"}}
{"field": {"$startswith": "prefix"}}
{"field": {"$endswith": "suffix"}}
// Existence
{"field": {"$exists": true}}Environment variables (or .env file):
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY |
- | OpenAI API key for embeddings |
OPENAI_MODEL |
text-embedding-3-small | Embedding model |
SERVER_PORT |
8080 | HTTP server port |
SERVER_HOST |
0.0.0.0 | HTTP server host |
DATA_DIR |
./data | Data directory |
DATABASE_FILE |
vecdb.dat | Database filename |
HNSW_M |
16 | HNSW connections per node |
HNSW_EF_CONSTRUCTION |
200 | HNSW construction parameter |
HNSW_EF_SEARCH |
50 | HNSW search parameter |
.
├── cmd/vecdb/main.go # CLI entry point
├── pkg/
│ ├── api/ # REST API server
│ ├── config/ # Configuration
│ ├── database/ # Database orchestrator
│ ├── embedding/ # OpenAI client
│ ├── index/ # HNSW algorithm
│ └── storage/ # Vector storage & persistence
├── test/ # Tests
├── go.mod
└── .env
- Embedding: Text is converted to 1536-dimensional vectors via OpenAI's embedding API
- Indexing: Vectors are inserted into an HNSW (Hierarchical Navigable Small World) graph
- Search: Queries traverse the graph to find approximate nearest neighbors in O(log n)
- Filtering: Metadata filters are applied post-search to refine results
- Persistence: Binary format stores vectors + serialized HNSW graph
go test ./... -vMIT