wisvec is an embedded vector database for Go applications.
It is designed to be used as a package (no external service required), with durable local storage and OpenAI-compatible embedding support.
- Embedded usage in your Go process
- Document CRUD with vector and text search
- Incremental HNSW index + exact fallback
- Hybrid retrieval (dense + sparse + RRF)
- Persistent storage (manifest + WAL + snapshots)
- Segment lifecycle (hot/cold tiers, disk chunks, coarse routing)
- Unified initialization config (
DBOptions)
- Go
1.21+ - An OpenAI-compatible embeddings endpoint (
POST /v1/embeddings)
go get github.com/jacoblai/wisvecpackage main
import (
"context"
"fmt"
"log"
"os"
vecdb "github.com/jacoblai/wisvec"
)
func main() {
db, err := vecdb.OpenWithModelDirOptions("./data-bge-m3-lan", "bge-m3-lan", vecdb.ModelDirOptions{
BaseURL: "http://192.168.2.76:8081/v1",
APIKey: os.Getenv("OPENAI_API_KEY"), // optional for local llama-server
Model: "bge-m3",
})
if err != nil {
log.Fatal(err)
}
defer db.Close()
coll, err := db.GetOrCreateCollection("docs", map[string]string{"env": "demo"})
if err != nil {
log.Fatal(err)
}
_ = coll.AddDocuments(context.Background(), []*vecdb.Document{
{ID: "1", Content: "Go context controls cancellation and deadlines."},
{ID: "2", Content: "Redis is often used for caching hot keys."},
})
results, err := coll.SearchByText(context.Background(), "golang cancellation", 3)
if err != nil {
log.Fatal(err)
}
for i, r := range results {
fmt.Printf("%d) %s (%.3f)\n", i+1, r.ID, r.Similarity)
}
}go run ./examples/nativeWith explicit flags:
go run ./examples/native \
-model_dir bge-m3-lan \
-data_dir ./data-bge-m3-lan \
-base_url http://192.168.2.76:8081/v1 \
-api_key "" \
-model bge-m3 \
-collection documentsFor package users, prefer configuring everything at initialization time:
opts := &vecdb.DBOptions{
Index: vecdb.IndexConfig{
EfSearch: 160,
M: 32,
},
Collection: vecdb.CollectionRuntimeOptions{
Search: vecdb.CollectionSearchConfig{
HybridRRFK: 80,
},
SegmentOptimizeEvery: 32,
OptimizerMaxQueue: 128,
DiskRouteCacheEntries: 512,
TelemetrySnapshotEvery: 128,
CheckpointEvery: 512,
WALSyncEveryOps: 32,
},
}
db, err := vecdb.OpenWithOptions("./data", embedder, opts)If you open via OpenWithModelDirOptions, pass the same runtime options through ModelDirOptions.DBOptions.
Use DBOptions for unified initialization-time configuration.
| Field | Default | Typical range | Impact |
|---|---|---|---|
Index.EfSearch |
120 |
80-400 |
Higher value usually improves ANN recall but increases query latency/CPU. |
Index.M |
24 |
16-64 |
Higher value increases graph connectivity and memory usage; may improve recall. |
Collection.Search.HybridRRFK |
60 |
30-120 |
Controls dense+sparse fusion smoothness; lower values favor top-ranked inputs more aggressively. |
Collection.Search.HybridDenseBudgetMultiplier |
2.0 |
1.0-4.0 |
In hybrid mode, increases dense candidate pool before fusion; higher cost, potentially better recall. |
Collection.Search.HybridSparseBudgetMultiplier |
2.0 |
1.0-4.0 |
In hybrid mode, increases sparse candidate pool before fusion; useful for keyword-heavy workloads. |
Collection.Search.EnableDiskCoarseRouting |
true |
true/false |
Enables segment/chunk coarse routing for cold data; usually reduces cold-scan latency. |
Collection.Search.DiskMaxSegmentsToScan |
4 |
1-16 |
Caps how many cold segments are scanned per query; lower is faster, higher may improve recall. |
Collection.Search.DiskMaxChunksPerSegment |
4 |
1-16 |
Caps chunk reads per selected cold segment; lower is faster, higher may improve recall. |
Collection.Search.EnablePQExactScan |
true |
true/false |
Enables PQ coarse ranking in exact path; lowers memory/CPU pressure on large sets. |
Collection.Search.PQSubspaces |
8 |
divisors of vector dims | PQ granularity; must divide embedding dimension. |
Collection.Search.PQCentroids |
16 |
8-256 |
PQ codebook size; larger improves approximation quality but increases training cost. |
Collection.Search.PQMinTrainDocs |
64 |
64-10k+ |
Minimum docs required to activate PQ training for a collection. |
Collection.Search.PQRerankMultiplier |
12 |
4-32 |
Number of PQ candidates kept for exact rerank; higher improves recall, increases CPU. |
Collection.Search.EnableSQ8ExactScan |
true |
true/false |
Enables SQ8 coarse path for exact search fallback. |
Collection.Search.SQ8RerankMultiplier |
8 |
4-32 |
SQ8 coarse candidate expansion before exact rerank. |
Collection.SegmentOptimizeEvery |
16 |
8-256 |
Frequency of optimizer trigger checks on write path. |
Collection.OptimizerMaxQueue |
64 |
32-1024 |
Caps in-memory optimizer tasks per collection. |
Collection.DiskRouteCacheEntries |
256 |
64-4096 |
Cache size for query->cold-segment route shortlist; improves repeated-query latency. |
Collection.TelemetrySnapshotEvery |
64 |
16-1024 |
Query count interval for persisting telemetry snapshot to metadata. |
Collection.CheckpointEvery |
256 |
64-4096 |
Write operation interval for checkpoint snapshot + WAL reset. |
Collection.WALSyncEveryOps |
16 |
1-128 |
WAL fsync batching interval; lower is safer, higher usually increases write throughput. |
Notes:
- Start from
DefaultDBOptions()and tune one knob at a time. - For production changes, benchmark on representative data and query mixes.
- If recall drops after aggressive latency tuning, first increase rerank multipliers and ANN parameters.
OpenWithOptions(...)/OpenWithModelDirOptions(...)GetOrCreateCollection(name, metadata)AddDocuments(ctx, docs)DeleteDocuments(ctx, ids...)SearchByText(ctx, query, limit)SearchByVector(vector, limit)RebuildIndex(collection)
- Writes are serialized by a bounded per-collection worker.
- Durable order: WAL append -> apply -> periodic checkpoint.
- Recovery: load manifest -> load snapshot -> replay WAL.
- Query planner selects ANN / iterative filtered ANN / exact / hybrid.
- Cold segments are chunked on disk with coarse routing to reduce scan cost.
One DB root directory corresponds to one embedder identity.
- Use different
rootDirvalues for different embedding models. - Reopening the same
rootDirwith a different embedder identity is rejected. - Error messages include a suggested model-specific directory suffix (for example
./data-bge-m3-lan).
- Collection data is stored under a sharded path:
collections/<bucket>/<collection-name>. - This improves directory scaling when collection count grows.
- During active development, if you upgrade across incompatible storage revisions, clear old data directories directly.
- Directory strategy
- Use a dedicated data directory per environment and model namespace.
- Keep
rootDiron reliable local disk (fast SSD preferred).
- Initialization config
- Set runtime options via
OpenWithOptions(orModelDirOptions.DBOptions) at startup. - Avoid mutating behavior through ad-hoc runtime patches in app code.
- Set runtime options via
- Throughput/latency tuning
- Start with
DefaultDBOptions()and tune gradually:Index.EfSearch,Index.MCollection.Search.HybridRRFKCollection.SegmentOptimizeEveryCollection.CheckpointEveryCollection.WALSyncEveryOps
- Start with
- Resilience
- Validate restart/recovery in staging: stop process abruptly, then verify data and query behavior after reopen.
- Monitor WAL growth and checkpoint cadence with your service metrics.
- Observability
- Read collection metrics (
Collection.Metrics()) and track query count, hybrid/sparse usage, and budget trends. - Keep telemetry snapshot enabled (
TelemetrySnapshotEvery) to preserve baseline across restarts.
- Read collection metrics (
- Upgrade safety
- Run
go test ./...in CI before deployment. - For major config changes, replay a representative workload in staging first.
- Run
embeddings request failed: status=...- check
BaseURL,APIKey, and model compatibility
- check
embeddings response size mismatch- embedding API returned a vector count different from input size
db embedder mismatch: existing=... current=...- open with a different
rootDirfor this model (or remove the old dev data directory and recreate)
- open with a different
Run all unit/functional tests:
go test ./...Run live integration tests against your OpenAI-compatible endpoint:
go test -tags live ./...