Skip to content

jacoblai/wisvec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wisvec

wisvec is an embedded vector database for Go applications.

It is designed to be used as a package (no external service required), with durable local storage and OpenAI-compatible embedding support.

Features

  • Embedded usage in your Go process
  • Document CRUD with vector and text search
  • Incremental HNSW index + exact fallback
  • Hybrid retrieval (dense + sparse + RRF)
  • Persistent storage (manifest + WAL + snapshots)
  • Segment lifecycle (hot/cold tiers, disk chunks, coarse routing)
  • Unified initialization config (DBOptions)

Requirements

  • Go 1.21+
  • An OpenAI-compatible embeddings endpoint (POST /v1/embeddings)

Install

go get github.com/jacoblai/wisvec

Quick Start (5 minutes)

package main

import (
	"context"
	"fmt"
	"log"
	"os"

	vecdb "github.com/jacoblai/wisvec"
)

func main() {
	db, err := vecdb.OpenWithModelDirOptions("./data-bge-m3-lan", "bge-m3-lan", vecdb.ModelDirOptions{
		BaseURL: "http://192.168.2.76:8081/v1",
		APIKey:  os.Getenv("OPENAI_API_KEY"), // optional for local llama-server
		Model:   "bge-m3",
	})
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	coll, err := db.GetOrCreateCollection("docs", map[string]string{"env": "demo"})
	if err != nil {
		log.Fatal(err)
	}

	_ = coll.AddDocuments(context.Background(), []*vecdb.Document{
		{ID: "1", Content: "Go context controls cancellation and deadlines."},
		{ID: "2", Content: "Redis is often used for caching hot keys."},
	})

	results, err := coll.SearchByText(context.Background(), "golang cancellation", 3)
	if err != nil {
		log.Fatal(err)
	}
	for i, r := range results {
		fmt.Printf("%d) %s (%.3f)\n", i+1, r.ID, r.Similarity)
	}
}

Use the Example App

go run ./examples/native

With explicit flags:

go run ./examples/native \
  -model_dir bge-m3-lan \
  -data_dir ./data-bge-m3-lan \
  -base_url http://192.168.2.76:8081/v1 \
  -api_key "" \
  -model bge-m3 \
  -collection documents

Unified Configuration (recommended)

For package users, prefer configuring everything at initialization time:

opts := &vecdb.DBOptions{
	Index: vecdb.IndexConfig{
		EfSearch: 160,
		M:        32,
	},
	Collection: vecdb.CollectionRuntimeOptions{
		Search: vecdb.CollectionSearchConfig{
			HybridRRFK: 80,
		},
		SegmentOptimizeEvery:   32,
		OptimizerMaxQueue:      128,
		DiskRouteCacheEntries:  512,
		TelemetrySnapshotEvery: 128,
		CheckpointEvery:        512,
		WALSyncEveryOps:        32,
	},
}
db, err := vecdb.OpenWithOptions("./data", embedder, opts)

If you open via OpenWithModelDirOptions, pass the same runtime options through ModelDirOptions.DBOptions.

Configuration Reference

Use DBOptions for unified initialization-time configuration.

Field Default Typical range Impact
Index.EfSearch 120 80-400 Higher value usually improves ANN recall but increases query latency/CPU.
Index.M 24 16-64 Higher value increases graph connectivity and memory usage; may improve recall.
Collection.Search.HybridRRFK 60 30-120 Controls dense+sparse fusion smoothness; lower values favor top-ranked inputs more aggressively.
Collection.Search.HybridDenseBudgetMultiplier 2.0 1.0-4.0 In hybrid mode, increases dense candidate pool before fusion; higher cost, potentially better recall.
Collection.Search.HybridSparseBudgetMultiplier 2.0 1.0-4.0 In hybrid mode, increases sparse candidate pool before fusion; useful for keyword-heavy workloads.
Collection.Search.EnableDiskCoarseRouting true true/false Enables segment/chunk coarse routing for cold data; usually reduces cold-scan latency.
Collection.Search.DiskMaxSegmentsToScan 4 1-16 Caps how many cold segments are scanned per query; lower is faster, higher may improve recall.
Collection.Search.DiskMaxChunksPerSegment 4 1-16 Caps chunk reads per selected cold segment; lower is faster, higher may improve recall.
Collection.Search.EnablePQExactScan true true/false Enables PQ coarse ranking in exact path; lowers memory/CPU pressure on large sets.
Collection.Search.PQSubspaces 8 divisors of vector dims PQ granularity; must divide embedding dimension.
Collection.Search.PQCentroids 16 8-256 PQ codebook size; larger improves approximation quality but increases training cost.
Collection.Search.PQMinTrainDocs 64 64-10k+ Minimum docs required to activate PQ training for a collection.
Collection.Search.PQRerankMultiplier 12 4-32 Number of PQ candidates kept for exact rerank; higher improves recall, increases CPU.
Collection.Search.EnableSQ8ExactScan true true/false Enables SQ8 coarse path for exact search fallback.
Collection.Search.SQ8RerankMultiplier 8 4-32 SQ8 coarse candidate expansion before exact rerank.
Collection.SegmentOptimizeEvery 16 8-256 Frequency of optimizer trigger checks on write path.
Collection.OptimizerMaxQueue 64 32-1024 Caps in-memory optimizer tasks per collection.
Collection.DiskRouteCacheEntries 256 64-4096 Cache size for query->cold-segment route shortlist; improves repeated-query latency.
Collection.TelemetrySnapshotEvery 64 16-1024 Query count interval for persisting telemetry snapshot to metadata.
Collection.CheckpointEvery 256 64-4096 Write operation interval for checkpoint snapshot + WAL reset.
Collection.WALSyncEveryOps 16 1-128 WAL fsync batching interval; lower is safer, higher usually increases write throughput.

Notes:

  • Start from DefaultDBOptions() and tune one knob at a time.
  • For production changes, benchmark on representative data and query mixes.
  • If recall drops after aggressive latency tuning, first increase rerank multipliers and ANN parameters.

Core API

  • OpenWithOptions(...) / OpenWithModelDirOptions(...)
  • GetOrCreateCollection(name, metadata)
  • AddDocuments(ctx, docs)
  • DeleteDocuments(ctx, ids...)
  • SearchByText(ctx, query, limit)
  • SearchByVector(vector, limit)
  • RebuildIndex(collection)

How It Works (short)

  • Writes are serialized by a bounded per-collection worker.
  • Durable order: WAL append -> apply -> periodic checkpoint.
  • Recovery: load manifest -> load snapshot -> replay WAL.
  • Query planner selects ANN / iterative filtered ANN / exact / hybrid.
  • Cold segments are chunked on disk with coarse routing to reduce scan cost.

Model Isolation

One DB root directory corresponds to one embedder identity.

  • Use different rootDir values for different embedding models.
  • Reopening the same rootDir with a different embedder identity is rejected.
  • Error messages include a suggested model-specific directory suffix (for example ./data-bge-m3-lan).

Storage Layout

  • Collection data is stored under a sharded path: collections/<bucket>/<collection-name>.
  • This improves directory scaling when collection count grows.
  • During active development, if you upgrade across incompatible storage revisions, clear old data directories directly.

Production Checklist

  • Directory strategy
    • Use a dedicated data directory per environment and model namespace.
    • Keep rootDir on reliable local disk (fast SSD preferred).
  • Initialization config
    • Set runtime options via OpenWithOptions (or ModelDirOptions.DBOptions) at startup.
    • Avoid mutating behavior through ad-hoc runtime patches in app code.
  • Throughput/latency tuning
    • Start with DefaultDBOptions() and tune gradually:
      • Index.EfSearch, Index.M
      • Collection.Search.HybridRRFK
      • Collection.SegmentOptimizeEvery
      • Collection.CheckpointEvery
      • Collection.WALSyncEveryOps
  • Resilience
    • Validate restart/recovery in staging: stop process abruptly, then verify data and query behavior after reopen.
    • Monitor WAL growth and checkpoint cadence with your service metrics.
  • Observability
    • Read collection metrics (Collection.Metrics()) and track query count, hybrid/sparse usage, and budget trends.
    • Keep telemetry snapshot enabled (TelemetrySnapshotEvery) to preserve baseline across restarts.
  • Upgrade safety
    • Run go test ./... in CI before deployment.
    • For major config changes, replay a representative workload in staging first.

Troubleshooting

  • embeddings request failed: status=...
    • check BaseURL, APIKey, and model compatibility
  • embeddings response size mismatch
    • embedding API returned a vector count different from input size
  • db embedder mismatch: existing=... current=...
    • open with a different rootDir for this model (or remove the old dev data directory and recreate)

Testing

Run all unit/functional tests:

go test ./...

Run live integration tests against your OpenAI-compatible endpoint:

go test -tags live ./...

About

wisvec

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages