Skip to content

v7.0.0

Choose a tag to compare

@sysid sysid released this 05 Apr 06:27
· 43 commits to main since this release

Fully Offline Semantic Search

v7.0.0 replaces the OpenAI API-based embedding system with a fully local, offline vector search pipeline. Semantic search now requires zero API keys and makes zero network calls. All processing happens on your machine.

Breaking Changes

1. --openai CLI flag removed

The global --openai flag no longer exists. Semantic search is always available — no opt-in required.

# v6.x (required --openai to enable embeddings)
bkmr --openai sem-search "kubernetes security"
bkmr --openai backfill

# v7.0 (just works)
bkmr sem-search "kubernetes security"
bkmr backfill

2. Embedding storage migrated

Embeddings moved from the bookmarks.embedding BLOB column to a dedicated vec_bookmarks virtual table (powered by sqlite-vec).

  • The migration clears all existing embeddings (SET embedding = NULL)
  • You must regenerate embeddings after upgrading:
bkmr backfill --force

The database backup is created automatically before migration (e.g., bkmr_backup_20260405.db).

3. Embedding model changed

v6.x v7.0
Provider OpenAI API (text-embedding-ada-002) Local fastembed (ONNX Runtime)
Default model text-embedding-ada-002 (1536 dims) NomicEmbedTextV15 (768 dims)
Network Required per embed call None (model downloaded once, ~130MB)
Cost Per-token API charges Free

The model is configurable in ~/.config/bkmr/config.toml:

[embeddings]
model = "NomicEmbedTextV15"  # default, 768 dims
# Alternatives: AllMiniLML6V2 (384), BGESmallENV15 (384), BGEM3 (1024)

New Features

Hybrid Search (hsearch)

New command combining FTS and semantic search with Reciprocal Rank Fusion (RRF):

# Both keyword precision AND conceptual recall in one query
bkmr hsearch "containerized application security"

# FTS-only mode (skip semantic)
bkmr hsearch "docker AND security" --mode exact

# With tag filters, JSON output, FZF
bkmr hsearch -t python "web framework" --json
bkmr hsearch --fzf "deployment patterns"

RRF merges ranked lists from both engines — documents found by both get boosted, single-engine results still appear.

--embeddable search filter

# Show only bookmarks that have embeddings
bkmr search --embeddable

Embedding statistics in bkmr info

bkmr info
# Now shows: model name, dimensions, embedded count, vector table status

Dimension mismatch detection

If you change the embedding model without regenerating, bkmr tells you exactly what to do:

Embedding dimension mismatch: model=384, stored=768. Run `bkmr backfill --force` to regenerate.

Upgrade Guide

# 1. Install v7.0.0
cargo install bkmr   # or pip install bkmr, or brew upgrade bkmr

# 2. First run triggers auto-migration (backup created automatically)
bkmr search ""

# 3. Regenerate all embeddings locally (one-time, automatic ~130MB embedding model download)
bkmr backfill --force

# 4. Verify
bkmr info              # Check embedding stats
bkmr sem-search "test" # Should return results

Full Changelog: v6.7.0...v7.0.0