-
-
Notifications
You must be signed in to change notification settings - Fork 4
VecDB
SQLite-backed semantic search with file splitting, embedding fetch, and cached vector storage.
VecDB uses SQLite with the vec0 extension for vector search.
The active embedding table is created as a vec0 virtual table with cosine distance, along with scope and line-range metadata.
Before embedding, files are split into search chunks using content-aware splitters:
- Trajectory JSON: 4 messages per chunk, with a 1-message overlap
- Markdown: heading-aware sections with frontmatter support
- Code: AST-aware token windows through the AST splitter path
Embeddings are fetched through an external HTTP API. The vectorization path batches requests and retries failures. The background flow is:
- enqueue documents
- split them
- check the cache
- embed missing chunks
- store vectors
Search uses cosine KNN queries, then applies a reject threshold and normalizes the usefulness score. Scope-prefix filtering is applied after the query when needed.
VecDB cleans up old embedding tables by keeping the 10 newest tables and dropping tables older than 7 days.
The VecDB search endpoint is /vecdb-search.
Refact on GitHub: https://github.com/JegernOUTT/refact
- Agent Modes
- Agent Tools
- Task Planner & Cards
- Worktrees
- Subagents
- Memory & Knowledge
- Hidden Roles & Plans
- Context Compression
- Scheduler & Cron
- Processes & PTY
- Buddy
- MCP
- Skills, Commands & Hooks
- Marketplace
- Chat System
- Providers
- Caps & Models
- Code Completion (FIM)
- AST
- VecDB
- Exec Runtime
- HTTP API
- Checkpoints & Git
- Voice