Semantic code search for codebases
- API-first embeddings
- Incremental indexing: only modified files are re-embedded.
- Background daemon: automatic re-indexing on file changes.
- Project state lives in
.codeindex/.
src/ Go implementation
.codeindex/ Project state and index
codeindex Built binary
curl -fsSL https://raw.githubusercontent.com/QuinsZouls/code-index/master/install.sh | bash
codeindex version
# Setup
codeindex onboard
codeindex init
codeindex index
codeindex search "authentication logic"
# Search with path
codeindex search -path . "authentication logic"
codeindex status
codeindex daemon start --interval 2s
codeindex daemon list
codeindex daemon stopnpx skills add QuinsZouls/code-indexcurl -fsSL https://raw.githubusercontent.com/QuinsZouls/code-index/master/install.sh | bashWindows is detected automatically; the same script will download the .zip release and install the .exe.
go install github.com/QuinsZouls/code-index/src@latestDownload the matching asset from GitHub Releases and copy codeindex into your PATH.
The project config is stored in .codeindex/settings.json.
Example:
{
"embedding": {
"provider": "openrouter",
"model": "qwen/qwen3-embedding-8b",
"base_url": "https://openrouter.ai/api/v1",
"api_key_env": "OPENROUTER_API_KEY",
"rate_limit": 10,
"timeout": "60s"
}
}Supported providers:
openaiopenai-compatibleopenroutermistralgeminiollamalmstudiollamacpp
The rate_limit field controls request frequency to avoid API throttling:
rate_limit: requests per second (default:0= disabled)- Example:
10means maximum 10 requests per second
The timeout field sets HTTP request timeout:
- Format: duration string like
"30s","1m","2m30s" - Default:
"60s"
The retry system handles transient HTTP errors automatically:
max_retries: maximum retry attempts (default:0= disabled)retry_initial_delay: first retry delay (default:"1s")retry_max_delay: maximum delay cap (default:"30s")- Uses exponential backoff (delay doubles each retry)
- If all retries fail for a file, indexer skips it and continues with next file
Example with retry enabled:
{
"embedding": {
"provider": "openai",
"model": "text-embedding-3-small",
"max_retries": 3,
"retry_initial_delay": "1s",
"retry_max_delay": "30s"
}
}Retryable errors include: rate limits (429), server errors (502, 503), timeouts, and connection issues.
The indexer splits files into chunks for embedding. By default, it uses chunk_size (lines) to determine chunk boundaries:
chunk_size: maximum lines per chunk (default:120)chunk_overlap: lines to overlap between chunks (default:20)min_chunk_size: minimum lines for a valid chunk (default:8)
Some embedding models have small context windows (e.g., 512 tokens). Use context_size to limit chunks by character count instead of lines:
context_size: maximum characters per chunk (default:0= disabled)
When context_size is set and a chunk would exceed this limit, the indexer switches to character-based chunking:
- Chunks are split at line boundaries (no mid-line splits)
- Overlap is respected
- A single long line exceeding
context_sizeremains as one chunk
Example for a 512-token model:
{
"chunk_size": 120,
"chunk_overlap": 20,
"context_size": 2048
}This ensures chunks fit within the model's context window while respecting line boundaries for readability.
api_key_envshould contain the environment variable name, not the secret itself.- For OpenAI-compatible backends, set
base_urlto the provider endpoint. - Default local endpoints:
- Ollama:
http://localhost:11434 - LM Studio:
http://localhost:1234/v1 - llama.cpp Server:
http://localhost:8080/v1
- Ollama:
To use llama.cpp server for embeddings, start the server with:
llama-server -c 8192 -hf nomic-ai/nomic-embed-text-v1.5-GGUF -c 8192 -ub 8192 --embeddingsBased on our tests we recommend the model nomic-ai/nomic-embed-text-v1.5-GGUF
Example configuration:
{
"embedding": {
"provider": "openai-compatible",
"model": "nomic-ai/nomic-embed-text-v2-moe-GGUF",
"base_url": "http://127.0.0.1:8080/v1",
}
}The llama.cpp server exposes an OpenAI-compatible /v1/embeddings endpoint. No API key is required by default.
Creates .codeindex/settings.json from ~/.codeindex/default_settings.json when present, then ensures .gitignore excludes .codeindex/.
Scans the repository, chunks text, sends embeddings requests, and persists the index in .codeindex/index.gob.
Default output:
- spinner while indexing
- final count of indexed files
Verbose output:
newupdatedunchanged- final file/chunk summary
Example:
codeindex index -path . --verboseConfig options:
worker_count: override parallel file workerscheckpoint_every: override how often the.gobcheckpoint is flushed
Runs vector search against the stored index.
Prints file count, chunk count, and language distribution.
Manages background daemon for automatic re-indexing.
| Command | Description |
|---|---|
daemon start |
Start daemon for a project |
daemon stop |
Stop daemon (by PID or project) |
daemon list |
List all running daemons |
daemon status |
Show daemon status for current project |
Starts a background daemon that monitors file changes and re-indexes automatically.
codeindex daemon start [--path .] [--interval 2s] [--debounce 500ms] [--verbose]Options:
| Flag | Default | Description |
|---|---|---|
--path |
. |
Project root directory |
--interval |
2s |
Polling interval for file scanning |
--debounce |
500ms |
Wait time before processing batch |
--verbose |
false |
Show re-indexing activity |
The daemon:
- Polls for file changes (no external dependencies)
- Debounces rapid changes to process in batches
- Re-indexes only modified files (partial updates)
- Removes deleted files from the index
- Stores PID registry in
~/.codeindex/daemons.json
Stops a running daemon.
codeindex daemon stop [--path .] [pid]If no PID is provided, stops the daemon for the current project.
Lists all running daemons across projects.
codeindex daemon listOutput example:
PID PROJECT STATUS STARTED
12345 /home/user/projects/myapp running 2026-04-05 10:30:00
Shows detailed status for the current project's daemon.
codeindex daemon status [--path .]Output example:
PID: 12345
Project: /home/user/projects/myapp
Status: running
Started: 2026-04-05 10:30:00
Interval: 2s
Debounce: 500ms
Files: 150
Chunks: 500
Embeddings are stored inside .codeindex/index.gob alongside chunk metadata.
Daemon registry is stored in ~/.codeindex/daemons.json with PID and project information.
Lock files are stored in /tmp/codeindex-<hash>.lock to prevent duplicate daemons per project.
- Files are skipped when
sizeandmodtimematch the previous index. - Indexing runs with a small worker pool to overlap file IO and API calls.
- Only modified files are re-embedded.
go test ./...