Semantic search for your codebase. Parses every tracked file with tree-sitter, generates vector embeddings per chunk, and stores them on a dedicated orphan Git branch that mirrors your source tree — so the whole team can share embeddings without re-indexing.
main branch semantic branch (orphan)
────────────────── ──────────────────────────────
src/main.rs → src/main.rs ← [{start_line, end_line, text, embedding}, ...]
src/db.rs → src/db.rs ← [{...}, ...]
src/chunking/mod.rs → src/chunking/mod.rs
git-semantic indexparses all tracked files with tree-sitter, embeds each chunk, and commits the mirrored JSON files to thesemanticorphan branch. On subsequent runs it only re-embeds files that changed since the last index (incremental)git push origin semanticshares the embeddings with the team- Contributors run
git fetch origin semantic+git-semantic hydrateto populate their local SQLite search index — no re-embedding needed git-semantic grepruns KNN vector similarity search against the local index
Indexing only needs to happen once — whoever runs it pushes the semantic branch and the whole team benefits. Nobody else needs an API key or has to re-embed anything.
You can run indexing manually from any machine, or automate it in your CI/CD pipeline so embeddings stay fresh after every merge.
# Anyone with an API key runs this once (or after significant changes)
git-semantic index
git push origin semantic
# Everyone else
git fetch origin semantic
git-semantic hydrate
git-semantic grep "..."Add .github/workflows/semantic-index.yml to your repository and indexing happens automatically on every merge to main:
name: Semantic Index
on:
push:
branches: [main]
jobs:
index:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
token: ${{ secrets.GITHUB_TOKEN }}
- name: Install git-semantic
run: cargo install git-semantic
- name: Index codebase
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: git-semantic index
- name: Push semantic branch
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git push origin semantic- Rust 1.65 or higher
- Git 2.0 or higher
cargo install git-semanticgit clone https://github.com/ccherrad/git-semantic.git
cd git-semantic
cargo install --path .Parses and embeds files, then commits the result to the semantic orphan branch.
git-semantic index- First run: full index of all tracked files, writes
.indexed-atwith the current HEAD SHA - Subsequent runs: incremental — diffs against the last indexed SHA, re-embeds only added, modified, renamed, or deleted files
- Respects
.gitignore(usesgit ls-files) - Skips binary files
- Files with unrecognized extensions are stored as a single chunk
- Creates the
semanticbranch automatically on first run
Reads the semantic branch and populates the local .git/semantic.db search index.
git-semantic hydrateAttempts to fetch origin/semantic first, then falls back to the local branch.
Search code semantically using natural language.
git-semantic grep "authentication logic"
git-semantic grep "error handling" -n 5Injects code search instructions into CLAUDE.md so coding agents automatically use git-semantic grep instead of git grep.
git-semantic agentic-setup- Appends instructions to an existing
CLAUDE.md, or creates one if it doesn't exist - Idempotent — safe to run multiple times
- Works with Claude Code, Cursor, GitHub Copilot (via
.cursor/rulesor.github/copilot-instructions.mdequivalents)
Configure the embedding provider.
git-semantic config --list
git-semantic config gitsem.provider openai
git-semantic config gitsem.provider onnx
git-semantic config --get gitsem.provider
git-semantic config --unset gitsem.onnx.modelPathexport OPENAI_API_KEY="sk-..."
git-semantic config gitsem.provider openaigit-semantic config gitsem.provider onnx
git-semantic config gitsem.onnx.modelPath /path/to/model.onnxRust, Python, JavaScript, TypeScript, Java, C, C++, Go
git-semantic/
├── src/
│ ├── main.rs # CLI and command handlers
│ ├── models.rs # CodeChunk data structure
│ ├── db.rs # SQLite + sqlite-vec search index
│ ├── embed.rs # Embedding generation
│ ├── semantic_branch.rs # Orphan branch read/write via git worktree
│ ├── embeddings/ # OpenAI and ONNX provider implementations
│ └── chunking/ # tree-sitter parsing and language detection
├── Cargo.toml
└── README.md
cargo build --release
cargo testMIT OR Apache-2.0