Cross-repo code intelligence CLI for LLMs.
FlowMap indexes your codebases with tree-sitter AST parsing, stores them in a local vector database, and gives you fast hybrid search (semantic + keyword + symbol) across all your repos. Built for developers who use LLMs for code navigation and want better context than grep.
You have 5 repos. You know the retry logic exists somewhere. With grep you'd need to search each repo, wade through hundreds of string matches, and hope you find the right function. With FlowMap:
$ flowmap search "retry logic with exponential backoff"
[1] api-gateway/src/utils/retry.ts:12-45 (rrf: 0.0312, via: ripgrep+semantic)
export async function withRetry<T>(fn: () => Promise<T>, opts: RetryOptions): Promise<T> { ...
[2] payment-service/src/http/client.py:88-120 (rrf: 0.0198, via: semantic+symbol)
def retry_with_backoff(func: Callable, max_retries: int = 3, base_delay: float = 1.0): ...
[3] shared-lib/pkg/resilience/retry.go:15-52 (rrf: 0.0147, via: semantic)
func Retry(ctx context.Context, fn func() error, opts ...Option) error { ...
One query. Three repos. Three languages. Ranked by relevance. That's the point.
- Indexes your repos with tree-sitter, extracting functions, classes, methods, and their signatures
- Hybrid search fuses 3 channels: ripgrep (keyword), vector similarity (semantic), and symbol lookup (exact match) using Reciprocal Rank Fusion
- Incremental reindexing via
git diff-- only re-embeds changed files - Structural history shows AST-level diffs (function added/removed/signature changed) over time
- Works across repos -- search one query, get results from all your projects
Full tree-sitter parsing (functions, classes, methods, signatures):
Python | TypeScript | JavaScript | TSX/JSX | Go | Java | YAML | JSON
These languages get line-based fallback chunking (indexed but no symbol extraction):
Rust | C | C++ | Kotlin | Ruby | PHP | C# | Swift | SQL | GraphQL | Protobuf | Terraform | Shell | Markdown
git clone https://github.com/aniket-agi/flowmap-cli.git
cd flowmap-cli
uv syncDon't have uv? Install it:
curl -LsSf https://astral.sh/uv/install.sh | sh
Add this to your ~/.zshrc (or ~/.bashrc):
export PATH="/path/to/flowmap-cli/.venv/bin:$PATH"Then reload:
source ~/.zshrcVerify it works:
flowmap --helpFlowMap uses Ollama for embeddings by default. It's free, runs locally, and needs no API keys.
# Install Ollama (macOS)
brew install ollama
# Start the server
ollama serve
# Pull the embedding model (~400MB download, one-time)
ollama pull qwen3-embedding:0.6b# macOS
brew install ripgrep
# Ubuntu/Debian
apt install ripgrep
# Or see https://github.com/BurntSushi/ripgrep#installation# Initialize config
flowmap init
# Add repos (use absolute paths)
flowmap repos add /path/to/your/project
flowmap repos add /path/to/another/project
# Verify
flowmap repos listflowmap indexThis walks each repo, parses files with tree-sitter, generates embeddings via Ollama, and stores everything locally. First run takes a few minutes depending on repo size. Subsequent runs are incremental (seconds).
flowmap search "retry logic"That's it. You're searching across all your repos.
These are the things FlowMap is actually good at. Copy-paste these.
flowmap repos add /path/to/the/repo
flowmap index
flowmap mapThis gives you every class, function, and file at a glance — across all repos.
flowmap search "payment processing"
flowmap search "retry logic"
flowmap search "database connection pool"This is the core use case. You describe what you're looking for in plain English, and FlowMap finds the relevant functions/classes across all your repos. It's not just grep — it understands meaning.
flowmap search "processOrder" --mode symbolSymbol mode does exact/fuzzy matching on function and class names. Instant results, no embeddings needed.
# From search results, you see: auth-service/src/auth.py:25-70
flowmap cat src/auth.py --repo auth-service --lines 25-70
# Or jump directly to a symbol
flowmap cat src/auth.py --repo auth-service --symbol validateTokenflowmap history "validateToken"
flowmap history "payment" --repo payment-service --since "3 months ago"This shows AST-level diffs — not just "file changed" but "function processPayment had its signature changed on March 3rd."
# Structural overview
flowmap map --format json
# Find relevant code for a question
flowmap search "how does auth work" --format json
# Read specific files
flowmap cat src/auth.py --repo my-service --format jsonAll commands support --format json. Pipe them to Claude, ChatGPT, or any LLM tool.
flowmap indexThat's it. FlowMap detects what changed via git diff and only re-embeds the modified files. Takes seconds.
flowmap doctorThis checks: Ollama running? Model pulled? Repos exist? Index healthy? Dimension mismatch? It tells you exactly what's wrong and how to fix it.
flowmap search "TODO" --mode keyword
flowmap search "FIXME" --mode keywordKeyword mode uses ripgrep directly. No embeddings, no Ollama, instant results.
flowmap index # Build/update the index (incremental)
flowmap index --full # Force full rebuild
flowmap search "query" # Hybrid search (semantic + keyword + symbol)
flowmap search "fn" --mode symbol # Find by function/class name
flowmap search "x" --mode keyword # Grep-style (no Ollama needed)
flowmap map # Show repo structure
flowmap symbols # List all functions/classes
flowmap symbols "auth" # Search symbols by name
flowmap cat file.py --repo R # Read a file
flowmap cat file.py --symbol fn # Jump to a symbol
flowmap history "query" # Show structural changes over time
flowmap status # Check index health
flowmap doctor # Full system health check
flowmap repos add /path # Add a repo
flowmap repos list # List configured repos
flowmap reset --all # Delete all index data
The main command. Searches across all indexed repos.
# Default: hybrid search (semantic + keyword + symbol fusion)
flowmap search "database connection pooling"
# Semantic only (vector similarity)
flowmap search "error handling patterns" --mode semantic
# Keyword only (ripgrep -- no embeddings needed)
flowmap search "TODO" --mode keyword
# Symbol lookup (exact/fuzzy match on function/class names)
flowmap search "AuthMiddleware" --mode symbol
# Filter to one repo
flowmap search "validateToken" --repo auth-service
# JSON output (for piping to LLMs)
flowmap search "payment processing" --format json
# Cross-encoder reranking (slower but higher quality)
flowmap search "complex query" --rerankSearch modes:
| Mode | What it does | Speed | Needs Ollama? |
|---|---|---|---|
hybrid (default) |
Fuses ripgrep + vector + symbol search via RRF | ~1-2s | Yes |
semantic |
Vector similarity only | ~0.5s | Yes |
keyword |
ripgrep only (live filesystem grep) | ~0.1s | No |
symbol |
Exact/suffix/contains match on symbol names | ~0.1s | No |
Build or update the search index.
# Index all repos (incremental -- only re-embeds changed files)
flowmap index
# Force full re-index
flowmap index --full
# Index a specific repo
flowmap index --repo my-service
# Preview what would be indexed (fast, no parsing)
flowmap index --dry-runShow a structural overview of your indexed repos -- classes, functions, file counts, languages.
flowmap map
flowmap map --repo my-service
flowmap map --format jsonList and search symbols (functions, classes, methods) across repos.
# List all symbols
flowmap symbols
# Search for symbols matching a name
flowmap symbols "process"
# Filter by type
flowmap symbols --type class
flowmap symbols --type function --repo my-service
# JSON output
flowmap symbols "validate" --format jsonRead source files from configured repos. Supports line ranges and symbol-based lookup.
# Read a file (auto-detects repo from path)
flowmap cat my-service/src/auth.py
# Specific line range
flowmap cat src/auth.py --repo my-service --lines 25-70
# Jump to a symbol
flowmap cat src/auth.py --repo my-service --symbol validateToken
# JSON output (useful for LLM context)
flowmap cat src/service.ts --repo my-service --format jsonShow a timeline of structural changes -- which functions were added, removed, or had their signatures changed.
# What changed around "auth"?
flowmap history "validateToken"
# Scoped to a repo and time window
flowmap history "payment" --repo payment-service --since "3 months ago"
# Focus on a specific symbol
flowmap history "OrderProcessor" --symbol OrderProcessor.process
# JSON output
flowmap history "auth" --format jsonShow index status for all repos.
flowmap statusIndex: 12,450 total chunks
my-service 4,230 chunks 2026-04-01 (main, abc1234)
auth-service 3,100 chunks 2026-04-01 (main, def5678)
shared-lib 5,120 chunks 2026-03-28 (main, 789abcd)
Check that everything is set up correctly.
flowmap doctorChecks: repo paths exist, Ollama is running, embedding model is pulled, ripgrep is installed, index is healthy, no dimension mismatches.
Manage configured repositories.
flowmap repos add /path/to/repo # Add a repo
flowmap repos add /path/to/repo --name custom-name # Add with a custom alias
flowmap repos list # List all repos and their index status
flowmap repos paths # Output repo paths (one per line)Delete index data.
flowmap reset --repo my-service # Reset one repo
flowmap reset --all # Reset everythingCreate a starter config file.
flowmap init # Creates ~/.flowmap/config.yaml
flowmap init --force # Overwrite existing config (preserves repo list)Config lives at ~/.flowmap/config.yaml. Created by flowmap init.
# FlowMap configuration
repos:
- name: my-service
path: /Users/you/code/my-service
- name: auth-service
path: /Users/you/code/auth-service
data_dir: ~/.flowmap/data
embedding:
backend: ollama # ollama | sentence-transformers
model: qwen3-embedding:0.6b # model name
ollama_url: http://localhost:11434
reranking:
enabled: false # Enable with --rerank flag instead
model: cross-encoder/ms-marco-MiniLM-L-6-v2flowmap --config /path/to/config.yaml search "query"Add a .flowmapignore file to any repo root to exclude files from indexing. Uses gitignore syntax.
# .flowmapignore
generated/
*.pb.go
*_test.go
vendor/
Free, local, no API keys. Runs on CPU or GPU.
ollama serve
ollama pull qwen3-embedding:0.6bConfig:
embedding:
backend: ollama
model: qwen3-embedding:0.6b
ollama_url: http://localhost:11434Local Python-based embeddings. No external server needed, but requires PyTorch.
uv sync --extra local-embeddingsConfig:
embedding:
backend: sentence-transformers
model: nomic-ai/CodeRankEmbedYour repos FlowMap Search
----------- --------- --------
.py .ts .go ---> tree-sitter AST parsing
.java .yaml chunk into functions,
classes, methods
|
v
Ollama / sentence-transformers
generate embeddings
|
v
LanceDB (vectors) + SQLite (metadata)
local storage, no cloud
|
v
3-way hybrid search <--- "your query"
ripgrep + vector + symbol
|
v
Reciprocal Rank Fusion
merge & score results
|
v
Ranked results with
file, line, symbol, score
Indexing pipeline:
- Walk repos via
git ls-files(respects.gitignore) - Parse each file with tree-sitter to extract functions, classes, methods
- Generate embeddings via Ollama (batched, with retry)
- Store in LanceDB (vectors) + SQLite (metadata, state tracking)
- Incremental updates via
git diff-- only changed files are re-embedded
Search pipeline:
- Classify query (identifier vs natural language vs mixed)
- Run 3 search channels in parallel: ripgrep, vector similarity, symbol lookup
- Map ripgrep line hits to stored chunks (dedup before scoring)
- Fuse with weighted Reciprocal Rank Fusion (weights based on query type)
- Optional cross-encoder reranking on top-30 candidates
All commands support --format json for piping to LLMs or scripts:
# Feed search results to an LLM
flowmap search "auth middleware" --format json | llm "explain these results"
# Get repo map as structured data
flowmap map --format json | jq '.repos[].classes[].name'
# Read a file for LLM context
flowmap cat src/auth.py --repo my-service --format jsonRun flowmap doctor first. It checks everything and tells you what's wrong.
ollama serve # Start Ollama
ollama pull qwen3-embedding:0.6b # Pull the modelKeyword search and hybrid mode need ripgrep. Install it:
brew install ripgrep # macOS
apt install ripgrep # Ubuntu/DebianWithout ripgrep, --mode semantic and --mode symbol still work.
You changed the embedding model after indexing. Fix:
flowmap index --fullflowmap status # Check if repos are indexed
flowmap index # Re-index if needed
flowmap doctor # Check system healthFirst index is slow (parses all files + generates embeddings). Subsequent runs are incremental and fast. For very large repos, ensure Ollama has enough resources:
# Check Ollama is responsive
curl http://localhost:11434/api/tags- Python >= 3.11
- Ollama (for embeddings) -- install
- ripgrep (for keyword search) -- install
- git (for file listing and incremental reindex)
# Clone
git clone https://github.com/aniket-agi/flowmap-cli.git
cd flowmap-cli
# Install with dev dependencies
uv sync --extra dev
# Run tests
uv run pytest tests/ -v
# Lint
uv run ruff check flowmap/291 tests covering:
- Tree-sitter chunking (Python, TypeScript, Go, Java, YAML, JSON)
- LanceDB store operations (real database, not mocked)
- CLI commands (all 11 commands)
- Hybrid search fusion and deduplication
- Incremental reindexing with git
- End-to-end: index -> search -> cat pipeline
- SQL escaping and special character handling
- Crash recovery (embedding failure preserves data)
- History/timeline with structural diffs
ripgrep finds literal text matches. FlowMap understands code structure -- it knows what a function is, what class it belongs to, and can find semantically similar code even when the exact words don't match. FlowMap actually uses ripgrep as one of its three search channels and fuses the results.
GitHub code search works on github.com. FlowMap works on your local repos, offline, with no data leaving your machine. It also searches across multiple repos at once and provides structural history (AST-level diffs over time).
No. Everything runs locally. Embeddings are generated by Ollama on your machine. Data is stored in ~/.flowmap/data. No cloud services, no API keys, no telemetry.
Roughly 1-2 MB per 1,000 source files. A 10-repo setup with 50K files typically uses ~100 MB for the LanceDB vector store.
Not currently. FlowMap supports Ollama (recommended) and sentence-transformers. Adding API-based backends is straightforward if there's demand -- open an issue.
First full index: ~1-5 minutes for a typical repo (depends on size and Ollama speed). Incremental updates after git pull: seconds -- only changed files are re-embedded.
Yes. Use --format json to pipe structured results into any LLM tool:
flowmap search "auth middleware" --format json
flowmap cat src/auth.py --repo my-service --format json
flowmap map --format json- Use a GPU-accelerated Ollama install for faster embeddings
- Use
--mode keywordor--mode symbolfor searches that don't need embeddings - The default model (
qwen3-embedding:0.6b) is small and fast -- larger models are more accurate but slower
If tree-sitter has a grammar for your language, yes. Add the grammar package to pyproject.toml, register the extension mapping in flowmap/parsing/languages.py, and define symbol extraction rules in flowmap/parsing/chunker.py. PRs welcome.
Contributions are welcome. Here's how to get started:
git clone https://github.com/aniket-agi/flowmap-cli.git
cd flowmap-cli
uv sync --extra devuv run pytest tests/ -vAll 291 tests should pass. If they don't, your environment has an issue -- fix that first.
- Create a branch from
master - Write your code -- follow the existing style (4-space indent, no docstrings on obvious functions, no unnecessary abstractions)
- Add tests for any new behavior -- look at existing tests for patterns
- Run the full test suite --
uv run pytest tests/ -v - Lint --
uv run ruff check flowmap/ - Open a PR with a clear description of what and why
- Bug fixes with a test that would have caught the bug
- New tree-sitter language grammars (Python, TS, Go, Java are done -- Rust, C, Ruby are not)
- Performance improvements with before/after measurements
- Better error messages for common failure modes
- Don't add features nobody asked for -- open an issue first to discuss
- Don't refactor working code for style preferences
- Don't add dependencies without a strong reason
- Don't break the
--format jsoncontract (other tools depend on it)
flowmap/
cli.py # Click commands (entry point)
config.py # YAML config loading, defaults
store.py # LanceDB vector store
state.py # SQLite metadata (indexed SHAs, pending markers)
embeddings.py # Ollama + sentence-transformers backends
indexer.py # File walking, chunking orchestration
reindex.py # Incremental reindex via git diff
render.py # Output formatting (text + JSON)
parsing/
chunker.py # Tree-sitter AST chunking
languages.py # Grammar registry
search/
hybrid.py # 3-way fusion + RRF + reranking
ripgrep.py # ripgrep subprocess wrapper
services/
indexing.py # Index orchestration (full + incremental)
file_resolver.py # Resolve file paths across repos
symbol_lookup.py # Symbol resolution for --symbol flag
map_builder.py # Structural map data builder
history/
timeline.py # Temporal query orchestration
structural_diff.py # AST-level diffing between versions
git_ops.py # Git log, show, pickaxe wrappers
tests/
conftest.py # Shared fixtures (MockBackend, hash_vector)
test_chunker.py # 49 tests -- AST parsing for all languages
test_store_integration.py # 26 tests -- real LanceDB operations
test_cli_commands.py # 20 tests -- all CLI commands
test_end_to_end.py # 6 tests -- full index->search pipeline
test_reindex.py # 18 tests -- incremental reindex + ordering
test_history.py # 19 tests -- git ops + structural diff
...and more
- Large repos (>50K files): Memory usage scales with repo size. The entire chunk + embedding set is held in memory during indexing. For very large monorepos, ensure adequate RAM.
- Windows: Best-effort support. The advisory file lock behavior differs from Unix. Primary development and testing is on macOS/Linux.
- AST coverage: Only Python, TypeScript, JavaScript, Go, Java, YAML, and JSON have full tree-sitter grammars. Other languages fall back to line-based chunking (still indexed, but no symbol extraction).
- Concurrent access: Designed for single-user CLI use. Don't run
flowmap indexfrom multiple terminals simultaneously. Search during indexing may return partial results.
This section helps AI coding assistants (Claude Code, Cursor, Copilot, Aider, etc.) discover and use FlowMap when it's available on the user's machine.
FlowMap is a CLI tool installed on this machine that provides cross-repo code search. If the user has configured repos with FlowMap, you can use it to find code faster than reading files manually.
flowmap --help # Check if installed
flowmap status # Check if repos are indexed
flowmap doctor # Full health check# Find code by meaning (semantic + keyword + symbol fusion)
flowmap search "authentication middleware" --format json
# Find a specific function or class by name
flowmap search "processOrder" --mode symbol --format json
# Grep-style keyword search (no embeddings needed)
flowmap search "TODO" --mode keyword --format json
# Get structural overview of all indexed repos
flowmap map --format json
# Read a specific file with line numbers
flowmap cat src/auth.py --repo my-service --format json
# Jump to a specific symbol's source code
flowmap cat src/auth.py --repo my-service --symbol validateToken --format json
# See what functions changed recently
flowmap history "validateToken" --format json| Scenario | Use FlowMap | Use file reads |
|---|---|---|
| "Where is the retry logic?" | flowmap search "retry logic" |
- |
"What does processOrder do?" |
flowmap search "processOrder" --mode symbol |
Then flowmap cat the result |
| "Show me all classes in the project" | flowmap symbols --type class |
- |
| "Read lines 50-100 of auth.py" | - | Read the file directly |
| "What changed in auth recently?" | flowmap history "auth" |
- |
- Always use
--format jsonwhen calling FlowMap -- it gives structured output you can parse flowmap searchreturns results ranked by relevance -- the first result is usually the best- If
flowmap statusshows "not indexed", the user needs to runflowmap indexfirst - FlowMap searches across ALL configured repos at once -- you don't need to know which repo a function is in
MIT