A codebase intelligence platform that transforms your code into a queryable knowledge graph.
# Install and start server
pip install kotadb-client
# Index your codebase and start querying
python -c "
from kotadb import KotaDB, start_server
server = start_server(port=8080)
db = KotaDB('http://localhost:8080')
# Index and search your code
stats = db.index_codebase('./my-project')
results = db.search_code('function_name')
print(f'Found {len(results)} matches')
"
# macOS (Apple Silicon)
curl -L https://github.com/jayminwest/kota-db/releases/latest/download/kotadb-macos-arm64.tar.gz | tar xz
./kotadb serve --port 8080
# Linux x64
curl -L https://github.com/jayminwest/kota-db/releases/latest/download/kotadb-linux-x64.tar.gz | tar xz
./kotadb serve --port 8080
docker run -p 8080:8080 ghcr.io/jayminwest/kota-db:latest serve
pip install kotadb-client
npm install kotadb-client
cargo add kotadb
Go - Coming Soon (Track progress: #114)
From Binaries: Download from releases
From Source: cargo install kotadb
Docker: docker pull ghcr.io/jayminwest/kota-db:latest
✅ Codebase Intelligence
- Symbol extraction from source code (functions, classes, variables)
- Dependency tracking and impact analysis
- Cross-reference detection and caller analysis
✅ High-Performance Search
- Full-text search with <3ms latency (210x improvement)
- Symbol-based search with pattern matching
- Path-based queries with wildcard support
✅ Production Ready
- Crash-safe storage with Write-Ahead Logging
- Type-safe client libraries for Python, TypeScript, Rust
- Comprehensive test coverage (271 passing tests)
- Zero external database dependencies
✅ Developer Experience
- REST API for HTTP integration
- MCP server for AI assistant integration
- Pre-built binaries for all platforms
Real-world benchmarks on Apple Silicon:
Operation | Latency | Throughput |
---|---|---|
Symbol Search | 277 µs | 3,600 ops/sec |
Text Search | <3 ms | 333+ queries/sec |
B+ Tree Lookup | 489 µs | 2,000 queries/sec |
Tested on KotaDB's own codebase (21,000+ symbols)
- Limited cross-language support (Rust focus)
- Basic query operators (no complex filtering)
- UX improvements needed for CLI interface
from kotadb import KotaDB
db = KotaDB("http://localhost:8080")
# Index your codebase
stats = db.index_codebase("./my-project")
print(f"Indexed {stats['symbols']} symbols")
# Search for symbols
symbols = db.search_symbols("DatabaseConnection")
for symbol in symbols:
print(f"{symbol['type']}: {symbol['name']} at {symbol['location']}")
# Find function callers
callers = db.find_callers("process_data")
print(f"Called from {len(callers)} locations")
# Analyze change impact
impact = db.analyze_impact("StorageError")
print(f"Would affect {len(impact['affected'])} files")
use kotadb::Database;
#[tokio::main]
async fn main() -> Result<()> {
let db = Database::new("~/.kota/db").await?;
// Index and search
let stats = db.index_codebase("./my-project").await?;
let symbols = db.search_symbols("FileStorage").await?;
let callers = db.find_callers("process_data").await?;
Ok(())
}
# Index your codebase
kotadb index-codebase ./my-project
# Search operations
kotadb search-code "async fn"
kotadb search-symbols "Storage*"
kotadb find-callers FileStorage
kotadb analyze-impact Config
# Database operations
kotadb stats --symbols
kotadb validate
KotaDB achieves sub-10ms query latency through:
- Optimized Indices: B+ tree, trigram, and vector search
- Native Storage: Custom page-based storage engine
- Memory Efficiency: <2.5x memory overhead vs raw data
- Concurrent Access: Lock-free read operations
See the architecture notes on performance in docs/architecture/technical_architecture.md for detailed analysis.
When built with the mcp-server
feature, the server also exposes MCP-over-HTTP endpoints under /mcp/*
for AI assistants:
# List tools (GET preferred; POST supported)
curl -sS -H "Authorization: Bearer $API_KEY" http://localhost:8080/mcp/tools
# Run text search tool
curl -sS -X POST -H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{"query":"storage","limit":10}' \
http://localhost:8080/mcp/tools/search_code
# Bridge stats/discovery
curl -sS -H "Authorization: Bearer $API_KEY" \
http://localhost:8080/mcp/tools/stats
See docs/search_sanitization_and_thresholds.md
for details on:
- Differences between
sanitize_search_query
andsanitize_path_aware_query
. - Optional
strict-sanitization
feature for high-threat environments. - Trigram matching thresholds and how they balance precision vs recall.
Stress/performance tests support CI-aware, env-overridable thresholds. See docs/ci_aware_test_thresholds.md
for variables, defaults, and examples.
Bridge errors use a stable schema { success: false, error: { code, message } }
. Common codes:
feature_disabled
, tool_not_found
, registry_unavailable
, internal_error
.
- Getting Started - Installation and first steps
- API Reference - Complete API documentation
- Architecture - Technical design details
- Developer Guide - Development workflow
- Agent Guide - LLM agent collaboration protocol
KotaDB is developed 100% by LLM agents following a structured workflow:
- Open an issue describing the change
- Agents review and implement following AGENT.md protocols
- Changes validated through comprehensive testing
- Documentation updated automatically
See Contributing Guide for details.
MIT - See LICENSE for details.
Built for AI-assisted development • Inspired by LevelDB, Tantivy, and FAISS