-
-
Notifications
You must be signed in to change notification settings - Fork 11
Home
A language-agnostic codebase comprehension layer that orchestrates multiple code intelligence backends (SCIP, LSP, Git) and provides semantically compressed, LLM-optimized views with persistent architectural understanding.
CKB analyzes, indexes, and explains your code but never modifies it. It won't refactor, lint, format, auto-fix, or enforce coding standards. Think of it as a librarian who knows everything about the books but never rewrites them.
CKB (Code Knowledge Backend) is the missing link between your codebase and AI assistants. While AI coding tools like Claude, Cursor, and GitHub Copilot are powerful, they struggle with large codebases because they lack deep structural understanding of your code.
CKB solves this by providing:
- A unified query layer that abstracts away the complexity of different code intelligence tools
- Semantic compression that delivers exactly what an LLM needs without overwhelming its context window
- Stable symbol tracking that survives refactoring, renames, and code moves
- Architectural memory that maintains persistent knowledge about your codebase structure, ownership, and design decisions
When you ask an AI assistant "what calls this function?", it typically:
- Searches for text patterns (error-prone)
- Reads random files hoping to find context (inefficient)
- Gives up and asks you to provide more context (frustrating)
Your codebase has valuable intelligence scattered across:
- SCIP indexes - Precise symbol information, but requires setup
- Language servers - Real-time analysis, but slow for large queries
- Git - History and blame, but no semantic understanding
- CODEOWNERS - Ownership rules, but no integration with code intelligence
Each tool speaks a different language. None of them are optimized for AI consumption.
Even with 100K+ token context windows, you can't just dump your entire codebase into an LLM. You need:
- Relevant information only
- Properly compressed responses
- Smart truncation with follow-up suggestions
You: "What's the impact of changing the UserService.authenticate() method?"
CKB provides:
├── Symbol details (signature, visibility, location)
├── 12 direct callers across 4 modules
├── Risk score: HIGH (public API, many dependents)
├── Affected modules: auth, api, admin, tests
├── Code owners: @security-team, @api-team
└── Suggested drilldowns for deeper analysis
You: "Show me the architecture of this codebase"
CKB provides:
├── Module dependency graph
├── Key symbols per module
├── Module responsibilities and ownership
├── Import/export relationships
└── Compressed to fit LLM context
You: "Is it safe to rename this function?"
CKB provides:
├── All references (not just text matches)
├── Cross-module dependencies
├── Test coverage of affected code
├── Hotspot risk assessment
└── Breaking change warnings
You: "Who should review changes to internal/api?"
CKB provides:
├── Primary owners from CODEOWNERS
├── Recent contributors from git blame
├── Related architectural decisions
└── Historical hotspot trends
Query SCIP, LSP, and Git through a single interface. CKB automatically:
- Routes queries to the best available backend
- Falls back gracefully when backends are unavailable
- Merges results from multiple sources
Symbols get permanent IDs that survive:
- Renames (
oldName→newName) - Moves (
pkg/old/→pkg/new/) - Refactoring (extract method, inline, etc.)
Old references automatically redirect to current locations.
Responses are optimized for LLM consumption:
- Configurable token budgets
- Intelligent truncation (most relevant first)
- Drilldown suggestions for deeper exploration
- Deterministic output for reliable caching
Understand the blast radius before making changes:
- Visibility detection (public/private/internal)
- Risk scoring based on usage patterns
- Module-level impact summaries
- Breaking change detection
Fast responses through intelligent caching:
- Query cache - Recent query results
- View cache - Expensive computations
- Negative cache - Avoid repeated failures
All caches invalidate automatically when code changes.
Persistent knowledge that survives across sessions:
-
Module registry - Boundaries, responsibilities, and tags from
MODULES.tomlor inference - Ownership tracking - CODEOWNERS integration + git-blame analysis with time decay
- Hotspot trends - Historical risk tracking with trend analysis and 30-day projections
- Decision log - Architectural Decision Records (ADRs) with full-text search
Long-running operations run asynchronously:
-
Job queue - SQLite-backed job persistence in
~/.ckb/jobs.db -
Async refresh -
refreshArchitecturewithasync: truereturns immediately withjobId -
Progress tracking - Poll
getJobStatusfor progress and results - Job management - List, filter, and cancel running jobs
Built for automated pipelines:
-
PR analysis -
summarizePrassesses risk, suggests reviewers, identifies affected modules -
Ownership drift -
getOwnershipDriftcompares CODEOWNERS vs actual contributors -
GitHub Actions - Example workflows in
examples/github-actions/
Cross-repository queries and unified visibility:
- Multi-repo collections - Group related repositories into named federations
- Cross-repo search - Search modules, ownership, hotspots, and decisions across repos
- Stable identity - UUID-based repo identity survives renames
- Staleness propagation - Federation freshness reflects weakest link
Always-on service for continuous code intelligence:
- Background daemon - Long-running process with HTTP API on port 9120
- Job queue - Async operations with progress tracking and cancellation
- Scheduler - Cron and interval expressions for automated refresh
- File watcher - Git change detection with debounced refresh
- Webhooks - Outbound notifications to Slack, PagerDuty, Discord with retry logic
Language-agnostic complexity metrics via tree-sitter:
- Multi-language support - Go, JavaScript, TypeScript, Python, Rust, Java, Kotlin
- Cyclomatic complexity - Decision points analysis (if, for, while, switch, &&, ||)
- Cognitive complexity - Nesting-weighted complexity for maintainability assessment
- Hotspot integration - Complexity metrics feed into hotspot risk scores
Cross-repo intelligence through explicit API boundaries:
-
Contract detection - Automatic discovery of protobuf (
.proto) and OpenAPI specs - Visibility classification - Public, internal, or unknown based on paths and metadata
- Consumer detection - Three evidence tiers (declared, derived, heuristic)
- Impact analysis - "What breaks if I change this shared API?"
- Risk assessment - Low/medium/high risk with detailed factors
- Transitive analysis - Follow proto import graphs across repos
Observed reality through OpenTelemetry integration:
- OTLP ingest - Accept metrics from OpenTelemetry Collector
- Symbol matching - Map telemetry to code symbols (exact, strong, weak quality levels)
- Coverage tracking - Know how much of your code is observed
- Usage display - See actual call counts for any symbol
- Dead code detection - Find symbols with zero runtime calls
- Blended confidence - Combine static analysis with observed reality
- Impact enrichment - Add observed callers to impact analysis
Go beyond what code does to understand why it exists:
- Symbol origin - Who wrote it, when, why, and what issues/PRs are linked
- Evolution timeline - How has this code changed over time
- Co-change coupling - Find files that historically change together
- Proactive warnings - Detect temporary code, single-author risk, high coupling, staleness
- LLM export - Token-efficient codebase summaries with importance ranking
- Risk audit - 8-factor risk scoring (complexity, coverage, bus factor, security, staleness, errors, coupling, churn)
- Quick wins - Find high-impact, low-effort refactoring targets
Get started in seconds without building from source:
-
npm distribution -
npm install -g @tastehub/ckbornpx @tastehub/ckb - 58 MCP tools - Full code intelligence via Model Context Protocol
Code intelligence without requiring a SCIP index upfront:
- Tree-sitter fallback - Symbol extraction for 8 languages without SCIP
-
ckb indexcommand - Auto-detects language and runs the right indexer - Universal MCP docs - Setup instructions for all major AI tools
-
ckb setup- Interactive wizard for Claude Code, Cursor, Windsurf, VS Code, OpenCode, Claude Desktop - Extended languages - Added C/C++, Dart, Ruby, C#, PHP indexer support
- Smart indexing - Skip-if-fresh, freshness tracking, concurrent lock protection
-
ckb mcp --watch- Auto-reindex mode with 30-second polling -
Explicit tiers - Control analysis depth with
--tier=fast|standard|full -
ckb doctor --tier- Check tool requirements for each analysis tier
Bridge documentation and code with automatic symbol detection:
-
Backtick detection - Automatically detect
Symbol.Namereferences in markdown -
Directive support - Explicit
<!-- ckb:symbol -->and<!-- ckb:module -->directives - Fence scanning - Extract symbols from fenced code blocks (8 languages via tree-sitter)
- Staleness detection - Find broken references when symbols are renamed or deleted
- Rename awareness - Suggest new names when documented symbols are renamed
-
CI enforcement -
--fail-underflag for documentation coverage thresholds - known_symbols - Allow single-segment symbol detection via directive
Serve symbol indexes over HTTP for remote federation clients:
-
Index Server Mode -
ckb serve --index-serverenables remote index endpoints - Multi-Repo Support - Serve multiple repositories from a single CKB instance
- REST API - 10 endpoints for repos, symbols, refs, callgraph, and search
- HMAC-Signed Cursors - Secure pagination with tamper-proof cursors
- Privacy Redaction - Per-repo controls for exposing paths, docs, and signatures
- TOML Configuration - Configure repos, privacy settings, and pagination limits
Know who owns what code:
- Parse CODEOWNERS files automatically
- Compute ownership from git blame with time decay
- Track ownership changes over time
- Suggest reviewers for pull requests
Identify volatile areas before they become problems:
- Track churn metrics over time
- Compute composite risk scores (churn + coupling + complexity)
- Detect trends (increasing/stable/decreasing)
- Project future hotspot scores
| Use Case | Without CKB | With CKB |
|---|---|---|
| Find all callers | Grep + manual filtering | Precise semantic results |
| Understand function | Read surrounding files | Structured summary with context |
| Safe refactoring | Hope for the best | Impact analysis + risk score |
| Code review | Check changed files only | See downstream effects + owners |
| Onboarding | Read docs + explore | Query architecture instantly |
| Find code owner | Search CODEOWNERS manually | Query ownership for any path |
| Track tech debt | Gut feeling | Hotspot trends with data |
- Developers using AI assistants - Give your AI tools superpowers
- Teams with large codebases - Navigate complexity efficiently
- Anyone doing refactoring - Understand impact before changing
- Code reviewers - See the full picture of changes
- Tech leads - Track architectural health over time
- Quick Start - Step-by-step installation for Windows, macOS, and Linux
- Prompt Cookbook - Real prompts for real problems (start here if you're new!)
- Language Support - Which languages work best, SCIP indexers, and support tiers
- Practical Limits - Accuracy notes, blind spots, and how to validate results
- User Guide - Getting started, CLI commands, best practices
- Incremental Indexing - Fast index updates for Go projects, accuracy guarantees (v7.3)
- Doc-Symbol Linking - Automatic symbol detection in documentation, staleness checking (v7.3)
- Authentication - API tokens, scopes, rate limiting for index server (v7.3)
- Federation - Cross-repository queries, contract analysis, and unified visibility (v6.3)
- Telemetry - Runtime observability, dead code detection, observed usage (v6.4)
- CI/CD Integration - GitHub Actions workflows, PR analysis, automated refresh
- API Reference - HTTP API documentation
- Daemon Mode - Always-on service with scheduler, watcher, and webhooks (v6.2.1)
- MCP Integration - Claude Desktop / AI assistant setup (71 tools available)
- Architecture - System design and components
- Configuration - All configuration options, MODULES.toml format, and ADR workflow
- Performance - Latency targets and benchmarks
- Contributing - Development guidelines
# Install globally
npm install -g @tastehub/ckb
# Or run directly without installing
npx @tastehub/ckb --helpgit clone https://github.com/SimplyLiz/CodeMCP.git
cd CodeMCP
go build -o ckb ./cmd/ckbNew to CKB? See the Quick Start guide for detailed instructions.
# Initialize in your project
cd /path/to/your/project
ckb init # or: npx @tastehub/ckb init
# Generate SCIP index (auto-detects language)
ckb index
# Check status
ckb status
# Configure Claude Code
ckb setup
# Search for symbols
ckb search "myFunction"
# Find references
ckb refs "symbol-id"
# Analyze impact
ckb impact "symbol-id"
# Query ownership
ckb ownership internal/api/handler.go
# View architectural decisions
ckb decisions
# Start MCP server for AI assistants
ckb mcpCKB exposes code intelligence through the Model Context Protocol:
| Tool | Purpose |
|---|---|
searchSymbols |
Find symbols by name with filtering |
getSymbol |
Get symbol details |
findReferences |
Find all usages |
explainSymbol |
AI-friendly symbol explanation |
justifySymbol |
Keep/investigate/remove verdict |
getCallGraph |
Caller/callee relationships |
getModuleOverview |
Module statistics |
analyzeImpact |
Change risk analysis |
getStatus |
System health |
doctor |
Diagnostics |
| Tool | Purpose |
|---|---|
traceUsage |
How is this symbol reached? |
listEntrypoints |
System entrypoints (API, CLI, jobs) |
explainFile |
File-level orientation |
explainPath |
Why does this path exist? |
summarizeDiff |
What changed, what might break? |
getArchitecture |
Module dependency overview |
getHotspots |
Volatile areas with trends |
listKeyConcepts |
Domain concepts in codebase |
recentlyRelevant |
What matters now? |
| Tool | Purpose |
|---|---|
getOwnership |
Who owns this code? |
getModuleResponsibilities |
What does this module do? |
recordDecision |
Create an ADR |
getDecisions |
Query architectural decisions |
annotateModule |
Add module metadata |
refreshArchitecture |
Rebuild architectural model |
| Tool | Purpose |
|---|---|
getJobStatus |
Query background job status |
listJobs |
List jobs with filters |
cancelJob |
Cancel queued/running job |
summarizePr |
PR risk analysis & reviewers |
getOwnershipDrift |
CODEOWNERS vs actual ownership |
| Tool | Purpose |
|---|---|
listFederations |
List all federations |
federationStatus |
Get federation status |
federationRepos |
List repos in federation |
federationSearchModules |
Cross-repo module search |
federationSearchOwnership |
Cross-repo ownership search |
federationGetHotspots |
Merged hotspots across repos |
federationSearchDecisions |
Cross-repo decision search |
federationSync |
Sync federation index |
| Tool | Purpose |
|---|---|
daemonStatus |
Daemon health and stats |
listSchedules |
List scheduled tasks |
runSchedule |
Run a schedule immediately |
listWebhooks |
List configured webhooks |
testWebhook |
Send test webhook |
webhookDeliveries |
Get delivery history |
| Tool | Purpose |
|---|---|
getFileComplexity |
Cyclomatic/cognitive complexity metrics |
| Tool | Purpose |
|---|---|
listContracts |
List contracts in federation |
analyzeContractImpact |
Analyze impact of contract changes |
getContractDependencies |
Get contract deps for a repo |
suppressContractEdge |
Suppress false positive edge |
verifyContractEdge |
Verify an edge |
getContractStats |
Contract statistics |
| Tool | Purpose |
|---|---|
getTelemetryStatus |
Coverage metrics and sync status |
getObservedUsage |
Observed usage data for a symbol |
findDeadCodeCandidates |
Find symbols with zero runtime calls |
| Tool | Purpose |
|---|---|
explainOrigin |
Why does this code exist? (origin, evolution, warnings) |
analyzeCoupling |
Find files/symbols that change together |
exportForLLM |
LLM-friendly codebase export with importance ranking |
auditRisk |
Multi-signal risk audit (8 weighted factors) |
| Feature | Description |
|---|---|
| npm distribution |
npm install -g @tastehub/ckb or npx @tastehub/ckb
|
ckb setup |
Auto-configure Claude Code integration |
ckb index |
Auto-detect language and run SCIP indexer |
| Analysis tiers | Works without SCIP (basic), better with it (enhanced) |
| Feature | Description |
|---|---|
| Doc-Symbol Linking | Bridge documentation and code with automatic symbol detection |
indexDocs |
Scan and index documentation |
getDocsForSymbol |
Find docs referencing a symbol |
getSymbolsInDoc |
List symbols in a document |
getDocsForModule |
Find docs linked to a module |
checkDocStaleness |
Check for stale references |
getDocCoverage |
Documentation coverage stats |
| Tool | Purpose |
|---|---|
listRepos |
List registered repos with state and active status |
switchRepo |
Switch active repo context for MCP session |
getActiveRepo |
Get information about currently active repo |
| Feature | Description |
|---|---|
| Repo Registry | Global ~/.ckb/repos.json for named repo shortcuts |
ckb repo add |
Register a repository by name |
ckb repo list |
List repos grouped by state |
ckb mcp --repo |
Start MCP with specific repo active |
| Multi-Engine | Up to 5 engines in memory with LRU eviction |
Incremental Indexing v4 (Production-grade):
| Feature | Description |
|---|---|
| Delta Artifacts | CI-generated diffs for O(delta) ingestion instead of O(N) |
| FTS5 Search | SQLite FTS5 for instant search (replaces LIKE scans) |
| Compaction Scheduler | Automatic snapshot cleanup and database maintenance |
| Prometheus Metrics |
/metrics endpoint for monitoring |
| Load Shedding | Graceful degradation under load with priority endpoints |
Language Quality (v7.3):
| Feature | Description |
|---|---|
| Language Tiers | 4-tier classification based on indexer maturity |
| Quality Assessment | Per-language metrics (ref accuracy, callgraph quality) |
/meta/languages |
Language quality dashboard endpoint |
/meta/python-env |
Python venv detection with recommendations |
/meta/typescript-monorepo |
TypeScript monorepo detection (pnpm, lerna, nx) |
CKB provides three ways to interact:
| Interface | Best For |
|---|---|
| CLI | Quick queries, scripting, CI/CD |
| HTTP API | Web integrations, custom tools |
| MCP Server | Claude Desktop, AI assistants |
Free for personal use. Commercial/enterprise use requires a license. See LICENSE for details.