Home

CKB - Code Knowledge Backend

A language-agnostic codebase comprehension layer that orchestrates multiple code intelligence backends (SCIP, LSP, Git) and provides semantically compressed, LLM-optimized views with persistent architectural understanding.

CKB analyzes, indexes, and explains your code but never modifies it. It won't refactor, lint, format, auto-fix, or enforce coding standards. Think of it as a librarian who knows everything about the books but never rewrites them.

What is CKB?

CKB (Code Knowledge Backend) is the missing link between your codebase and AI assistants. While AI coding tools like Claude, Cursor, and GitHub Copilot are powerful, they struggle with large codebases because they lack deep structural understanding of your code.

CKB solves this by providing:

A unified query layer that abstracts away the complexity of different code intelligence tools
Semantic compression that delivers exactly what an LLM needs without overwhelming its context window
Stable symbol tracking that survives refactoring, renames, and code moves
Architectural memory that maintains persistent knowledge about your codebase structure, ownership, and design decisions

The Problem CKB Solves

AI Assistants Are Blind to Code Structure

When you ask an AI assistant "what calls this function?", it typically:

Searches for text patterns (error-prone)
Reads random files hoping to find context (inefficient)
Gives up and asks you to provide more context (frustrating)

Existing Tools Don't Talk to Each Other

Your codebase has valuable intelligence scattered across:

SCIP indexes - Precise symbol information, but requires setup
Language servers - Real-time analysis, but slow for large queries
Git - History and blame, but no semantic understanding
CODEOWNERS - Ownership rules, but no integration with code intelligence

Each tool speaks a different language. None of them are optimized for AI consumption.

Context Windows Are Limited

Even with 100K+ token context windows, you can't just dump your entire codebase into an LLM. You need:

Relevant information only
Properly compressed responses
Smart truncation with follow-up suggestions

How CKB Helps

For AI-Assisted Development

You: "What's the impact of changing the UserService.authenticate() method?"

CKB provides:
├── Symbol details (signature, visibility, location)
├── 12 direct callers across 4 modules
├── Risk score: HIGH (public API, many dependents)
├── Affected modules: auth, api, admin, tests
├── Code owners: @security-team, @api-team
└── Suggested drilldowns for deeper analysis

For Code Understanding

You: "Show me the architecture of this codebase"

CKB provides:
├── Module dependency graph
├── Key symbols per module
├── Module responsibilities and ownership
├── Import/export relationships
└── Compressed to fit LLM context

For Refactoring Safety

You: "Is it safe to rename this function?"

CKB provides:
├── All references (not just text matches)
├── Cross-module dependencies
├── Test coverage of affected code
├── Hotspot risk assessment
└── Breaking change warnings

For Code Review

You: "Who should review changes to internal/api?"

CKB provides:
├── Primary owners from CODEOWNERS
├── Recent contributors from git blame
├── Related architectural decisions
└── Historical hotspot trends

Key Features

Multi-Backend Orchestration

Query SCIP, LSP, and Git through a single interface. CKB automatically:

Routes queries to the best available backend
Falls back gracefully when backends are unavailable
Merges results from multiple sources

Stable Symbol Identity

Symbols get permanent IDs that survive:

Renames (oldName → newName)
Moves (pkg/old/ → pkg/new/)
Refactoring (extract method, inline, etc.)

Old references automatically redirect to current locations.

Smart Compression

Responses are optimized for LLM consumption:

Configurable token budgets
Intelligent truncation (most relevant first)
Drilldown suggestions for deeper exploration
Deterministic output for reliable caching

Impact Analysis

Understand the blast radius before making changes:

Visibility detection (public/private/internal)
Risk scoring based on usage patterns
Module-level impact summaries
Breaking change detection

Three-Tier Caching

Fast responses through intelligent caching:

Query cache - Recent query results
View cache - Expensive computations
Negative cache - Avoid repeated failures

All caches invalidate automatically when code changes.

Architectural Memory

Persistent knowledge that survives across sessions:

Module registry - Boundaries, responsibilities, and tags from MODULES.toml or inference
Ownership tracking - CODEOWNERS integration + git-blame analysis with time decay
Hotspot trends - Historical risk tracking with trend analysis and 30-day projections
Decision log - Architectural Decision Records (ADRs) with full-text search

Background Operations (v6.1)

Long-running operations run asynchronously:

Job queue - SQLite-backed job persistence in ~/.ckb/jobs.db
Async refresh - refreshArchitecture with async: true returns immediately with jobId
Progress tracking - Poll getJobStatus for progress and results
Job management - List, filter, and cancel running jobs

CI/CD Integration (v6.1)

Built for automated pipelines:

PR analysis - summarizePr assesses risk, suggests reviewers, identifies affected modules
Ownership drift - getOwnershipDrift compares CODEOWNERS vs actual contributors
GitHub Actions - Example workflows in examples/github-actions/

Federation (v6.2)

Cross-repository queries and unified visibility:

Multi-repo collections - Group related repositories into named federations
Cross-repo search - Search modules, ownership, hotspots, and decisions across repos
Stable identity - UUID-based repo identity survives renames
Staleness propagation - Federation freshness reflects weakest link

Daemon Mode (v6.2.1)

Always-on service for continuous code intelligence:

Background daemon - Long-running process with HTTP API on port 9120
Job queue - Async operations with progress tracking and cancellation
Scheduler - Cron and interval expressions for automated refresh
File watcher - Git change detection with debounced refresh
Webhooks - Outbound notifications to Slack, PagerDuty, Discord with retry logic

Tree-sitter Complexity (v6.2.2)

Language-agnostic complexity metrics via tree-sitter:

Multi-language support - Go, JavaScript, TypeScript, Python, Rust, Java, Kotlin
Cyclomatic complexity - Decision points analysis (if, for, while, switch, &&, ||)
Cognitive complexity - Nesting-weighted complexity for maintainability assessment
Hotspot integration - Complexity metrics feed into hotspot risk scores

Contract-Aware Impact Analysis (v6.3)

Cross-repo intelligence through explicit API boundaries:

Contract detection - Automatic discovery of protobuf (.proto) and OpenAPI specs
Visibility classification - Public, internal, or unknown based on paths and metadata
Consumer detection - Three evidence tiers (declared, derived, heuristic)
Impact analysis - "What breaks if I change this shared API?"
Risk assessment - Low/medium/high risk with detailed factors
Transitive analysis - Follow proto import graphs across repos

Runtime Telemetry (v6.4)

Observed reality through OpenTelemetry integration:

OTLP ingest - Accept metrics from OpenTelemetry Collector
Symbol matching - Map telemetry to code symbols (exact, strong, weak quality levels)
Coverage tracking - Know how much of your code is observed
Usage display - See actual call counts for any symbol
Dead code detection - Find symbols with zero runtime calls
Blended confidence - Combine static analysis with observed reality
Impact enrichment - Add observed callers to impact analysis

Developer Intelligence (v6.5)

Go beyond what code does to understand why it exists:

Symbol origin - Who wrote it, when, why, and what issues/PRs are linked
Evolution timeline - How has this code changed over time
Co-change coupling - Find files that historically change together
Proactive warnings - Detect temporary code, single-author risk, high coupling, staleness
LLM export - Token-efficient codebase summaries with importance ranking
Risk audit - 8-factor risk scoring (complexity, coverage, bus factor, security, staleness, errors, coupling, churn)
Quick wins - Find high-impact, low-effort refactoring targets

Zero-Friction UX (v7.0)

Get started in seconds without building from source:

npm distribution - npm install -g @tastehub/ckb or npx @tastehub/ckb
58 MCP tools - Full code intelligence via Model Context Protocol

Zero-Friction Operation (v7.1)

Code intelligence without requiring a SCIP index upfront:

Tree-sitter fallback - Symbol extraction for 8 languages without SCIP
ckb index command - Auto-detects language and runs the right indexer
Universal MCP docs - Setup instructions for all major AI tools

Multi-Tool Setup & Smart Indexing (v7.2)

ckb setup - Interactive wizard for Claude Code, Cursor, Windsurf, VS Code, OpenCode, Claude Desktop
Extended languages - Added C/C++, Dart, Ruby, C#, PHP indexer support
Smart indexing - Skip-if-fresh, freshness tracking, concurrent lock protection
ckb mcp --watch - Auto-reindex mode with 30-second polling
Explicit tiers - Control analysis depth with --tier=fast|standard|full
ckb doctor --tier - Check tool requirements for each analysis tier

Doc-Symbol Linking (v7.3)

Bridge documentation and code with automatic symbol detection:

Backtick detection - Automatically detect Symbol.Name references in markdown
Directive support - Explicit  and  directives
Fence scanning - Extract symbols from fenced code blocks (8 languages via tree-sitter)
Staleness detection - Find broken references when symbols are renamed or deleted
Rename awareness - Suggest new names when documented symbols are renamed
CI enforcement - --fail-under flag for documentation coverage thresholds
known_symbols - Allow single-segment symbol detection via directive

Remote Index Serving (v7.3)

Serve symbol indexes over HTTP for remote federation clients:

Index Server Mode - ckb serve --index-server enables remote index endpoints
Multi-Repo Support - Serve multiple repositories from a single CKB instance
REST API - 10 endpoints for repos, symbols, refs, callgraph, and search
HMAC-Signed Cursors - Secure pagination with tamper-proof cursors
Privacy Redaction - Per-repo controls for exposing paths, docs, and signatures
TOML Configuration - Configure repos, privacy settings, and pagination limits

Ownership Intelligence

Know who owns what code:

Parse CODEOWNERS files automatically
Compute ownership from git blame with time decay
Track ownership changes over time
Suggest reviewers for pull requests

Hotspot Detection

Identify volatile areas before they become problems:

Track churn metrics over time
Compute composite risk scores (churn + coupling + complexity)
Detect trends (increasing/stable/decreasing)
Project future hotspot scores

Use Cases

Use Case	Without CKB	With CKB
Find all callers	Grep + manual filtering	Precise semantic results
Understand function	Read surrounding files	Structured summary with context
Safe refactoring	Hope for the best	Impact analysis + risk score
Code review	Check changed files only	See downstream effects + owners
Onboarding	Read docs + explore	Query architecture instantly
Find code owner	Search CODEOWNERS manually	Query ownership for any path
Track tech debt	Gut feeling	Hotspot trends with data

Who Should Use CKB?

Developers using AI assistants - Give your AI tools superpowers
Teams with large codebases - Navigate complexity efficiently
Anyone doing refactoring - Understand impact before changing
Code reviewers - See the full picture of changes
Tech leads - Track architectural health over time

Quick Start - Step-by-step installation for Windows, macOS, and Linux
Prompt Cookbook - Real prompts for real problems (start here if you're new!)
Language Support - Which languages work best, SCIP indexers, and support tiers
Practical Limits - Accuracy notes, blind spots, and how to validate results
User Guide - Getting started, CLI commands, best practices
Incremental Indexing - Fast index updates for Go projects, accuracy guarantees (v7.3)
Doc-Symbol Linking - Automatic symbol detection in documentation, staleness checking (v7.3)
Authentication - API tokens, scopes, rate limiting for index server (v7.3)
Federation - Cross-repository queries, contract analysis, and unified visibility (v6.3)
Telemetry - Runtime observability, dead code detection, observed usage (v6.4)
CI/CD Integration - GitHub Actions workflows, PR analysis, automated refresh
API Reference - HTTP API documentation
Daemon Mode - Always-on service with scheduler, watcher, and webhooks (v6.2.1)
MCP Integration - Claude Desktop / AI assistant setup (71 tools available)
Architecture - System design and components
Configuration - All configuration options, MODULES.toml format, and ADR workflow
Performance - Latency targets and benchmarks
Contributing - Development guidelines

Installation

npm (Recommended)

# Install globally
npm install -g @tastehub/ckb

# Or run directly without installing
npx @tastehub/ckb --help

Build from Source

git clone https://github.com/SimplyLiz/CodeMCP.git
cd CodeMCP
go build -o ckb ./cmd/ckb

New to CKB? See the Quick Start guide for detailed instructions.

Quick Start

# Initialize in your project
cd /path/to/your/project
ckb init   # or: npx @tastehub/ckb init

# Generate SCIP index (auto-detects language)
ckb index

# Check status
ckb status

# Configure Claude Code
ckb setup

# Search for symbols
ckb search "myFunction"

# Find references
ckb refs "symbol-id"

# Analyze impact
ckb impact "symbol-id"

# Query ownership
ckb ownership internal/api/handler.go

# View architectural decisions
ckb decisions

# Start MCP server for AI assistants
ckb mcp

MCP Tools (74 Available)

CKB exposes code intelligence through the Model Context Protocol:

v5.1 — Core Navigation

Tool	Purpose
`searchSymbols`	Find symbols by name with filtering
`getSymbol`	Get symbol details
`findReferences`	Find all usages
`explainSymbol`	AI-friendly symbol explanation
`justifySymbol`	Keep/investigate/remove verdict
`getCallGraph`	Caller/callee relationships
`getModuleOverview`	Module statistics
`analyzeImpact`	Change risk analysis
`getStatus`	System health
`doctor`	Diagnostics

v5.2 — Discovery & Flow

Tool	Purpose
`traceUsage`	How is this symbol reached?
`listEntrypoints`	System entrypoints (API, CLI, jobs)
`explainFile`	File-level orientation
`explainPath`	Why does this path exist?
`summarizeDiff`	What changed, what might break?
`getArchitecture`	Module dependency overview
`getHotspots`	Volatile areas with trends
`listKeyConcepts`	Domain concepts in codebase
`recentlyRelevant`	What matters now?

v6.0 — Architectural Memory

Tool	Purpose
`getOwnership`	Who owns this code?
`getModuleResponsibilities`	What does this module do?
`recordDecision`	Create an ADR
`getDecisions`	Query architectural decisions
`annotateModule`	Add module metadata
`refreshArchitecture`	Rebuild architectural model

v6.1 — Production Ready

Tool	Purpose
`getJobStatus`	Query background job status
`listJobs`	List jobs with filters
`cancelJob`	Cancel queued/running job
`summarizePr`	PR risk analysis & reviewers
`getOwnershipDrift`	CODEOWNERS vs actual ownership

v6.2 — Federation

Tool	Purpose
`listFederations`	List all federations
`federationStatus`	Get federation status
`federationRepos`	List repos in federation
`federationSearchModules`	Cross-repo module search
`federationSearchOwnership`	Cross-repo ownership search
`federationGetHotspots`	Merged hotspots across repos
`federationSearchDecisions`	Cross-repo decision search
`federationSync`	Sync federation index

v6.2.1 — Daemon Mode

Tool	Purpose
`daemonStatus`	Daemon health and stats
`listSchedules`	List scheduled tasks
`runSchedule`	Run a schedule immediately
`listWebhooks`	List configured webhooks
`testWebhook`	Send test webhook
`webhookDeliveries`	Get delivery history

v6.2.2 — Tree-sitter Complexity

Tool	Purpose
`getFileComplexity`	Cyclomatic/cognitive complexity metrics

v6.3 — Contract-Aware Impact Analysis

Tool	Purpose
`listContracts`	List contracts in federation
`analyzeContractImpact`	Analyze impact of contract changes
`getContractDependencies`	Get contract deps for a repo
`suppressContractEdge`	Suppress false positive edge
`verifyContractEdge`	Verify an edge
`getContractStats`	Contract statistics

v6.4 — Runtime Telemetry

Tool	Purpose
`getTelemetryStatus`	Coverage metrics and sync status
`getObservedUsage`	Observed usage data for a symbol
`findDeadCodeCandidates`	Find symbols with zero runtime calls

v6.5 — Developer Intelligence

Tool	Purpose
`explainOrigin`	Why does this code exist? (origin, evolution, warnings)
`analyzeCoupling`	Find files/symbols that change together
`exportForLLM`	LLM-friendly codebase export with importance ranking
`auditRisk`	Multi-signal risk audit (8 weighted factors)

v7.0 — Zero-Friction UX

Feature	Description
npm distribution	`npm install -g @tastehub/ckb` or `npx @tastehub/ckb`
`ckb setup`	Auto-configure Claude Code integration
`ckb index`	Auto-detect language and run SCIP indexer
Analysis tiers	Works without SCIP (basic), better with it (enhanced)

v7.3 — Doc-Symbol Linking & Production Hardening

Feature	Description
Doc-Symbol Linking	Bridge documentation and code with automatic symbol detection
`indexDocs`	Scan and index documentation
`getDocsForSymbol`	Find docs referencing a symbol
`getSymbolsInDoc`	List symbols in a document
`getDocsForModule`	Find docs linked to a module
`checkDocStaleness`	Check for stale references
`getDocCoverage`	Documentation coverage stats

v7.3 — Multi-Repo Management

Tool	Purpose
`listRepos`	List registered repos with state and active status
`switchRepo`	Switch active repo context for MCP session
`getActiveRepo`	Get information about currently active repo

Feature	Description
Repo Registry	Global `~/.ckb/repos.json` for named repo shortcuts
`ckb repo add`	Register a repository by name
`ckb repo list`	List repos grouped by state
`ckb mcp --repo`	Start MCP with specific repo active
Multi-Engine	Up to 5 engines in memory with LRU eviction

Incremental Indexing v4 (Production-grade):

Feature	Description
Delta Artifacts	CI-generated diffs for O(delta) ingestion instead of O(N)
FTS5 Search	SQLite FTS5 for instant search (replaces LIKE scans)
Compaction Scheduler	Automatic snapshot cleanup and database maintenance
Prometheus Metrics	`/metrics` endpoint for monitoring
Load Shedding	Graceful degradation under load with priority endpoints

Language Quality (v7.3):

Feature	Description
Language Tiers	4-tier classification based on indexer maturity
Quality Assessment	Per-language metrics (ref accuracy, callgraph quality)
`/meta/languages`	Language quality dashboard endpoint
`/meta/python-env`	Python venv detection with recommendations
`/meta/typescript-monorepo`	TypeScript monorepo detection (pnpm, lerna, nx)

Interfaces

CKB provides three ways to interact:

Interface	Best For
CLI	Quick queries, scripting, CI/CD
HTTP API	Web integrations, custom tools
MCP Server	Claude Desktop, AI assistants

License

Free for personal use. Commercial/enterprise use requires a license. See LICENSE for details.

Uh oh!

Uh oh!

Home

CKB - Code Knowledge Backend

What is CKB?

The Problem CKB Solves

AI Assistants Are Blind to Code Structure

Existing Tools Don't Talk to Each Other

Context Windows Are Limited

How CKB Helps

For AI-Assisted Development

For Code Understanding

For Refactoring Safety

For Code Review

Key Features

Multi-Backend Orchestration

Stable Symbol Identity

Smart Compression

Impact Analysis

Three-Tier Caching

Architectural Memory

Background Operations (v6.1)

CI/CD Integration (v6.1)

Federation (v6.2)

Daemon Mode (v6.2.1)

Tree-sitter Complexity (v6.2.2)

Contract-Aware Impact Analysis (v6.3)

Runtime Telemetry (v6.4)

Developer Intelligence (v6.5)

Zero-Friction UX (v7.0)

Zero-Friction Operation (v7.1)

Multi-Tool Setup & Smart Indexing (v7.2)

Doc-Symbol Linking (v7.3)

Remote Index Serving (v7.3)

Ownership Intelligence

Hotspot Detection

Use Cases

Who Should Use CKB?

Table of Contents

Installation

npm (Recommended)

Build from Source

Quick Start

MCP Tools (74 Available)

v5.1 — Core Navigation

v5.2 — Discovery & Flow

v6.0 — Architectural Memory

v6.1 — Production Ready

v6.2 — Federation

v6.2.1 — Daemon Mode

v6.2.2 — Tree-sitter Complexity

v6.3 — Contract-Aware Impact Analysis

v6.4 — Runtime Telemetry

v6.5 — Developer Intelligence

v7.0 — Zero-Friction UX

v7.3 — Doc-Symbol Linking & Production Hardening

v7.3 — Multi-Repo Management

Interfaces

License

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!