Practical Limits

CKB Practical Limits & Accuracy

Honest guidance on what CKB does well, where it struggles, and how to validate results.

TL;DR: CKB is excellent for static code navigation and comprehension. It's not magic—it has blind spots around dynamic behavior, and you should always verify critical decisions by opening the actual code.

Best-Fit Tasks

CKB excels at these tasks—high accuracy, high value:

Symbol Navigation

Task	Why CKB is great
Find function/type definitions	Direct SCIP index lookup, near-perfect accuracy
List all references to a symbol	Semantic analysis, not text search
Show call graph (callers/callees)	Static analysis with depth control
Explain symbol signature and location	Structured, consistent output

Codebase Orientation

Task	Why CKB is great
Architecture overview	Module detection + dependency analysis
File/path role classification	Pattern matching + import analysis
Key concepts extraction	Symbol clustering + naming analysis
Entrypoint discovery	Framework detection + naming conventions

Change Safety

Task	Why CKB is great
Impact analysis (blast radius)	Reference counting + module spread
Hotspot detection	Git history + churn metrics
Risk scoring	Multi-factor analysis (visibility, callers, volatility)
Diff summarization by risk	Combines file changes with hotspot data

Ownership & Decisions

Task	Why CKB is great
CODEOWNERS lookup	Direct file parsing, 100% accurate
Git blame ownership	Weighted by recency, filters bots
ADR search and creation	Full-text search, structured storage

Low-Impact Tasks

CKB adds less value here—consider alternatives:

Task	Why CKB is limited	Better alternative
Single-file questions	CKB overhead not worth it	Just read the file
Trivial searches	"Where is main.go?"	Use `find` or your IDE
Runtime behavior questions	CKB is static analysis only	Debugger, logging, profiler
Code generation	Not CKB's job	Your AI assistant directly
Style/lint checking	Not CKB's job	ESLint, golangci-lint, etc.
Test generation	Not CKB's job	Your AI assistant directly

When CKB overhead isn't worth it

Small codebases (<10 files): Just read the code
Single-file changes: Open the file directly
Questions your IDE answers: "Go to definition" is faster
Runtime questions: "Why is this slow?" needs profiling, not static analysis

Known Blind Spots

CKB has limitations. Understanding them helps you use it effectively.

Dynamic Dispatch / Reflection

The problem: CKB uses static analysis. It can't see runtime behavior.

// CKB sees: handler is type http.Handler (interface)
// CKB doesn't see: which concrete type implements it at runtime
var handler http.Handler = getHandler() // dynamic
handler.ServeHTTP(w, r)

What CKB misses:

Interface implementations resolved at runtime
Reflection-based calls (reflect.Call, reflect.Method)
Plugin systems loading code dynamically
Dependency injection containers

Impact: Call graphs may be incomplete. "Who calls X?" might miss dynamic callers.

Mitigation:

CKB reports confidence levels—low confidence often means dynamic dispatch
Use "find references" + manual review for interface methods
Check concrete implementations separately

Generated Code

The problem: Code generated at build time may not be indexed.

What CKB misses:

Protocol buffer generated files (.pb.go, _pb2.py)
GraphQL generated types
ORM-generated models
Build-time codegen (go generate, etc.)

Impact: Symbols in generated code may not appear in search or references.

Mitigation:

Regenerate code before indexing: go generate ./... && scip-go
Check if generated files are in .gitignore (they might be excluded)
Some indexers skip generated code by design

Cross-Repository References

The problem: CKB indexes one repository at a time.

What CKB misses:

Calls from external packages that depend on your code
Symbols defined in vendored dependencies (partial support)
Monorepo cross-package references (depends on indexer setup)

Impact: "Who calls X?" only shows callers within your repo.

Mitigation:

For libraries: assume external callers exist for any exported symbol
Impact analysis warns about public API changes for this reason

Conditional Compilation

The problem: Build tags and platform-specific code.

// +build linux

func platformSpecific() { ... }  // May not be indexed on macOS

What CKB misses:

Code behind build tags for other platforms
#ifdef blocks in C/C++ (depends on indexer configuration)
Feature flags that exclude code paths

Impact: Some symbols may be missing or have incomplete references.

Mitigation:

Index on the primary build platform
For cross-platform code, index multiple times and merge (advanced)

Test-Only Code

The problem: Test files have different visibility rules.

What you might misinterpret:

A symbol with "no callers" might have test-only callers
Test helpers may appear unused (they're not)
Internal test packages (_test suffix) have special access

Mitigation:

Always use includeTests: true when checking for dead code
CKB's justifySymbol separates test vs production references

Very Recent Changes

The problem: SCIP index lags behind code changes.

What CKB misses:

Symbols added/changed since last index generation
Renamed symbols (old name still in index)
Deleted code (may still appear in stale index)

Impact: Results may be outdated.

Mitigation:

Run ckb doctor to check index freshness
Regenerate index after significant changes: scip-go --repository-root=.
CKB warns when index is stale (commits behind HEAD)

Macro-Heavy Languages

The problem: Macros expand at compile time, not index time.

Languages affected:

Rust (procedural macros, derive macros)
C/C++ (preprocessor macros)
Lisp family (reader macros)

What CKB misses:

Symbols generated by macro expansion
Call relationships through macros

Mitigation:

Depends on language-specific indexer capabilities
Some indexers expand macros before indexing (check your indexer docs)

Confidence Levels

CKB reports confidence scores (0.0–1.0). Here's what they mean:

Confidence	What it means	Trust level
0.90–1.0	Full static analysis, high certainty	Trust it
0.80–0.89	Good static analysis, minor gaps	Trust with quick verify
0.70–0.79	Partial analysis or heuristics involved	Verify important decisions
0.50–0.69	Significant uncertainty, heuristics only	Use as starting point only
< 0.50	Low confidence, may be wrong	Always verify manually

Confidence by source

Source	Typical confidence	Notes
SCIP index	0.90–1.0	Best source, regenerate if stale
LSP	0.85–0.95	Good, but may timeout on large queries
Git blame	0.79	Time-weighted, may miss recent contributors
Heuristics	0.50–0.70	Pattern matching, naming conventions
Inference	0.40–0.60	Educated guesses

How to Validate

Golden rule: For any decision that matters, open the actual code and verify.

Validation checklist

CKB says...	You should...
"X has no callers"	Search for string "X" in case of dynamic calls
"Safe to remove"	Check for reflection, string-based lookups, external consumers
"High risk change"	Review the listed callers—are they actually affected?
"Y owns this code"	Confirm with the team (ownership can be outdated)
"Z is a hotspot"	Check if churn is meaningful (refactoring vs real changes)

Quick validation prompts

After getting CKB results, follow up with:

Show me the actual code for [symbol]

Open [file path] so I can verify

Search for the string "FunctionName" in case of dynamic calls

When to always verify

Before deleting code — CKB might miss dynamic callers
Before public API changes — External consumers aren't indexed
When confidence is below 0.8 — CKB is uncertain
For security-sensitive code — Don't trust any tool blindly
When results seem wrong — Trust your instincts, investigate

When to Trust Results

High trust (act on it)

CODEOWNERS lookup (direct file parsing)
Symbol location and signature (SCIP index)
Git blame with recent commits (factual data)
Module detection for standard layouts (well-tested heuristics)

Medium trust (verify for important decisions)

Call graphs (may miss dynamic dispatch)
Impact analysis (depends on complete reference data)
Hotspot trends (git history is factual, interpretation is heuristic)
Dead code detection (may miss reflection/dynamic calls)

Low trust (use as starting point)

Inferred responsibilities (heuristic text analysis)
Role classification (pattern matching)
Key concepts (clustering + naming analysis)
Ownership from old git history (people leave, roles change)

Improving Accuracy

Keep your index fresh

# Check freshness
ckb doctor

# Regenerate if stale (Go)
scip-go --repository-root=.

# Regenerate if stale (TypeScript)
scip-typescript index

Include generated code

# Generate code first, then index
go generate ./...
scip-go --repository-root=.

Use scoped queries

Smaller scope = faster + more accurate:

❌ "Find all handlers"
✅ "Find handlers in internal/api"

Report issues

If CKB consistently misses something, it might be a bug or missing feature. Check GitHub Issues.

Summary

Category	Accuracy	Notes
Symbol lookup	⭐⭐⭐⭐⭐	Excellent—direct index lookup
Reference finding	⭐⭐⭐⭐	Very good—may miss dynamic calls
Call graphs	⭐⭐⭐⭐	Very good—static analysis limits
Impact analysis	⭐⭐⭐⭐	Very good—based on reference quality
Ownership (CODEOWNERS)	⭐⭐⭐⭐⭐	Excellent—direct file parsing
Ownership (git blame)	⭐⭐⭐⭐	Very good—weighted, filters bots
Hotspot detection	⭐⭐⭐⭐	Very good—factual git data
Architecture detection	⭐⭐⭐	Good—heuristics for edge cases
Role classification	⭐⭐⭐	Good—pattern matching
Responsibility inference	⭐⭐	Fair—use as starting point

Remember: CKB is a navigation aid, not an oracle. It dramatically speeds up code comprehension, but you're still the one making decisions. Verify what matters.

Uh oh!

Uh oh!

Practical Limits

CKB Practical Limits & Accuracy

Table of Contents

Best-Fit Tasks

Symbol Navigation

Codebase Orientation

Change Safety

Ownership & Decisions

Low-Impact Tasks

When CKB overhead isn't worth it

Known Blind Spots

Dynamic Dispatch / Reflection

Generated Code

Cross-Repository References

Conditional Compilation

Test-Only Code

Very Recent Changes

Macro-Heavy Languages

Confidence Levels

Confidence by source

How to Validate

Validation checklist

Quick validation prompts

When to always verify

When to Trust Results

High trust (act on it)

Medium trust (verify for important decisions)

Low trust (use as starting point)

Improving Accuracy

Keep your index fresh

Include generated code

Use scoped queries

Report issues

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally