codeskeleton

Reveal the skeleton of your codebase — turn any folder of code into a queryable knowledge graph. Single binary, zero runtime dependencies, blazing fast.

codeskeleton .

codeskeleton-out/
├── graph.html         interactive graph — click nodes, search, filter by community
├── GRAPH_REPORT.md    god nodes, surprising connections, suggested questions
├── graph.json         persistent graph — query later without re-reading
└── cache/             SHA256 cache — re-runs only process changed files

Install

Requires: Rust 1.70+

cargo install codeskeleton

Or build from source:

git clone https://github.com/DhanushNehru/codeskeleton.git
cd codeskeleton
cargo build --release
./target/release/codeskeleton .

Usage

codeskeleton .              # analyze current directory
codeskeleton ./src          # analyze a specific folder
codeskeleton . --no-cache   # force full re-extraction

Add a .cographignore file to exclude folders:

# .cographignore
vendor/
node_modules/
dist/
*.generated.py

Same syntax as .gitignore. Patterns match against file paths relative to the analyzed folder.

What You Get

God nodes — highest-degree concepts (what everything connects through)

Surprising connections — cross-community edges ranked by structural distance, with plain-English explanations

Communities — automatically detected clusters of related code with cohesion scores

Suggested questions — 4-5 questions the graph is uniquely positioned to answer

Interactive visualization — dark-themed vis.js graph with search, click-to-inspect, community coloring

Incremental builds — SHA256 file caching means re-runs only process changed files

Supported Languages

Language	Extensions	Extraction
Python	`.py`	Classes, functions, imports, calls via tree-sitter AST
JavaScript	`.js` `.jsx`	Classes, functions, imports, calls via tree-sitter AST
TypeScript	`.ts` `.tsx`	Classes, functions, imports, calls via tree-sitter AST
Rust	`.rs`	Structs, enums, traits, functions, use declarations via tree-sitter AST
Go	`.go`	Types, functions, methods, imports via tree-sitter AST
Java	`.java`	Classes, interfaces, methods, imports via tree-sitter AST
C	`.c` `.h`	Structs, functions, includes via tree-sitter AST

How It Works

codeskeleton runs a deterministic AST pass using tree-sitter. No LLM needed — pure structural extraction:

Detect — walks the directory tree respecting .gitignore and .cographignore
Cache — SHA256 hashes each file, skips unchanged files from previous runs
Extract — tree-sitter parses each file in parallel (Rayon), extracts classes/structs, functions/methods, imports, and call sites
Build — assembles all extractions into a petgraph knowledge graph
Cluster — label propagation community detection groups related nodes
Analyze — identifies god nodes (highest degree), surprising cross-community connections, generates questions
Export — writes graph.json, graph.html (vis.js), and GRAPH_REPORT.md

Every relationship is tagged EXTRACTED (found directly in source) or INFERRED (call-graph second pass). You always know what was found vs guessed.

Architecture

detect → cache-check → extract (parallel) → build_graph → cluster → analyze → report → export

Each stage is a pure function in its own module. No shared mutable state, no side effects outside codeskeleton-out/.

Module	Responsibility
`detect.rs`	Directory walk, file filtering
`cache.rs`	SHA256 file caching
`languages.rs`	Per-language tree-sitter configs
`extract.rs`	Generic AST extraction engine
`graph.rs`	petgraph construction
`cluster.rs`	Label propagation community detection
`analyze.rs`	God nodes, surprising connections
`report.rs`	GRAPH_REPORT.md generation
`export.rs`	JSON + HTML visualization
`types.rs`	Shared types (Node, Edge, Confidence)

Performance

codeskeleton is written in Rust for maximum performance:

Parallel extraction — Rayon processes all files across all CPU cores
Zero-copy parsing — tree-sitter operates on raw bytes, no string allocation
Incremental builds — SHA256 caching means only changed files are re-extracted
Single binary — no Python, no Node.js, no runtime dependencies
Native speed — compiled to optimized machine code with LTO

Contributing

Adding a language:

Add the tree-sitter grammar crate to Cargo.toml
Add a variant to SupportedLanguage in languages.rs
Define the LanguageSpec with AST node types
Add the extension mapping in from_extension()
Add an import extractor in extract.rs
Add test fixtures

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

codeskeleton

Install

Usage

What You Get

Supported Languages

How It Works

Architecture

Performance

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

codeskeleton

Install

Usage

What You Get

Supported Languages

How It Works

Architecture

Performance

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages