The universal project brain that works with every AI coding tool.
One command scans your codebase and generates context files for Claude Code, Cursor, Codex, Windsurf, and more — auto-detected conventions, dependency health, architecture maps, and smart context routing. Stays fresh via git hooks.
Every AI coding tool needs project context to work well. But each tool has its own format:
- Claude Code wants
CLAUDE.md - Cursor wants
.cursorrules - Codex wants
codex.md - Windsurf wants
.windsurfrules
Writing and maintaining these manually is tedious. codebase-md scans your project once and generates all of them from a single source of truth.
- Universal output — generates 6 formats from one scan (CLAUDE.md, .cursorrules, AGENTS.md, codex.md, .windsurfrules, PROJECT_CONTEXT.md)
- Auto-detected conventions — naming style, import patterns, file organization, design patterns (powered by tree-sitter AST)
- Dependency intelligence — health scores, version diffs, breaking change detection, migration plans with code impact
- Architecture mapping — detects monolith/monorepo/microservice/library/CLI patterns, entry points, modules
- Smart context routing — query-based context retrieval with TF-IDF relevance scoring
- Git integration — hooks for auto-regeneration on commit, contributor analysis, file hotspots
- Multi-language — Python, JavaScript, TypeScript (50+ file extensions recognized)
Alpha release — this is the first public release of codebase-md. Core functionality is working and tested, but APIs and output formats may change between minor versions. Please pin your version (
pip install codebase-md==0.1.0) and report issues.
- Single-command scan —
codebase scan .analyzes your entire project in seconds - 5 output formats — CLAUDE.md, AGENTS.md, .cursorrules, codex.md, .windsurfrules (+ generic PROJECT_CONTEXT.md)
- Language detection — Python, TypeScript, JavaScript, Go, Rust and 50+ file extensions
- Dependency parsing — requirements.txt, pyproject.toml, package.json, go.mod, Cargo.toml, Gemfile
- Convention inference — naming style, import patterns, file organization, design patterns (via tree-sitter AST)
- Architecture detection — monolith, monorepo, microservice, library, CLI tool
- Git insights — commit history, contributor analysis, file change hotspots
- Dependency health — live registry queries (PyPI, npm) with health scoring and breaking change detection
- Smart context routing — TF-IDF relevance scoring for query-based context retrieval
- AST grammars — tree-sitter support is limited to Python, JavaScript, and TypeScript; Go and Rust are parsed via heuristics
- No incremental mode — every scan re-analyzes the full project (no watch/diff mode yet)
- Large monorepos — projects with >10,000 files may experience slower scan times
- Network dependency — DepShift registry queries (PyPI/npm health checks) require network access; use
--offlineto skip - No Windows CI — tested on Linux and macOS; Windows should work but is not yet part of CI
The test suite (354 tests) validates against these project archetypes:
| Fixture | Type | Languages |
|---|---|---|
| Python CLI | CLI tool | Python |
| FastAPI App | Web API | Python |
| Next.js App | Full-stack | TypeScript, JavaScript |
| Go CLI | CLI tool | Go |
| Rust CLI | CLI tool | Rust |
| Mixed Language | Multi-lang | Python, JS, Go |
| Monorepo | Monorepo | Multiple |
| Empty Repo | Edge case | — |
Integration tests also run against real-world repositories (see test_real_repos.py).
pip install codebase-mdpip install "codebase-md[ast]"pip install git+https://github.com/sauravanand542/codebase-md.gitgit clone https://github.com/sauravanand542/codebase-md.git
cd codebase-md
pip install -e ".[dev,ast]"# Initialize config in your project
cd your-project/
codebase init
# Scan your codebase (builds internal project model)
codebase scan .
# Generate context files for all AI tools
codebase generate .That's it. You now have CLAUDE.md, .cursorrules, AGENTS.md, codex.md, .windsurfrules, and PROJECT_CONTEXT.md in your project root.
Scans your project and builds a complete model: languages, architecture, dependencies, conventions, modules, git history.
codebase scan . # Scan current directory
codebase scan /path/to/project # Scan a specific projectGenerates context files from the last scan.
codebase generate . # Generate all formats
codebase generate . --format claude # Generate only CLAUDE.mdDependency health dashboard — checks versions against registries, computes health scores.
codebase deps . # Health dashboard (queries PyPI/npm)
codebase deps . --offline # Offline mode (no network)
codebase deps . --upgrade typer # Migration plan for a specific packageQuery relevant project context with smart ranking.
codebase context "architecture" # Find architecture info
codebase context "dependencies" --max 3 # Top 3 relevant chunks
codebase context "how to test" --compact # Content-only outputInstall git hooks for automatic regeneration.
codebase hooks install . # Install post-commit hooks
codebase hooks status . # Show installed hooks
codebase hooks remove . # Remove hooksInitialize .codebase/ configuration directory.
codebase init # Creates .codebase/config.yaml| Format | File | AI Tool | Description |
|---|---|---|---|
claude |
CLAUDE.md |
Claude Code | Structured markdown with project summary, architecture, conventions |
cursor |
.cursorrules |
Cursor | Coding rules, language-specific guidance, tech stack |
agents |
AGENTS.md |
Multi-agent | Compact entry points, commands, architecture flow |
codex |
codex.md |
Codex CLI | Overview, setup, project structure, conventions |
windsurf |
.windsurfrules |
Windsurf | Rules-based format with architecture and file map |
generic |
PROJECT_CONTEXT.md |
Any tool | Complete markdown with all sections + metadata |
50+ file extensions recognized. Framework detection for Python (Django, FastAPI, Flask), JavaScript/TypeScript (React, Next.js, Express, Vue).
Monolith, monorepo, microservice, library, CLI tool — detected from folder structure, entry points, and package layout.
- Naming: snake_case, camelCase, PascalCase, kebab-case
- Imports: absolute, relative, mixed
- File organization: modular, layer-based, feature-based, flat
- Design patterns: model, view, controller, service, repository, etc.
Parses package.json, requirements.txt, pyproject.toml, go.mod, Cargo.toml, Gemfile. Health scoring via live registry queries (PyPI, npm).
src/codebase_md/
├── cli.py # Typer CLI — all commands
├── model/ # Pydantic v2 data models (frozen, validated)
├── scanner/ # Codebase analysis engine
│ ├── engine.py # Orchestrates all scanners
│ ├── language_detector.py
│ ├── structure_analyzer.py
│ ├── dependency_parser.py
│ ├── convention_inferrer.py # tree-sitter powered
│ ├── ast_analyzer.py # tree-sitter AST
│ └── git_analyzer.py
├── generators/ # Output format generators (plugin-style)
├── depshift/ # Dependency intelligence engine
│ ├── analyzer.py # Health scoring
│ ├── version_differ.py # Breaking change detection
│ ├── usage_mapper.py # Import → source location mapping
│ └── registries/ # PyPI + npm clients
├── context/ # Smart context routing
│ ├── chunker.py # 12 topic-based chunks
│ ├── ranker.py # 6-signal TF-IDF scoring
│ └── router.py # Query pipeline
├── persistence/ # .codebase/ state management
└── integrations/ # Git hooks, GitHub Actions
After codebase init, edit .codebase/config.yaml:
version: 1
generators:
- claude
- cursor
- agents
- codex
- windsurf
- generic
scan:
exclude:
- node_modules
- .venv
- dist
- build
hooks:
post_commit: true
pre_push: falseSee CONTRIBUTING.md for development setup, coding conventions, and PR guidelines.