LLM-enhanced code lifecycle analysis and documentation generator
CodeWiki helps you maintain clean codebases by automatically analyzing files, classifying their lifecycle status, and generating up-to-date documentation.
- 🔍 Repository Scanning: Comprehensive codebase analysis with metadata extraction
- 🤖 Multi-Provider LLM Support: Use any LLM provider - local or cloud, with flexible configuration
- Local: Ollama, LM Studio (100% private, no API keys)
- Cloud: OpenAI, Anthropic, Groq, or any OpenAI-compatible API (opt-in)
- Auto-detects API format (Ollama vs OpenAI-compatible)
- Priority-based failover for reliability
- ⚡ Local Inference Optimization: Powered by LIR for 2.85x faster inference
- 🧠 Reasoning Model Support: Handles advanced models with
<think>tag parsing - 🎯 Hybrid Mode: Smart file selection (77% fewer LLM calls vs full mode)
- 🛡️ Privacy-First Design: 100% local by default, cloud providers disabled until explicitly enabled
- 📊 Automatic Documentation: Generates markdown docs with architecture overviews
- 📈 Operational Profiles: Daily/Weekly/Audit modes for different use cases
# Install from source
git clone https://github.com/openjay/codewiki.git
cd codewiki
pip install -e .
# Or with development dependencies
pip install -e ".[dev]"# 1. Start your local LLM (Ollama or LM Studio)
ollama serve # or start LM Studio GUI
# 2. Run lifecycle classification (rule-based, safe default)
codewiki --mode lifecycle
# 3. Enable LLM enhancement (edit config/code_wiki_config.yaml: use_llm: true)
codewiki --mode lifecycle
# 4. Inspect results
python -m codewiki.inspect_lifecycle_resultCodeWiki supports three operational profiles:
- LLM calls: 50-80 files
- Runtime: ~1 min
- Use case: Fast scans, CI checks
- Config:
llm_max_files: 80
- LLM calls: 150 files
- Runtime: ~2-3 min (LM Studio) / ~20 min (Ollama)
- Use case: Comprehensive cleanup
- Config:
llm_max_files: 150
- LLM calls: All files
- Runtime: ~15 min (LM Studio) / ~140 min (Ollama)
- Use case: Major refactoring, full audit
- Config:
llm_mode: "full",llm_max_files: null
CodeWiki supports multiple LLM providers through config/llm_providers.json:
- Ollama (Priority 1, Enabled): 100% local, no API key needed
- LM Studio (Priority 2, Enabled): 100% local, OpenAI-compatible
- Cloud Providers (Priority 99, Disabled): OpenAI, Anthropic, Groq
Your code never leaves your machine unless you explicitly enable cloud providers.
- Edit
config/llm_providers.json - Find the provider (e.g.,
openai) - Set
enabled: trueand configureapi_key - Optionally set environment variable:
export OPENAI_API_KEY="sk-proj-..."
codewiki --mode lifecycleCodeWiki supports any OpenAI-compatible API:
{
"provider": "my_custom_llm",
"api_type": "openai",
"base_url": "http://localhost:8000/v1",
"api_key": "${CUSTOM_API_KEY}",
"models": ["your-model"],
"priority": 1,
"enabled": true
}See Multi-Provider Architecture for details.
CodeWiki uses LIR (Local Inference Runtime) for optimized local inference:
- Automatic batching and scheduling for improved throughput
- Thermal-aware throttling to reduce fan noise
- Connection pooling for faster requests
- Multi-provider support with intelligent failover
Installation:
# Install LIR as editable dependency (development)
pip install -e ../lir
# Or install from GitHub (production)
pip install git+https://github.com/openjay/lir.git@v0.1.0Performance: 2.85x throughput improvement, 8.9x better latency
See LIR Integration Guide for detailed setup.
CodeWiki supports 6+ LLM providers with automatic failover:
-
Ollama (Primary): More accurate (85% parse success)
ollama serve ollama pull qwen3:8b
-
LM Studio (Backup): Faster (~10x speed)
- Start LM Studio GUI
- Load any chat model
- Automatic failover if Ollama unavailable
- OpenAI (gpt-4, gpt-3.5-turbo)
- Anthropic (claude-3-opus, claude-3-sonnet)
- Groq (mixtral-8x7b, llama2-70b)
- Custom (any OpenAI-compatible endpoint)
All cloud providers are disabled by default for privacy.
Enable via config/llm_providers.json (see Multi-Provider Configuration section).
config/code_wiki_config.yaml: Main configurationconfig/llm_providers.json: LLM provider settings
Override default configs:
export CODEWIKI_CONFIG=/path/to/custom/config.yaml
export CODEWIKI_LLM_PROVIDERS=/path/to/custom/providers.json
export CODEWIKI_LIR_POLICY=balanced # silent/balanced/performancecodewiki --mode scancodewiki --mode lifecycle# 1. Edit config/code_wiki_config.yaml: use_llm: true
# 2. Run classification
codewiki --mode lifecyclecodewiki --mode lifecycle --previewcodewiki --mode docspython -m codewiki.inspect_lifecycle_result
python -m codewiki.inspect_lifecycle_result --verbose- Multi-tier confidence thresholds: Only allow archive/delete with confidence ≥ 0.6
- Graceful fallback: Automatic fallback to rule-based on LLM failures
- No auto-deletion: All destructive actions require human review
- Clear-case detection: 75% of files handled by safe rules, no LLM needed
- Parse success: 85%+ (Ollama), 78%+ (LM Studio)
- Review recommendations: <50 files typical
- False positives: Near zero (validated on 664-file codebase)
┌─────────────────┐
│ CLI / Module │
└────────┬────────┘
│
┌────────▼────────┐ ┌──────────────┐
│ Orchestrator │────▶│ Repo Scanner │
└────────┬────────┘ └──────────────┘
│
├─────▶ Lifecycle Classifier
│ ├─ Rule-based (fast)
│ └─ LLM-enhanced (hybrid)
│ │
│ ▼
│ ┌─────────────────┐
│ │ Multi-Provider │
│ │ LLM Client │
│ │ ┌────────────┐ │
│ │ │ Ollama │ │
│ │ │ LM Studio │ │
│ │ │ OpenAI │ │
│ │ │ Custom... │ │
│ │ └────────────┘ │
│ └─────────────────┘
│
└─────▶ Doc Generator
└─ Markdown output
- Multi-Provider Architecture - Flexible LLM provider system
- LIR Integration Guide - Local inference optimization setup
- LLM Integration Guide - LLM provider configuration
- Reasoning Model Fix - Support for reasoning models
- Operational Guide - Daily usage and best practices
- Design Document - Architecture and design decisions
Tested on 664-file Python codebase:
| Mode | Runtime | LLM Calls | Parse Success | Review Count |
|---|---|---|---|---|
| Rule-based | 0.1s | 0 | N/A | 0 |
| LLM Full | ~7 min | 664 | 78% | 426 |
| LLM Hybrid | ~2-3 min | 150 | 85% | 6 |
Key Achievement: 77% fewer LLM calls, 99% fewer review recommendations vs full mode.
- Python ≥ 3.10
pyyaml >= 6.0requests >= 2.28- Local LLM (Ollama or LM Studio) - optional
# Clone repository
git clone https://github.com/openjay/codewiki.git
cd codewiki
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate # or `.venv\Scripts\activate` on Windows
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
# Format code
black codewiki/
ruff check codewiki/Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new features
- Ensure all tests pass
- Submit a pull request
MIT License - see LICENSE file for details.
- Digital Me - Autonomous multi-agent AI platform (where CodeWiki was originally developed)
Current: 2.1.0 (Multi-Provider Architecture)
-
v2.1.0 (2025-12-30)
- Multi-provider architecture (6+ providers)
- OpenAI-compatible API support
- Automatic API type detection
- Reasoning model support (
<think>tag parsing) - Flexible authentication (environment variables)
- Privacy-first defaults maintained
-
v2.0.0 (2025-12-20)
- LIR-powered local inference optimization (2.85x speedup)
- Thermal-aware throttling
- Connection pooling
-
v1.2.0 (2025-11-20)
- Hybrid mode implementation
- LLM integration
-
v1.0.0 (2025-11-15)
- Initial release
- Rule-based classification
- Documentation generation
Jay - 2025