An intelligent system that automatically analyzes your codebase and generates comprehensive documentation, with specialized support for AI/ML pipelines.
- ๐ Intelligent Code Analysis: AST-based Python code analysis with complexity metrics
- ๐ค AI/ML Pipeline Detection: Specialized analysis for machine learning components
- ๐ Comprehensive Documentation: Generates multiple documentation sections automatically
- ๐ Visual Diagrams: Architecture diagrams and data flow visualizations
- ๐๏ธ Supabase Integration: Complete data logging, monitoring, and debug interface
- ๐ CI/CD Integration: GitHub Actions workflow for automated documentation updates
- ๐จ Beautiful Output: MkDocs Material theme with modern UI
- โก Fast & Efficient: Optimized for large codebases
The system automatically generates:
- ๐ Project Overview: High-level project statistics and summary
- ๐๏ธ Architecture Documentation: System design and component relationships
- ๐ API Reference: Detailed function and class documentation
- ๐ฅ Onboarding Guide: New developer getting-started guide
- ๐ค AI/ML Documentation: Machine learning models and pipelines (if detected)
- ๐ Code Quality Reports: Complexity analysis and metrics
# Clone the repository
git clone <your-repo-url>
cd auto_doc_generator
# Install dependencies
pip install -r requirements.txt
# Or install as a package
pip install -e .# Analyze current directory and generate documentation
python src/main.py --analyze --generate
# Analyze, generate, and build MkDocs site
python src/main.py --analyze --generate --build
# Serve documentation locally
python src/main.py --serve
# Analyze specific repository
python src/main.py --repo /path/to/your/project --analyze --generate# Build Docker image
docker build -t auto-doc-generator .
# Run analysis on your project
docker run -v /path/to/your/project:/app/source -v /path/to/output:/app/docs auto-doc-generator --repo /app/sourceauto_doc_generator/
โโโ src/
โ โโโ analyzers/
โ โ โโโ code_analyzer.py # Core code analysis
โ โ โโโ ai_pipeline_analyzer.py # AI/ML detection
โ โโโ generators/
โ โ โโโ markdown_generator.py # Documentation generation
โ โ โโโ diagram_generator.py # Visual diagram creation
โ โโโ supabase_integration.py # Supabase logging & storage
โ โโโ debug_interface.py # Web debug interface
โ โโโ main.py # Main entry point
โโโ config/
โ โโโ doc_config.yaml # Main configuration
โ โโโ analysis_rules.yaml # Analysis rules
โโโ templates/
โ โโโ base_template.md # Base documentation template
โ โโโ architecture_template.md # Architecture documentation
โ โโโ api_template.md # API reference template
โ โโโ onboarding_template.md # Onboarding guide template
โโโ .github/workflows/
โ โโโ auto-doc.yml # GitHub Actions workflow
โโโ docs/ # Generated documentation output
โโโ setup_supabase.py # Supabase setup script
โโโ supabase_setup.sql # Complete database setup (handles all scenarios)
โโโ start_debug_server.py # Debug interface launcher
โโโ Dockerfile # Container definition
โโโ requirements.txt # Python dependencies
โโโ README.md # This file
analysis:
include_patterns:
- "*.py"
- "*.js"
- "*.ts"
exclude_patterns:
- "*/tests/*"
- "*/__pycache__/*"
ai_analysis:
detect_frameworks: true
analyze_pipelines: true
generate_flow_diagrams: true
generation:
output_format: "mkdocs"
theme: "material"
include_diagrams: true
include_api_docs: true
deployment:
target: "github_pages"
auto_deploy: truecomplexity_thresholds:
cyclomatic:
low: 5
medium: 10
high: 15
code_patterns:
ai_pipeline:
- "class.*Pipeline"
- "def.*train"
- "def.*predict"The system includes comprehensive Supabase integration for logging, data storage, and monitoring of the documentation generation process.
- ๐ Analysis Tracking: Complete analysis results storage
- ๐ค LLM Logging: All AI/LLM interactions with metrics
- ๐ฏ Vector Embeddings: Code embeddings with pgvector for fast semantic search
- ๐ฌ Quality Assessments: Module quality metrics and insights
- ๐ Documentation Tracking: Generation metadata and results
- ๐ Debug Interface: Web-based monitoring dashboard
-
Create Supabase Project:
- Go to supabase.com and create a new project
- Note your project URL and anon key
-
Set Environment Variables:
export SUPABASE_URL='https://your-project-ref.supabase.co' export SUPABASE_ANON_KEY='your-supabase-anon-key'
-
Run Setup Script:
python setup_supabase.py
-
Execute Database Schema:
- Copy contents of
supabase_setup.sqland execute in Supabase SQL Editor - This single file handles both new setups and existing installations automatically
- Copy contents of
The system creates 6 tables for comprehensive data storage:
| Table | Purpose |
|---|---|
analysis_steps |
Track each step of the analysis process |
llm_interactions |
Log all AI/LLM requests and responses |
vector_embeddings |
Store code embeddings using pgvector for fast semantic search |
quality_assessments |
Module quality metrics and LLM insights |
complete_analysis_results |
Full analysis data (code, AI, quality) |
documentation_generations |
Documentation generation tracking |
Monitor your data with the web-based debug interface:
# Start debug server
python start_debug_server.py
# Visit dashboard
open http://localhost:5001Available Endpoints:
/api/database-stats- Database statistics/api/llm-interactions- Recent AI interactions/api/analysis-steps- Analysis step history/api/complete-analysis-results- Full analysis data/api/documentation-generations- Documentation tracking/api/vector-embeddings/search- Semantic code search
Add Supabase configuration to your documentor.yaml:
supabase:
enabled: true
url: ${SUPABASE_URL}
key: ${SUPABASE_ANON_KEY}
# Optional: Customize logging behavior
logging:
supabase:
log_analysis_steps: true
log_llm_interactions: true
log_quality_assessments: true
log_complete_results: true
log_documentation: trueThe system automatically logs data when Supabase is configured:
# Run analysis with Supabase logging
python -m auto_doc_generator.main --analyze --generate
# Start debug interface to monitor
python start_debug_server.py- Development: Data stored indefinitely
- Production: Consider implementing data retention policies
- Privacy: All data stored in your Supabase instance
For detailed setup instructions, see SUPABASE_INTEGRATION.md.
The system includes a pre-configured GitHub Actions workflow:
- Automatic Triggers: Runs on push to main branch and merged PRs
- Documentation Generation: Analyzes code and generates docs
- GitHub Pages Deployment: Automatically deploys to GitHub Pages
- Quality Checks: Lints documentation and runs completeness checks
- Copy
.github/workflows/auto-doc.ymlto your repository - Enable GitHub Pages in repository settings
- Push changes to trigger the workflow
- Documentation will be available at
https://username.github.io/repository-name/
from src.analyzers.code_analyzer import CodeAnalyzer
from src.analyzers.ai_pipeline_analyzer import AIPipelineAnalyzer
# Initialize analyzers
code_analyzer = CodeAnalyzer("/path/to/project")
ai_analyzer = AIPipelineAnalyzer()
# Perform analysis
code_results = code_analyzer.analyze_codebase()
ai_results = ai_analyzer.analyze_ai_components("/path/to/project")from src.generators.markdown_generator import MarkdownGenerator
# Initialize generator
generator = MarkdownGenerator("templates", "output")
# Generate specific documentation
docs = generator.generate_all_documentation(code_results, ai_results)
generator.save_documentation(docs)The system provides specialized analysis for:
- ๐ค Model Detection: Identifies ML models, classifiers, and regressors
- โก Pipeline Analysis: Detects data processing pipelines
- ๐ Training Scripts: Finds model training functions
- ๐ฎ Inference Endpoints: Locates prediction/inference code
- ๐ Experiment Tracking: Detects MLflow, WandB, TensorBoard usage
- ๐๏ธ Data Sources: Identifies data loading and processing patterns
- TensorFlow/Keras
- PyTorch
- Scikit-learn
- Pandas/NumPy
- MLflow
- Weights & Biases (WandB)
- XGBoost/LightGBM
- Hugging Face Transformers
The system analyzes:
- Cyclomatic Complexity: Function complexity scoring
- Maintainability Index: Code maintainability metrics
- Halstead Metrics: Software complexity measures
- Dependency Analysis: Module interdependencies
- Architecture Patterns: Design pattern detection
Issue: Import errors when running analysis
# Solution: Set PYTHONPATH
export PYTHONPATH="${PYTHONPATH}:$(pwd)/src"
python src/main.py --analyze --generateIssue: MkDocs not found
# Solution: Install MkDocs
pip install mkdocs mkdocs-materialIssue: Diagrams library errors
# Solution: Install system dependencies
sudo apt-get install graphviz graphviz-dev
pip install diagrams- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Add tests if applicable
- Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Format code
black src/
# Lint code
flake8 src/- Analysis Speed: ~1000 lines of code per second
- Memory Usage: <100MB for typical projects
- Output Size: ~1-5MB documentation for medium projects
- Build Time: 30-60 seconds for full documentation generation
- No external API calls during analysis
- Local processing only
- Configurable file inclusion/exclusion
- Safe AST parsing without code execution
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with MkDocs and Material theme
- Code analysis powered by Python AST and Radon
- Diagrams created with Diagrams library
- Inspired by the need for always up-to-date documentation
- ๐ง Create an issue for bug reports or feature requests
- ๐ฌ Check existing issues for solutions
- ๐ Read the generated documentation for usage examples
Auto-generated documentation for the win! ๐