An intelligent 3-agent system powered by CrewAI with RAG (Retrieval-Augmented Generation) that automatically analyzes, refactors, and tests Python code using language models while learning from your existing codebase patterns.
-
Clone/Setup the project:
-
Create virtual environment:
python -m venv venv .\venv\Scripts\Activate.ps1 # Windows
-
Install dependencies:
pip install -r requirements.txt
-
Set up API key:
# Add to .env file GROQ_API_KEY=your_api_key_here
Instead of hard-coding settings, change behavior with a YAML file:
# Adjust these without touching code:
processing:
files:
- "target_repo/bad_code.py"
- "src/module.py" # Add more files!
refactoring_level: "aggressive"
backup_originals: trueOverride config on the command line:
# Process a different file
python main.py --file my_file.py
# Use a different config
python main.py --config config.conservative.yaml
# Clone your approach (conservative or aggressive)
python main.py --conservative
python main.py --aggressiveAutomatic logging to file + console:
# View logs after running
cat logs/codebase_agent.log
# Or on Windows
type logs\codebase_agent.logLogs include:
- Backup confirmations
- Processing progress
- Rate limit notifications
- Dependency analysis summaries
- Benchmark results
- Error details
- Timestamps for all events
After each run, check:
open reports/refactoring_report.html # macOS
xdg-open reports/refactoring_report.html # Linux
start reports/refactoring_report.html # WindowsReports show:
- Total files processed & processing time
- Success/failure count
- Per-file analysis results
- Backup locations
- Detailed status for each agent
- Quality metrics before/after comparison (Cyclomatic, MI, Halstead, code size, signal-to-noise)
Originals are backed up to timestamped folders (YYYYMMDD_HHMMSS):
backups/
├── 20260304_143015/
│ └── bad_code.py
├── 20260304_150345/
│ └── bad_code.py
└── 20260304_152812/
└── bad_code.py
Restore a backup:
cp backups/20260304_143015/bad_code.py target_repo/bad_code.pyA diverse collection of Python code files are generated by an LLM creating a codebase for RAG to learn patterns from. To do that:
python -m utils.generate_codebase Agents search the codebase for similar patterns, best practices, and implementations before refactoring. This makes them context-aware and produces more consistent, high-quality results.
Within the generated Python codes, RAG can:
- Agents learn from existing code patterns in your repo
- Maintains consistency with your codebase style
- Finds similar implementations and best practices
- Context-aware refactoring decisions
Configuration:
rag:
enable: true # Enable RAG
index_on_startup: true # Auto-index codebase
n_results: 3 # Snippets per searchThe scanner detects issues such as:
- Silent exception swallowing (
except Exception ...+pass) - SQL injection risk patterns (
execute(f"...")) - Hardcoded credential-like assignments (
api_key,token,password, etc.) - Hardcoded bypass/allow lists
- Unsafe shutdown patterns (
os._exit(...)) - Other auditability/data-integrity anti-patterns
Each finding contains:
rule_idseverity(critical,high,medium,low)categoryfileandlineevidencerecommendation
Set severity threshold in configs/config.yaml:
compliance:
fail_on_severity: "critical" # critical|high|medium|low or null to disable
findings_file: "reports/compliance_findings.json"
audit_log_file: "reports/compliance_audit_log.jsonl"The framework now tracks refactoring impact beyond compliance findings.
- Cyclomatic Complexity (avg/max/total) where lower is better
- Maintainability Index (0-100) where higher is better
- Halstead metrics (difficulty, effort, estimated bugs) where lower is better
- Code size (LOC, LLOC, comments, blanks)
- Signal-to-noise score based on metric improvements vs lines changed
Enable or disable in config:
quality_metrics:
enable: truepython main.py --file src/my_module.py- Uses default config
- Creates backup
- Logs to console + file
- Generates HTML report
Edit config.yaml:
processing:
files:
- "src/module1.py"
- "src/module2.py"
- "utils/helpers.py"Then run:
python main.pypython main.py --config config.conservative.yaml- Only 1 iteration per agent
- Sequential process only
- Minimal changes
python main.py --aggressive- Up to 3 iterations per agent
- Hierarchical process (manager-coordinated)
- Extensive refactoring