CodeBase Agent - Automated code refactoring

An intelligent 3-agent system powered by CrewAI with RAG (Retrieval-Augmented Generation) that automatically analyzes, refactors, and tests Python code using language models while learning from your existing codebase patterns.

Installation

Clone/Setup the project:

Create virtual environment:

python -m venv venv
.\venv\Scripts\Activate.ps1  # Windows

Install dependencies:
```
pip install -r requirements.txt
```

Set up API key:

# Add to .env file
GROQ_API_KEY=your_api_key_here

Key features

Configuration file (`config.yaml`)

Instead of hard-coding settings, change behavior with a YAML file:

# Adjust these without touching code:
processing:
  files:
    - "target_repo/bad_code.py"
    - "src/module.py"  # Add more files!
  refactoring_level: "aggressive"
  backup_originals: true

CLI arguments

Override config on the command line:

# Process a different file
python main.py --file my_file.py

# Use a different config
python main.py --config config.conservative.yaml

# Clone your approach (conservative or aggressive)
python main.py --conservative
python main.py --aggressive

Logging system

Automatic logging to file + console:

# View logs after running
cat logs/codebase_agent.log

# Or on Windows
type logs\codebase_agent.log

Logs include:

Backup confirmations
Processing progress
Rate limit notifications
Dependency analysis summaries
Benchmark results
Error details
Timestamps for all events

HTML reports

After each run, check:

open reports/refactoring_report.html  # macOS
xdg-open reports/refactoring_report.html  # Linux
start reports/refactoring_report.html  # Windows

Reports show:

Total files processed & processing time
Success/failure count
Per-file analysis results
Backup locations
Detailed status for each agent
Quality metrics before/after comparison (Cyclomatic, MI, Halstead, code size, signal-to-noise)

Auto backups

Originals are backed up to timestamped folders (YYYYMMDD_HHMMSS):

backups/
├── 20260304_143015/
│   └── bad_code.py
├── 20260304_150345/
│   └── bad_code.py
└── 20260304_152812/
    └── bad_code.py

Restore a backup:

cp backups/20260304_143015/bad_code.py target_repo/bad_code.py

RAG Integration

A diverse collection of Python code files are generated by an LLM creating a codebase for RAG to learn patterns from. To do that:

python -m utils.generate_codebase

Agents search the codebase for similar patterns, best practices, and implementations before refactoring. This makes them context-aware and produces more consistent, high-quality results.

Within the generated Python codes, RAG can:

Agents learn from existing code patterns in your repo
Maintains consistency with your codebase style
Finds similar implementations and best practices
Context-aware refactoring decisions

Configuration:

rag:
  enable: true                # Enable RAG
  index_on_startup: true      # Auto-index codebase
  n_results: 3                # Snippets per search

Compliance

Deterministic rule engine

The scanner detects issues such as:

Silent exception swallowing (except Exception ... + pass)
SQL injection risk patterns (execute(f"..."))
Hardcoded credential-like assignments (api_key, token, password, etc.)
Hardcoded bypass/allow lists
Unsafe shutdown patterns (os._exit(...))
Other auditability/data-integrity anti-patterns

Each finding contains:

rule_id
severity (critical, high, medium, low)
category
file and line
evidence
recommendation

Quality gate

Set severity threshold in configs/config.yaml:

compliance:
  fail_on_severity: "critical"  # critical|high|medium|low or null to disable
  findings_file: "reports/compliance_findings.json"
  audit_log_file: "reports/compliance_audit_log.jsonl"

Code quality metrics evaluation

The framework now tracks refactoring impact beyond compliance findings.

Cyclomatic Complexity (avg/max/total) where lower is better
Maintainability Index (0-100) where higher is better
Halstead metrics (difficulty, effort, estimated bugs) where lower is better
Code size (LOC, LLOC, comments, blanks)
Signal-to-noise score based on metric improvements vs lines changed

Enable or disable in config:

quality_metrics:
  enable: true

Common use cases

Use case 1: Single file

python main.py --file src/my_module.py

Uses default config
Creates backup
Logs to console + file
Generates HTML report

Use case 2: Batch process multiple files

Edit config.yaml:

processing:
  files:
    - "src/module1.py"
    - "src/module2.py"
    - "utils/helpers.py"

Then run:

python main.py

Use case 3: Conservative approach for production code

python main.py --config config.conservative.yaml

Only 1 iteration per agent
Sequential process only
Minimal changes

Use case 4: Aggressive refactoring for internal code

python main.py --aggressive

Up to 3 iterations per agent
Hierarchical process (manager-coordinated)
Extensive refactoring

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CodeBase Agent - Automated code refactoring

Installation

Key features

Configuration file (`config.yaml`)

CLI arguments

Logging system

HTML reports

Auto backups

RAG Integration

Compliance

Deterministic rule engine

Quality gate

Code quality metrics evaluation

Common use cases

Use case 1: Single file

Use case 2: Batch process multiple files

Use case 3: Conservative approach for production code

Use case 4: Aggressive refactoring for internal code

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
configs		configs
target_repo		target_repo
tests		tests
utils		utils
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

CodeBase Agent - Automated code refactoring

Installation

Key features

Configuration file (config.yaml)

CLI arguments

Logging system

HTML reports

Auto backups

RAG Integration

Compliance

Deterministic rule engine

Quality gate

Code quality metrics evaluation

Common use cases

Use case 1: Single file

Use case 2: Batch process multiple files

Use case 3: Conservative approach for production code

Use case 4: Aggressive refactoring for internal code

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Configuration file (`config.yaml`)

Packages