patchpro-bot

PatchPro: CI code-repair assistant that analyzes code using Ruff and Semgrep, then generates intelligent patch suggestions using LLM with professional finding normalization and deduplication.

Quick Start

For Collaborators: See DEVELOPMENT.md for complete setup and testing instructions.

For End Users: Try the demo repository to see PatchPro in action.

# Quick test with demo repo
git clone <demo-repo-url>
cd patchpro-demo-repo
echo "OPENAI_API_KEY=your-key" > .env
uv run --with /path/to/patchpro-bot-agent-dev python -m patchpro_bot.run_ci

Overview

PatchPro Bot is a comprehensive code analysis and patch generation tool that:

Reads JSON analysis reports from Ruff (Python linter) and Semgrep (static analysis)
Processes findings with deduplication, prioritization, and aggregation
Generates intelligent code fixes using OpenAI's LLM
Creates unified diff patches that can be applied to fix the issues
Reports comprehensive analysis results and patch summaries

Architecture

The codebase follows the pipeline described in this mermaid diagram:

flowchart TD
 A[patchpro-demo-repo PR] --> B[GitHub Actions CI] 
 subgraph Analysis
   direction LR
   C1[Ruff ▶ JSON]
   C2[Semgrep ▶ JSON]
 end
 B --> C{Analyzers}
 C --> C1[Ruff: lint issues to JSON]
 C --> C2[Semgrep: patterns to JSON]
 C1 & C2 --> D[Artifact storage: artifact/analysis/*.json]
 D --> E[Agent Core]
 E --> F[LLM: OpenAI call prompt toolkit]
 F --> G[Unified diff + rationale: patch_*.diff]
 G & D --> H[Report generator: report.md]
 H --> I[Sticky PR comment]
 I --> J[Eval/QA judge & metrics: artifact/run_metrics.json]

Project Structure

src/patchpro_bot/
├── __init__.py              # Package exports
├── agent_core.py            # Main orchestrator
├── run_ci.py               # Legacy CI runner (delegates to agent_core)
├── analysis/               # Analysis reading and aggregation
│   ├── __init__.py
│   ├── reader.py           # JSON file reader for Ruff/Semgrep
│   └── aggregator.py       # Finding aggregation and processing
├── models/                 # Pydantic data models
│   ├── __init__.py
│   ├── common.py           # Shared models and enums
│   ├── ruff.py            # Ruff-specific models
│   └── semgrep.py         # Semgrep-specific models
├── llm/                   # LLM integration
│   ├── __init__.py
│   ├── client.py          # OpenAI client wrapper
│   ├── prompts.py         # Prompt templates and builders
│   └── response_parser.py # Parse LLM responses
└── diff/                  # Diff generation and patch writing
    ├── __init__.py
    ├── file_reader.py     # Source file reading
    ├── generator.py       # Unified diff generation
    └── patch_writer.py    # Patch file writing

Installation

Clone the repository:

git clone <repository-url>
cd patchpro-bot

Create and activate virtual environment:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install the package:
```
pip install -e .
```
Install development dependencies (optional):
```
pip install -e ".[dev]"
```

Usage

Basic Usage

Set up your OpenAI API key:

export OPENAI_API_KEY="your-openai-api-key-here"

Prepare analysis files: Create artifact/analysis/ directory and place your Ruff and Semgrep JSON output files there:
```
mkdir -p artifact/analysis
# Copy your ruff and semgrep JSON files to artifact/analysis/
```

Run the bot:

python -m patchpro_bot.agent_core

Or use the test pipeline:

python test_pipeline.py

Programmatic Usage

from patchpro_bot import AgentCore, AgentConfig
from pathlib import Path

# Configure the agent
config = AgentConfig(
    analysis_dir=Path("artifact/analysis"),
    artifact_dir=Path("artifact"),
    openai_api_key="your-api-key",
    max_findings=20,
)

# Create and run the agent
agent = AgentCore(config)
results = agent.run()

print(f"Status: {results['status']}")
print(f"Generated {results['patches_written']} patches")

Testing with Sample Data

The project includes sample data for testing:

# Copy sample analysis files
cp tests/sample_data/*.json artifact/analysis/

# Copy sample source file
cp tests/sample_data/example.py src/

# Run the test pipeline
python test_pipeline.py

Configuration

The AgentConfig class supports the following options:

Parameter	Default	Description
`analysis_dir`	`artifact/analysis`	Directory containing JSON analysis files
`artifact_dir`	`artifact`	Output directory for patches and reports
`base_dir`	Current directory	Base directory for source files
`openai_api_key`	`None`	OpenAI API key (can also use `OPENAI_API_KEY` env var)
`llm_model`	`gpt-4o-mini`	OpenAI model to use
`max_tokens`	`4096`	Maximum tokens for LLM response
`temperature`	`0.1`	LLM temperature (0.0 = deterministic)
`max_findings`	`20`	Maximum findings to process
`max_files_per_batch`	`5`	Maximum files to process in one batch
`combine_patches`	`True`	Whether to create a combined patch file
`generate_summary`	`True`	Whether to generate patch summaries

Analysis File Formats

Ruff JSON Format

[
  {
    "code": "F401",
    "filename": "src/example.py",
    "location": {"row": 1, "column": 8},
    "end_location": {"row": 1, "column": 11},
    "message": "`sys` imported but unused",
    "fix": {
      "applicability": "automatic",
      "edits": [{"content": "", "location": {"row": 1, "column": 1}}]
    }
  }
]

Semgrep JSON Format

{
  "results": [
    {
      "check_id": "python.lang.security.hardcoded-password.hardcoded-password",
      "path": "src/auth.py",
      "start": {"start": {"line": 12, "col": 13}},
      "end": {"end": {"line": 12, "col": 35}},
      "extra": {
        "message": "Hardcoded password found",
        "severity": "ERROR",
        "metadata": {"category": "security", "confidence": "HIGH"}
      }
    }
  ]
}

Output

The bot generates several output files in the artifact/ directory:

patch_001.diff, patch_002.diff, etc. - Individual patch files
combined_patch.diff - Combined patch file (if enabled)
patch_summary.md - Summary of all generated patches
report.md - Comprehensive analysis report

Testing

Run the test suite:

# Run all tests
pytest

# Run with coverage
pytest --cov=src/patchpro_bot

# Run specific test modules
pytest tests/test_analysis.py
pytest tests/test_models.py
pytest tests/test_llm.py
pytest tests/test_diff.py

Development

Code Quality

The project uses several tools for code quality:

# Format code
black src/ tests/

# Type checking
mypy src/

# Linting
ruff check src/ tests/

Adding New Analysis Tools

To add support for new analysis tools:

Create a new model in src/patchpro_bot/models/
Update the AnalysisReader to detect and parse the new format
Add tests for the new functionality

Extending LLM Capabilities

The LLM integration is modular and can be extended:

Add new prompt templates in prompts.py
Extend response parsing in response_parser.py
Add support for different LLM providers in client.py

Dependencies

Core Dependencies

pydantic - Data validation and parsing
openai - OpenAI API client
unidiff - Unified diff processing
python-dotenv - Environment variable management
typer - CLI framework
rich - Rich text and beautiful formatting
httpx - HTTP client

Analysis Tools (External)

ruff - Python linter
semgrep - Static analysis tool

Development Dependencies

pytest - Testing framework
pytest-cov - Coverage reporting
pytest-asyncio - Async testing support
black - Code formatting
mypy - Type checking

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Run the test suite
Submit a pull request

License

MIT License - see LICENSE file for details. - An intelligent patch bot that analyzes static analysis reports from Ruff and Semgrep and generates unified diff patches using LLM-powered suggestions.

🎯 Overview

PatchPro Bot follows the pipeline described in your mermaid diagram:

Analysis JSON → Agent Core → LLM Suggestions → Unified Diff Generation → Patch Files

The bot reads JSON reports from static analysis tools (Ruff for Python linting, Semgrep for security/pattern analysis), sends the findings to an LLM for intelligent code suggestions, and generates properly formatted unified diff patches.

🏗️ Architecture

Core Components

📋 Analysis Module (src/patchpro_bot/analysis/)
- AnalysisReader: Reads and parses JSON files from artifact/analysis/
- FindingAggregator: Processes, filters, and organizes findings for LLM consumption
🧠 LLM Module (src/patchpro_bot/llm/)
- LLMClient: OpenAI integration for generating code suggestions
- PromptBuilder: Creates structured prompts for different fix scenarios
- ResponseParser: Extracts code fixes and diffs from LLM responses
🔧 Diff Module (src/patchpro_bot/diff/)
- DiffGenerator: Creates unified diffs from code changes
- FileReader: Reads source code files for diff generation
- PatchWriter: Writes patch files to the artifact directory
🎛️ Agent Core (src/patchpro_bot/agent_core.py)
- Orchestrates the entire pipeline from analysis to patch generation
- Configurable processing limits and output options
📊 Models (src/patchpro_bot/models/)
- Pydantic models for Ruff and Semgrep JSON schemas
- Unified AnalysisFinding model for cross-tool compatibility

🚀 Quick Start

Installation

Clone the repository and install dependencies:

cd patchpro-bot
pip install -e .

Install optional development dependencies:

pip install -e ".[dev]"

Basic Usage

Set up your OpenAI API key:

export OPENAI_API_KEY="your-api-key-here"

Prepare analysis data: Place your Ruff and Semgrep JSON outputs in artifact/analysis/:

mkdir -p artifact/analysis
ruff check --format=json examples/src/ > artifact/analysis/ruff_output.json
semgrep --config=auto --json examples/src/ > artifact/analysis/semgrep_output.json

Run the bot:

python -m patchpro_bot.agent_core

Check the results:

Patch files: artifact/patch_*.diff
Report: artifact/report.md

Using the API

from patchpro_bot import AgentCore, AgentConfig

# Configure the agent
config = AgentConfig(
    analysis_dir=Path("artifact/analysis"),
    openai_api_key="your-api-key",
    max_findings=10
)

# Run the pipeline
agent = AgentCore(config)
results = agent.run()

print(f"Generated {results['patches_written']} patches")

📁 Project Structure

patchpro-bot/
├── src/patchpro_bot/
│   ├── __init__.py           # Package exports
│   ├── agent_core.py         # Main orchestrator
│   ├── run_ci.py            # Legacy CI runner
│   ├── analysis/            # Analysis reading & processing
│   │   ├── reader.py        # JSON file reader
│   │   └── aggregator.py    # Finding aggregation
│   ├── llm/                 # LLM integration
│   │   ├── client.py        # OpenAI client
│   │   ├── prompts.py       # Prompt templates
│   │   └── response_parser.py # Response parsing
│   ├── diff/                # Diff generation
│   │   ├── generator.py     # Unified diff creation
│   │   ├── file_reader.py   # Source file reading
│   │   └── patch_writer.py  # Patch file writing
│   └── models/              # Data models
│       ├── common.py        # Common types
│       ├── ruff.py         # Ruff JSON schema
│       └── semgrep.py      # Semgrep JSON schema
├── tests/                   # Comprehensive test suite
├── examples/               # Sample data and usage
└── docs/                   # Documentation

🔧 Configuration

Environment Variables

OPENAI_API_KEY: Your OpenAI API key (required)
LLM_MODEL: Model to use (default: gpt-4o-mini)
MAX_FINDINGS: Maximum findings to process (default: 20)
PP_ARTIFACTS: Artifact directory path (default: artifact)

AgentConfig Options

config = AgentConfig(
    # Directories
    analysis_dir=Path("artifact/analysis"),
    artifact_dir=Path("artifact"),
    base_dir=Path.cwd(),
    
    # LLM settings
    openai_api_key="your-key",
    llm_model="gpt-4o-mini",
    max_tokens=4096,
    temperature=0.1,
    
    # Processing limits
    max_findings=20,
    max_files_per_batch=5,
    
    # Output settings
    combine_patches=True,
    generate_summary=True
)

📝 Supported Analysis Tools

Ruff (Python Linter)

Supported: All Ruff rule categories (F, E, W, C, N, D, S, B, etc.)
Features: Automatic fix extraction, severity inference, rule categorization
Format: JSON output from ruff check --format=json

Semgrep (Security & Pattern Analysis)

Supported: All Semgrep rule types and severities
Features: Security vulnerability detection, metadata extraction
Format: JSON output from semgrep --json

🧪 Testing

Run the comprehensive test suite:

# Install test dependencies
pip install -e ".[dev]"

# Run all tests
pytest

# Run with coverage
pytest --cov=patchpro_bot

# Run specific test modules
pytest tests/test_analysis.py
pytest tests/test_llm.py
pytest tests/test_diff.py

📊 Example Output

Generated Patch

diff --git a/src/example.py b/src/example.py
index 1234567..abcdefg 100644
--- a/src/example.py
+++ b/src/example.py
@@ -1,5 +1,4 @@
-import os
 import sys
 import subprocess
 
 def main():

Report Summary

# PatchPro Bot Report

## Summary
- **Total findings**: 6
- **Tools used**: ruff, semgrep  
- **Affected files**: 2
- **Patches generated**: 3

## Findings Breakdown
- **error**: 3
- **warning**: 2  
- **high**: 1

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Add tests for your changes
Ensure tests pass (pytest)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📋 Requirements

Python 3.12+
OpenAI API key
Dependencies listed in pyproject.toml

🔒 Security

API keys are loaded from environment variables
No sensitive data is logged
Minimal, targeted code changes to reduce risk
Security-first prioritization in fix suggestions

📜 License

MIT License - see LICENSE file for details.

🆘 Support

📖 Check the examples/ directory for usage samples
🐛 Report issues on GitHub
💬 Review the comprehensive test suite for API usage examples

PatchPro Bot - Intelligent code repair for modern CI/CD pipelines with professional finding normalization 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
.github		.github
docs		docs
examples		examples
schemas		schemas
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.patchpro.toml		.patchpro.toml
.ruff.toml		.ruff.toml
AGENTIC_IMPLEMENTATION_SUMMARY.md		AGENTIC_IMPLEMENTATION_SUMMARY.md
BUG_ANALYSIS.md		BUG_ANALYSIS.md
CONTRIBUTING.md		CONTRIBUTING.md
DEVELOPMENT.md		DEVELOPMENT.md
HANDOFF.md		HANDOFF.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
ruff_test.json		ruff_test.json
semgrep.yml		semgrep.yml
test_bug_demo.py		test_bug_demo.py
test_findings.json		test_findings.json
test_sample.py		test_sample.py
trace_viewer.py		trace_viewer.py
uv.lock		uv.lock

License

A3copilotprogram/patchpro-bot

Folders and files

Latest commit

History

Repository files navigation

patchpro-bot

Quick Start

Overview

Architecture

Project Structure

Installation

Usage

Basic Usage

Programmatic Usage

Testing with Sample Data

Configuration

Analysis File Formats

Ruff JSON Format

Semgrep JSON Format

Output

Testing

Development

Code Quality

Adding New Analysis Tools

Extending LLM Capabilities

Dependencies

Core Dependencies

Analysis Tools (External)

Development Dependencies

Contributing

License

🎯 Overview

🏗️ Architecture

Core Components

🚀 Quick Start

Installation

Basic Usage

Using the API

📁 Project Structure

🔧 Configuration

Environment Variables

AgentConfig Options

📝 Supported Analysis Tools

Ruff (Python Linter)

Semgrep (Security & Pattern Analysis)

🧪 Testing

📊 Example Output

Generated Patch

Report Summary

🤝 Contributing

📋 Requirements

🔒 Security

📜 License

🆘 Support

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages