Autonomous multi-agent AI system for GitHub Pull Request code reviews
Powered by AMD Developer Cloud + ROCm + CrewAI
Paste a GitHub PR URL → Get a full expert code review in under 60 seconds.
Code Surgery Bot dispatches 6 specialized AI agents that work in parallel:
- 🔍 Logic Checker — Finds bugs, null pointers, off-by-one errors, missing error handling
- 🔒 Security Agent — Detects SQL injection, hardcoded credentials, XSS, insecure dependencies
- ✨ Style Agent — Enforces naming conventions, checks complexity, finds code duplication
- 📝 Documentation Agent — Audits docstrings and comments
- 🤖 Coordinator Agent — Orchestrates all agents and resolves conflicts
- 📊 Report Generator — Synthesizes findings into a human-readable review with code fixes
Each agent is powered by specialized open-source LLMs running on AMD MI300X GPU via ROCm.
Try it on Hugging Face Spaces →
(Demo available after deployment)
User submits GitHub PR URL
↓
PR Analyzer Agent (fetches files, diffs, metadata)
↓
Coordinator Agent (decides which agents to run)
↓
┌──────────────────────────────────┐
│ Runs in parallel: │
├──────────────────────────────────┤
│ • Logic Checker Agent │
│ • Security Agent │
│ • Style Agent │
│ • Documentation Agent │
└──────────────────────────────────┘
↓
Report Generator Agent (writes final review)
↓
Output: Scored review report with fixes
- Python 3.10+
- GitHub Personal Access Token
- AMD Developer Cloud account (for GPU)
- Or use HuggingFace API as fallback
# Clone the repository
git clone https://github.com/your-org/code-surgery-bot.git
cd code-surgery-bot
# Create virtual environment
python -m venv venv
# Activate it
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Copy environment template
cp .env.example .env
# Edit .env with your credentials
nano .env # or use your editorAdd to .env:
GITHUB_TOKEN=your_github_personal_access_token
AMD_API_KEY=your_amd_developer_cloud_key
MODEL_ENDPOINT=http://your-amd-instance:8000/v1
HF_TOKEN=your_huggingface_tokenpython app/gradio_app.pyThen open http://localhost:7860 in your browser.
from src.github import PRFetcher
from src.agents import CoordinatorAgent
# Initialize
pr_fetcher = PRFetcher(github_token="your_token")
coordinator = CoordinatorAgent()
# Fetch and review
pr_url = "https://github.com/facebook/react/pull/28000"
pr_data = pr_fetcher.fetch(pr_url)
review_report = coordinator.review(pr_data)
print(review_report)code-surgery-bot/
│
├── src/
│ ├── agents/
│ │ ├── pr_analyzer.py ← Fetch PR data from GitHub
│ │ ├── coordinator.py ← Orchestrate agents
│ │ ├── logic_checker.py ← Find bugs
│ │ ├── security_agent.py ← Find vulnerabilities
│ │ ├── style_agent.py ← Check code quality
│ │ ├── report_generator.py ← Write final review
│ │ └── __init__.py
│ │
│ ├── github/
│ │ ├── pr_fetcher.py ← GitHub API integration
│ │ └── __init__.py
│ │
│ ├── models/
│ │ ├── llm_client.py ← LLM inference client
│ │ └── __init__.py
│ │
│ ├── utils/
│ │ ├── ast_parser.py ← Code parsing utilities
│ │ └── __init__.py
│ │
│ └── __init__.py
│
├── app/
│ ├── gradio_app.py ← Web UI demo
│ └── fastapi_backend.py ← REST API backend
│
├── tests/
│ ├── test_agents.py ← Agent unit tests
│ └── __init__.py
│
├── docs/
│ ├── architecture.md
│ ├── setup-amd-gpu.md
│ └── agent-prompts.md
│
├── examples/
│ ├── sample_review_output.md ← Example report
│ └── test_pr_urls.txt
│
├── requirements.txt
├── setup.py
├── LICENSE (MIT)
├── .gitignore
├── .env.example
└── README.md (this file)
| Component | Technology | Why |
|---|---|---|
| GPU Compute | AMD Instinct MI300X + ROCm | Fast code inference on open-source |
| Agent Framework | CrewAI | Best for multi-agent orchestration |
| LLMs | DeepSeek-Coder-33B, Llama-2-70B, Mistral-7B | Code-specialized, open-source |
| Inference | vLLM | Batched, fast inference on ROCm |
| GitHub Integration | PyGithub | Fetch PR data, post reviews |
| Code Parsing | tree-sitter, Python AST | Language-aware analysis |
| Backend API | FastAPI | REST endpoint for orchestration |
| Frontend Demo | Gradio | Clean UI for live demo |
| Deployment | Hugging Face Spaces | Free, public, scalable |
Input: GitHub PR URL
Process: Fetches changed files, unified diffs, commit messages using PyGithub API
Output: Structured JSON with { files, diffs, metadata, ast_nodes }
Input: Structured PR data from analyzer
Process: Decides which agents are relevant, dispatches tasks in parallel
Output: Task assignments and conflict resolution
Model: DeepSeek-Coder-33B
Detects:
- Null pointer / undefined variable risks
- Off-by-one errors in loops
- Missing error handling (try/catch gaps)
- Wrong return types
- Unreachable code blocks
- Infinite loop risks
- Race conditions in async code
- Edge case violations
Model: Llama-2-70B-Chat
Detects:
- SQL injection vulnerabilities
- Hardcoded credentials & API keys
- XSS vulnerabilities
- Insecure dependencies (CVE cross-check)
- SSRF & path traversal risks
- Authentication/authorization gaps
- Sensitive data in logs
- Deprecated crypto functions
Model: Mistral-7B-Instruct
Checks:
- Naming convention consistency
- Function complexity (>50 lines → split)
- Code duplication (DRY violations)
- Magic numbers
- Complex conditionals
- Import organization
- Commented-out code
- Type hints coverage
Model: Llama-2-70B-Chat
Process:
- Combines all agent outputs
- Calculates quality score (0-100)
- Groups issues by severity
- Generates corrected code snippets
- Renders GitHub-compatible Markdown
- Produces verdict: ✅ Approve /
⚠️ Request Changes / ❌ Reject
## 🔬 Code Surgery Bot Review — PR #47
**Overall Score: 72/100** ⚠️ Request Changes
### Issue Summary
| Severity | Count |
|----------|-------|
| Critical | 0 |
| High | 2 |
| Medium | 5 |
| Low | 3 |
### 🔒 Security Issues (2 High)
**auth.py line 34**: API key hardcoded in source❌ BEFORE: API_KEY = "sk_live_abc123xyz" users_db = connect(API_KEY)
✅ AFTER: API_KEY = os.getenv("API_KEY") if not API_KEY: raise EnvironmentError("API_KEY not set") users_db = connect(API_KEY)
**db.py line 89**: Raw SQL with f-string interpolation
❌ BEFORE: query = f"SELECT * FROM users WHERE id={user_id}" cursor.execute(query)
✅ AFTER: query = "SELECT * FROM users WHERE id = %s" cursor.execute(query, (user_id,))
### 🔍 Logic Issues (3 High)
[...]
### ✅ Verdict: REQUEST CHANGES
Fix 2 critical security issues before merge.
# Create Space on huggingface.co
# Choose "Gradio" SDK
# Clone Space
git clone https://huggingface.co/spaces/your-username/code-surgery-bot
cd code-surgery-bot
# Copy app
cp app/gradio_app.py ./app.py
cp requirements.txt .
# Push
git add . && git commit -m "Deploy Code Surgery Bot" && git pushVisit: https://huggingface.co/spaces/your-username/code-surgery-bot
# SSH into MI300X instance
ssh user@amd-instance-ip
# Install dependencies
pip install -r requirements.txt
# Start inference server
python -m vllm.entrypoints.openai.api_server \
--model deepseek-ai/deepseek-coder-33b-instruct \
--port 8000 \
--tensor-parallel-size 4
# In another terminal, start FastAPI backend
python app/fastapi_backend.py
# Gradio UI connects to http://amd-instance-ip:8000# Run all tests
pytest tests/
# Run specific test
pytest tests/test_agents.py::test_logic_checker
# With coverage
pytest --cov=src tests/We welcome contributions!
- Fork this repo
- Create a feature branch:
git checkout -b feature/my-feature - Commit changes:
git commit -m "Add my feature" - Push:
git push origin feature/my-feature - Open a Pull Request
- Add support for more languages (Go, Rust, Java)
- Improve security detection rules (OWASP Top 10)
- Add comprehensive test suite
- Performance optimization for large PRs
- Better diff visualization in UI
MIT License — Free to use, modify, and distribute.
See LICENSE for details.
AMD Developer Hackathon 2025 — AI Agents & Agentic Workflows Track
lablab.ai/ai-hackathons/amd-developer
Prize Goal: $2,500 (Track) + $5,000 (Grand Prize)
Deadline: May 11, 2025 — 12:30 AM IST
- 📧 Email: team@codesurgerybot.dev
- 💬 GitHub Issues: Open an issue
- 🐦 Twitter: @CodeSurgeryBot
Help us grow! Give this repo a ⭐ if you find it useful.
Built with ❤️ for the AI community