🐦‍🔥 PhoenixRepair: Rising from Multi-Location Sampling and Iterative Self-Reflection

PhoenixRepair is a multi-agent framework that systematically explores multiple candidate location sets and performs iterative reflection and refinement on patch generation, thereby expanding the search space of repair strategies for automated issue resolution.

🔥 Highlights

State-of-the-art Performance: Achieves 74.4% pass@1 on SWE-bench Verified with DeepSeek-V3.2
Superior Fault Localization: Demonstrates higher fault localization accuracy across file, module, and function granularities
Model-Agnostic: Consistent performance gains across different LLMs (DeepSeek-V3.1, DeepSeek-V3.2, Qwen-Coder-Plus)
Generalizable Framework: Successfully transfers to other agent-based methods (Mini-SWE-agent, Live-SWE-agent)

📊 Performance

Pass@1 on SWE-bench Verified

Method	LLM	Resolved (%)
Mini-SWE-agent	DeepSeek-V3.2	61.0%
Live-SWE-agent	DeepSeek-V3.2	63.8%
SWE-Search	DeepSeek-V3.2	65.4%
Agentless	DeepSeek-V3.2	67.0%
Trae-agent	DeepSeek-V3.2	67.8%
SWE-agent	DeepSeek-V3.2	69.4%
Moatless-tools	Claude-4-Sonnet	70.8%
PhoenixRepair	DeepSeek-V3.2	74.4% (+5.0%)

Fault Localization Accuracy

Method	LLM	File Acc@1	Module Acc@1	Function Acc@1
SWE-agent	DeepSeek-V3.2	83.04%	74.22%	63.42%
PhoenixRepair	DeepSeek-V3.2	85.64% (+2.60%)	76.82% (+2.60%)	66.64% (+3.22%)

🏗️ Architecture

PhoenixRepair comprises three main phases:

1️⃣ Multi-Location Sets Sampling

Localization Sampling Agent: Iteratively samples N diverse location sets
Graph-based Localization (for difficult tasks): Provides cross-file dependency information
Deduplication: Removes duplicate location sets to obtain final candidates

2️⃣ Iterative Reflection & Refinement

Coder Agent: Generates patches constrained to specific location sets
Analysis & Test Agent: Evaluates patch quality through:
- Test quality assessment
- Regression test pass rate
Selector Agent: Selects top-performing patches
Analysis Agent: Distills guidance from historical attempts
Iterative Refinement: Continues until converging to final location set

3️⃣ Final Round Generation

Analysis agent distills insights from all historical attempts
Generates final patch guided by comprehensive distilled knowledge

🚀 Quick Start

Installation

Run

conda create --name phoenix python=3.11
conda activate phoenix
python -m pip install --upgrade pip && pip install --editable .

at the repository root

Configuration

# Set up your API keys
export OPENAI_API_KEY="your-api-key"
export DEEPSEEK_API_KEY="your-deepseek-key"

Running PhoenixRepair

Phase 1: Multi-Location Sets Sampling

We design a tool at PhoenixRepair/tools/analysis_complete/ to verify whether each task has been completed successfully.

sweagent run-locate \
    --config config/locate.yaml \
    --agent.model.name "" \  # Please enter the model name, for example "deepseek-chat"
    --agent.model.api_base "" \  # Please enter the url, for example "https://api.deepseek.com/v1"
    --agent.model.per_instance_cost_limit 3.00 \
    --instances.type swe_bench \
    --instances.subset verified \
    --instances.split test \
    --enable_multi_sampling=True \
    --specific_instance_ids="" \  # Specify the instance IDs to be executed
    --num_samples=5 \   # The Number of sequential samples per task
    --num_workers=25 \  # The number of tasks executed in parallel
    --enable_best_sample_selection=False \
    --comparison_model_name="" \  # Please enter the model name, for example "deepseek-chat"
    --comparison_api_key="" \  # Please enter the api_key
    --comparison_api_base="" \  # Please enter the url, for example "https://api.deepseek.com/v1"

Phase 2: Iterative Reflection & Refinement

sweagent run-batch \
    --config config/default.yaml \ 
    --agent.model.name "" \   # Please enter the model name, for example "deepseek-chat"
    --agent.model.api_base "" \  # Please enter the url, for example "https://api.deepseek.com/v1"
    --agent.model.per_instance_cost_limit 3.00 \
    --instances.type swe_bench \
    --instances.subset verified \
    --instances.split test \
    --enable_multi_sampling=True \
    --specific_instance_ids="" \  # Specify the instance IDs to be executed
    --num_samples=3 \
    --comparison_model_name="" \  # Please enter the model name, for example "deepseek-chat"
    --comparison_api_key="" \   # Please enter the api_key
    --comparison_api_base="" \  # Please enter the url, for example "https://api.deepseek.com/v1"
    --deduplicated_patches_root ""  # Please enter the absolute path obtained in Phase 1， for example "/home/PhoenixPepair/trajectories/root/locate__openai--GLM-4.7__t-0.70__p-1.00__c-0.00___swe_bench_verified_test"

Phase 3: Evaluation

git clone https://github.com/SWE-bench/SWE-bench.git
python tackle_pred.py
python evaluation/run_evaluation.py \
    --results_dir "" \  # Please Enter The absolute path obtained in Phase 2, for example "/home/PhoenixPepair/trajectories/root/default__openai--GLM-4.7__t-0.70__p-1.00__c-0.00___swe_bench_verified_test"

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.cursor/rules		.cursor/rules
.devcontainer		.devcontainer
.github		.github
assets		assets
config		config
docs		docs
evaluation		evaluation
sweagent		sweagent
tools		tools
trajectories		trajectories
LICENSE		LICENSE
README.md		README.md
mkdocs.yml		mkdocs.yml
mlc_config.json		mlc_config.json
pyproject.toml		pyproject.toml
tackle_pred.py		tackle_pred.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🐦‍🔥 PhoenixRepair: Rising from Multi-Location Sampling and Iterative Self-Reflection

🔥 Highlights

📊 Performance

Pass@1 on SWE-bench Verified

Fault Localization Accuracy

🏗️ Architecture

1️⃣ Multi-Location Sets Sampling

2️⃣ Iterative Reflection & Refinement

3️⃣ Final Round Generation

🚀 Quick Start

Installation

Configuration

Running PhoenixRepair

Phase 1: Multi-Location Sets Sampling

Phase 2: Iterative Reflection & Refinement

Phase 3: Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🐦‍🔥 PhoenixRepair: Rising from Multi-Location Sampling and Iterative Self-Reflection

🔥 Highlights

📊 Performance

Pass@1 on SWE-bench Verified

Fault Localization Accuracy

🏗️ Architecture

1️⃣ Multi-Location Sets Sampling

2️⃣ Iterative Reflection & Refinement

3️⃣ Final Round Generation

🚀 Quick Start

Installation

Configuration

Running PhoenixRepair

Phase 1: Multi-Location Sets Sampling

Phase 2: Iterative Reflection & Refinement

Phase 3: Evaluation

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages