Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
a8a1c8f
current version
xingdi-eric-yuan Jul 27, 2025
f8517f6
Update rag_agent.py
xingdi-eric-yuan Jul 27, 2025
9771228
Update rag_agent.py
xingdi-eric-yuan Jul 27, 2025
6b5de4c
Update rag_agent.py
xingdi-eric-yuan Jul 27, 2025
5bdcad5
Update rag_agent.py
xingdi-eric-yuan Jul 28, 2025
24c7fce
question prompt
xingdi-eric-yuan Jul 28, 2025
a0ccd75
we don't put messages with debug_gym_ignore in history
xingdi-eric-yuan Jul 28, 2025
9b42073
config
xingdi-eric-yuan Jul 28, 2025
6b2a3a0
Update requirements.txt
xingdi-eric-yuan Jul 28, 2025
5cd60f5
add test
xingdi-eric-yuan Jul 28, 2025
5e1e758
minor
xingdi-eric-yuan Jul 28, 2025
2d3e2ba
caching
xingdi-eric-yuan Jul 28, 2025
bbb6e4f
test cases for the caching
xingdi-eric-yuan Jul 28, 2025
abe8519
minor
xingdi-eric-yuan Jul 28, 2025
a950551
Update rag_agent.py
xingdi-eric-yuan Jul 28, 2025
15cc3fa
dedup
xingdi-eric-yuan Jul 28, 2025
96f9999
add back the user message at retrieval steps to log, if we want to re…
xingdi-eric-yuan Jul 28, 2025
1944840
make cash key interpretable
xingdi-eric-yuan Jul 29, 2025
d13506b
suppress_stdout_stderr so they don't break rich
xingdi-eric-yuan Jul 29, 2025
69e7a60
sentence encoder as a service
xingdi-eric-yuan Jul 29, 2025
008fa62
logger
xingdi-eric-yuan Jul 29, 2025
2dae27c
script to pre-generate rag cache
xingdi-eric-yuan Jul 29, 2025
5edff6b
Update generate_rag_cache.py
xingdi-eric-yuan Jul 29, 2025
0f3a6e0
more robust server request
xingdi-eric-yuan Jul 29, 2025
3300b69
Update rag_agent.py
xingdi-eric-yuan Jul 30, 2025
d006ee2
current version
xingdi-eric-yuan Jul 30, 2025
69d2230
Update encoding_service.py
xingdi-eric-yuan Jul 30, 2025
d518a7e
minor
xingdi-eric-yuan Jul 30, 2025
be25542
retrieval as a service
xingdi-eric-yuan Jul 30, 2025
153a5a7
Update utils.py
xingdi-eric-yuan Jul 30, 2025
db0848f
catch cases where encoder crash because of GPU OOM
xingdi-eric-yuan Jul 30, 2025
0596a31
minor
xingdi-eric-yuan Jul 30, 2025
87128c8
remove unnecessary yaml
xingdi-eric-yuan Jul 30, 2025
19e7565
remove unnecessary argument
xingdi-eric-yuan Jul 30, 2025
144004c
skip build index if running multiple workers
xingdi-eric-yuan Jul 30, 2025
c51e645
Update shared_cache.py
xingdi-eric-yuan Jul 30, 2025
04cfafc
fix Race Condition in Index Building
xingdi-eric-yuan Jul 30, 2025
699a625
Update retrieval_service.py
xingdi-eric-yuan Jul 30, 2025
f7fc7d6
Update run.py
xingdi-eric-yuan Jul 30, 2025
27cc864
Update run.py
xingdi-eric-yuan Jul 30, 2025
591ee99
Update run.py
xingdi-eric-yuan Jul 30, 2025
1d65973
Update retrieval_service.py
xingdi-eric-yuan Jul 30, 2025
99c89c1
minor
xingdi-eric-yuan Jul 31, 2025
b22751a
fix tests
xingdi-eric-yuan Jul 31, 2025
61d3707
Update test_retrieval_service.py
xingdi-eric-yuan Jul 31, 2025
2341c3c
Update test_retrieval_service.py
xingdi-eric-yuan Jul 31, 2025
0bafeb7
hang detection and auto restart
xingdi-eric-yuan Jul 31, 2025
b63ba37
make hang detection timeout configurable
xingdi-eric-yuan Jul 31, 2025
2aee44b
minor
xingdi-eric-yuan Jul 31, 2025
ab962bf
improved build flow
xingdi-eric-yuan Jul 31, 2025
949b5c9
moving retrieval server outside
xingdi-eric-yuan Aug 1, 2025
9e3efd4
minor
xingdi-eric-yuan Aug 1, 2025
92f0b3e
fix import
xingdi-eric-yuan Aug 1, 2025
806e8f4
fix tests
xingdi-eric-yuan Aug 1, 2025
0cdd930
absolute path when building index
xingdi-eric-yuan Aug 1, 2025
1ad7a18
minor
xingdi-eric-yuan Aug 1, 2025
792d1be
simplify config
xingdi-eric-yuan Aug 1, 2025
67138d3
add rag agent into readme
xingdi-eric-yuan Aug 1, 2025
6c37408
Merge branch 'main' into external_retrieval_server
xingdi-eric-yuan Aug 6, 2025
5296701
Merge branch 'main' into external_retrieval_server
xingdi-eric-yuan Aug 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,10 +101,70 @@ We provide the below LLM-based agents, they all have minimal design and serve th
| `debug_agent` | `pdb`, `rewrite`, `view`, `eval` | A minimal agent that dumps all available information into its prompt and queries the LLM to generate a command. |
| `rewrite_agent` | `rewrite`, `view`, `eval` | A `debug_agent` but `pdb` tool is disabled (an agent keeps rewriting). |
| `debug_5_agent` | `pdb`, `rewrite`, `view`, `eval` | A `debug_agent`, but `pdb` tool is only enabled after certain amount of rewrites. |
| `rag_agent` | `pdb`, `rewrite`, `view`, `eval` | A retrieval-augmented agent that uses similar debugging examples from past trajectories. **Requires separate retrieval service setup** - see [RAG Agent Setup](#rag-agent-setup) below. |
| `solution_agent` | `pdb`, `eval` | An oracle agent that applies a gold patch (only works with `swebench` and `swesmith` benchmarks for now). The agent checks that tests are failing before applying the patch, and passing after. It also checks that `pdb` tool can be used as expected. |

---

<details>
<summary><strong>RAG Agent Setup</strong> (Click to expand)</summary>

#### 2.2.1. RAG Agent Setup

The `rag_agent` requires a separate retrieval service to function. This service handles embedding generation, caching, and similarity search for retrieving relevant debugging examples.

**Setup Instructions:**

1. **Install the retrieval service:**
```bash
git clone https://github.com/xingdi-eric-yuan/retriever_service
cd retriever_service
pip install -e .
```

2. **Configure the retrieval service:**
Edit `config.yaml` in the retriever service directory:
```yaml
# Model and processing settings (configured server-side)
sentence_encoder_model: "Qwen/Qwen3-Embedding-0.6B"
rag_cache_dir: ".rag_cache"
rag_use_cache: true
rag_indexing_batch_size: 1000

# Service settings
default_port: 8766
default_host: "localhost"
```

3. **Start the retrieval service:**
```bash
python quick_start.py
```
The service will start on `http://localhost:8766`

4. **Configure the RAG agent in debug-gym:**
In your debug-gym config file (e.g., `scripts/config_swesmith.yaml`):
```yaml
rag_agent:
# Retrieval service connection
rag_retrieval_service_host: "localhost"
rag_retrieval_service_port: 8766
rag_retrieval_service_timeout: 300

# Retrieval settings
rag_num_retrievals: 3
rag_indexing_method: "tool_call_with_reasoning-3"
```

**Important Notes:**
- The retrieval service must be running before using `rag_agent`
- Model configuration (sentence encoder, caching) is handled server-side in the retrieval service
- See the [retrieval service repository](https://github.com/xingdi-eric-yuan/retriever_service) for detailed documentation

</details>

---

#### 2.3. Benchmarks

To demonstrate how to integrate `debug-gym` with coding tasks and repositories, we provide example code importing two widely used benchmarks, namely `aider` and `swebench`, and a small set of minimal buggy code snippets, namely `mini_nightmare`.
Expand Down
7 changes: 7 additions & 0 deletions debug_gym/agents/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
from debug_gym.agents.debug_agent import Debug_5_Agent, DebugAgent
from debug_gym.agents.rewrite_agent import RewriteAgent
from debug_gym.agents.solution_agent import AgentSolution

# Conditionally import RAGAgent only if retrieval service is available
try:
from debug_gym.agents.rag_agent import RAGAgent
except ImportError:
# RAGAgent is not available if retrieval service is not installed
RAGAgent = None
Loading