Skip to content

Commit e455395

Browse files
authored
🤖 fix: unblock nightly terminal bench import (#617)
## Summary - point the terminal_bench package export at mux_agent so the renamed class loads again - update the benchmark-terminal Makefile target to reference MuxAgent consistently - refresh the README filenames so contributor docs match the new mux nomenclature ## Testing - Not run (not requested) _Generated with `mux`_
1 parent 2b23849 commit e455395

File tree

3 files changed

+7
-7
lines changed

3 files changed

+7
-7
lines changed

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -362,7 +362,7 @@ benchmark-terminal: ## Run Terminal-Bench with the mux agent (use TB_DATASET/TB_
362362
export MUX_TIMEOUT_MS=$$((TB_TIMEOUT * 1000)); \
363363
uvx terminal-bench run \
364364
--dataset "$$TB_DATASET" \
365-
--agent-import-path benchmarks.terminal_bench.mux_agent:CmuxAgent \
365+
--agent-import-path benchmarks.terminal_bench.mux_agent:MuxAgent \
366366
--global-agent-timeout-sec $$TB_TIMEOUT \
367367
$$CONCURRENCY_FLAG \
368368
$$LIVESTREAM_FLAG \

benchmarks/terminal_bench/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -101,8 +101,8 @@ Based on analysis of the Oct 30 nightly run (15-minute timeout):
101101

102102
## Files
103103

104-
- `cmux_agent.py`: Main agent adapter implementing Terminal-Bench's agent interface
105-
- `cmux-run.sh`: Shell script that sets up environment and invokes cmux CLI
106-
- `cmux_payload.py`: Helper to package cmux app for containerized execution
107-
- `cmux_setup.sh.j2`: Jinja2 template for agent installation script
104+
- `mux_agent.py`: Main agent adapter implementing Terminal-Bench's agent interface
105+
- `mux-run.sh`: Shell script that sets up environment and invokes mux CLI
106+
- `mux_payload.py`: Helper to package the mux app for containerized execution
107+
- `mux_setup.sh.j2`: Jinja2 template for agent installation script
108108
- `sample_tasks.py`: Utility to randomly sample tasks from dataset
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
from .cmux_agent import CmuxAgent
1+
from .mux_agent import MuxAgent
22

3-
__all__ = ["CmuxAgent"]
3+
__all__ = ["MuxAgent"]

0 commit comments

Comments
 (0)