Run mini-swe-agent on ProgramBench!
What's Changed
The main feature is compatibility with ProgramBench, a new and ultra-challenging software benchmark.
Fixes
- fix: add exist_ok=True to mkdir in BubblewrapEnvironment by @hobostay in #802
- Fix/cost limit zero by @klieret in #825
- fix: add wall-clock time limit to properly kill agents by @klieret in #832
Full Changelog: v2.2.8...v2.3.0