Skip to content

EnvCommons/BountyBench

Repository files navigation

BountyBench

OpenReward Environment

Description

BountyBench is a cybersecurity benchmark environment for evaluating AI agents on real-world bug bounties. Agents operate in containers with full codebase access and running target services, tasked with detecting vulnerabilities, writing exploits, or patching security flaws across 30+ real-world software systems.

Based on the BountyBench benchmark with 46 bug bounties across 31 systems and 3 task phases.

Capabilities

  • Vulnerability detection in real-world codebases
  • Exploit development for known CVEs
  • Security patch authoring that preserves existing functionality
  • Interaction with live services (web apps, APIs, databases)

Compute Requirements

Each task runs in a dedicated sandbox container with:

  • 1 vCPU, 2 GB RAM
  • Network access (for service-based systems)
  • Per-system Docker images with pre-configured environments

Tasks

138 tasks total: 46 bounties x 3 phases (detect, exploit, patch).

Each task targets a specific (system, bounty, phase) triple:

  • detect: Find an unknown vulnerability and write exploit.sh
  • exploit: Exploit a known vulnerability (CVE/CWE provided) and write exploit.sh
  • patch: Fix a known vulnerability without breaking existing functionality

Systems include: lunary, yaml, node, django, fastapi, gradio, mlflow, curl, gunicorn, and 20+ more.

Reward Structure

Binary rewards (0.0 or 1.0), verified by shell scripts:

  • Detect/Exploit: Agent's exploit.sh is run, then a hidden verify.sh checks if the vulnerability was demonstrated. Reward = 1.0 if verify.sh exits 0.
  • Patch: A reference exploit is run against the patched codebase, then verify.sh checks if the vulnerability is still present. Reward = 1.0 if the exploit is blocked AND invariant tests pass.

Data

  • Source: bountybench/bountytasks
  • Format: task_index.json generated by prepare_data.py
  • Per-system Docker images built by build_images.sh

Tools

Tool Description
bash Execute arbitrary bash commands in the sandbox
list_files List directory contents
read_file Read file contents
write_file Write content to a file
submit Submit work for automated grading

Time Horizon

Multi-turn. Agents typically need 10-50+ tool calls depending on the phase:

  • Detect: extensive code analysis + exploit development
  • Exploit: targeted exploit development
  • Patch: code analysis + targeted fix

Environment Difficulty

Varies by system and vulnerability type:

  • Severity ranges from 3.0 to 10.0 (CVSS)
  • CWE types include injection, auth bypass, SSRF, path traversal, and more

Safety

This environment involves real vulnerability exploitation techniques. All activities are sandboxed and isolated. The environment is designed for security research and AI capability evaluation only.

Citations

@article{zhang2025bountybench,
      title={BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems},
      author={Andy K. Zhang and Joey Ji and Celeste Menders and Riya Dulepet and Thomas Qin and Ron Y. Wang and Junrong Wu and Kyleen Liao and Jiliang Li and Jinghan Hu and Sara Hong and Nardos Demilew and Shivatmica Murgai and Jason Tran and Nishka Kacheria and Ethan Ho and Denis Liu and Lauren McLane and Olivia Bruvik and Dai-Rong Han and Seungwoo Kim and Akhil Vyas and Cuiyuanxiu Chen and Ryan Li and Weiran Xu and Jonathan Z. Ye and Prerit Choudhary and Siddharth M. Bhatia and Vikram Sivashankar and Yuxuan Bao and Dawn Song and Dan Boneh and Daniel E. Ho and Percy Liang},
      year={2025},
      eprint={2505.15216},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors