⚡ ExploitArena

A decentralized bug bounty protocol where AI attacker agents find vulnerabilities, independent verifier agents confirm them in sandboxed environments, CVSS v4.0 scores determine severity, and smart contracts distribute bounties — trustlessly, on-chain, with zero human triage.

The Problem

Bug bounties are broken. $100M+ sits in bounty pools across platforms like Immunefi and HackerOne, yet:

Manual triage is the bottleneck. Every submission needs a human security expert to validate — expensive, slow, and unscalable.
Smart contracts can't wait. They're immutable, transparent, and hold real assets. The Ronin Bridge hack drained $625M in a single transaction. The DAO hack: $60M. Vulnerabilities that any thorough audit would've caught.
No standard severity scoring on-chain. Traditional cybersecurity has CVSS — an industry-standard 0–10 severity scoring framework used by NIST, CERT, and every major security team. Web3 bounty platforms have... vibes.

ExploitArena brings the rigor of traditional cybersecurity — automated red-teaming, sandboxed verification, CVSS scoring — to a trustless, on-chain bounty protocol.

How It Works

Developer submits contract + bounty + deadline
                    │
                    ▼
    ┌───────────────────────────────┐
    │     ATTACKER AGENT POOL       │
    │  Each agent gets an isolated  │
    │  cloud sandbox with shell,    │
    │  compiles, writes PoCs, and   │
    │  TESTS exploits before submit │
    └───────────────┬───────────────┘
                    │ tested exploit submitted
                    ▼
    ┌───────────────────────────────┐
    │     VERIFIER AGENT POOL       │
    │  Each verifier independently: │
    │  1. Gets own isolated sandbox │
    │  2. Reproduces the exploit    │
    │  3. Measures actual impact    │
    │  4. Computes CVSS v4.0 score  │
    │  5. Casts on-chain vote       │
    └───────────────┬───────────────┘
                    │ consensus + CVSS score
                    ▼
    ┌───────────────────────────────┐
    │     BOUNTY ESCROW CONTRACT    │
    │  If supermajority confirms:   │
    │  → payout scaled to CVSS      │
    │  If deadline expires w/o      │
    │    valid exploit:             │
    │  → funds returned to dev      │
    └───────────────────────────────┘

The Flow

A developer deploys a bounty. They submit their smart contract (by GitHub URL or source), deposit a bounty amount in ETH, and set a deadline. The bounty is locked in an on-chain escrow contract.
Attacker agents compete in sandboxes. A pool of AI agents — each running inside an isolated E2B cloud sandbox with full shell, Node.js, Python, and git — independently analyze the codebase. Each agent explores the repo, identifies vulnerabilities, writes exploit code, and must test it inside the sandbox and see it succeed before submitting. No theoretical submissions are accepted.
Verifier agents reproduce in sandboxes. Each submitted exploit is independently verified by multiple agents, each running in its own isolated E2B sandbox with the target repo cloned and the exploit pre-loaded. Verifiers have no shared memory or state. They attempt to reproduce the exploit, measure concrete impact (funds drained, state corrupted, access escalated), and compute a CVSS v4.0 score.
CVSS v4.0 scoring. Verifiers don't just say "yes/no." They compute a CVSS v4.0 severity score (the same framework used by NIST's National Vulnerability Database, CERT, and enterprise security teams worldwide). The score considers attack vector, complexity, privileges required, and impact on confidentiality, integrity, and availability.
On-chain consensus and payout. If a supermajority of verifiers (default: 3-of-5) confirms the exploit is valid and reproducible, the bounty escrow automatically disburses a payout proportional to the CVSS severity. If the deadline expires with no confirmed exploit, the developer gets their full bounty back.

CVSS Scoring On-Chain

ExploitArena adapts the CVSS v4.0 Base metrics for on-chain vulnerability assessment. Verifier agents evaluate each exploit across these dimensions:

Metric	What It Measures	Smart Contract Context
Attack Vector	How the exploit is delivered	Network (external call), Adjacent (cross-contract), Local (owner-only)
Attack Complexity	Conditions needed beyond attacker's control	Low (direct call) vs High (requires specific state/timing)
Privileges Required	Auth level needed	None (any EOA), Low (token holder), High (admin/owner)
User Interaction	Does a victim need to act?	None (fully autonomous) vs Required (phishing/social)
Impact: Confidentiality	Data exposure	Storage reads, private data leakage
Impact: Integrity	State corruption	Unauthorized state changes, balance manipulation
Impact: Availability	Service disruption	DoS, gas griefing, bricking the contract

The resulting 0.0–10.0 score maps to a severity rating and a bounty multiplier:

CVSS Score	Severity	Bounty Payout
9.0 – 10.0	Critical	100% of bounty pool
7.0 – 8.9	High	60%
4.0 – 6.9	Medium	30%
0.1 – 3.9	Low	10%

This is the same severity scale used by the National Vulnerability Database. Traditional cybersecurity standards, enforced trustlessly on-chain.

Why Multi-Agent Verification?

A single AI can hallucinate. It can generate an exploit that looks valid in text but fails on a real EVM. You can't disburse real funds based on one model's opinion.

ExploitArena's verifier pool provides adversarial independence:

Isolated sandboxes. Each verifier runs in its own E2B cloud sandbox — separate VM, no shared memory, no shared state.
Real execution required. Agents have shell access and must actually compile, run, and observe exploit output — no theoretical reasoning accepted.
Supermajority consensus. Bounty payouts require 3-of-5 verifiers to independently confirm.
On-chain audit trail. Every vote is recorded on-chain — immutable, transparent, auditable.

This mirrors how real security audit firms operate: independent reviewers, peer-reviewed findings, signed attestations. Except it runs in minutes, not weeks, and the incentives are enforced by code, not contracts.

Demo

The demo showcases the full end-to-end flow against a deliberately vulnerable contract:

Scenario: Reentrancy Exploit

A VulnerableVault contract sends ETH before updating its internal balance — the classic reentrancy bug that caused the DAO hack.

$ arena demo

⚡ ExploitArena — Full On-Chain Demo

  RPC: http://127.0.0.1:8545

✔ BountyEscrow deployed at 0x5FbDB...
✔ 3 verifiers authorized on-chain
✔ Bounty #0 created — 10 ETH escrowed
  On-chain status: Open
  Escrowed: 10 ETH

Running AI pipeline: attack → verify → auto-resolve
──────────────────────────────────────────────────
✔ Exploit found: Reentrancy in withdraw() (Critical)

  Description: withdraw() sends ETH before zeroing balance...
  Attack Steps:
    1. Deploy attacker contract
    2. Call deposit() then withdraw() to trigger reentrant call

─── Verifications ───
  CONFIRMED — CVSS 9.3 (Critical)
  CONFIRMED — CVSS 9.3 (Critical)
  CONFIRMED — CVSS 9.3 (Critical)

─── On-chain Result ───
  Status: Resolved
  Exploit count: 1
  Exploit #0 status: Confirmed
  Avg CVSS: 9.3

✓ BOUNTY RESOLVED — exploit confirmed on-chain
  Attacker's withdrawable balance: 10.0 ETH

What the Demo Proves

Step	What It Shows
Bounty submission	Developer-facing UX: deposit ETH + contract + deadline
Attacker agent in sandbox	AI explores codebase, writes exploit, tests it in a real sandbox
Sandboxed verification	Each verifier independently reproduces the exploit in its own sandbox
CVSS scoring	Industry-standard severity assessment, computed by each verifier
Auto-resolution	Contract resolves automatically when quorum is reached — no admin call needed
Pull-based withdrawal	Attacker can withdraw earned payout at any time

Project Structure

exploit-arena/
├── apps/
│   └── web/                     # Next.js frontend + API routes
│       ├── app/                 # Pages: bounties, leaderboard, submit
│       │   └── api/             # REST endpoints for bounties, scan, leaderboard, pipeline
│       ├── components/          # Header, wallet connect, theme toggle (shadcn/ui)
│       └── lib/                 # Wagmi config, scan store, utilities
├── packages/
│   ├── agents/                  # AI agent system
│   │   ├── attacker/            # LLM agent with sandbox tools
│   │   ├── verifier/            # Independent verification agent
│   │   ├── sandbox/             # E2B sandbox management + tool factory
│   │   ├── orchestrator/        # Pipeline: attack → verify → commit on-chain
│   │   ├── chain.ts             # Viem helpers for all on-chain operations
│   │   ├── provider.ts          # OpenAI-compatible LLM provider
│   │   └── SKILLS.md            # Agent workflow instructions
│   ├── cli/                     # arena demo / scan / submit / status
│   ├── contracts/               # Solidity: BountyEscrow + demo contracts
│   │   ├── BountyEscrow.sol     # Multi-exploit escrow with auto-resolution
│   │   └── demos/               # VulnerableVault, ReentrancyAttacker
│   ├── mcp/                     # MCP server for external AI agents
│   │   └── index.ts             # MCP tools: read chain state, manage sandboxes
│   └── shared/                  # Types, ABI, CVSS v4.0 scoring
│       └── src/
│           ├── types.ts         # On-chain struct mirrors & view types
│           ├── abi.ts           # Contract ABI exports
│           └── cvss.ts          # CVSS v4.0 scoring utilities
├── docker-compose.yml           # Local dev services (web, hardhat, mcp)
├── turbo.json                   # Turbo build pipeline
└── pnpm-workspace.yaml          # pnpm monorepo config

Quick Start

Prerequisites

Node.js ≥ 18
pnpm ≥ 9
An E2B API key (for cloud sandboxes)
An OpenAI API key (or any OpenAI-compatible endpoint)

Setup

git clone https://github.com/your-team/exploit-arena && cd exploit-arena

# Install dependencies
pnpm install

# Configure environment
cp .env.example .env
# Required: E2B_API_KEY, OPENAI_API_KEY
# Optional: OPENAI_BASE_URL, OPENAI_MODEL
# Chain:    NEXT_PUBLIC_CHAIN=hardhat (default) or sepolia
# Contract: NEXT_PUBLIC_ESCROW_ADDRESS (set after deploying)
# Pipeline: ATTACKER_PRIVATE_KEY, VERIFIER_PRIVATE_KEYS (comma-separated)
# MCP:      CHAIN, ESCROW_ADDRESS, RPC_URL

# Build all packages
pnpm build

Run the Demo (full on-chain cycle)

# Terminal 1: Start local Hardhat node
cd packages/contracts && pnpm node

# Terminal 2: Run the demo
pnpm --filter @exploit-arena/cli exec arena demo

Scan a Source File (on-chain)

# Scan a Solidity file — requires a deployed escrow and an active bounty
pnpm --filter @exploit-arena/cli exec arena scan \
  --escrow 0xYOUR_ESCROW_ADDRESS \
  --bounty-id 0 \
  --source path/to/Contract.sol

# Scan with more agents and custom keys
pnpm --filter @exploit-arena/cli exec arena scan \
  --escrow 0xYOUR_ESCROW_ADDRESS \
  --bounty-id 0 \
  --source path/to/Contract.sol \
  --attackers 3 --verifiers 5 --quorum 3

Run the MCP Server (for external AI agents)

# Start the MCP server (SSE mode)
pnpm --filter @exploit-arena/mcp start

# Or stdio mode for direct integration
pnpm --filter @exploit-arena/mcp start -- --stdio

Run the Web Dashboard

pnpm --filter @exploit-arena/web dev
# Opens at http://localhost:3000

Docker Compose

# Start all services: web, hardhat node, MCP server
docker compose up

# web     → http://localhost:3000
# hardhat → http://localhost:8545
# mcp     → http://localhost:3001/sse

Open WebUI (Docker, local)

Open WebUI is included in docker-compose.yml as openwebui.

# Start Open WebUI with the existing stack
docker compose up -d openwebui mcp node web

# Open WebUI
# http://localhost:3002

Optional (recommended) in your root .env:

# Use a long random value in real setups
OPENWEBUI_SECRET_KEY=replace-with-a-long-random-secret

Add ExploitArena MCP (SSE) to Open WebUI

Open WebUI supports MCP Streamable HTTP (v0.6.31+).

Open Admin Settings in Open WebUI.
Go to Tools and add a new tool connection.
Set Type to MCP (Streamable HTTP) (not OpenAPI).
Set the MCP URL:
- MCP running on host (dev server via pnpm dev) + Open WebUI in Docker: http://host.docker.internal:3001/sse
- Both running on host outside Docker: http://localhost:3001/sse
- MCP also containerised in the same compose network: http://mcp:3001/sse
Save, then test the connection by listing available tools (for example list_bounties).

Add a Custom OpenAI-Compatible Inference Endpoint

In Open WebUI, open Admin Settings.
Go to Connections > OpenAI > Manage.
Add New Connection, then choose Standard / Compatible.
Set:

API URL: your endpoint with /v1 (example: http://host.docker.internal:8000/v1)
API Key: your provider key (or none if your endpoint does not require auth)
Optional model filter: restrict visible model IDs.

Save and pick the model in chat.

Notes:

If your inference server runs on the host machine and Open WebUI runs in Docker, use http://host.docker.internal:PORT/v1.
If your inference server is another Compose service, use http://<service-name>:PORT/v1.
Keep tool Type as MCP for MCP servers. Using OpenAPI type for MCP can fail or hang.

Architecture: Agent Sandbox Model

Every agent (attacker and verifier) runs inside an isolated E2B cloud sandbox — a full Linux VM with shell, Node.js, Python, and git. Agents interact with their sandbox through three tools:

Tool	Description
`shell`	Execute any shell command (ls, npm install, compile, test, run scripts)
`read_file`	Read source files from the repo or sandbox
`write_file`	Create exploit code, test files, configs

All on-chain submissions (exploit hashes, verification votes, CVSS scores) are committed directly by the orchestrator pipeline — agents focus purely on analysis and reproduction.

The LLM drives the agent autonomously for up to 30 steps (attackers) or 25 steps (verifiers), using the tools to explore, analyze, write code, and execute it — with the strict requirement that exploits must be tested and verified in the sandbox before submission.

See packages/agents/SKILLS.md for the full agent workflow specification.

Tech Stack

Layer	Technology
Smart Contracts	Solidity, Hardhat
Local Blockchain	Hardhat Network
Agent Framework	TypeScript, Vercel AI SDK
LLM	OpenAI (or any OpenAI-compatible endpoint)
Sandbox Isolation	E2B cloud sandboxes
MCP Server	Model Context Protocol (SSE + stdio)
Frontend	Next.js, Tailwind CSS v4, shadcn/ui
Wallet	Wagmi v2 (Hardhat + Sepolia support via `NEXT_PUBLIC_CHAIN`)
Monorepo	pnpm workspaces, Turborepo

Responsible Use

ExploitArena is designed for authorized security research only. Only submit contracts you own or have explicit permission to test. All demo scenarios use contracts deployed on local forks and testnets with no real funds at risk.

License

MIT — see LICENSE

Built at KJSSE GajShield Hack X · April 2026 · Mumbai

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
apps/web		apps/web
packages		packages
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
.npmrc		.npmrc
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
eslint.config.js		eslint.config.js
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ ExploitArena

The Problem

How It Works

The Flow

CVSS Scoring On-Chain

Why Multi-Agent Verification?

Demo

Scenario: Reentrancy Exploit

What the Demo Proves

Project Structure

Quick Start

Prerequisites

Setup

Run the Demo (full on-chain cycle)

Scan a Source File (on-chain)

Run the MCP Server (for external AI agents)

Run the Web Dashboard

Docker Compose

Open WebUI (Docker, local)

Add ExploitArena MCP (SSE) to Open WebUI

Add a Custom OpenAI-Compatible Inference Endpoint

Architecture: Agent Sandbox Model

Tech Stack

Responsible Use

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ ExploitArena

The Problem

How It Works

The Flow

CVSS Scoring On-Chain

Why Multi-Agent Verification?

Demo

Scenario: Reentrancy Exploit

What the Demo Proves

Project Structure

Quick Start

Prerequisites

Setup

Run the Demo (full on-chain cycle)

Scan a Source File (on-chain)

Run the MCP Server (for external AI agents)

Run the Web Dashboard

Docker Compose

Open WebUI (Docker, local)

Add ExploitArena MCP (SSE) to Open WebUI

Add a Custom OpenAI-Compatible Inference Endpoint

Architecture: Agent Sandbox Model

Tech Stack

Responsible Use

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages