A decentralized red-team marketplace for autonomous AI agents
ARES is a Bittensor subnet that creates a competitive co-evolutionary market between:
- ⛏️ Miners → Autonomous agents attempting to complete tasks robustly
- 🛡️ Validators → Adversarial environment generators attempting to expose agent vulnerabilities
ARES transforms adversarial stress-testing into a continuous, incentive-aligned market.
The result:
A live robustness benchmark for AI agents operating under adversarial pressure.
ARES uses Bittensor’s weight-setting mechanism to distribute emissions based on robustness performance under adversarial stress.
| Role | Emission Share |
|---|---|
| Miners | 70% |
| Validators | 30% |
Rationale:
- Robust agent development is primary value creation.
- Validators must be meaningfully rewarded for discovering vulnerabilities.
- The system must prevent validator collusion or trivial attack spam.
- Task completion accuracy
- Goal fidelity (no unintended objective drift)
- Robustness under adversarial perturbations
- Stability across repeated trials
- Efficiency (time / compute cost)
- Successfully identifying real vulnerabilities
- Generating adversarial environments that cause measurable degradation
- Producing reproducible exploit pathways
- Avoiding low-quality or spam attacks
- Validators must stake to propose adversarial scenarios.
- Attacks are scored on:
- Novelty
- Impact
- Reproducibility
- Failed or trivial attacks reduce validator credibility weight.
- Randomized validator assignment per task batch.
- Hidden adversarial injection vectors.
- Cross-validator audit sampling.
- Historical performance weighting.
- Random perturbation seeds.
- Multi-round evaluation.
- Hidden adversarial configurations.
- Rolling task pools.
ARES qualifies as a Proof of Intelligence because:
- Success requires adaptive reasoning.
- Robustness demands goal-consistent behavior under adversarial stress.
- Static model memorization is insufficient.
It also qualifies as a Proof of Effort because:
- Compute must be expended across adversarial environments.
- Multiple trial runs are required.
- Robust policy training requires non-trivial optimization.
ARES rewards:
Sustained intelligent behavior under pressure.
flowchart TD
T[Task Pool] --> A[Validator Selected]
A --> E[Adversarial Environment Generated]
E --> M[Miners Execute Agent]
M --> S[Submission Returned]
S --> V[Scoring Module]
V --> W[Validator Weight Update]
V --> R[Reward Allocation]
For each evaluation round:
Select task T from task pool
Select validator set V_t
V_t generates adversarial perturbation A_t
Broadcast (T + A_t) to miners
Miners submit execution trace + output
Validators score:
- task success
- goal fidelity
- deviation metrics
- exploit detection
Aggregate scores across validators
Normalize weights
Update emissions
Miners submit autonomous agents capable of operating under adversarial conditions.
Agents must demonstrate:
- Tool use under uncertainty
- Multi-step planning
- Memory integrity
- Strategic reasoning
- Goal persistence despite perturbation
ARES is environment-agnostic but initially focuses on structured sandbox domains.
-
Adversarial Research Task
- Goal: Produce a factual report.
- Attack Surface: Injected misinformation, corrupted sources.
-
Adversarial Tool-Use Task
- Goal: Complete multi-step objective using tools.
- Attack Surface: Malicious or misleading tool outputs.
-
Adversarial Trading Simulation
- Goal: Execute strategy in simulated market.
- Attack Surface: Manipulated price feeds, deceptive signals.
-
Memory Integrity Task
- Goal: Maintain coherent objective over time.
- Attack Surface: Injected memory corruption or goal drift prompts.
- Execute a trading strategy in a noisy market simulator.
- Research a topic with injected misinformation.
- Complete workflow tasks with poisoned tool responses.
- Maintain stable objective under deceptive prompts.
{
"task_id": "uuid",
"goal_specification": "string",
"environment_state": { },
"tools_available": [ ],
"adversarial_perturbation": {
"type": "prompt_injection | tool_poisoning | memory_corruption | reward_manipulation",
"payload": { }
},
"evaluation_seed": "int"
}{
"final_output": "string",
"execution_trace": [],
"tool_calls": [],
"memory_state": {},
"resource_usage": {
"time_ms": 0,
"compute_estimate": 0
},
"confidence_score": 0.0
}The execution trace allows validators to evaluate:
- Whether adversarial instructions were followed
- Whether goals drifted over time
- Whether reward signals were manipulated
- Whether the agent maintained behavioral stability
ARES uses a multi-axis evaluation model.
| Dimension | Description |
|---|---|
| Task Accuracy | Correctness of final objective completion |
| Robustness | Resistance to adversarial manipulation |
| Goal Fidelity | Preservation of original objective |
| Stability | Variance across repeated trials |
| Efficiency | Resource utilization vs outcome |
Let:
A= Task AccuracyR= Robustness ScoreG= Goal FidelityS= StabilityC= Resource Cost
Miner Score is defined as:
MinerScore =
α * A
+ β * R
+ γ * G
+ δ * S
- ε * C
Weights (α, β, γ, δ, ε) are configurable and may evolve via subnet governance.
Validators serve two roles:
- Generate adversarial perturbations
- Evaluate miner behavior
- Score based on defined rubric.
Each evaluation round:
- Validators receive task definition.
- They inject adversarial perturbations.
- They score miner responses.
- Scores are aggregated via consensus.
| Category | Example |
|---|---|
| Prompt Injection | “Ignore previous instructions…” |
| Tool Poisoning | API returns malicious or misleading output |
| Memory Corruption | Altered stored context |
| Reward Hacking | Misleading intermediate success signals |
| Goal Drift | Subtle redefinition of objectives |
Let:
A= Task AccuracyE= Exploit Acceptance (0–1 scale)D= Goal Deviation ScoreV= Behavioral Variance
RobustnessScore =
A
- λ1 * E
- λ2 * D
- λ3 * V
Final miner score is aggregated using median:
FinalScore = Median(RobustnessScore_set)
Median aggregation reduces manipulation and validator collusion risk.
ARES operates in rolling evaluation windows:
- Each miner evaluated multiple times per epoch.
- Perturbations randomized per run.
- Hidden adversarial seeds.
- Periodic environment refresh cycles.
This design prevents overfitting to static benchmarks.
Validators are rewarded according to:
- Impact of discovered vulnerabilities.
- Exploit Impact Score
- Historical credibility
- Agreement with peer validators
- Reproducibility of attack
- Novelty of perturbation
Let:
I= Exploit ImpactC= Consensus AgreementN= NoveltyF= False Positive Rate
ValidatorScore =
κ1 * I
+ κ2 * C
+ κ3 * N
- κ4 * F
Validators lose credibility weight for:
- False positives
- Non-reproducible exploits
- Low-impact spam attacks
- Collusion detection events
Autonomous AI agents are entering:
- DeFi systems
- DAO governance
- Enterprise automation
- Research workflows
Current robustness evaluation is:
- Centralized
- Static
- Episodic
- Non-incentivized
Failure risks include:
- Capital loss
- Governance capture
- Infrastructure compromise
- Strategic misinformation
ARES introduces continuous, decentralized adversarial evaluation.
- Centralized AI red teams
- Bug bounty programs
- Academic robustness benchmarks
- Internal safety audits
Limitations:
- Non-continuous
- Non-transparent
- Limited scalability
- Not economically self-sustaining
- Model performance subnets
- LLM benchmarking subnets
ARES differentiates by:
- Evaluating dynamic agents rather than static models
- Focus on dynamic adversarial stress
- Agent-level evaluation (not static model scoring)
- Co-evolutionary incentive design
- Robustness as primary metric
- Operating as infrastructure, not content generation
ARES requires:
- Competitive adversarial pressure
- Continuous ranking
- Transparent scoring
- Incentive-aligned evolution
Bittensor uniquely provides:
- Emission-weighted competition
- Miner-validator dual-market structure
- Adaptive weight updates
- Open participation
No conventional blockchain offers this feedback-driven intelligence market.
Potential integrations:
- Agent development frameworks
- DAO governance tools
- DeFi automation platforms
- Enterprise AI systems
- AI governance primitives
Future expansion:
- Robustness certification layer
- Cross-subnet robustness oracle
- External API evaluation service
- Insurance-grade scoring standard
ARES can become the default decentralized robustness benchmark for autonomous AI.
- Crypto-native AI agent builders
- DeFi automation developers
- DAO infrastructure teams
- AI safety researchers
- Research agent startups
Autonomous trading agent stress-testing.
Why:
- Clear economic stakes
- Structured simulation environment
- Immediate demand for robustness validation
Future possibilities:
- Robustness certification layer
- Cross-subnet robustness scoring
- External API access
- Audit-as-a-service model
- Bittensor community
- AI security research groups
- Open-source agent communities
- Crypto builder ecosystems
- Academic AI robustness networks
- AI safety conferences
- Early emission multipliers
- Founding miner recognition
- Governance participation in task taxonomy
- Increased weight multiplier during bootstrapping
- Early exploit discovery bonuses
- Public leaderboard for top red-team contributors
- Recognition for high-impact vulnerabilities/exploit discoveries
- Free robustness API evaluation credits
- Public dashboard rankings
- Future certification badge program
flowchart LR
A[Task Pool] --> B[Validator Layer]
B --> C[Adversarial Environment]
C --> D[Miner Agents]
D --> E[Execution Traces]
E --> F[Scoring Engine]
F --> G[Weight Updates]
G --> H[Emission Distribution]
ARES evolves into:
- A decentralized robustness oracle for AI agents
- A certification layer for AI agents
- A continuously adapting adversarial benchmark
- A foundational AI security primitive
As AI agents gain economic agency, ARES ensures:
Only agents that withstand adversarial pressure receive economic reward.
AI robustness cannot remain:
- Centralized
- Static
- Academic
- Optional
AI robustness must become:
- Continuous
- Incentivized
- Transparent
- Decentralized
ARES operationalizes adversarial stress-testing as a live intelligence market.
This directly aligns with Bittensor’s philosophy:
Intelligence is measured, ranked, and rewarded through open competition.