A Bittensor Subnet for Provable Classifier Robustness Under Attack
GAUNTLET (models that "run the guantlet") is a Bittensor subnet that creates a decentralized market for robust machine learning classifiers.
- βοΈ Miners: Train and serve classifiers (image, tabular, signal).
- π‘οΈ Validators: Actively attack these classifiers using adversarial methods (PGD, FGSM, AutoAttack, etc.).
- Scoring: Accuracy under adaptive adversarial attack.
- Reward: Emissions flow to models that remain accurate under stress.
This subnet transforms adversarial robustness from an academic benchmark into a continuous, adversarial, economically incentivized proof-of-intelligence system.
Instead of rewarding raw accuracy, we reward resilience under attack.
+------------------+ +----------------------+
| Miner | | Validator |
|------------------| |----------------------|
| Robust Classifier| <------> | Adversarial Engine |
| API Endpoint | | (PGD, FGSM, AutoAtk) |
+------------------+ +----------------------+
| |
| |
+------------> Scoring <---------+
|
v
Emission Allocation
The system creates a continuous adversarial game:
- Validators try to break models.
- Miners try to withstand attacks.
- Emissions reflect robustness.
Each epoch:
-
Validators sample a hidden dataset batch.
-
Generate adversarial perturbations.
-
Evaluate:
- Clean Accuracy (A_clean)
- Robust Accuracy (A_adv)
-
Compute final robustness score:
[ Score_i = \alpha \cdot A_{adv} + \beta \cdot A_{clean} - \gamma \cdot LatencyPenalty ]
Where:
- Ξ± > Ξ² (robustness weighted higher)
- Ξ³ discourages slow inference
Emission distribution:
[ Emission_i = \frac{Score_i^\tau}{\sum_j Score_j^\tau} ]
- Ο = temperature parameter to sharpen competition.
Incentivized to:
- Train adversarially robust models
- Reduce gradient masking
- Provide fast inference
- Avoid overfitting to validator patterns
They are punished if:
- Attacks reduce performance drastically
- Latency is excessive
- Outputs are inconsistent
Validators are incentivized to:
- Generate strong, valid attacks
- Discover weaknesses
- Avoid false negatives
Validator reward depends on:
[ ValidatorScore = \Delta Accuracy + NoveltyFactor ]
Where:
- Ξ Accuracy = drop caused by attack
- NoveltyFactor = encourages new perturbation types
Validators are penalized for:
- Invalid perturbations (exceeding epsilon bounds)
- Trivial or duplicate attacks
- Randomized attack strategies
- Hidden test sets
- Ensemble validators
- Transfer attacks
- Attack validity checks
- Bounded perturbation norms
- Multi-validator consensus
This subnet qualifies as a Proof of Intelligence because:
-
Robustness under adversarial attack is computationally non-trivial.
-
It requires:
- Adversarial training
- Regularization strategies
- Model architecture sophistication
-
Validators must compute gradient-based adversarial examples.
The system proves:
- Model generalization
- Defense capability
- Computational effort
Unlike raw inference subnets, this one measures resilience against strategic adversaries.
For each epoch:
1. Validators sample hidden dataset batch
2. Validators query miner model
3. Generate adversarial samples (PGD/FGSM/etc)
4. Evaluate clean and adversarial accuracy
5. Compute score
6. Normalize emissions
7. Distribute rewards
Miners must:
-
Host a classifier API
-
Accept batch inputs
-
Return:
- Predicted class
- Confidence score (optional)
-
Respond within latency constraints
Supported domains:
- Image classification (e.g. CIFAR-style)
- Tabular fraud detection
- Signal classification
{
"task_id": "image_cifar",
"batch": [
{ "input": <base64_encoded_tensor> }
]
}{
"predictions": [
{ "label": 3, "confidence": 0.92 }
],
"latency_ms": 38
}| Dimension | Weight |
|---|---|
| Robust Accuracy | High |
| Clean Accuracy | Medium |
| Latency | Medium |
| Consistency | Medium |
Validators:
-
Perform gradient estimation.
-
Run:
- FGSM
- PGD (multi-step)
- AutoAttack (optional advanced phase)
-
Measure:
[ RobustAccuracy = \frac{Correct\ under\ attack}{Total} ]
Validators submit:
- Perturbed samples
- Attack parameters
- Result logs
- Epoch-based scoring (e.g., every 100 blocks)
- Rolling average to reduce variance
- Randomized attack selection
Validators earn more if:
- They discover new vulnerabilities.
- They reduce miner robustness significantly.
- Their attack validity is confirmed by peers.
Validator staking required to discourage spam attacks.
Adversarial attacks threaten:
- Autonomous vehicles
- Fraud detection systems
- AI medical diagnostics
- Financial AI systems
Most deployed AI models are:
- Not adversarially tested
- Easily manipulated
- Vulnerable to gradient-based attacks
Robustness testing today is:
- Centralized
- Expensive
- Static
We create:
A decentralized, continuous robustness benchmark.
- RobustBench
- Academic benchmarks
- Internal red-teaming
Limitations:
- Static datasets
- No economic incentives
- No adversarial evolution
- General inference subnets
- LLM scoring subnets
None focus on adversarial ML robustness.
Bittensor is uniquely suited because:
- It supports adversarial competition.
- Emissions reward measurable performance.
- Validators can evolve attacks.
- Miners continuously improve.
It creates:
A live adversarial ecosystem.
Possible monetization:
- Enterprise robustness certification
- API access to robustness leaderboard
- Insurance underwriting input
- White-label adversarial testing
Long term:
- Robustness Score as on-chain primitive
- Security oracle for AI systems
- AI startups deploying classifiers
- Web3 AI protocols
- Security-focused AI labs
- Research institutions
Early dataset domains:
- Fraud detection
- Crypto transaction anomaly detection
- Image moderation systems
- Crypto AI Twitter
- Bittensor ecosystem partners
- Research publications
- Hackathon demos
- Open leaderboard website
- Bonus emission multiplier for first N epochs
- Early adopter NFT badge
- Higher reward multiplier for novel attacks
- Bounty pool for breaking top miner
- Free robustness evaluation for first 100 external models
+------------------+
| Hidden Dataset |
+------------------+
|
v
+----------+ +------------------+ +------------+
| Miner A | <----> | Validators | <----> | Miner B |
+----------+ | (Attack Engine) | +------------+
| +------------------+
v |
Robust Model v
Score Aggregation
|
v
Emission Split
Phase 1:
- Image & tabular classification robustness
Phase 2:
- LLM jailbreak resistance
- Multimodal robustness
Phase 3:
- On-chain AI security oracle
- AI robustness insurance market
Although initially academic-feeling, this subnet can become:
- The security layer of AI
- The robustness oracle for autonomous systems
- The on-chain benchmark for trustworthy intelligence
As AI integrates into finance, robotics, and defense:
Robustness becomes more valuable than raw intelligence.
We are building:
A decentralized adversarial intelligence arms race.
The Adversarial Robustness Subnet transforms adversarial ML from an academic benchmark into a live economic competition.
It aligns:
- Cryptoeconomics
- Security engineering
- Machine learning research
Into a single measurable signal:
Accuracy under attack.
This is not just proof of inference.
This is proof of resilience.