Skip to content

A Bittensor subnet for provable classifier robustness under attack

License

Notifications You must be signed in to change notification settings

JacobKohav/gauntlet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🏁 GAUNTLET Subnet

πŸ›‘οΈ Classifier Adversarial Robustness Subnet

A Bittensor Subnet for Provable Classifier Robustness Under Attack


0. 🧾 Executive Summary

GAUNTLET (models that "run the guantlet") is a Bittensor subnet that creates a decentralized market for robust machine learning classifiers.

  • ⛏️ Miners: Train and serve classifiers (image, tabular, signal).
  • πŸ›‘οΈ Validators: Actively attack these classifiers using adversarial methods (PGD, FGSM, AutoAttack, etc.).
  • Scoring: Accuracy under adaptive adversarial attack.
  • Reward: Emissions flow to models that remain accurate under stress.

This subnet transforms adversarial robustness from an academic benchmark into a continuous, adversarial, economically incentivized proof-of-intelligence system.

Instead of rewarding raw accuracy, we reward resilience under attack.


1. 🧩 Subnet Design Proposal


1.1 🧭 System Overview

+------------------+          +----------------------+
|      Miner       |          |      Validator       |
|------------------|          |----------------------|
| Robust Classifier| <------> | Adversarial Engine   |
| API Endpoint     |          | (PGD, FGSM, AutoAtk) |
+------------------+          +----------------------+
         |                                |
         |                                |
         +------------> Scoring <---------+
                          |
                          v
                 Emission Allocation

The system creates a continuous adversarial game:

  • Validators try to break models.
  • Miners try to withstand attacks.
  • Emissions reflect robustness.

1.2 βš–οΈ Incentive & Mechanism Design


1.2.1 πŸͺ™ Emission and Reward Logic

Each epoch:

  1. Validators sample a hidden dataset batch.

  2. Generate adversarial perturbations.

  3. Evaluate:

    • Clean Accuracy (A_clean)
    • Robust Accuracy (A_adv)
  4. Compute final robustness score:

[ Score_i = \alpha \cdot A_{adv} + \beta \cdot A_{clean} - \gamma \cdot LatencyPenalty ]

Where:

  • Ξ± > Ξ² (robustness weighted higher)
  • Ξ³ discourages slow inference

Emission distribution:

[ Emission_i = \frac{Score_i^\tau}{\sum_j Score_j^\tau} ]

  • Ο„ = temperature parameter to sharpen competition.

1.2.2 🎯 Incentive Alignment

Miners

Incentivized to:

  • Train adversarially robust models
  • Reduce gradient masking
  • Provide fast inference
  • Avoid overfitting to validator patterns

They are punished if:

  • Attacks reduce performance drastically
  • Latency is excessive
  • Outputs are inconsistent

Validators

Validators are incentivized to:

  • Generate strong, valid attacks
  • Discover weaknesses
  • Avoid false negatives

Validator reward depends on:

[ ValidatorScore = \Delta Accuracy + NoveltyFactor ]

Where:

  • Ξ” Accuracy = drop caused by attack
  • NoveltyFactor = encourages new perturbation types

Validators are penalized for:

  • Invalid perturbations (exceeding epsilon bounds)
  • Trivial or duplicate attacks

1.2.3 🧹 Discouraging Low-Quality or Adversarial Behavior

Against Miners:

  • Randomized attack strategies
  • Hidden test sets
  • Ensemble validators
  • Transfer attacks

Against Validators:

  • Attack validity checks
  • Bounded perturbation norms
  • Multi-validator consensus

1.2.4 🧠 Proof of Intelligence

This subnet qualifies as a Proof of Intelligence because:

  • Robustness under adversarial attack is computationally non-trivial.

  • It requires:

    • Adversarial training
    • Regularization strategies
    • Model architecture sophistication
  • Validators must compute gradient-based adversarial examples.

The system proves:

  • Model generalization
  • Defense capability
  • Computational effort

Unlike raw inference subnets, this one measures resilience against strategic adversaries.


1.2.5 πŸ§ͺ High-Level Algorithm

Epoch Loop

For each epoch:
    1. Validators sample hidden dataset batch
    2. Validators query miner model
    3. Generate adversarial samples (PGD/FGSM/etc)
    4. Evaluate clean and adversarial accuracy
    5. Compute score
    6. Normalize emissions
    7. Distribute rewards

2. ⛏️ Miner Design


2.1 πŸ—‚οΈ Miner Tasks

Miners must:

  • Host a classifier API

  • Accept batch inputs

  • Return:

    • Predicted class
    • Confidence score (optional)
  • Respond within latency constraints

Supported domains:

  • Image classification (e.g. CIFAR-style)
  • Tabular fraud detection
  • Signal classification

2.2 πŸ” Expected Input β†’ Output Format

Input (JSON)

{
  "task_id": "image_cifar",
  "batch": [
    { "input": <base64_encoded_tensor> }
  ]
}

Output

{
  "predictions": [
    { "label": 3, "confidence": 0.92 }
  ],
  "latency_ms": 38
}

2.3 πŸ“Š Performance Dimensions

Dimension Weight
Robust Accuracy High
Clean Accuracy Medium
Latency Medium
Consistency Medium

3. πŸ›‘οΈ Validator Design


3.1 πŸ§ͺ Scoring Methodology

Validators:

  1. Perform gradient estimation.

  2. Run:

    • FGSM
    • PGD (multi-step)
    • AutoAttack (optional advanced phase)
  3. Measure:

[ RobustAccuracy = \frac{Correct\ under\ attack}{Total} ]

Validators submit:

  • Perturbed samples
  • Attack parameters
  • Result logs

3.2 ⏱️ Evaluation Cadence

  • Epoch-based scoring (e.g., every 100 blocks)
  • Rolling average to reduce variance
  • Randomized attack selection

3.3 🀝 Validator Incentive Alignment

Validators earn more if:

  • They discover new vulnerabilities.
  • They reduce miner robustness significantly.
  • Their attack validity is confirmed by peers.

Validator staking required to discourage spam attacks.


4. πŸ’Ό Business Logic & Market Rationale


4.1 ❗ The Problem

Adversarial attacks threaten:

  • Autonomous vehicles
  • Fraud detection systems
  • AI medical diagnostics
  • Financial AI systems

Most deployed AI models are:

  • Not adversarially tested
  • Easily manipulated
  • Vulnerable to gradient-based attacks

Robustness testing today is:

  • Centralized
  • Expensive
  • Static

We create:

A decentralized, continuous robustness benchmark.


4.2 πŸ†š Competing Solutions

Outside Bittensor:

  • RobustBench
  • Academic benchmarks
  • Internal red-teaming

Limitations:

  • Static datasets
  • No economic incentives
  • No adversarial evolution

Within Bittensor:

  • General inference subnets
  • LLM scoring subnets

None focus on adversarial ML robustness.


4.3 🧬 Why Bittensor?

Bittensor is uniquely suited because:

  • It supports adversarial competition.
  • Emissions reward measurable performance.
  • Validators can evolve attacks.
  • Miners continuously improve.

It creates:

A live adversarial ecosystem.


4.4 πŸ›£οΈ Path to Sustainable Business

Possible monetization:

  1. Enterprise robustness certification
  2. API access to robustness leaderboard
  3. Insurance underwriting input
  4. White-label adversarial testing

Long term:

  • Robustness Score as on-chain primitive
  • Security oracle for AI systems

5. πŸš€ Go-To-Market Strategy


5.1 🎯 Initial Target Users

  • AI startups deploying classifiers
  • Web3 AI protocols
  • Security-focused AI labs
  • Research institutions

Early dataset domains:

  • Fraud detection
  • Crypto transaction anomaly detection
  • Image moderation systems

5.2 πŸ“£ Distribution Channels

  • Crypto AI Twitter
  • Bittensor ecosystem partners
  • Research publications
  • Hackathon demos
  • Open leaderboard website

5.3 🎁 Incentives for Early Participation

Bootstrapping Miners

  • Bonus emission multiplier for first N epochs
  • Early adopter NFT badge

Bootstrapping Validators

  • Higher reward multiplier for novel attacks
  • Bounty pool for breaking top miner

Bootstrapping Users

  • Free robustness evaluation for first 100 external models

6. πŸ—οΈ Extended Architecture Diagram

                    +------------------+
                    |   Hidden Dataset |
                    +------------------+
                             |
                             v
+----------+        +------------------+        +------------+
|  Miner A | <----> |   Validators     | <----> |  Miner B   |
+----------+        |  (Attack Engine)  |        +------------+
       |            +------------------+
       v                      |
  Robust Model                v
                       Score Aggregation
                              |
                              v
                        Emission Split

7. πŸ”­ Long-Term Vision

Phase 1:

  • Image & tabular classification robustness

Phase 2:

  • LLM jailbreak resistance
  • Multimodal robustness

Phase 3:

  • On-chain AI security oracle
  • AI robustness insurance market

8. πŸŒ™ Why This Is a Moonshot

Although initially academic-feeling, this subnet can become:

  • The security layer of AI
  • The robustness oracle for autonomous systems
  • The on-chain benchmark for trustworthy intelligence

As AI integrates into finance, robotics, and defense:

Robustness becomes more valuable than raw intelligence.

We are building:

A decentralized adversarial intelligence arms race.


9. βœ… Closing Statement

The Adversarial Robustness Subnet transforms adversarial ML from an academic benchmark into a live economic competition.

It aligns:

  • Cryptoeconomics
  • Security engineering
  • Machine learning research

Into a single measurable signal:

Accuracy under attack.

This is not just proof of inference.

This is proof of resilience.

About

A Bittensor subnet for provable classifier robustness under attack

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published