rank-based scoring + reliability fixes
Validator update — rank-based scoring + reliability fixes
Date: 2026-05-01 · PRs: #91, #92, #93
Three back-to-back validator PRs landed (or are landing) that change how miners are scored and how validators handle the round lifecycle. Headline: scoring is now rank-based, not delta-based, and validators are quieter and more disk-efficient.
What changed for miners
To help the subnet reach consensus more cleanly, we have deployed the following measures.
Rank-based scoring (PR #93)
A round's reward is no longer (baseline_loss − val_loss) ** 1.2. Instead, the validator finalizes each round by ranking its evaluated miners and writing a discrete score:
| Rank in round | Score |
|---|---|
| 1st | 2.25 |
| 2nd | 1.5 |
| 3rd | 1.0 |
Everyone else (incl. delta ≤ 0) |
0 |
A miner's chain weight is still the rolling average of these per-round scores over the score window, so consistency across rounds is what compounds. A single great round won't carry you — placing multiple rounds in a row is what builds weight.
Why we made this change:
- Sharper separation between miners. Continuous
delta-based scoring let near-identical models accumulate near-identical rewards, which made the subnet winner non-deterministic from one round to the next. With each round having ~12 competitors, collapsing everyone outside the top-3 to the same score produces a clear signal: only the round leaders matter, and the rest are treated as performance-equivalent. - No reward for duplicating a top model. Under the old scheme, copying a top miner's checkpoint guaranteed a comparable reward. Under rank-based scoring, you only earn anything by placing top-3 yourself — so submitting an identical or near-identical copy of someone else's model has no upside.
- Note that it dosen't change how the validator is only setting weight to top-3 miners. And if your performance hovers around top-3, then your weight would still be ~1/3 from a validator.
Tied losses score zero (PR #93)
If two miners report the exact same val_loss, both get 0 for the round. Identical losses across distinct submissions almost always indicate a duplicated checkpoint, not parallel improvement. We penalize both sides rather than try to disambiguate.
Why we made this change: This is part 1 of the duplicated submission countermeasure. Suppose you have a top-1 model A and try to submit it twice — both copies score 0 because they tie. The natural workaround is to submit A and A' = A + noise, but A' is by construction worse than A, which means A' risks falling out of the top-3 entirely. The duplicator now has to choose between forfeiting the second submission (tied → 0) or degrading it (and risking the rank-3 cutoff).
Part 2 of this countermeasure ships next week and removes the remaining incentive for A' to sit at rank 2 or 3.
Background queue re-checks the top-5 first (PR #93)
A leader who silently regresses used to ride a stale EMA for several rounds because pure staleness ordering deferred their re-eval. The background segment now leads with the top-5 prior-avg-scored miners, then falls through to the staleness rotation.
Why we made this change: It keeps the leaderboard honest. Re-evaluating the current top-5 every round means a regression gets caught on the very next round instead of bleeding into many. It also gives the rest of the field a stable comparison baseline, so a miner can no longer drift to the top by chance — they have to outperform the verified incumbents.
What changed for validators
Two operator-side optimizations to keep disk usage and HF bandwidth in proportion to what the cycle actually needs.
Disk usage drops by ~10× (PR #92)
Both eval paths now delete on-disk .pt files for non-top miners as soon as they're processed. For a 60-miner roster with top_k_miners_to_reward = 3, the miner_submission/ directory shrinks from O(roster) to O(top_k + currently_downloading + pending_eval).
bg-download backpressure (PR #92)
When bg-eval has more than 10 fresh checkpoints sitting unconsumed, the download worker pauses until the queue drains. HF bandwidth now goes to the next 10 useful checkpoints, not the full roster.