Release v7.7.7 — leaderboard CLI + concrete baselines + CI auto-verification · fathom-lab/styxx

styxx 7.7.7 — `styxx leaderboard` CLI + concrete reference baselines + CI auto-verification

The empirical floor is now a runnable, terminal-accessible, CI-verified public challenge.

Added

styxx leaderboard — lightweight CLI that displays the current gauntlet leaderboard. Reads bundled LEADERBOARD.md from styxx/_data/ so it works on clean pip install. --rows-only flag filters to just the leaderboard rows for quick scanning.
submissions/baseline_002_classifier/ — the shipped dark-core classifier wrapped in the gauntlet. 1/3 bars passed (K2 accuracy 0.77 ✓; K1 F1 0.42 ✗; K3 F1 0.36 ✗).
submissions/baseline_003_length/ — a deliberately bad length-only heuristic. 0/3 bars; anchors the leaderboard floor with a real numeric row.
.github/workflows/gauntlet-pr.yml — CI workflow that auto-verifies external submission PRs (1e-3 float tolerance).
submissions/GAUNTLET.md — submission protocol documentation.
styxx/_data/LEADERBOARD.md bundled as package data.
4 new tests; full suite: 1083 passed, 8 skipped.

What this delivers

pip install styxx==7.7.7
styxx leaderboard --rows-only         # see the floor
# ... write method.py + submission.json ...
styxx gauntlet --method submissions.<name>.method:predict --task classification
# ... open PR; CI re-verifies reported numbers ...

The leaderboard is trustworthy by construction.

🤖 Generated with Claude Code

Zenodo DOI: 10.5281/zenodo.20418532 (v24 in the concept chain at 10.5281/zenodo.19326174; published 2026-05-27).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v7.7.7 — leaderboard CLI + concrete baselines + CI auto-verification

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

styxx 7.7.7 — `styxx leaderboard` CLI + concrete reference baselines + CI auto-verification

Added

What this delivers

Uh oh!

v7.7.7 — leaderboard CLI + concrete baselines + CI auto-verification

styxx 7.7.7 — styxx leaderboard CLI + concrete reference baselines + CI auto-verification

Added

What this delivers

Uh oh!

styxx 7.7.7 — `styxx leaderboard` CLI + concrete reference baselines + CI auto-verification