Skip to content

feat: add model comparison arena with ELO scoring#25

Open
wydrox wants to merge 1 commit intomainfrom
feat/model-arena
Open

feat: add model comparison arena with ELO scoring#25
wydrox wants to merge 1 commit intomainfrom
feat/model-arena

Conversation

@wydrox
Copy link
Contributor

@wydrox wydrox commented Mar 26, 2026

Summary

  • ppmlx arena <model1> <model2> CLI with Rich side-by-side display
  • Web UI at /arena with split-screen comparison and voting
  • ELO scoring (K=32, start 1500) persisted in ~/.ppmlx/arena.db
  • Leaderboard with win rate and match count
  • 55 new tests pass

🤖 Generated with Claude Code

Side-by-side model comparison via CLI and web UI at /arena.
Send same prompt to 2 models, display results, vote.
ELO scoring system persisted in SQLite. Leaderboard display.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant