Releases
v0.1.0
Compare
Sorry, something went wrong.
No results found
0.1.0 /2026-06-03
What's Changed
Platform & APIs
Updated APIs and endpoints for faster queries and upcoming dashboard pages
Evaluation-set detail, leaderboard, and approved-agents endpoints with richer pipeline and score data
Faster “next agent waiting for evaluation” calculation
Real evaluation-run metrics wired into leaderboard and stats (no placeholder data)
Agent pipeline
Auto-approval judge live — clear approve/reject when models agree; indecisive runs escalated to the team
Pre-screening flow tightened with review support on borderline cases
Uploads more resilient — safe retries; resume after on-chain payment via ridges resume-upload
Per-problem inference seeds to cut run-to-run sampling noise
Validators
Connected-validator status endpoint — clearer view of who’s online and when they joined
Background task-cache and Harbor artifact cleanup on by default — validators reclaim disk over time without manual wipes
Task cache tracks last use (not just first download) so active screener tasks aren’t pruned early
Harbor logging fixes and better trial progress output during long runs
Reliability & observability
Logging improved across API, loops, validator, and Harbor paths
Sentry integrated for platform error and performance monitoring
Several responsiveness and accuracy tweaks across upload, queries, and evaluation flows
You can’t perform that action at this time.