You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Backend inference-server calls can now be routed through an optional Undici proxy configured with AITESTBENCH_INFERENCE_PROXY and AITESTBENCH_INFERENCE_NO_PROXY, without exposing proxy settings to the frontend.
Changed
CI and release workflows now run on Node.js 22 to match current backend dependency requirements.
Fixed
Results dashboard performance graphs now link repeated runs from the same template/model into one series even when generated active test IDs differ.
Results dashboard merged metric graphs now keep different models as separate lines instead of collapsing same-test metrics together.
Results dashboard default date ranges now include the newest result even when its timestamp has seconds or milliseconds, preventing single-run dashboards from appearing empty.
Settings Empty database now clears all application SQLite tables, including evaluation prompts and evaluations that feed the leaderboard.
Leaderboard view now clears stale displayed rows immediately after the database is emptied from settings.
Architecture inspection errors now show visible, non-empty diagnostics in the model detail page instead of leaving only a red button state.
MLX architecture inspection now uses config-backed estimation directly, avoiding PyTorch-dependent AutoModel construction and allowing models such as /inferencerlabs/Qwen3-Coder-30B-A3B-Instruct-MLX-6.5bit to inspect successfully from config.json.
Architecture inspector subprocess failures now include captured output or an explicit timeout diagnostic when the Python process exits or is killed without a structured error.
Models page filters now infer provider, quantized provider, format, quantization bit-depth, and use-case metadata from discovered model IDs, and collapse provider-prefixed aliases so the model filter shows clean base model names only.