Skip to content

v0.3.2

Choose a tag to compare

@github-actions github-actions released this 05 May 18:50
· 173 commits to main since this release

Added

  • Backend inference-server calls can now be routed through an optional Undici proxy configured with AITESTBENCH_INFERENCE_PROXY and AITESTBENCH_INFERENCE_NO_PROXY, without exposing proxy settings to the frontend.

Changed

  • CI and release workflows now run on Node.js 22 to match current backend dependency requirements.

Fixed

  • Results dashboard performance graphs now link repeated runs from the same template/model into one series even when generated active test IDs differ.
  • Results dashboard merged metric graphs now keep different models as separate lines instead of collapsing same-test metrics together.
  • Results dashboard default date ranges now include the newest result even when its timestamp has seconds or milliseconds, preventing single-run dashboards from appearing empty.
  • Settings Empty database now clears all application SQLite tables, including evaluation prompts and evaluations that feed the leaderboard.
  • Leaderboard view now clears stale displayed rows immediately after the database is emptied from settings.
  • Architecture inspection errors now show visible, non-empty diagnostics in the model detail page instead of leaving only a red button state.
  • MLX architecture inspection now uses config-backed estimation directly, avoiding PyTorch-dependent AutoModel construction and allowing models such as /inferencerlabs/Qwen3-Coder-30B-A3B-Instruct-MLX-6.5bit to inspect successfully from config.json.
  • Architecture inspector subprocess failures now include captured output or an explicit timeout diagnostic when the Python process exits or is killed without a structured error.
  • Models page filters now infer provider, quantized provider, format, quantization bit-depth, and use-case metadata from discovered model IDs, and collapse provider-prefixed aliases so the model filter shows clean base model names only.