v0.3.2

github-actions released this 05 May 18:50

· 173 commits to main since this release

af8c388

Added

Backend inference-server calls can now be routed through an optional Undici proxy configured with AITESTBENCH_INFERENCE_PROXY and AITESTBENCH_INFERENCE_NO_PROXY, without exposing proxy settings to the frontend.

Changed

CI and release workflows now run on Node.js 22 to match current backend dependency requirements.

Fixed

Results dashboard performance graphs now link repeated runs from the same template/model into one series even when generated active test IDs differ.
Results dashboard merged metric graphs now keep different models as separate lines instead of collapsing same-test metrics together.
Results dashboard default date ranges now include the newest result even when its timestamp has seconds or milliseconds, preventing single-run dashboards from appearing empty.
Settings Empty database now clears all application SQLite tables, including evaluation prompts and evaluations that feed the leaderboard.
Leaderboard view now clears stale displayed rows immediately after the database is emptied from settings.
Architecture inspection errors now show visible, non-empty diagnostics in the model detail page instead of leaving only a red button state.
MLX architecture inspection now uses config-backed estimation directly, avoiding PyTorch-dependent AutoModel construction and allowing models such as /inferencerlabs/Qwen3-Coder-30B-A3B-Instruct-MLX-6.5bit to inspect successfully from config.json.
Architecture inspector subprocess failures now include captured output or an explicit timeout diagnostic when the Python process exits or is killed without a structured error.
Models page filters now infer provider, quantized provider, format, quantization bit-depth, and use-case metadata from discovered model IDs, and collapse provider-prefixed aliases so the model filter shows clean base model names only.

Assets 4