v0.5.15
Breaking Changes
- The ArabicMMLU scenario will now include the context field in the inputs; previously, the context field was incorrect omitted, which made some instances unsolvable (#4218)
Models
- Add GPT-5.4, GPT-5.4 mini, and GPT-5.4 nano (#4145)
- Add Llama 4 Maverick on Vertex AI Llama 4 API Service (#4150)
- Add Qwen3.5 models on Together (#4152)
- Lazily load model in HuggingFaceClient (#4172)
- Add Gemini 3 Flash and Gemini 3.1 Flash-Lite (#4204)
- Work around Pydantic validation error in
OpenAIResponseClient(#4223)
Scenarios
- Fix
_apply_output_mapping_patternreturning wrong match results (#4192) - Release Arabic Finance scenario (#4220)
- Fix missing context in ArabicMMLU scenario (#4218)
Frontend
- Add link to
per_instance_stats.jsonon frontend (#4166)
Framework
- Suppress duplicate warnings from
truncate_sequence(#4151) - Default suite to "default" in
helm-runandhelm-summarize(#4155) - Don't import PyTorch when registering run spec functions (#4221)
Contributors
Thank you to the following contributors for your work on this HELM release!