v0.5.14
Breaking Changes
- Previously, responses from models on the OpenAI, Anthropic Claude, OpenRouter, NVIDIA NIM, vLLM, and xAI Grok APIs were tokenized locally locally. These responses will no longer be tokenized. (#4108, #4113)
- The MadinahQA scenario will now include the context field in the inputs for the reading comprehension task (#4127)
Models
- Add Mistral Large 3, Mistral Medium 3.1, Mistral Small 3.2, and Ministral 3 (#4098)
- Add GPT-5.4 (#4099)
- Switch llama-4-scout-17b-16e-instruct to use Vertex AI instead of Together (#4104)
- Add the ability to use models from certain providers without manually configuring
model_deployments.yaml(#4100, #4103, #4117):- The following providers are supported:
- Note: Together models will now use the chat API rather than the text completions API by default (#4106)
- Remove tokenization from
AnthropicMessagesClient(#4108) - Remove tokenization from
OpenAIClient(#4113)
Scenarios
- Fix MadinahQA scenario to include Context field for reading comprehension (#4127)
Framework
- Remove the human-evaluation optional dependency (#4101)
- Install dependencies for documentation on Read the Docs with uv (#4140)
Contributors
Thank you to the following contributors for your work on this HELM release!