Skip to content

Commit 0908618

Browse files
Litellm stable release 06 14 2025 (#11737)
* docs: initial commit with stable release changelog notes * docs: style updates * docs(index.md): updated changelog * docs(index.md): cleanup * docs(index.md): add general proxy improvements * docs: index.md cleanup
1 parent 327868f commit 0908618

File tree

3 files changed

+304
-6
lines changed

3 files changed

+304
-6
lines changed
Lines changed: 242 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,242 @@
1+
---
2+
title: "[PRE-RELEASE] v1.72.6-stable"
3+
slug: "v1-72-6-stable"
4+
date: 2025-06-14T10:00:00
5+
authors:
6+
- name: Krrish Dholakia
7+
title: CEO, LiteLLM
8+
url: https://www.linkedin.com/in/krish-d/
9+
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
10+
- name: Ishaan Jaffer
11+
title: CTO, LiteLLM
12+
url: https://www.linkedin.com/in/reffajnaahsi/
13+
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
14+
15+
hide_table_of_contents: false
16+
---
17+
18+
import Image from '@theme/IdealImage';
19+
import Tabs from '@theme/Tabs';
20+
import TabItem from '@theme/TabItem';
21+
22+
23+
## Deploy this version
24+
25+
:::info
26+
27+
This version is not out yet.
28+
29+
:::
30+
31+
32+
## TLDR
33+
34+
* **Why Upgrade**
35+
36+
* **Who Should Read**
37+
* **Risk of Upgrade**
38+
39+
40+
41+
---
42+
43+
## Key Highlights
44+
45+
---
46+
47+
48+
## New / Updated Models
49+
50+
### Pricing / Context Window Updates
51+
52+
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Type |
53+
| ----------- | -------------------------------------- | -------------- | ------------------- | -------------------- | -------------------- |
54+
| VertexAI | `vertex_ai/claude-opus-4` | 200K | $15.00 | $75.00 | New |
55+
| OpenAI | `gpt-4o-audio-preview-2025-06-03` | 128k | $2.5 (text), $40 (audio) | $10 (text), $80 (audio) | New |
56+
| OpenAI | `o3-pro` | 200k | 20 | 80 | New |
57+
| OpenAI | `o3-pro-2025-06-10` | 200k | 20 | 80 | New |
58+
| OpenAI | `o3` | 200k | 2 | 8 | Updated |
59+
| OpenAI | `o3-2025-04-16` | 200k | 2 | 8 | Updated |
60+
| Azure | `azure/gpt-4o-mini-transcribe` | 16k | 1.25 (text), 3 (audio) | 5 (text) | New |
61+
| Mistral | `mistral/magistral-medium-latest` | 40k | 2 | 5 | New |
62+
| Mistral | `mistral/magistral-small-latest` | 40k | 0.5 | 1.5 | New |
63+
64+
- Deepgram: `nova-3` cost per second pricing is [now supported](https://github.com/BerriAI/litellm/pull/11634).
65+
66+
### Updated Models
67+
#### Bugs
68+
- **Watsonx**
69+
- Ignore space id on Watsonx deployments (throws json errors) - [PR](https://github.com/BerriAI/litellm/pull/11527)
70+
- **Ollama**
71+
- Set tool call id for streaming calls - [PR](https://github.com/BerriAI/litellm/pull/11528)
72+
- **Gemini (VertexAI + Google AI Studio)**
73+
- Fix tool call indexes - [PR](https://github.com/BerriAI/litellm/pull/11558)
74+
- Handle empty string for arguments in function calls - [PR](https://github.com/BerriAI/litellm/pull/11601)
75+
- Add audio/ogg mime type support when inferring from file url’s - [PR](https://github.com/BerriAI/litellm/pull/11635)
76+
- **Custom LLM**
77+
- Fix passing api_base, api_key, litellm_params_dict to custom_llm embedding methods - [PR](https://github.com/BerriAI/litellm/pull/11450) s/o [ElefHead](https://github.com/ElefHead)
78+
- **Huggingface**
79+
- Add /chat/completions to endpoint url when missing - [PR](https://github.com/BerriAI/litellm/pull/11630)
80+
- **Deepgram**
81+
- Support async httpx calls - [PR](https://github.com/BerriAI/litellm/pull/11641)
82+
- **Anthropic**
83+
- Append prefix (if set) to assistant content start - [PR](https://github.com/BerriAI/litellm/pull/11719)
84+
85+
#### Features
86+
- **VertexAI**
87+
- Support vertex credentials set via env var on passthrough - [PR](https://github.com/BerriAI/litellm/pull/11527)
88+
- Support for choosing ‘global’ region when model is only available there - [PR](https://github.com/BerriAI/litellm/pull/11566)
89+
- Anthropic passthrough cost calculation + token tracking - [PR](https://github.com/BerriAI/litellm/pull/11611)
90+
- Support ‘global’ vertex region on passthrough - [PR](https://github.com/BerriAI/litellm/pull/11661)
91+
- **Anthropic**
92+
- ‘none’ tool choice param support - [PR](https://github.com/BerriAI/litellm/pull/11695)
93+
- **Perplexity**
94+
- Add ‘reasoning_effort’ support - [PR](https://github.com/BerriAI/litellm/pull/11562)
95+
- **Mistral**
96+
- Add mistral reasoning support - [PR](https://github.com/BerriAI/litellm/pull/11642)
97+
- **SGLang**
98+
- Map context window exceeded error for proper handling - [PR](https://github.com/BerriAI/litellm/pull/11575/)
99+
- **Deepgram**
100+
- Provider specific params support - [PR](https://github.com/BerriAI/litellm/pull/11638)
101+
- **Azure**
102+
- Return content safety filter results - [PR](https://github.com/BerriAI/litellm/pull/11655)
103+
---
104+
105+
## LLM API Endpoints
106+
107+
#### Bugs
108+
- **Chat Completion**
109+
- Streaming - Ensure consistent ‘created’ across chunks - [PR](https://github.com/BerriAI/litellm/pull/11528)
110+
#### Features
111+
- **MCP**
112+
- Add controls for MCP Permission Management - [PR](https://github.com/BerriAI/litellm/pull/11598),
113+
- Add permission management for MCP List + Call Tool operations - [PR](https://github.com/BerriAI/litellm/pull/11682)
114+
- Streamable HTTP server support - [PR](https://github.com/BerriAI/litellm/pull/11628), [PR](https://github.com/BerriAI/litellm/pull/11645)
115+
- Use Experimental dedicated Rest endpoints for list, calling MCP tools - [PR](https://github.com/BerriAI/litellm/pull/11684)
116+
- **Responses API**
117+
- NEW API Endpoint - List input items - [PR](https://github.com/BerriAI/litellm/pull/11602)
118+
- Background mode for OpenAI + Azure OpenAI - [PR](https://github.com/BerriAI/litellm/pull/11640)
119+
- Langfuse/other Logging support on responses api requests - [PR](https://github.com/BerriAI/litellm/pull/11685)
120+
- **Chat Completions**
121+
- Bridge for Responses API - allows calling codex-mini via `/chat/completions` and `/v1/messages` - [PR](https://github.com/BerriAI/litellm/pull/11632), [PR](https://github.com/BerriAI/litellm/pull/11685)
122+
123+
124+
---
125+
126+
## Spend Tracking
127+
128+
#### Bugs
129+
- **End Users**
130+
- Update enduser spend and budget reset date based on budget duration - [PR](https://github.com/BerriAI/litellm/pull/8460) (s/o [laurien16](https://github.com/laurien16))
131+
- **Custom Pricing**
132+
- Convert scientific notation str to int - [PR](https://github.com/BerriAI/litellm/pull/11655)
133+
134+
---
135+
136+
## Management Endpoints / UI
137+
138+
#### Bugs
139+
- **Users**
140+
- `/user/info` - fix passing user with `+` in user id
141+
- Add admin-initiated password reset flow - [PR](https://github.com/BerriAI/litellm/pull/11618)
142+
- Fixes default user settings UI rendering error - [PR](https://github.com/BerriAI/litellm/pull/11674)
143+
- **Budgets**
144+
- Correct success message when new user budget is created - [PR](https://github.com/BerriAI/litellm/pull/11608)
145+
146+
#### Features
147+
- **Leftnav**
148+
- Show remaining Enterprise users on UI
149+
- **MCP**
150+
- New server add form - [PR](https://github.com/BerriAI/litellm/pull/11604)
151+
- Allow editing mcp servers - [PR](https://github.com/BerriAI/litellm/pull/11693)
152+
- **Models**
153+
- Add deepgram models on UI
154+
- Model Access Group support on UI - [PR](https://github.com/BerriAI/litellm/pull/11719)
155+
- **Keys**
156+
- Trim long user id’s - [PR](https://github.com/BerriAI/litellm/pull/11488)
157+
- **Logs**
158+
- Add live tail feature to logs view, allows user to disable auto refresh in high traffic - [PR](https://github.com/BerriAI/litellm/pull/11712)
159+
- Audit Logs - preview screenshot - [PR](https://github.com/BerriAI/litellm/pull/11715)
160+
161+
---
162+
163+
## Logging / Guardrails Integrations
164+
165+
#### Bugs
166+
- **Arize**
167+
- Change space_key header to space_id - [PR](https://github.com/BerriAI/litellm/pull/11595) (s/o [vanities](https://github.com/vanities))
168+
- **Prometheus**
169+
- Fix total requests increment - [PR](https://github.com/BerriAI/litellm/pull/11718)
170+
171+
#### Features
172+
- **Lasso Guardrails**
173+
- [NEW] Lasso Guardrails support - [PR](https://github.com/BerriAI/litellm/pull/11565)
174+
- **Users**
175+
- New `organizations` param on `/user/new` - allows adding users to orgs on creation - [PR](https://github.com/BerriAI/litellm/pull/11572/files)
176+
- **Prevent double logging when using bridge logic** - [PR](https://github.com/BerriAI/litellm/pull/11687)
177+
178+
---
179+
180+
## Performance / Reliability Improvements
181+
182+
#### Bugs
183+
- **Tag based routing**
184+
- Do not consider ‘default’ models when request specifies a tag - [PR](https://github.com/BerriAI/litellm/pull/11454) (s/o [thiagosalvatore](https://github.com/thiagosalvatore))
185+
186+
#### Features
187+
- **Caching**
188+
- New optional ‘litellm[caching]’ pip install for adding disk cache dependencies - [PR](https://github.com/BerriAI/litellm/pull/11600)
189+
190+
---
191+
192+
## General Proxy Improvements
193+
194+
#### Bugs
195+
- **aiohttp**
196+
- fixes for transfer encoding error on aiohttp transport - [PR](https://github.com/BerriAI/litellm/pull/11561)
197+
198+
#### Features
199+
- **aiohttp**
200+
- Enable System Proxy Support for aiohttp transport - [PR](https://github.com/BerriAI/litellm/pull/11616) (s/o [idootop](https://github.com/idootop))
201+
- **CLI**
202+
- Make all commands show server URL - [PR](https://github.com/BerriAI/litellm/pull/10801)
203+
- **Unicorn**
204+
- Allow setting keep alive timeout - [PR](https://github.com/BerriAI/litellm/pull/11594)
205+
- **Experimental Rate Limiting v2**
206+
- Support specifying rate limit by output_tokens only - [PR](https://github.com/BerriAI/litellm/pull/11646)
207+
- Decrement parallel requests on call failure - [PR](https://github.com/BerriAI/litellm/pull/11646)
208+
- In-memory only rate limiting support - [PR](https://github.com/BerriAI/litellm/pull/11646)
209+
- Return remaining rate limits by key/user/team - [PR](https://github.com/BerriAI/litellm/pull/11646)
210+
- **Helm**
211+
- support extraContainers in migrations-job.yaml - [PR](https://github.com/BerriAI/litellm/pull/11649)
212+
213+
214+
215+
216+
---
217+
218+
## New Contributors
219+
* @laurien16 made their first contribution in https://github.com/BerriAI/litellm/pull/8460
220+
* @fengbohello made their first contribution in https://github.com/BerriAI/litellm/pull/11547
221+
* @lapinek made their first contribution in https://github.com/BerriAI/litellm/pull/11570
222+
* @yanwork made their first contribution in https://github.com/BerriAI/litellm/pull/11586
223+
* @dhs-shine made their first contribution in https://github.com/BerriAI/litellm/pull/11575
224+
* @ElefHead made their first contribution in https://github.com/BerriAI/litellm/pull/11450
225+
* @idootop made their first contribution in https://github.com/BerriAI/litellm/pull/11616
226+
* @stevenaldinger made their first contribution in https://github.com/BerriAI/litellm/pull/11649
227+
* @thiagosalvatore made their first contribution in https://github.com/BerriAI/litellm/pull/11454
228+
* @vanities made their first contribution in https://github.com/BerriAI/litellm/pull/11595
229+
* @alvarosevilla95 made their first contribution in https://github.com/BerriAI/litellm/pull/11661
230+
231+
---
232+
233+
## Demo Instance
234+
235+
Here's a Demo Instance to test changes:
236+
237+
- Instance: https://demo.litellm.ai/
238+
- Login Credentials:
239+
- Username: admin
240+
- Password: sk-1234
241+
242+
## [Git Diff](https://github.com/BerriAI/litellm/compare/v1.72.2-stable...1.72.6.rc)

litellm/model_prices_and_context_window_backup.json

Lines changed: 31 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4263,6 +4263,20 @@
42634263
"supports_assistant_prefill": true,
42644264
"supports_tool_choice": true
42654265
},
4266+
"mistral/magistral-medium-latest": {
4267+
"max_tokens": 40000,
4268+
"max_input_tokens": 40000,
4269+
"max_output_tokens": 40000,
4270+
"input_cost_per_token": 2e-06,
4271+
"output_cost_per_token": 5e-06,
4272+
"litellm_provider": "mistral",
4273+
"mode": "chat",
4274+
"source": "https://mistral.ai/news/magistral",
4275+
"supports_function_calling": true,
4276+
"supports_assistant_prefill": true,
4277+
"supports_tool_choice": true,
4278+
"supports_reasoning": true
4279+
},
42664280
"mistral/magistral-medium-2506": {
42674281
"max_tokens": 40000,
42684282
"max_input_tokens": 40000,
@@ -4277,15 +4291,29 @@
42774291
"supports_tool_choice": true,
42784292
"supports_reasoning": true
42794293
},
4294+
"mistral/magistral-small-latest": {
4295+
"max_tokens": 40000,
4296+
"max_input_tokens": 40000,
4297+
"max_output_tokens": 40000,
4298+
"input_cost_per_token": 0.5e-6,
4299+
"output_cost_per_token": 1.5e-6,
4300+
"litellm_provider": "mistral",
4301+
"mode": "chat",
4302+
"source": "https://mistral.ai/pricing#api-pricing",
4303+
"supports_function_calling": true,
4304+
"supports_assistant_prefill": true,
4305+
"supports_tool_choice": true,
4306+
"supports_reasoning": true
4307+
},
42804308
"mistral/magistral-small-2506": {
42814309
"max_tokens": 40000,
42824310
"max_input_tokens": 40000,
42834311
"max_output_tokens": 40000,
4284-
"input_cost_per_token": 0.0,
4285-
"output_cost_per_token": 0.0,
4312+
"input_cost_per_token": 0.5e-06,
4313+
"output_cost_per_token": 1.5e-06,
42864314
"litellm_provider": "mistral",
42874315
"mode": "chat",
4288-
"source": "https://mistral.ai/news/magistral",
4316+
"source": "https://mistral.ai/pricing#api-pricing",
42894317
"supports_function_calling": true,
42904318
"supports_assistant_prefill": true,
42914319
"supports_tool_choice": true,

model_prices_and_context_window.json

Lines changed: 31 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4263,6 +4263,20 @@
42634263
"supports_assistant_prefill": true,
42644264
"supports_tool_choice": true
42654265
},
4266+
"mistral/magistral-medium-latest": {
4267+
"max_tokens": 40000,
4268+
"max_input_tokens": 40000,
4269+
"max_output_tokens": 40000,
4270+
"input_cost_per_token": 2e-06,
4271+
"output_cost_per_token": 5e-06,
4272+
"litellm_provider": "mistral",
4273+
"mode": "chat",
4274+
"source": "https://mistral.ai/news/magistral",
4275+
"supports_function_calling": true,
4276+
"supports_assistant_prefill": true,
4277+
"supports_tool_choice": true,
4278+
"supports_reasoning": true
4279+
},
42664280
"mistral/magistral-medium-2506": {
42674281
"max_tokens": 40000,
42684282
"max_input_tokens": 40000,
@@ -4277,15 +4291,29 @@
42774291
"supports_tool_choice": true,
42784292
"supports_reasoning": true
42794293
},
4294+
"mistral/magistral-small-latest": {
4295+
"max_tokens": 40000,
4296+
"max_input_tokens": 40000,
4297+
"max_output_tokens": 40000,
4298+
"input_cost_per_token": 0.5e-6,
4299+
"output_cost_per_token": 1.5e-6,
4300+
"litellm_provider": "mistral",
4301+
"mode": "chat",
4302+
"source": "https://mistral.ai/pricing#api-pricing",
4303+
"supports_function_calling": true,
4304+
"supports_assistant_prefill": true,
4305+
"supports_tool_choice": true,
4306+
"supports_reasoning": true
4307+
},
42804308
"mistral/magistral-small-2506": {
42814309
"max_tokens": 40000,
42824310
"max_input_tokens": 40000,
42834311
"max_output_tokens": 40000,
4284-
"input_cost_per_token": 0.0,
4285-
"output_cost_per_token": 0.0,
4312+
"input_cost_per_token": 0.5e-06,
4313+
"output_cost_per_token": 1.5e-06,
42864314
"litellm_provider": "mistral",
42874315
"mode": "chat",
4288-
"source": "https://mistral.ai/news/magistral",
4316+
"source": "https://mistral.ai/pricing#api-pricing",
42894317
"supports_function_calling": true,
42904318
"supports_assistant_prefill": true,
42914319
"supports_tool_choice": true,

0 commit comments

Comments
 (0)