Releases: BerriAI/litellm
v1.73.2.dev1
What's Changed
- VertexAI Anthropic passthrough cost calc fixes + Filter litellm params from request sent to passthrough endpoint by @krrishdholakia in #11992
- Fix custom pricing logging + Gemini - only use accepted format values + Gemini - cache tools if passing alongside cached content by @krrishdholakia in #11989
- Fix unpack_defs handling of nested $ref inside anyOf items by @colesmcintosh in #11964
- #response_format NVIDIA-NIM add response_format to OpenAI parameters … by @shagunb-acn in #12003
- Add Azure o3-pro Pricing by @marty-sullivan in #11990
- [Bug Fix] SCIM - Ensure new user roles are applied by @ishaan-jaff in #12015
Full Changelog: v1.73.1-nightly...v1.73.2.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.2.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 267.8382003869747 | 6.2096771619800935 | 0.0 | 1858 | 0 | 214.47131599995828 | 1466.6541370000346 |
Aggregated | Passed ✅ | 250.0 | 267.8382003869747 | 6.2096771619800935 | 0.0 | 1858 | 0 | 214.47131599995828 | 1466.6541370000346 |
v1.73.1-nightly
What's Changed
- Fix SambaNova 'created' field validation error - handle float timestamps by @neubig in #11971
- Docs - Add Recommended Machine Specifications by @ishaan-jaff in #11980
- fix: make response api support Azure Authentication method by @hsuyuming in #11941
- feat: add Last Success column to health check table by @colesmcintosh in #11903
- Add GitHub Actions workflow for LLM translation testing artifacts by @colesmcintosh in #11780
- Fix markdown table not rendering properly by @mukesh-dream11 in #11969
- [Fix] - Check HTTP_PROXY vars in networking requests by @ishaan-jaff in #11947
- Proxy UI MCP Auth passthrough by @wagnerjt in #11968
- fix unrecognised parameter reasoning_effort by @Shankyg in #11838
- Fixing watsonx error: 'model_id' or 'model' cannot be specified in the request body for models in a deployment space by @cbjuan in #11854
- [Bug Fix] Perplexity - LiteLLM doesn't support 'web_search_options' for Perplexity' Sonar Pro model by @ishaan-jaff in #11983
- feat: implement Perplexity citation tokens and search queries cost calculation by @colesmcintosh in #11938
- [Feat] Enterprise - Allow dynamically disabling callbacks in request headers by @ishaan-jaff in #11985
- Add Mistral 3.2 24B to model mapping by @colesmcintosh in #11926
- [Feat] Add List Callbacks API Endpoint by @ishaan-jaff in #11987
- fix: fix test_get_azure_ad_token_with_oidc_token testcase issue by @hsuyuming in #11988
- [Bug Fix] Bedrock Guardrail - Don't raise exception on intervene action by @ishaan-jaff in #11875
New Contributors
- @mukesh-dream11 made their first contribution in #11969
- @cbjuan made their first contribution in #11854
Full Changelog: v1.73.0.rc.1...v1.73.1-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.1-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.1-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 269.80153099125215 | 6.123419901585826 | 0.0 | 1829 | 0 | 217.6905329999954 | 1336.1768169999948 |
Aggregated | Passed ✅ | 250.0 | 269.80153099125215 | 6.123419901585826 | 0.0 | 1829 | 0 | 217.6905329999954 | 1336.1768169999948 |
v1.73.0.rc.1
What's Changed
- (Tutorial) Onboard Users for AI Exploration by @krrishdholakia in #11955
- Management Fixes - don't apply default internal user settings to admins + preserve all model access for teams with empty model list, when team model added + /v2/model/info fixes by @krrishdholakia in #11957
Full Changelog: v1.73.0-nightly...v1.73.0.rc.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.0.rc.1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 220.0 | 240.58426726332922 | 6.145106667198675 | 0.0 | 1838 | 0 | 196.0181700000021 | 1838.010895000025 |
Aggregated | Passed ✅ | 220.0 | 240.58426726332922 | 6.145106667198675 | 0.0 | 1838 | 0 | 196.0181700000021 | 1838.010895000025 |
v1.73.0-nightly
What's Changed
- Update Azure o3 pricing to match OpenAI pricing ($2/$8 per 1M tokens) by @ervwalter in #11937
- [BugFix] Ollama response_format not working by @ThakeeNathees in #11880
- fix aws bedrock claude tool call index by @jnhyperion in #11842
- fix(acompletion): allow dict for tool_choice argument by @Jannchie in #11860
- [Chore] Check team counts on license when creating new team by @ishaan-jaff in #11943
- [Docs] [Pre-Release] v1.73.0-stable by @ishaan-jaff in #11950
- Show user all models they can call (Across teams) on UI by @krrishdholakia in #11948
New Contributors
- @ervwalter made their first contribution in #11937
- @ThakeeNathees made their first contribution in #11880
- @jnhyperion made their first contribution in #11842
- @Jannchie made their first contribution in #11860
Full Changelog: v1.72.9-nightly...v1.73.0-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 252.55233764162196 | 6.182759830384375 | 0.0 | 1850 | 0 | 208.6453730000244 | 1743.1928639999796 |
Aggregated | Passed ✅ | 230.0 | 252.55233764162196 | 6.182759830384375 | 0.0 | 1850 | 0 | 208.6453730000244 | 1743.1928639999796 |
v1.72.9-nightly
What's Changed
- [Feat] MCP - Allow connecting to MCP with authentication headers + Allow clients to specify MCP headers (#11890) by @ishaan-jaff in #11891
- [Fix] Networking - allow using CA Bundles by @ishaan-jaff in #11906
- [Feat] Add AWS Bedrock profiles for the APAC region by @lgruen-vcgs in #11883
- bumps the anthropic package by @rinormaloku in #11851
- Add deployment annotations by @InvisibleMan1306 in #11849
- Enhance Mistral API: Add support for parallel tool calls by @njbrake in #11770
- [UI] QA Items for adding pass through endpoints by @ishaan-jaff in #11909
- build(model_prices_and_context_window.json): mark all gemini-2.5 models support pdf input + Set anthropic custom llm provider property by @krrishdholakia in #11907
- fix(proxy_server.py): fix loading ui on custom root path by @krrishdholakia in #11912
- LiteLLM SDK <-> Proxy improvement (don't transform message client-side) + Bedrock - handle
qs:..
in base64 file data + Tag Management - support adding public model names by @krrishdholakia in #11908 - Add success modal for health check responses by @colesmcintosh in #11899
- Volcengine - thinking param support + Azure - handle more gpt custom naming patterns by @krrishdholakia in #11914
- [Feat] Model Cost Map - Add
gemini-2.5-pro
and setgemini-2.5-pro
supports_reasoning=True by @ishaan-jaff in #11927 - [Feat] UI Allow testing /v1/messages on the Test Key Page by @ishaan-jaff in #11930
- Feat/add delete callback by @jtong99 in #11654
- add ciphers in command and pass to hypercorn for proxy by @frankzye in #11916
- [Bug Fix] Fix model_group tracked for /v1/messages and /moderations by @ishaan-jaff in #11933
- [Bug Fix] Cost tracking and logging via the /v1/messages API are not working when using Claude Code by @ishaan-jaff in #11928
- [Feat] Add Azure Codex Models on LiteLLM + new /v1 preview Azure OpenAI API by @ishaan-jaff in #11934
- [Feat] UI QA: Pass through endpoints by @ishaan-jaff in #11939
New Contributors
- @lgruen-vcgs made their first contribution in #11883
- @rinormaloku made their first contribution in #11851
- @InvisibleMan1306 made their first contribution in #11849
Full Changelog: v1.72.7-nightly...v1.72.9-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.9-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 259.91299670392254 | 6.2187422072270495 | 0.0 | 1861 | 0 | 210.9276310000041 | 1676.9406920000165 |
Aggregated | Passed ✅ | 240.0 | 259.91299670392254 | 6.2187422072270495 | 0.0 | 1861 | 0 | 210.9276310000041 | 1676.9406920000165 |
v1.72.7-nightly
What's Changed
- feat(azure): Make Azure AD scope configurable by @kjoth in #11621
- Litellm stable docs 06 14 2025 p2 by @krrishdholakia in #11738
- Release note updates + Responses API Bridge improvements by @krrishdholakia in #11740
- VertexAI Anthropic - streaming passthrough cost tracking by @krrishdholakia in #11734
- Fix PrometheusLogger label_filters initialization for non-premium users by @colesmcintosh in #11764
- Add Vertex Imagen-4 models by @emerzon in #11767
- Users page buttons repositioned by @NANDINI-star in #11771
- #11748: Added Mistral Small to BEDROCK_CONVERSE_MODELS for Converse A… by @shagunb-acn in #11760
- [Security] Fixes for docs by @ishaan-jaff in #11776
- [Security] - Add Trivy Security Scan for UI + Docs folder - remove all vulnerabilities by @ishaan-jaff in #11778
- [Fix] Pass through - Langfuse don't log request to Langfuse passthrough on Langfuse by @ishaan-jaff in #11768
- [Deps] Fix aiohttp version requirement by @ishaan-jaff in #11777
- AWS credentials no longer mandatory by @MadsRC in #11765
- build(deps): bump next from 14.2.26 to 14.2.30 in /ui/litellm-dashboard by @dependabot in #11720
- feat: update the feature of ollama_embeddings to work on a sync api by @Abiji-2020 in #11746
- [Feat] Day-0 Support for OpenAI Re-usable prompts Responses API by @ishaan-jaff in #11782
- SSO - Allow passing additional headers + Spend Tags - automatically track spend by user agent (allows cost tracking for claude code) by @krrishdholakia in #11781
- JWT Auth - correctly return user email + UI Model Update - Allow editing model access group for existing model by @krrishdholakia in #11783
- Allow
/models
to return correct models for custom wildcard prefixes by @krrishdholakia in #11784 - Fix JSX syntax error in documentation causing Vercel deployment failure by @colesmcintosh in #11818
- [Fix] Bug Fix for using prom metrics config by @ishaan-jaff in #11779
- [Bug Fixes] MCP - using MCPs defined on config.yaml + fix for MCP error Team doesn't exist in cache by @ishaan-jaff in #11824
- new gemini model pricing + a few openrouter models model_prices_and_context_window.json by @salzubi401 in #11803
- Update bedrock guardrail docs by @orolega in #11826
- [Feat] v2 Pass through endpoints - Add support for subroutes for pass through endpoints + Cleaned up UI by @ishaan-jaff in #11827
- Fix vertex ai claude thinking params by @X4tar in #11796
- Implement health check backend API and storage functionality - fix ci/cd by @colesmcintosh in #11852
- [Fix] v1/messages endpoint always uses us-central1 with vertex_ai-anthropic models by @ishaan-jaff in #11831
- Fix #11856: Update billing.md docs to call the new GPT-4o model by @karen-veigas in #11858
- Add LiteLLM_HealthCheckTable to database schema by @colesmcintosh in #11677
- [SCIM] Add Error handling for existing user on SCIM by @ishaan-jaff in #11862
- feat(speech/): working gemini tts support via openai's
/v1/speech
endpoint by @krrishdholakia in #11832 - Completion-To-Responses Bridge: Support passing image url's by @krrishdholakia in #11833
- Implement health check frontend UI components and dashboard integration by @colesmcintosh in #11679
- Remove retired version of gpt-3.5 from prometheus.md by @Shankyg in #11859
- Minor Fixes by @krrishdholakia in #11868
- Fix boto3 tracer wrapping for observability by @colesmcintosh in #11869
- [Feat] Passthrough - Add support for setting custom cost per pass through request by @ishaan-jaff in #11870
- [Fix] SCIM - Add SCIM PATCH and PUT Ops for Users by @ishaan-jaff in #11863
- [UI] - Move passthrough endpoints under Models + Endpoints by @ishaan-jaff in #11871
- Fix gemini 2.5 flash config by @lowjiansheng in #11830
- Fix: #11853 Updated model version in alerting.md for latest model called when adding metadata to proxy calls. by @karen-veigas in #11855
- [Bug Fix] - Ensure "Request" is tracked for pass through requests on LiteLLM Proxy by @ishaan-jaff in #11873
- Add user agent tags in spend logs payload + Fix Azure ai content type + Fix passing dynamic credentials on retrieve batch by @krrishdholakia in #11872
- UI - allow setting default team for new users by @krrishdholakia in #11874
- Revert "UI - allow setting default team for new users" by @krrishdholakia in #11876
- Revert "Revert "UI - allow setting default team for new users"" by @krrishdholakia in #11877
- Fix default team settings by @NANDINI-star in #11887
- [Feat] UI - Add Allowed MCPs to Creating/Editing Organizations by @ishaan-jaff in #11893
- [Feat] Enable Tool Calling for meta_llama by @ishaan-jaff in #11895
- fix(vertex_ai): Handle missing tokenCount in promptTokensDetails (#11… by @ishaan-jaff in #11896
- [Bug Fix]: Fix gemini - web search error with responses API by @ishaan-jaff in #11894
- Revert "Users page buttons repositioned" by @krrishdholakia in #11904
- [Feat] V2 Add Pass through endpoints on UI by @ishaan-jaff in #11905
- Fix clickable model ID in health check table by @colesmcintosh in #11898
- Fix health check UI table design by @colesmcintosh in #11897
- [Bug Fix] add missing
flash-2.5-flash-lite
for gemini provider, fixgemini-2.5-flash
pricing by @fcakyon in #11901 - feat: add workload identity federation between GCP and AWS by @pascallim in #10210
New Contributors
- @kjoth made their first contribution in #11621
- @shagunb-acn made their first contribution in #11760
- @MadsRC made their first contribution in #11765
- @Abiji-2020 made their first contribution in #11746
- @salzubi401 made their first contribution in #11803
- @orolega made their first contribution in #11826
- @X4tar made their first contribution in #11796
- @karen-veigas made their first contribution in #11858
- @Shankyg made their first contribution in #11859
- @pascallim made their first contribution in #10210
Full Changelog: v1.72.6.dev1...v1.72.7-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.7-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 211.70463782970805 | 6.29948833219958 | 0.0 | 1885 | 0 | 169.2135669999857 | 2108.276391000004 |
Aggregated | Passed ✅ | 190.0 | 211.70463782970805 | 6.29948833219958 | 0.0 | 1885 | 0 | 169.2135669999857 | 2108.276391000004 |
v1.72.6-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.72.6-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 269.27781931947453 | 6.111834388077504 | 0.0 | 1828 | 0 | 215.86210600003142 | 1630.9297619999938 |
Aggregated | Passed ✅ | 250.0 | 269.27781931947453 | 6.111834388077504 | 0.0 | 1828 | 0 | 215.86210600003142 | 1630.9297619999938 |
What's Changed
- [Docs] v1.72.2.rc by @ishaan-jaff in #11519
- Support env var vertex credentials for passthrough + ignore space id on watsonx deployment (throws Json validation errors) by @krrishdholakia in #11527
- Ensure consistent 'created' across all chunks + set tool call id for ollama streaming calls by @krrishdholakia in #11528
- Update enduser spend and budget reset date based on budget duration by @laurien16 in #8460
- feat: add .cursor to .gitignore by @colesmcintosh in #11538
- Add gpt-4o-audio-preview-2025-06-03 pricing configuration by @colesmcintosh in #11560
- [Docs] Fix incorrect reference to database_url as master_key by @fengbohello in #11547
- Update documentation for configuring web search options in config.yaml by @colesmcintosh in #11537
- [Bug fix]: aiohttp fixes for transfer encoding error on aiohttp transport by @ishaan-jaff in #11561
- [Feat] Add
reasoning_effort
support for perplexity models by @ishaan-jaff in #11562 - Make all commands show server URL by @msabramo in #10801
- Simplify
management_cli.md
CLI docs by @msabramo in #10799 - Fix: Adds support for choosing the default region based on where the model is available by @ishaan-jaff in #11566
- [Feat] Add Lasso Guardrail to LiteLLM by @ishaan-jaff in #11565
- Fix gemini tool call indexes by @lowjiansheng in #11558
- Show remaining users on UI + prevent early stream stopping for gemini requests by @krrishdholakia in #11568
- Add VertexAI
claude-opus-4
+ Assign users to orgs on creation by @krrishdholakia in #11572 - Pangea/kl/udpate readme by @lapinek in #11570
- Update README.md so docker compose will work as described by @yanwork in #11586
- Add support for new Mistral Magistral models (magistral-medium-2506 and magistral-small-2506) by @colesmcintosh in #11588
- (fix:exception_mapping_utils.py) fix sglang rate limit error issue by @dhs-shine in #11575
- [Feat] LiteLLM Allow setting Uvicorn Keep Alive Timeout by @ishaan-jaff in #11594
- [Bug Fix] No module named 'diskcache' by @ishaan-jaff in #11600
- [Feat] UI - Add controls for MCP Permission Management by @ishaan-jaff in #11598
- [Feat] New LLM API Endpoint - Add List input items for Responses API by @ishaan-jaff in #11602
- Add new o3 models pricing by @krrishdholakia in #11606
- [UI] Polish New MCP Server Add Form by @ishaan-jaff in #11604
- Litellm dev 06 10 2025 p2 by @krrishdholakia in #11605
- Add VertexAI Anthropic passthrough - cost calculation, token tracking by @krrishdholakia in #11611
- fix(internal_user_endpoints.py): support user with
+
in email on user info + handle empty string for arguments on gemini function calls by @krrishdholakia in #11601 - Fix: passes api_base, api_key, litellm_params_dict to custom_llm embedding methods by @ElefHead in #11450
- Add Admin-Initiated Password Reset Flow by @NANDINI-star in #11618
- fix inference endpoints (#11630) by @ishaan-jaff in #11631
- [UI] Add Deepgram provider to supported providers list and mappings by @ishaan-jaff in #11634
- [Bug Fix] Add audio/ogg mapping for Audio MIME types by @ishaan-jaff in #11635
- [Feat] Add Background mode for Responses API - OpenAI, AzureOpenAI by @ishaan-jaff in #11640
- [Feat] Add provider specific params for
deepgram/
by @ishaan-jaff in #11638 - [Feat] MCP - Add support for
streamablehttp_client
MCP Servers by @ishaan-jaff in #11628 - [Feat] Perf fix - ensure deepgram provider uses async httpx calls by @ishaan-jaff in #11641
- Trim the long user ids on the the keys page by @NANDINI-star in #11488
- Enable System Proxy Support for aiohttp Transport by @idootop in #11616
- GA Multi-instance rate limiting v2 Requirements + New - specify token rate limit type - output / input / total by @krrishdholakia in #11646
- Add bridge for /chat/completion -> /responses API by @krrishdholakia in #11632
- Convert scientific notation str to int + Bubble up azure content filter results by @krrishdholakia in #11655
- feat(helm): [#11648] support extraContainers in migrations-job.yaml by @stevenaldinger in #11649
- Correct success message when user creates new budget by @vuanhtu52 in #11608
- fix: Do not add default model on tag based-routing when valid tag by @thiagosalvatore in #11454
- Fix default user settings by @NANDINI-star in #11674
- [Pricing] add azure/gpt-4o-mini-transcribe models by @ishaan-jaff in #11676
- Enhance Mistral model support with reasoning capabilities by @colesmcintosh in #11642
- [Feat] MCP expose streamable https endpoint for LiteLLM Proxy by @ishaan-jaff in #11645
- change space_key header to space_id for Arize by @vanities in #11595
- Add performance indexes to LiteLLM_SpendLogs for analytics queries by @colesmcintosh in #11675
- Revert "Add performance indexes to LiteLLM_SpendLogs for analytics queries" by @krrishdholakia in #11683
- [Feat] Use dedicated Rest endpoints for list, calling MCP tools by @ishaan-jaff in #11684
- Chat Completions <-> Responses API Bridge Improvements by @krrishdholakia in #11685
- [UI] Fix MCP Server Table to Match Existing Table Pattern by @ishaan-jaff in #11691
- Logging: prevent double logging logs when bridge is used (anthropic <-> chat completion OR chat completion <-> responses api) by @krrishdholakia in #11687
- fix(vertex_ai): support global location in vertex ai passthrough by @alvarosevilla95 in #11661
- [Feat] UI Allow editing mcp servers by @ishaan-jaff in #11693
- [Feat] UI - Allow setting MCP servers when creating keys, teams by @ishaan-jaff in #11711
- [Feat] Add Authentication + Permission Management for MCP List, Call Tool Ops by @ishaan-jaff in #11682
- Add Live Tail Feature to Logs View by @NANDINI-star in #11712
- [Feat] Add Connect to MCP Page by @ishaan-jaff in #11716
- Enterprise feature preview improvement on Audit Logs by @NANDINI-star in #11715
- Align Model Connection Success Icon and Text by @NANDINI-star in #11717
- fix(prometheus.py): fix total requests increment + add semantic tests fo… by @krrishdholakia in #11718
- Anthropic - add 'prefix' to start of assistant content + Add model access groups on UI by @krrishdholakia in #11719
- Add anthropic 'none' tool choice param support by @krrishdholakia in #11695
- [Feat] UI - Add back favicon by @ishaan-jaff in #11728
- Time taken column logs by @gbrian in #11723
- U...
v1.72.6.post1-nightly
Full Changelog: v1.72.6.dev1...v1.72.6.post1-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.6.post1-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 209.8013988365269 | 6.275681933110413 | 0.0 | 1878 | 0 | 167.48262099997646 | 1487.4784890000115 |
Aggregated | Passed ✅ | 190.0 | 209.8013988365269 | 6.275681933110413 | 0.0 | 1878 | 0 | 167.48262099997646 | 1487.4784890000115 |
v1.72.6.devSCIM
What's Changed
- feat(azure): Make Azure AD scope configurable by @kjoth in #11621
- Litellm stable docs 06 14 2025 p2 by @krrishdholakia in #11738
- Release note updates + Responses API Bridge improvements by @krrishdholakia in #11740
- VertexAI Anthropic - streaming passthrough cost tracking by @krrishdholakia in #11734
- Fix PrometheusLogger label_filters initialization for non-premium users by @colesmcintosh in #11764
- Add Vertex Imagen-4 models by @emerzon in #11767
- Users page buttons repositioned by @NANDINI-star in #11771
- #11748: Added Mistral Small to BEDROCK_CONVERSE_MODELS for Converse A… by @shagunb-acn in #11760
- [Security] Fixes for docs by @ishaan-jaff in #11776
- [Security] - Add Trivy Security Scan for UI + Docs folder - remove all vulnerabilities by @ishaan-jaff in #11778
- [Fix] Pass through - Langfuse don't log request to Langfuse passthrough on Langfuse by @ishaan-jaff in #11768
- [Deps] Fix aiohttp version requirement by @ishaan-jaff in #11777
- AWS credentials no longer mandatory by @MadsRC in #11765
- build(deps): bump next from 14.2.26 to 14.2.30 in /ui/litellm-dashboard by @dependabot in #11720
- feat: update the feature of ollama_embeddings to work on a sync api by @Abiji-2020 in #11746
- [Feat] Day-0 Support for OpenAI Re-usable prompts Responses API by @ishaan-jaff in #11782
- SSO - Allow passing additional headers + Spend Tags - automatically track spend by user agent (allows cost tracking for claude code) by @krrishdholakia in #11781
- JWT Auth - correctly return user email + UI Model Update - Allow editing model access group for existing model by @krrishdholakia in #11783
- Allow
/models
to return correct models for custom wildcard prefixes by @krrishdholakia in #11784 - Fix JSX syntax error in documentation causing Vercel deployment failure by @colesmcintosh in #11818
- [Fix] Bug Fix for using prom metrics config by @ishaan-jaff in #11779
- [Bug Fixes] MCP - using MCPs defined on config.yaml + fix for MCP error Team doesn't exist in cache by @ishaan-jaff in #11824
- new gemini model pricing + a few openrouter models model_prices_and_context_window.json by @salzubi401 in #11803
- Update bedrock guardrail docs by @orolega in #11826
- [Feat] v2 Pass through endpoints - Add support for subroutes for pass through endpoints + Cleaned up UI by @ishaan-jaff in #11827
- Fix vertex ai claude thinking params by @X4tar in #11796
- Implement health check backend API and storage functionality - fix ci/cd by @colesmcintosh in #11852
- [Fix] v1/messages endpoint always uses us-central1 with vertex_ai-anthropic models by @ishaan-jaff in #11831
- Fix #11856: Update billing.md docs to call the new GPT-4o model by @karen-veigas in #11858
- Add LiteLLM_HealthCheckTable to database schema by @colesmcintosh in #11677
- [SCIM] Add Error handling for existing user on SCIM by @ishaan-jaff in #11862
New Contributors
- @kjoth made their first contribution in #11621
- @shagunb-acn made their first contribution in #11760
- @MadsRC made their first contribution in #11765
- @Abiji-2020 made their first contribution in #11746
- @salzubi401 made their first contribution in #11803
- @orolega made their first contribution in #11826
- @X4tar made their first contribution in #11796
- @karen-veigas made their first contribution in #11858
Full Changelog: v1.72.6.dev1...v1.72.6.devSCIM
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.6.devSCIM
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 215.1720637640139 | 6.262237464870193 | 0.0 | 1873 | 0 | 171.28891599998042 | 1800.7898239999918 |
Aggregated | Passed ✅ | 190.0 | 215.1720637640139 | 6.262237464870193 | 0.0 | 1873 | 0 | 171.28891599998042 | 1800.7898239999918 |
v1.72.6.SCIM2
Full Changelog: v1.72.6.devSCIM...v1.72.6.SCIM2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.6.SCIM2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 213.45712869978374 | 6.190773809263607 | 0.0 | 1852 | 0 | 171.36217200004467 | 1296.009626 |
Aggregated | Passed ✅ | 190.0 | 213.45712869978374 | 6.190773809263607 | 0.0 | 1852 | 0 | 171.36217200004467 | 1296.009626 |