Releases · BerriAI/litellm

13 Jun 23:10

github-actions

v1.72.2.devMCP

4b03bf9

v1.72.2.devMCP Latest

Latest

What's Changed

fix inference endpoints (#11630) by @ishaan-jaff in #11631
[UI] Add Deepgram provider to supported providers list and mappings by @ishaan-jaff in #11634
[Bug Fix] Add audio/ogg mapping for Audio MIME types by @ishaan-jaff in #11635
[Feat] Add Background mode for Responses API - OpenAI, AzureOpenAI by @ishaan-jaff in #11640
[Feat] Add provider specific params for deepgram/ by @ishaan-jaff in #11638
[Feat] MCP - Add support for streamablehttp_client MCP Servers by @ishaan-jaff in #11628
[Feat] Perf fix - ensure deepgram provider uses async httpx calls by @ishaan-jaff in #11641
Trim the long user ids on the the keys page by @NANDINI-star in #11488
Enable System Proxy Support for aiohttp Transport by @idootop in #11616
GA Multi-instance rate limiting v2 Requirements + New - specify token rate limit type - output / input / total by @krrishdholakia in #11646
Add bridge for /chat/completion -> /responses API by @krrishdholakia in #11632
Convert scientific notation str to int + Bubble up azure content filter results by @krrishdholakia in #11655
feat(helm): [#11648] support extraContainers in migrations-job.yaml by @stevenaldinger in #11649
Correct success message when user creates new budget by @vuanhtu52 in #11608
fix: Do not add default model on tag based-routing when valid tag by @thiagosalvatore in #11454
Fix default user settings by @NANDINI-star in #11674
[Pricing] add azure/gpt-4o-mini-transcribe models by @ishaan-jaff in #11676
Enhance Mistral model support with reasoning capabilities by @colesmcintosh in #11642
[Feat] MCP expose streamable https endpoint for LiteLLM Proxy by @ishaan-jaff in #11645
change space_key header to space_id for Arize by @vanities in #11595
Add performance indexes to LiteLLM_SpendLogs for analytics queries by @colesmcintosh in #11675
Revert "Add performance indexes to LiteLLM_SpendLogs for analytics queries" by @krrishdholakia in #11683
[Feat] Use dedicated Rest endpoints for list, calling MCP tools by @ishaan-jaff in #11684
Chat Completions <-> Responses API Bridge Improvements by @krrishdholakia in #11685
[UI] Fix MCP Server Table to Match Existing Table Pattern by @ishaan-jaff in #11691
Logging: prevent double logging logs when bridge is used (anthropic <-> chat completion OR chat completion <-> responses api) by @krrishdholakia in #11687
fix(vertex_ai): support global location in vertex ai passthrough by @alvarosevilla95 in #11661
[Feat] UI Allow editing mcp servers by @ishaan-jaff in #11693
[Feat] UI - Allow setting MCP servers when creating keys, teams by @ishaan-jaff in #11711
[Feat] Add Authentication + Permission Management for MCP List, Call Tool Ops by @ishaan-jaff in #11682
Add Live Tail Feature to Logs View by @NANDINI-star in #11712
[Feat] Add Connect to MCP Page by @ishaan-jaff in #11716
Enterprise feature preview improvement on Audit Logs by @NANDINI-star in #11715

New Contributors

@idootop made their first contribution in #11616
@stevenaldinger made their first contribution in #11649
@thiagosalvatore made their first contribution in #11454
@vanities made their first contribution in #11595
@alvarosevilla95 made their first contribution in #11661

Full Changelog: v1.72.5.dev1...v1.72.2.devMCP

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.2.devMCP

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	220.0	241.96992280403583	6.294425384311064	0.0	1883	0	199.48631400001204	1258.8171310000007
Aggregated	Passed ✅	220.0	241.96992280403583	6.294425384311064	0.0	1883	0	199.48631400001204	1258.8171310000007

Contributors

alvarosevilla95, vanities, and 8 other contributors

Assets 4

11 Jun 17:59

github-actions

v1.72.5.dev1

ec52600

v1.72.5.dev1

What's Changed

fix(internal_user_endpoints.py): support user with + in email on user info + handle empty string for arguments on gemini function calls by @krrishdholakia in #11601
Fix: passes api_base, api_key, litellm_params_dict to custom_llm embedding methods by @ElefHead in #11450
Add Admin-Initiated Password Reset Flow by @NANDINI-star in #11618

New Contributors

@ElefHead made their first contribution in #11450

Full Changelog: v1.72.4-nightly...v1.72.5.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.5.dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	250.0	271.77221555459084	6.153062151618842	0.0	1841	0	218.69335899998532	1399.0517459999978
Aggregated	Passed ✅	250.0	271.77221555459084	6.153062151618842	0.0	1841	0	218.69335899998532	1399.0517459999978

Contributors

ElefHead, krrishdholakia, and NANDINI-star

Assets 4

11 Jun 06:36

github-actions

v1.72.4-nightly

3b7f1d5

v1.72.4-nightly

What's Changed

Add support for new Mistral Magistral models (magistral-medium-2506 and magistral-small-2506) by @colesmcintosh in #11588
(fix:exception_mapping_utils.py) fix sglang rate limit error issue by @dhs-shine in #11575
[Feat] LiteLLM Allow setting Uvicorn Keep Alive Timeout by @ishaan-jaff in #11594
[Bug Fix] No module named 'diskcache' by @ishaan-jaff in #11600
[Feat] UI - Add controls for MCP Permission Management by @ishaan-jaff in #11598
[Feat] New LLM API Endpoint - Add List input items for Responses API by @ishaan-jaff in #11602
Add new o3 models pricing by @krrishdholakia in #11606
[UI] Polish New MCP Server Add Form by @ishaan-jaff in #11604
Litellm dev 06 10 2025 p2 by @krrishdholakia in #11605
Add VertexAI Anthropic passthrough - cost calculation, token tracking by @krrishdholakia in #11611

New Contributors

@dhs-shine made their first contribution in #11575

Full Changelog: v1.72.3-nightly...v1.72.4-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.4-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	180.0	204.8932724751626	6.202890810178717	0.0	1852	0	168.13937000000578	1311.1876840000036
Aggregated	Passed ✅	180.0	204.8932724751626	6.202890810178717	0.0	1852	0	168.13937000000578	1311.1876840000036

Contributors

dhs-shine, krrishdholakia, and 2 other contributors

Assets 4

12 Jun 00:47

github-actions

v1.72.2-stable

3a8c4da

v1.72.2-stable

Full Changelog: v1.72.0.stable...v1.72.2-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.72.2-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	200.0	220.70987704382637	6.253574765303403	0.0	1871	0	179.76551599997492	1345.8777239999904
Aggregated	Passed ✅	200.0	220.70987704382637	6.253574765303403	0.0	1871	0	179.76551599997492	1345.8777239999904

What's Changed

Litellm doc fixes 05 31 2025 by @krrishdholakia in #11305
Converted action buttons to sticky footer action buttons by @NANDINI-star in #11293
Add support for DataRobot as a provider in LiteLLM by @mjnitz02 in #10385
fix: remove dupe server_id MCP config servers by @wagnerjt in #11327
Add unit tests for Cohere Embed v4.0 model by @colesmcintosh in #11329
Add presidio_language yaml configuration support for guardrails by @colesmcintosh in #11331
[Fix] Fix SCIM running patch operation case sensitivity by @ishaan-jaff in #11335
Fix transcription model name mapping by @colesmcintosh in #11333
[Feat] DD Trace - Add instrumentation for streaming chunks by @ishaan-jaff in #11338
UI - Custom Server Root Path (Multiple Fixes) by @krrishdholakia in #11337
[Perf] - Add Async + Batched S3 Logging by @ishaan-jaff in #11340
fixes: expose flag to disable token counter by @ishaan-jaff in #11344
Merge in - Gemini streaming - thinking content parsing - return in reasoning_content by @krrishdholakia in #11298
Support returning virtual key in custom auth + Handle provider-specific optional params for embedding calls by @krrishdholakia in #11346
Doc : Nvidia embedding models by @AnilAren in #11352
feat: add cerebras/qwen-3-32b model pricing and context window by @colesmcintosh in #11373
Fix Google/Vertex AI Gemini module linting errors - Remove unused imports by @colesmcintosh in #11374
[Feat]: Performance add DD profiler to monitor python profile of LiteLLM CPU% by @ishaan-jaff in #11375
[Fix]: Performance - Don't run auth on /health/liveliness by @ishaan-jaff in #11378
[Bug Fix] Create/Update team member api 500 errror by @hagan in #10479
add gemini-embeddings-001 model prices and context window by @marty-sullivan in #11332
[Performance]: Add debugging endpoint to track active /asyncio-tasks by @ishaan-jaff in #11382
Add Claude 4 Sonnet & Opus, DeepSeek R1, and fix Llama Vision model pricing configurations by @colesmcintosh in #11339
[Feat] Performance - Don't create 1 task for every hanging request alert by @ishaan-jaff in #11385
UI / SSO - Update proxy admin id role in DB + Handle SSO redirects with custom root path by @krrishdholakia in #11384
Anthropic - pass file url's as Document content type + Gemini - cache token tracking on streaming calls by @krrishdholakia in #11387
Anthropic - Token tracking for Passthrough Batch API calls by @krrishdholakia in #11388
update GCSBucketBase to handle GSM project ID if passed by @wwells in #11409
fix: add enterprise feature gating to RegenerateKeyModal in KeyInfoView by @likweitan in #11400
Litellm audit log staging by @krrishdholakia in #11418
Add User ID validation to ensure it is not an email or phone number by @raz-alon in #10102
[Performance] Performance improvements for /v1/messages route by @ishaan-jaff in #11421
Add SSO configuration endpoints and UI integration with persistent settings by @colesmcintosh in #11417
[Build] Bump dd trace version by @ishaan-jaff in #11426
Add together_ai provided deepseek-r1 family model configuration by @jtsai-quid in #11394
fix: Use proper attribute for Sagemaker request for embeddings by @tmbo in #11362
added gemini url context support by @wangsha in #11351
fix(redis_cache.py): support pipeline redis lpop for older redis vers… by @krrishdholakia in #11425
Support no reasoning option for gemini models by @lowjiansheng in #11393
fix(prometheus.py): pass custom metadata labels in litellm_total_toke… by @krrishdholakia in #11414
Fix None values in usage field for gpt-image-1 model responses by @colesmcintosh in #11448
Fix HuggingFace embeddings using non-default input_type by @seankwalker in #11452
Add AGENTS.md by @colesmcintosh in #11461
Custom Root Path Improvements: don't require reserving /litellm route by @krrishdholakia in #11460
[Feat] Make batch size for maximum retention in spend logs a controllable parameter by @ishaan-jaff in #11459
Add pangea to guardrails sidebar by @ryanmeans in #11464
[Fix] [Bug]: Knowledge Base Call returning error by @ishaan-jaff in #11467
[Feat] Return response_id == upstream response ID for VertexAI + Google AI studio (Stream+Non stream) by @ishaan-jaff in #11456
[Fix]: /v1/messages - return streaming usage statistics when using litellm with bedrock models by @ishaan-jaff in #11469
fix: supports_function_calling works with llm_proxy models by @pazevedo-hyland in #11381
feat: add HuggingFace rerank provider support by @cainiaoit in #11438
Litellm dev 06 05 2025 p2 by @krrishdholakia in #11470
Fix variable redefinition linting error in vertex_and_google_ai_studio_gemini.py by @colesmcintosh in #11486
Add Google Gemini 2.5 Pro Preview 06-05 by @PeterDaveHello in #11447
Feat: add add azure endpoint for image endpoints by @ishaan-jaff in #11482
[Feat] New model - add codex-mini-latest by @ishaan-jaff in #11492
Nebius model pricing info updated by @Aktsvigun in #11445
[Docs] Add audio / tts section for gemini and vertex by @AyrennC in #11306
Document batch polling logic to avoid ValueError: Output file id is None error by @fadil4u in #11286
Revert "Nebius model pricing info updated" by @ishaan-jaff in #11493
[Bug Fix] Fix: _transform_responses_api_content_to_chat_completion_content` doesn't support file content type by @ishaan-jaff in #11494
Fix Fireworks AI rate limit exception mapping - detect "rate limit" text in error messages by @colesmcintosh in #11455
Update Makefile to match CI workflows and improve contributor experience by @colesmcintosh in #11485
Fix: Respect user_header_name property for budget selection and user identification by @colesmcintosh in #11419
Update production doc by @ishaan-jaff in #11499
Enhance proxy CLI with Rich formatting and improved user experience by @colesmcintosh in #11420
Remove retired version gpt-3.5 from configs.md by @vuanhtu52 in #11508
Update model version in deploy.md by @vuanhtu52 in #11506
[Feat] Allow using litellm.completion with /v1/messages API Spec (use gpt-4, gemini etc with claude code) by @ishaan-jaff in #11502
Update the correct test directory in contributing_code.md by @vuanhtu52 in https://github.com/BerriAI/...

Contributors

wangsha, hagan, and 23 other contributors

Assets 4

10 Jun 22:21

github-actions

v1.72.3-nightly

fb78822

v1.72.3-nightly

What's Changed

[Docs] v1.72.2.rc by @ishaan-jaff in #11519
Support env var vertex credentials for passthrough + ignore space id on watsonx deployment (throws Json validation errors) by @krrishdholakia in #11527
Ensure consistent 'created' across all chunks + set tool call id for ollama streaming calls by @krrishdholakia in #11528
Update enduser spend and budget reset date based on budget duration by @laurien16 in #8460
feat: add .cursor to .gitignore by @colesmcintosh in #11538
Add gpt-4o-audio-preview-2025-06-03 pricing configuration by @colesmcintosh in #11560
[Docs] Fix incorrect reference to database_url as master_key by @fengbohello in #11547
Update documentation for configuring web search options in config.yaml by @colesmcintosh in #11537
[Bug fix]: aiohttp fixes for transfer encoding error on aiohttp transport by @ishaan-jaff in #11561
[Feat] Add reasoning_effort support for perplexity models by @ishaan-jaff in #11562
Make all commands show server URL by @msabramo in #10801
Simplify management_cli.md CLI docs by @msabramo in #10799
Fix: Adds support for choosing the default region based on where the model is available by @ishaan-jaff in #11566
[Feat] Add Lasso Guardrail to LiteLLM by @ishaan-jaff in #11565
Fix gemini tool call indexes by @lowjiansheng in #11558
Show remaining users on UI + prevent early stream stopping for gemini requests by @krrishdholakia in #11568
Add VertexAI claude-opus-4 + Assign users to orgs on creation by @krrishdholakia in #11572
Pangea/kl/udpate readme by @lapinek in #11570
Update README.md so docker compose will work as described by @yanwork in #11586

New Contributors

@laurien16 made their first contribution in #8460
@fengbohello made their first contribution in #11547
@lapinek made their first contribution in #11570
@yanwork made their first contribution in #11586

Full Changelog: v1.72.2.rc...v1.72.3-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.3-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	230.0	244.23009797848223	6.212339526915134	0.0	1859	0	195.39837500002477	1308.959112999986
Aggregated	Passed ✅	230.0	244.23009797848223	6.212339526915134	0.0	1859	0	195.39837500002477	1308.959112999986

Contributors

msabramo, yanwork, and 7 other contributors

Assets 4

08 Jun 02:02

github-actions

v1.72.2.rc

49a7833

v1.72.2.rc

What's Changed

Litellm doc fixes 05 31 2025 by @krrishdholakia in #11305
Converted action buttons to sticky footer action buttons by @NANDINI-star in #11293
Add support for DataRobot as a provider in LiteLLM by @mjnitz02 in #10385
fix: remove dupe server_id MCP config servers by @wagnerjt in #11327
Add unit tests for Cohere Embed v4.0 model by @colesmcintosh in #11329
Add presidio_language yaml configuration support for guardrails by @colesmcintosh in #11331
[Fix] Fix SCIM running patch operation case sensitivity by @ishaan-jaff in #11335
Fix transcription model name mapping by @colesmcintosh in #11333
[Feat] DD Trace - Add instrumentation for streaming chunks by @ishaan-jaff in #11338
UI - Custom Server Root Path (Multiple Fixes) by @krrishdholakia in #11337
[Perf] - Add Async + Batched S3 Logging by @ishaan-jaff in #11340
fixes: expose flag to disable token counter by @ishaan-jaff in #11344
Merge in - Gemini streaming - thinking content parsing - return in reasoning_content by @krrishdholakia in #11298
Support returning virtual key in custom auth + Handle provider-specific optional params for embedding calls by @krrishdholakia in #11346
Doc : Nvidia embedding models by @AnilAren in #11352
feat: add cerebras/qwen-3-32b model pricing and context window by @colesmcintosh in #11373
Fix Google/Vertex AI Gemini module linting errors - Remove unused imports by @colesmcintosh in #11374
[Feat]: Performance add DD profiler to monitor python profile of LiteLLM CPU% by @ishaan-jaff in #11375
[Fix]: Performance - Don't run auth on /health/liveliness by @ishaan-jaff in #11378
[Bug Fix] Create/Update team member api 500 errror by @hagan in #10479
add gemini-embeddings-001 model prices and context window by @marty-sullivan in #11332
[Performance]: Add debugging endpoint to track active /asyncio-tasks by @ishaan-jaff in #11382
Add Claude 4 Sonnet & Opus, DeepSeek R1, and fix Llama Vision model pricing configurations by @colesmcintosh in #11339
[Feat] Performance - Don't create 1 task for every hanging request alert by @ishaan-jaff in #11385
UI / SSO - Update proxy admin id role in DB + Handle SSO redirects with custom root path by @krrishdholakia in #11384
Anthropic - pass file url's as Document content type + Gemini - cache token tracking on streaming calls by @krrishdholakia in #11387
Anthropic - Token tracking for Passthrough Batch API calls by @krrishdholakia in #11388
update GCSBucketBase to handle GSM project ID if passed by @wwells in #11409
fix: add enterprise feature gating to RegenerateKeyModal in KeyInfoView by @likweitan in #11400
Litellm audit log staging by @krrishdholakia in #11418
Add User ID validation to ensure it is not an email or phone number by @raz-alon in #10102
[Performance] Performance improvements for /v1/messages route by @ishaan-jaff in #11421
Add SSO configuration endpoints and UI integration with persistent settings by @colesmcintosh in #11417
[Build] Bump dd trace version by @ishaan-jaff in #11426
Add together_ai provided deepseek-r1 family model configuration by @jtsai-quid in #11394
fix: Use proper attribute for Sagemaker request for embeddings by @tmbo in #11362
added gemini url context support by @wangsha in #11351
fix(redis_cache.py): support pipeline redis lpop for older redis vers… by @krrishdholakia in #11425
Support no reasoning option for gemini models by @lowjiansheng in #11393
fix(prometheus.py): pass custom metadata labels in litellm_total_toke… by @krrishdholakia in #11414
Fix None values in usage field for gpt-image-1 model responses by @colesmcintosh in #11448
Fix HuggingFace embeddings using non-default input_type by @seankwalker in #11452
Add AGENTS.md by @colesmcintosh in #11461
Custom Root Path Improvements: don't require reserving /litellm route by @krrishdholakia in #11460
[Feat] Make batch size for maximum retention in spend logs a controllable parameter by @ishaan-jaff in #11459
Add pangea to guardrails sidebar by @ryanmeans in #11464
[Fix] [Bug]: Knowledge Base Call returning error by @ishaan-jaff in #11467
[Feat] Return response_id == upstream response ID for VertexAI + Google AI studio (Stream+Non stream) by @ishaan-jaff in #11456
[Fix]: /v1/messages - return streaming usage statistics when using litellm with bedrock models by @ishaan-jaff in #11469
fix: supports_function_calling works with llm_proxy models by @pazevedo-hyland in #11381
feat: add HuggingFace rerank provider support by @cainiaoit in #11438
Litellm dev 06 05 2025 p2 by @krrishdholakia in #11470
Fix variable redefinition linting error in vertex_and_google_ai_studio_gemini.py by @colesmcintosh in #11486
Add Google Gemini 2.5 Pro Preview 06-05 by @PeterDaveHello in #11447
Feat: add add azure endpoint for image endpoints by @ishaan-jaff in #11482
[Feat] New model - add codex-mini-latest by @ishaan-jaff in #11492
Nebius model pricing info updated by @Aktsvigun in #11445
[Docs] Add audio / tts section for gemini and vertex by @AyrennC in #11306
Document batch polling logic to avoid ValueError: Output file id is None error by @fadil4u in #11286
Revert "Nebius model pricing info updated" by @ishaan-jaff in #11493
[Bug Fix] Fix: _transform_responses_api_content_to_chat_completion_content` doesn't support file content type by @ishaan-jaff in #11494
Fix Fireworks AI rate limit exception mapping - detect "rate limit" text in error messages by @colesmcintosh in #11455
Update Makefile to match CI workflows and improve contributor experience by @colesmcintosh in #11485
Fix: Respect user_header_name property for budget selection and user identification by @colesmcintosh in #11419
Update production doc by @ishaan-jaff in #11499
Enhance proxy CLI with Rich formatting and improved user experience by @colesmcintosh in #11420
Remove retired version gpt-3.5 from configs.md by @vuanhtu52 in #11508
Update model version in deploy.md by @vuanhtu52 in #11506
[Feat] Allow using litellm.completion with /v1/messages API Spec (use gpt-4, gemini etc with claude code) by @ishaan-jaff in #11502
Update the correct test directory in contributing_code.md by @vuanhtu52 in #11511

New Contributors

@mjnitz02 made their first contribution in #10385
@hagan made their first contribution in #10479
@wwells made their first contribution in #11409
@likweitan made their first contribution in #11400
@raz-alon made their first contribution in #10102
@jtsai-quid made their first contribution in #11394
@tmbo made their first contribution in #11362
@wangsha made their first contribution in #11351
@seankwalker made their first contribution in #11452
@pazevedo-hyland made their first contribution in #11381
@cainiaoit made their ...

Contributors

wangsha, hagan, and 23 other contributors

Assets 4

07 Jun 06:04

github-actions

v1.72.2-nightly

18081cf

v1.72.2-nightly

What's Changed

fix: supports_function_calling works with llm_proxy models by @pazevedo-hyland in #11381
feat: add HuggingFace rerank provider support by @cainiaoit in #11438
Litellm dev 06 05 2025 p2 by @krrishdholakia in #11470
Fix variable redefinition linting error in vertex_and_google_ai_studio_gemini.py by @colesmcintosh in #11486
Add Google Gemini 2.5 Pro Preview 06-05 by @PeterDaveHello in #11447
Feat: add add azure endpoint for image endpoints by @ishaan-jaff in #11482
[Feat] New model - add codex-mini-latest by @ishaan-jaff in #11492
Nebius model pricing info updated by @Aktsvigun in #11445
[Docs] Add audio / tts section for gemini and vertex by @AyrennC in #11306
Document batch polling logic to avoid ValueError: Output file id is None error by @fadil4u in #11286
Revert "Nebius model pricing info updated" by @ishaan-jaff in #11493
[Bug Fix] Fix: _transform_responses_api_content_to_chat_completion_content` doesn't support file content type by @ishaan-jaff in #11494
Fix Fireworks AI rate limit exception mapping - detect "rate limit" text in error messages by @colesmcintosh in #11455
Update Makefile to match CI workflows and improve contributor experience by @colesmcintosh in #11485
Fix: Respect user_header_name property for budget selection and user identification by @colesmcintosh in #11419
Update production doc by @ishaan-jaff in #11499
Enhance proxy CLI with Rich formatting and improved user experience by @colesmcintosh in #11420
Remove retired version gpt-3.5 from configs.md by @vuanhtu52 in #11508
Update model version in deploy.md by @vuanhtu52 in #11506
[Feat] Allow using litellm.completion with /v1/messages API Spec (use gpt-4, gemini etc with claude code) by @ishaan-jaff in #11502

New Contributors

@pazevedo-hyland made their first contribution in #11381
@cainiaoit made their first contribution in #11438
@vuanhtu52 made their first contribution in #11508

Full Changelog: v1.72.1.dev8...v1.72.2-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.2-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	180.0	201.25992693793458	6.245318195564	0.0	1869	0	165.1556739999478	1316.0002060000124
Aggregated	Passed ✅	180.0	201.25992693793458	6.245318195564	0.0	1869	0	165.1556739999478	1316.0002060000124

Contributors

PeterDaveHello, krrishdholakia, and 8 other contributors

Assets 4

07 Jun 04:51

github-actions

v1.72.0-stable

34c122e

v1.72.0-stable

Full Changelog: v1.72.0.rc1...v1.72.0-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.72.0-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	180.0	202.72874892497356	6.248285605273851	0.0	1866	0	166.82234500001414	1400.0550900000235
Aggregated	Passed ✅	180.0	202.72874892497356	6.248285605273851	0.0	1866	0	166.82234500001414	1400.0550900000235

Assets 4

06 Jun 15:39

github-actions

v1.72.2.dev_image

a46ca8e

v1.72.2.dev_image

What's Changed

fix: supports_function_calling works with llm_proxy models by @pazevedo-hyland in #11381
feat: add HuggingFace rerank provider support by @cainiaoit in #11438
Litellm dev 06 05 2025 p2 by @krrishdholakia in #11470

New Contributors

@pazevedo-hyland made their first contribution in #11381
@cainiaoit made their first contribution in #11438

Full Changelog: v1.72.1.dev8...v1.72.2.dev_image

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.2.dev_image

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	250.0	266.1860423928771	6.200527110342324	0.0	1853	0	215.66505500004496	1307.9809779999891
Aggregated	Passed ✅	250.0	266.1860423928771	6.200527110342324	0.0	1853	0	215.66505500004496	1307.9809779999891

Contributors

krrishdholakia, cainiaoit, and pazevedo-hyland

Assets 4

06 Jun 05:12

github-actions

v1.72.1.dev8

c99daef

v1.72.1.dev8

What's Changed

update GCSBucketBase to handle GSM project ID if passed by @wwells in #11409
fix: add enterprise feature gating to RegenerateKeyModal in KeyInfoView by @likweitan in #11400
Litellm audit log staging by @krrishdholakia in #11418
Add User ID validation to ensure it is not an email or phone number by @raz-alon in #10102
[Performance] Performance improvements for /v1/messages route by @ishaan-jaff in #11421
Add SSO configuration endpoints and UI integration with persistent settings by @colesmcintosh in #11417
[Build] Bump dd trace version by @ishaan-jaff in #11426
Add together_ai provided deepseek-r1 family model configuration by @jtsai-quid in #11394
fix: Use proper attribute for Sagemaker request for embeddings by @tmbo in #11362
added gemini url context support by @wangsha in #11351
fix(redis_cache.py): support pipeline redis lpop for older redis vers… by @krrishdholakia in #11425
Support no reasoning option for gemini models by @lowjiansheng in #11393
fix(prometheus.py): pass custom metadata labels in litellm_total_toke… by @krrishdholakia in #11414
Fix None values in usage field for gpt-image-1 model responses by @colesmcintosh in #11448
Fix HuggingFace embeddings using non-default input_type by @seankwalker in #11452
Add AGENTS.md by @colesmcintosh in #11461
Custom Root Path Improvements: don't require reserving /litellm route by @krrishdholakia in #11460
[Feat] Make batch size for maximum retention in spend logs a controllable parameter by @ishaan-jaff in #11459
Add pangea to guardrails sidebar by @ryanmeans in #11464
[Fix] [Bug]: Knowledge Base Call returning error by @ishaan-jaff in #11467
[Feat] Return response_id == upstream response ID for VertexAI + Google AI studio (Stream+Non stream) by @ishaan-jaff in #11456
[Fix]: /v1/messages - return streaming usage statistics when using litellm with bedrock models by @ishaan-jaff in #11469

New Contributors

@wwells made their first contribution in #11409
@likweitan made their first contribution in #11400
@raz-alon made their first contribution in #10102
@jtsai-quid made their first contribution in #11394
@tmbo made their first contribution in #11362
@wangsha made their first contribution in #11351
@seankwalker made their first contribution in #11452

Full Changelog: v1.72.1-nightly...v1.72.1.dev8

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.1.dev8

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	180.0	202.7306475996812	6.284886409238797	0.0	1881	0	164.28741799995805	1311.9080620000432
Aggregated	Passed ✅	180.0	202.7306475996812	6.284886409238797	0.0	1881	0	164.28741799995805	1311.9080620000432

Contributors

wangsha, tmbo, and 10 other contributors

Assets 4

Uh oh!

Releases: BerriAI/litellm

v1.72.2.devMCP

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!

v1.72.5.dev1

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!

v1.72.4-nightly

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!

v1.72.2-stable

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

What's Changed

Contributors

Uh oh!

v1.72.3-nightly

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!

v1.72.2.rc

What's Changed

New Contributors

Contributors

Uh oh!

v1.72.2-nightly

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!

v1.72.0-stable

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Uh oh!

v1.72.2.dev_image

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!

v1.72.1.dev8

What's Changed

New Contributors

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

Uh oh!