Releases: BerriAI/litellm
v1.74.4-nightly
What's Changed
- Litellm release notes 07 12 2025 by @krrishdholakia in #12563
- Add Bytez to the list of providers in the docs by @inf3rnus in #12588
- [Feat] New LLM API Integration - Add Moonshot API (Kimi) (#12551) by @ishaan-jaff in #12592
- [Feat] Add ai21/jamba-1.7 model family pricing by @ishaan-jaff in #12593
- fix: add implicit caching cost calculation for Gemini 2.x models by @colesmcintosh in #12585
- Updated release notes by @krrishdholakia in #12594
- [Feat] Vector Stores - Add Vertex RAG Engine API as a provider by @ishaan-jaff in #12595
- Wildcard model filter by @NANDINI-star in #12597
- [Bug fix] [Bug]: Verbose log is enabled by default by @ishaan-jaff in #12596
- Control Plane + Data Plane support by @krrishdholakia in #12601
- Claude 4 Bedrock /invoke route support + Bedrock application inference profile tool choice support by @krrishdholakia in #12599
- refactor(prisma_migration.py): refactor to support use_prisma_migrate - for helm hook by @krrishdholakia in #12600
- feat: Add envVars and extraEnvVars support to Helm migrations job by @AntonioKL in #12591
- feat(gemini): Add custom TTL support for context caching (#9810) by @marcelodiaz558 in #12541
- fix(anthropic): fix streaming + response_format + tools bug by @dmcaulay in #12463
- [Bug Fix] Include /mcp in list of available routes on proxy by @ishaan-jaff in #12612
- Add Copy-on-Click for IDs by @NANDINI-star in #12615
- add azure blob cache support by @demoray in #12587
- refactor(mcp): Make MCP_TOOL_PREFIX_SEPARATOR configurable from env by @juancarlosm in #12603
- [Bug Fix] Add swagger docs for LiteLLM /chat/completions, /embeddings, /responses by @ishaan-jaff in #12618
- [Docs] troubleshooting SSO configs by @ishaan-jaff in #12621
- [Feat] MCP Gateway - allow using MCPs with all LLM APIs when using /responses with LiteLLM by @ishaan-jaff in #12546
- rm claude instant 1 and 1.2 from model_prices_and_context_window.json by @staeiou in #12631
- Add "keys import" command to CLI by @msabramo in #12620
- Add token pricing for Together.ai Llama-4 and DeepSeek models by @stefanc-ai2 in #12622
- Add input_cost_per_pixel to values in ModelGroupInfo model by @Mte90 in #12604
- fix: role chaining with webauthentication for aws bedrock by @RichardoC in #12607
- (#11794) use upsert for managed object table rather than create to avoid UniqueViolationError by @yeahyung in #11795
- [Bug Fix] [Bug]: Knowledge Base Call returning error by @ishaan-jaff in #12628
- fix(router.py): use more descriptive error message + UI - enable team admins to update member role by @krrishdholakia in #12629
- fix(proxy_server.py): fixes for handling team only models on UI by @krrishdholakia in #12632
- OpenAI deepresearch models via
.completion
support by @krrishdholakia in #12627 - fix: Handle circular references in spend tracking metadata JSON serialization by @colesmcintosh in #12643
- Fix bedrock nova micro and lite info by @mnguyen96 in #12619
- [New Model] add together_ai/moonshotai/Kimi-K2-Instruct by @ishaan-jaff in #12645
- Add groq/moonshotai-kimi-k2-instruct model configuration by @colesmcintosh in #12648
- [Bug Fix] grok-4 does not support the
stop
param by @ishaan-jaff in #12646 - Add GitHub Copilot LiteLLM tutorial by @colesmcintosh in #12649
- Fix unused imports in completion_extras transformation by @colesmcintosh in #12655
- [MCP Gateway] Allow MCP access groups to be added via the config LIT-312 by @jugaldb in #12654
- [MCP Gateway] List tools from access list for keys by @jugaldb in #12657
- [MCP Gateway] Allow MCP sse and http to have namespaced url for better segregation LIT-304 by @jugaldb in #12658
- [Feat] Allow reading custom logger python scripts from s3 by @ishaan-jaff in #12623
- [Feat] UI - Add
end_user
filter on UI by @ishaan-jaff in #12663 - [Bug Fix] StandardLoggingPayload on cache_hits should track custom llm provider + DD LLM Obs span type by @ishaan-jaff in #12652
- [Bug Fix] SCIM - add GET /ServiceProviderConfig by @ishaan-jaff in #12664
- feat: add input_fidelity parameter for OpenAI image generation by @colesmcintosh in #12662
New Contributors
- @AntonioKL made their first contribution in #12591
- @marcelodiaz558 made their first contribution in #12541
- @dmcaulay made their first contribution in #12463
- @demoray made their first contribution in #12587
- @staeiou made their first contribution in #12631
- @stefanc-ai2 made their first contribution in #12622
- @RichardoC made their first contribution in #12607
- @yeahyung made their first contribution in #11795
- @mnguyen96 made their first contribution in #12619
Full Changelog: v1.74.3.rc.1...v1.74.4-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.74.4-nightly
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 240.0 | 258.63325774080073 | 6.1786141049802525 | 0.0 | 1848 | 0 | 211.92541800002118 | 1368.992559999981 |
Aggregated | Passed β | 240.0 | 258.63325774080073 | 6.1786141049802525 | 0.0 | 1848 | 0 | 211.92541800002118 | 1368.992559999981 |
v1.74.3.rc.3
Full Changelog: v1.74.3.rc.2...v1.74.3.rc.3
v1.74.3.dev2
What's Changed
- Litellm release notes 07 12 2025 by @krrishdholakia in #12563
- Add Bytez to the list of providers in the docs by @inf3rnus in #12588
- [Feat] New LLM API Integration - Add Moonshot API (Kimi) (#12551) by @ishaan-jaff in #12592
- [Feat] Add ai21/jamba-1.7 model family pricing by @ishaan-jaff in #12593
- fix: add implicit caching cost calculation for Gemini 2.x models by @colesmcintosh in #12585
- Updated release notes by @krrishdholakia in #12594
- [Feat] Vector Stores - Add Vertex RAG Engine API as a provider by @ishaan-jaff in #12595
- Wildcard model filter by @NANDINI-star in #12597
- [Bug fix] [Bug]: Verbose log is enabled by default by @ishaan-jaff in #12596
- Control Plane + Data Plane support by @krrishdholakia in #12601
- Claude 4 Bedrock /invoke route support + Bedrock application inference profile tool choice support by @krrishdholakia in #12599
- refactor(prisma_migration.py): refactor to support use_prisma_migrate - for helm hook by @krrishdholakia in #12600
- feat: Add envVars and extraEnvVars support to Helm migrations job by @AntonioKL in #12591
- feat(gemini): Add custom TTL support for context caching (#9810) by @marcelodiaz558 in #12541
- fix(anthropic): fix streaming + response_format + tools bug by @dmcaulay in #12463
New Contributors
- @AntonioKL made their first contribution in #12591
- @marcelodiaz558 made their first contribution in #12541
- @dmcaulay made their first contribution in #12463
Full Changelog: v1.74.3.rc.1...v1.74.3.dev2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.74.3.dev2
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.74.3.dev2
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 190.0 | 206.89425634609142 | 6.19933434941609 | 0.0 | 1855 | 0 | 168.97698900004343 | 1646.9904610000299 |
Aggregated | Passed β | 190.0 | 206.89425634609142 | 6.19933434941609 | 0.0 | 1855 | 0 | 168.97698900004343 | 1646.9904610000299 |
v1.74.3.rc.2
Full Changelog: v1.74.3.rc.1...v1.74.3.rc.2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.74.3.rc.2
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 240.0 | 261.64451627750555 | 6.164255932341009 | 0.0 | 1845 | 0 | 216.88029500000994 | 1499.7778569999696 |
Aggregated | Passed β | 240.0 | 261.64451627750555 | 6.164255932341009 | 0.0 | 1845 | 0 | 216.88029500000994 | 1499.7778569999696 |
v1.74.3.rc.1
What's Changed
- Fix: Output github copilot verification uri immediately when running in docker. by @gemyago in #12558
- [MCP Gateway] add access group documentation by @jugaldb in #12557
- fix: handle missing 'env' attribute in MCP server Prisma model by @colesmcintosh in #12560
- UI - v1.74.3-stable QA fixes by @ishaan-jaff in #12559
- [MCP Gateway] Ensure we use the same param for specifying groups by @ishaan-jaff in #12561
New Contributors
Full Changelog: v1.74.3-stable-draft...v1.74.3.rc.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.74.3.rc.1
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 230.0 | 255.52282259223816 | 6.28594793060673 | 0.0 | 1881 | 0 | 206.60606100000223 | 1164.8403710000252 |
Aggregated | Passed β | 230.0 | 255.52282259223816 | 6.28594793060673 | 0.0 | 1881 | 0 | 206.60606100000223 | 1164.8403710000252 |
v1.74.3-nightly
What's Changed
- fix bedrock cost calculation for cached tokens by @jdietzsch91 in #12488
- Fix tool call handling in Anthropic pass-through adapter by @iwinux in #12473
- Guardrails AI - pre-call + logging only guardrail (pii detection/competitor names) support by @krrishdholakia in #12506
- Litellm mcp access group on UI by @jugaldb in #12470
- [Enterprise] Support tag based mode for guardrails by @krrishdholakia in #12508
- Litellm mcp access group by @jugaldb in #12514
- Add
Build and push litellm-non_root
todocker-hub-deploy
CICD workflow by @andresC98 in #12413 - Validation to mcp server name by @jugaldb in #12515
- [Feat] - New guardrail - OpenAI Moderations API by @ishaan-jaff in #12519
- [MCP Gateway] QA - MCP Tool Testing Playground by @ishaan-jaff in #12520
- [Security Fix] - Dont show pure JWT in "Logs" page on UI by @ishaan-jaff in #12524
- [Bug Fix] - QA for MCP Gateway - show the cost config on the root of MCP Settings by @ishaan-jaff in #12526
- [MCP Gateway] access group UI object permission fix by @jugaldb in #12523
- [MCP Gateway] UI Quality check fixes by @jugaldb in #12521
- [MCP Gateway] Allow using stdio MCPs with LiteLLM by @ishaan-jaff in #12530
- docs: Update github.md by @EmaSuriano in #12509
- π Remove deprecated pydantic class Config by @strawgate in #12528
- Team Members - reset budget, if duration set + Prometheus - support tag based metrics by @krrishdholakia in #12534
- Consistent layout for Create and Back buttons on all the pages by @NANDINI-star in #12542
- Fix e2e test by @NANDINI-star in #12544
- Align Show Password with Checkbox by @NANDINI-star in #12538
- chore: Update Vertex AI Model Garden LiteLLM integration tutorial by @lizzij in #12428
- [Bug Fix] xai/ translation fix - ensure finish_reason includes tool calls when xai responses with tool calls by @ishaan-jaff in #12545
- Prevent writing default user setting updates to yaml (error in non-root env) + Use central team member budget when max_budget_in_team set on UI by @krrishdholakia in #12533
- [MCP Gateway] Allow mcp access groups on test key and tool calls by @jugaldb in #12529
- [MCP Gateway] UI headers groups example on connect tab by @jugaldb in #12550
- Fix e2e test by @NANDINI-star in #12549
- Integration: Bytez as a model provider by @inf3rnus in #12121
New Contributors
- @jdietzsch91 made their first contribution in #12488
- @iwinux made their first contribution in #12473
- @andresC98 made their first contribution in #12413
- @EmaSuriano made their first contribution in #12509
- @strawgate made their first contribution in #12528
- @inf3rnus made their first contribution in #12121
Full Changelog: v1.74.2-nightly...v1.74.3-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.74.3-nightly
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed β | 210.0 | 397.6239629079091 | 5.915434271857456 | 0.0 | 1770 | 0 | 185.11648200001218 | 16174.709219000022 |
Aggregated | Failed β | 210.0 | 397.6239629079091 | 5.915434271857456 | 0.0 | 1770 | 0 | 185.11648200001218 | 16174.709219000022 |
v1.74.3-stable-draft
What's Changed
- Sticky session for Test Key page by @NANDINI-star in #12365
- Truncate long labels and improve tooltip in Top API Keys chart by @NANDINI-star in #12371
- [Bug Fix] s3 config.yaml file - ensure yaml safe load is used by @ishaan-jaff in #12373
- [Feat] Bump langfuse python SDK version and
LANGFUSE_TRACING_ENVIRONMENT
by @ishaan-jaff in #12376 - [Security] Bump mcp version on docker img by @ishaan-jaff in #12362
- fix: make TextCompletionStreamWrapper conversion retain reasoning_content by @aholmberg in #12377
- Bump mcp from 1.9.3 to 1.10.0 by @dependabot[bot] in #12388
- [Feat] Add MCP Cost Tracking by @ishaan-jaff in #12385
- feat: add image support for Responses API when falling back on Chat Completions by @ryan-castner in #12204
- Add 'thinking blocks' to stream chunk builder + remove experimental 'by_tag' metrics on prometheus (fix cardinality issue) by @krrishdholakia in #12395
- Add audit logs on model update by @krrishdholakia in #12396
- Improve Chart Readability for Tags by @NANDINI-star in #12378
- Fix API base url for Github Copilot provider by @kanaka in #12418
- fix(proxy/mcp): Error handling MCP request: Task group is not initialized by @juancarlosm in #12411
- style: update sambanova logos by @jhpiedrahitao in #12431
- [Bug fix] MCP MCP_TOOL_PREFIX_SEPARATOR to work with claude code by @jugaldb in #12430
- Prevent navigation reset after team member operations by @NANDINI-star in #12424
- Fix guardrails_ai.md documentation page by @DmitriyAlergant in #12356
- [Bug fix] Multiple API Keys Created on Startup When max_budget is Enabled by @ishaan-jaff in #12436
- [Feat] Add XInference Image Generation API Provider by @ishaan-jaff in #12439
- [Feat] Bedrock Guardrails - Raise Bedrock output text on 'BLOCKED' actions from guardrail by @ishaan-jaff in #12435
- MCP - usage tracking by @krrishdholakia in #12397
- fix(utils.py): rollback faulty security check on files by @krrishdholakia in #12441
- Fix: Properly close aiohttp client sessions to prevent resource leaks by @colesmcintosh in #12251
- Remove temporary test files by @colesmcintosh in #12442
- (Router) don't add invalid deployment to router pattern match by @krrishdholakia in #12459
- [Feat] MCP Gateway - Allow customizing what client side header to use by @ishaan-jaff in #12460
- [Bug Fix] Ensure supported
bedrock/converse/
params =bedrock/
params by @ishaan-jaff in #12466 - Litellm mcp internal users by @jugaldb in #12458
- [Feat] SSO - Allow users to run a custom sso login handler by @ishaan-jaff in #12465
- [Bug Fix]
DataDogLLMObsLogger
pushtotal_cost
by @ishaan-jaff in #12467 - [MCP Gateway] - Allow using custom post call MCP hook for cost tracking by @ishaan-jaff in #12469
- DB Spend Update Writer: fix query + Allow anthropic-beta header when forward_client_headers_to_llm_api is true by @krrishdholakia in #12462
- OTEL - OTEL_RESOURCE_ATTRIBUTES support + Model Hub - new model hub table view, new
/public/model_hub
endpoint, fix duplicates in/model_group/info
by @krrishdholakia in #12468 - Resolve model group alias on Auth +
/v1/messages
Fallback support by @krrishdholakia in #12440 - add grok-4 configs to table by @fcakyon in #12476
- fix slack alerts by @jugaldb in #12464
- Add devstral-small-2507 and devstral-medium-2507 models by @xingyaoww in #12484
- [Bug Fix] fix parsing environment_variables from config.yaml (arize logger integration fix) by @ishaan-jaff in #12482
- [Chore] Don't emit warning for Max in memory queue flush count by @ishaan-jaff in #12489
- Add Azure OpenAI o3-deep-research model pricing support by @neubig in #12493
- Feat(bedrock): support api key authentication for AWS Bedrock API by @ishaan-jaff in #12495
- Added validate payload error by @jugaldb in #12494
- [MCP Gateway] - Add custom cost configuration for each MCP tool by @ishaan-jaff in #12499
- [Feat] Add support for editing MCP cost per tool by @ishaan-jaff in #12501
- [docs]: Fix typo and import required types for proxy call hooks by @Rayshard in #12487
- fix: handle reasoning parameters and response in responses bridge by @aholmberg in #12433
- Added dashscope (alibaba's cloud - qwen) as a provider by @minghao51 in #12361
- feat: improve user dropdown UI with premium badge and cleaner layout by @colesmcintosh in #12502
- fix bedrock cost calculation for cached tokens by @jdietzsch91 in #12488
- Fix tool call handling in Anthropic pass-through adapter by @iwinux in #12473
- Guardrails AI - pre-call + logging only guardrail (pii detection/competitor names) support by @krrishdholakia in #12506
- Litellm mcp access group on UI by @jugaldb in #12470
- [Enterprise] Support tag based mode for guardrails by @krrishdholakia in #12508
- Litellm mcp access group by @jugaldb in #12514
- Add
Build and push litellm-non_root
todocker-hub-deploy
CICD workflow by @andresC98 in #12413 - Validation to mcp server name by @jugaldb in #12515
- [Feat] - New guardrail - OpenAI Moderations API by @ishaan-jaff in #12519
- [MCP Gateway] QA - MCP Tool Testing Playground by @ishaan-jaff in #12520
- [Security Fix] - Dont show pure JWT in "Logs" page on UI by @ishaan-jaff in #12524
- [Bug Fix] - QA for MCP Gateway - show the cost config on the root of MCP Settings by @ishaan-jaff in #12526
- [MCP Gateway] access group UI object permission fix by @jugaldb in #12523
- [MCP Gateway] UI Quality check fixes by @jugaldb in #12521
- [MCP Gateway] Allow using stdio MCPs with LiteLLM by @ishaan-jaff in #12530
- docs: Update github.md by @EmaSuriano in #12509
- π Remove deprecated pydantic class Config by @strawgate in #12528
- Team Members - reset budget, if duration set + Prometheus - support tag based metrics by @krrishdholakia in #12534
- Consistent layout for Create and Back buttons on all the pages by @NANDINI-star in #12542
- Fix e2e test by @NANDINI-star in #12544
- Align Show Password with Checkbox by @NANDINI-star in #12538
- chore: Update Vertex AI Model Garden LiteLLM integration tutorial by @lizzij in #12428
- [Bug Fix] xai/ translation fix - ensure finish_reason includes tool calls when xai responses with tool calls by @ishaan-jaff in #12545
- Prevent writing default user setting updates to yaml (error in non-root env) + Use central team member budget when max_budget_in_team set on UI by @krrishdholakia in #12533
- [MCP Gateway] Allow mcp access groups on test key and tool calls by @jugaldb in #12529
- [MCP Gateway] UI headers groups example on connect tab by @jugaldb in #12550
- Fix e2e test by @NANDINI-star in #12549
- Integration: Bytez as a model provider by @inf3rnus in #12121
- [CI/CD fix] test_redis_caching_multiple_namespaces by @ishaan-jaff in #12552
- [MCP Gateway] access group fixes on UI for keys and teams by @jugaldb in #12556
- UI - Model Hub Page - minor fixes + improvements (+ Make Model Hub OSS π) by @krrishdho...
v1.74.2-nightly
What's Changed
- add grok-4 configs to table by @fcakyon in #12476
- fix slack alerts by @jugaldb in #12464
- Add devstral-small-2507 and devstral-medium-2507 models by @xingyaoww in #12484
- [Bug Fix] fix parsing environment_variables from config.yaml (arize logger integration fix) by @ishaan-jaff in #12482
- [Chore] Don't emit warning for Max in memory queue flush count by @ishaan-jaff in #12489
- Add Azure OpenAI o3-deep-research model pricing support by @neubig in #12493
- Feat(bedrock): support api key authentication for AWS Bedrock API by @ishaan-jaff in #12495
- Added validate payload error by @jugaldb in #12494
- [MCP Gateway] - Add custom cost configuration for each MCP tool by @ishaan-jaff in #12499
- [Feat] Add support for editing MCP cost per tool by @ishaan-jaff in #12501
- [docs]: Fix typo and import required types for proxy call hooks by @Rayshard in #12487
- fix: handle reasoning parameters and response in responses bridge by @aholmberg in #12433
- Added dashscope (alibaba's cloud - qwen) as a provider by @minghao51 in #12361
- feat: improve user dropdown UI with premium badge and cleaner layout by @colesmcintosh in #12502
New Contributors
- @Rayshard made their first contribution in #12487
- @minghao51 made their first contribution in #12361
Full Changelog: v1.74.1-nightly...v1.74.2-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.74.2-nightly
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 200.0 | 216.6222827699786 | 6.190122373161466 | 0.0 | 1852 | 0 | 172.40756000001056 | 1119.208724000032 |
Aggregated | Passed β | 200.0 | 216.6222827699786 | 6.190122373161466 | 0.0 | 1852 | 0 | 172.40756000001056 | 1119.208724000032 |
v1.74.1-nightly
What's Changed
- Add mcp server segregation comma separated support by @jugaldb in #12326
- Fix: Preserve Live Tail State on Log Pages by @NANDINI-star in #12335
- [Feat] JWT - Sync user roles and team memberships when JWT Auth is used by @ishaan-jaff in #11994
- fix watsonx datetime conversion issue py3.10 by @isaken in #12339
- Patch 1 by @isaken in #12338
- UI - Add Azure Content Safety Guardrails (improved UX) by @krrishdholakia in #12330
- UI - Azure Content Guardrails by @krrishdholakia in #12341
- feat(vertex_ai/): add new deepseek-ai api service by @krrishdholakia in #12312
- v1.74.0.rc docs by @ishaan-jaff in #12344
- [Docs] vertex deepseek by @ishaan-jaff in #12345
- docs - 1.74.0.rc by @ishaan-jaff in #12347
- [UI QA] 1.74.0.rc by @ishaan-jaff in #12348
- fix: add proper type annotations for embedding() function by @colesmcintosh in #12262
- Remove stream options from streaming + fix guardrail start time on log duration by @krrishdholakia in #12346
- Add all guardrails to the UI by @krrishdholakia in #12349
- New
/key/service-account/generate
API Endpoint + Team member permissions for creating service account keys by @krrishdholakia in #12350 - Sticky session for Test Key page by @NANDINI-star in #12365
- Truncate long labels and improve tooltip in Top API Keys chart by @NANDINI-star in #12371
- [Bug Fix] s3 config.yaml file - ensure yaml safe load is used by @ishaan-jaff in #12373
- [Feat] Bump langfuse python SDK version and
LANGFUSE_TRACING_ENVIRONMENT
by @ishaan-jaff in #12376 - [Security] Bump mcp version on docker img by @ishaan-jaff in #12362
- fix: make TextCompletionStreamWrapper conversion retain reasoning_content by @aholmberg in #12377
- Bump mcp from 1.9.3 to 1.10.0 by @dependabot in #12388
- [Feat] Add MCP Cost Tracking by @ishaan-jaff in #12385
- feat: add image support for Responses API when falling back on Chat Completions by @ryan-castner in #12204
- Add 'thinking blocks' to stream chunk builder + remove experimental 'by_tag' metrics on prometheus (fix cardinality issue) by @krrishdholakia in #12395
- Add audit logs on model update by @krrishdholakia in #12396
- Improve Chart Readability for Tags by @NANDINI-star in #12378
- Fix API base url for Github Copilot provider by @kanaka in #12418
- fix(proxy/mcp): Error handling MCP request: Task group is not initialized by @juancarlosm in #12411
- style: update sambanova logos by @jhpiedrahitao in #12431
- [Bug fix] MCP MCP_TOOL_PREFIX_SEPARATOR to work with claude code by @jugaldb in #12430
- Prevent navigation reset after team member operations by @NANDINI-star in #12424
- Fix guardrails_ai.md documentation page by @DmitriyAlergant in #12356
- [Bug fix] Multiple API Keys Created on Startup When max_budget is Enabled by @ishaan-jaff in #12436
- [Feat] Add XInference Image Generation API Provider by @ishaan-jaff in #12439
- [Feat] Bedrock Guardrails - Raise Bedrock output text on 'BLOCKED' actions from guardrail by @ishaan-jaff in #12435
- MCP - usage tracking by @krrishdholakia in #12397
- fix(utils.py): rollback faulty security check on files by @krrishdholakia in #12441
- Fix: Properly close aiohttp client sessions to prevent resource leaks by @colesmcintosh in #12251
- Remove temporary test files by @colesmcintosh in #12442
- (Router) don't add invalid deployment to router pattern match by @krrishdholakia in #12459
- [Feat] MCP Gateway - Allow customizing what client side header to use by @ishaan-jaff in #12460
- [Bug Fix] Ensure supported
bedrock/converse/
params =bedrock/
params by @ishaan-jaff in #12466 - Litellm mcp internal users by @jugaldb in #12458
- [Feat] SSO - Allow users to run a custom sso login handler by @ishaan-jaff in #12465
- [Bug Fix]
DataDogLLMObsLogger
pushtotal_cost
by @ishaan-jaff in #12467 - [MCP Gateway] - Allow using custom post call MCP hook for cost tracking by @ishaan-jaff in #12469
- DB Spend Update Writer: fix query + Allow anthropic-beta header when forward_client_headers_to_llm_api is true by @krrishdholakia in #12462
- OTEL - OTEL_RESOURCE_ATTRIBUTES support + Model Hub - new model hub table view, new
/public/model_hub
endpoint, fix duplicates in/model_group/info
by @krrishdholakia in #12468 - Resolve model group alias on Auth +
/v1/messages
Fallback support by @krrishdholakia in #12440
New Contributors
- @isaken made their first contribution in #12339
- @kanaka made their first contribution in #12418
- @juancarlosm made their first contribution in #12411
- @DmitriyAlergant made their first contribution in #12356
Full Changelog: v1.74.0-nightly...v1.74.1-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.74.1-nightly
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 200.0 | 271.9479641086367 | 6.15189964037961 | 0.0033416076264962576 | 1841 | 1 | 168.73124300002473 | 2931.748561000063 |
Aggregated | Passed β | 200.0 | 271.9479641086367 | 6.15189964037961 | 0.0033416076264962576 | 1841 | 1 | 168.73124300002473 | 2931.748561000063 |
v1.74.0-stable
Full Changelog: v1.74.0.rc.2...v1.74.0-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.74.0-stable
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 240.0 | 260.8232618687037 | 6.237602465744322 | 0.0 | 1866 | 0 | 210.26094900003045 | 1392.1758870000076 |
Aggregated | Passed β | 240.0 | 260.8232618687037 | 6.237602465744322 | 0.0 | 1866 | 0 | 210.26094900003045 | 1392.1758870000076 |