Releases: BerriAI/litellm
v1.41.4
What's Changed
- fix(router.py): disable cooldowns by @krrishdholakia in #4497
- fix(slack_alerting.py): use in-memory cache for checking request status by @krrishdholakia in #4520
- feat(vertex_httpx.py): Support cachedContent. by @Manouchehri in #4492
- [Fix+Test] /audio/transcriptions - use initialized OpenAI / Azure OpenAI clients by @ishaan-jaff in #4519
- [Fix-Proxy] Background health checks use deep copy of model list for _run_background_health_check by @ishaan-jaff in #4518
- refactor(azure.py): move azure dall-e calls to httpx client by @krrishdholakia in #4523
- feat(dynamic_rate_limiter.py): support dynamic rate limiting on rpm by @krrishdholakia in #4502
- [Enterprise] Check if Key should run secret_detection callback by @ishaan-jaff in #4524
- [Feat] Control Lakera AI per Request by @ishaan-jaff in #4525
Full Changelog: v1.41.3...v1.41.4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.4
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.4
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 150.0 | 174.7746187513257 | 6.304554345115949 | 0.0 | 1886 | 0 | 120.43884399997751 | 1842.0690810000337 |
Aggregated | Passed ✅ | 150.0 | 174.7746187513257 | 6.304554345115949 | 0.0 | 1886 | 0 | 120.43884399997751 | 1842.0690810000337 |
v1.41.3.dev3
What's Changed
- fix(slack_alerting.py): use in-memory cache for checking request status by @krrishdholakia in #4520
- feat(vertex_httpx.py): Support cachedContent. by @Manouchehri in #4492
- [Fix+Test] /audio/transcriptions - use initialized OpenAI / Azure OpenAI clients by @ishaan-jaff in #4519
- [Fix-Proxy] Background health checks use deep copy of model list for _run_background_health_check by @ishaan-jaff in #4518
- refactor(azure.py): move azure dall-e calls to httpx client by @krrishdholakia in #4523
Full Changelog: v1.41.3.dev2...v1.41.3.dev3
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.3.dev3
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 100.0 | 121.89055827443096 | 6.607625351081853 | 0.0 | 1975 | 0 | 85.08053399998516 | 1232.177610000008 |
Aggregated | Passed ✅ | 100.0 | 121.89055827443096 | 6.607625351081853 | 0.0 | 1975 | 0 | 85.08053399998516 | 1232.177610000008 |
v1.41.3.dev2
What's Changed
- fix(router.py): disable cooldowns by @krrishdholakia in #4497
Full Changelog: v1.41.3...v1.41.3.dev2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.3.dev2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 146.68007458654787 | 6.4100923703571215 | 0.0 | 1918 | 0 | 104.69528600003741 | 1155.2264040000182 |
Aggregated | Passed ✅ | 130.0 | 146.68007458654787 | 6.4100923703571215 | 0.0 | 1918 | 0 | 104.69528600003741 | 1155.2264040000182 |
v1.41.3.dev1
What's Changed
- Update azure_ai.md by @sauravpanda in #4485
- fix(bedrock_httpx.py): Add anthropic.claude-3-5-sonnet-20240620-v1:0 to converse list by @Manouchehri in #4484
- Fix usage of parameter-based credentials when using vertex_ai_beta route by @ubirdio in #4486
New Contributors
- @sauravpanda made their first contribution in #4485
- @ubirdio made their first contribution in #4486
Full Changelog: v1.41.2-stable...v1.41.3.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.3.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 120.0 | 139.74836169660554 | 6.401173614938302 | 0.0 | 1915 | 0 | 100.05953000001 | 2095.8458609999866 |
Aggregated | Passed ✅ | 120.0 | 139.74836169660554 | 6.401173614938302 | 0.0 | 1915 | 0 | 100.05953000001 | 2095.8458609999866 |
v1.41.3
What's Changed
- Update azure_ai.md by @sauravpanda in #4485
- fix(bedrock_httpx.py): Add anthropic.claude-3-5-sonnet-20240620-v1:0 to converse list by @Manouchehri in #4484
- Fix usage of parameter-based credentials when using vertex_ai_beta route by @ubirdio in #4486
- [Feat] Return Response headers for OpenAI / Azure OpenAI when
litellm.return_response_headers=True
by @ishaan-jaff in #4501 - [Feat-Enterprise] log
"x-ratelimit-remaining-tokens"
and"x-ratelimit-remaining-requests"
on prometheus by @ishaan-jaff in #4503 - fix exception provider not known by @ishaan-jaff in #4504
New Contributors
- @sauravpanda made their first contribution in #4485
- @ubirdio made their first contribution in #4486
Full Changelog: v1.41.2...v1.41.3
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.3
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 158.51689002697222 | 6.443513101897096 | 0.0 | 1928 | 0 | 109.03273299999228 | 1351.596542999971 |
Aggregated | Passed ✅ | 130.0 | 158.51689002697222 | 6.443513101897096 | 0.0 | 1928 | 0 | 109.03273299999228 | 1351.596542999971 |
v1.41.2-stable
Full Changelog: v1.41.2...v1.41.2-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.2-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 155.2631200462578 | 6.418460613857053 | 0.0 | 1924 | 0 | 115.53501399998822 | 1721.5953009999794 |
Aggregated | Passed ✅ | 130.0 | 155.2631200462578 | 6.418460613857053 | 0.0 | 1924 | 0 | 115.53501399998822 | 1721.5953009999794 |
v1.41.2
What's Changed
- [Feat] /spend/report by API Key by @ishaan-jaff in #4471
- fix(utils.py): new
supports_response_schema()
function to check if provider/model supports the param by @krrishdholakia in #4476 - [Feature] Support aws_session_token for bedrock client. [https://github.com//issues/4346] by @bschulth in #4371
- [Fix] - Error str in OpenAI, Azure exception by @ishaan-jaff in #4477
- [Fix] DALL-E connection error bug on litellm proxy by @ishaan-jaff in #4480
- [Fix] Proxy ErrorLogs store raw exception in error log by @ishaan-jaff in #4474
- fix(langfuse.py): use clean metadata instead of deepcopy by @krrishdholakia in #4413
- [Fix] Admin UI - fix error users we're seeing when logging in (use correct user_id when creating key for admin ui) by @ishaan-jaff in #4479
- feat(vertex_httpx.py): support the 'response_schema' param for older vertex ai models by @krrishdholakia in #4478
- Vertex anthropic json mode by @krrishdholakia in #4482
New Contributors
Full Changelog: v1.41.1...v1.41.2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 81 | 94.17508715704547 | 6.427710896354567 | 0.0 | 1923 | 0 | 69.435722999998 | 1632.053753000008 |
Aggregated | Passed ✅ | 81 | 94.17508715704547 | 6.427710896354567 | 0.0 | 1923 | 0 | 69.435722999998 | 1632.053753000008 |
v1.41.1
What's Changed
- [Doc] Add spec on pass through endpoints by @ishaan-jaff in #4468
- feat - Allow adding authentication on pass through endpoints by @ishaan-jaff in #4469
- tests - pass through endpoint by @ishaan-jaff in #4470
Full Changelog: v1.41.0...v1.41.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 120.0 | 135.94747413256803 | 6.401388314451501 | 0.0 | 1916 | 0 | 98.61660699999675 | 1625.039872000002 |
Aggregated | Passed ✅ | 120.0 | 135.94747413256803 | 6.401388314451501 | 0.0 | 1916 | 0 | 98.61660699999675 | 1625.039872000002 |
v1.41.0-stable
Full Changelog: v1.41.0...v1.41.0-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.0-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 120.0 | 135.89504825622356 | 6.44293938318117 | 0.0 | 1928 | 0 | 99.83084399999598 | 1682.5208299999872 |
Aggregated | Passed ✅ | 120.0 | 135.89504825622356 | 6.44293938318117 | 0.0 | 1928 | 0 | 99.83084399999598 | 1682.5208299999872 |
v1.41.0
What's Changed
- feat: decrypts aws keys in entrypoint.sh by @krrishdholakia in #4437
- fix: replicate - catch 422 unprocessable entity error by @krrishdholakia in
- fix: router.py - pre-call-checks (if enabled) only check context window limits for azure modes if base_model is set by @krrishdholakia in c9a424d
- fix: utils.py - correctly raise openrouter content filter error by @krrishdholakia in ca04244
Note: This release contains changes in how pre-call-checks run for azure models. Filtering models based on context window limits, will only apply to azure models if the base_model is set.
To enable pre-call-checks 👉 https://docs.litellm.ai/docs/routing#pre-call-checks-context-window-eu-regions
Full Changelog: v1.40.31...v1.41.0
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.0
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 156.9410206132132 | 6.2719899835647945 | 0.0 | 1877 | 0 | 112.84582399997589 | 1745.2864320000003 |
Aggregated | Passed ✅ | 130.0 | 156.9410206132132 | 6.2719899835647945 | 0.0 | 1877 | 0 | 112.84582399997589 | 1745.2864320000003 |