Releases: BerriAI/litellm
v1.40.29
What's Changed
- Updates Databricks provider docs by @djliden in #4442
- [Feat] Improve secret detection call hook - catch more cases by @ishaan-jaff in #4444
- [Fix] Secret redaction logic when used with logging callbacks by @ishaan-jaff in #4443
- [fix] error message on /v2/model info when no models exist by @ishaan-jaff in #4447
New Contributors
Full Changelog: v1.40.28...v1.40.29
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.29
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 150.0 | 169.90346260031845 | 6.295057404345822 | 0.0 | 1884 | 0 | 116.81983199997603 | 1212.0624549999661 |
Aggregated | Passed ✅ | 150.0 | 169.90346260031845 | 6.295057404345822 | 0.0 | 1884 | 0 | 116.81983199997603 | 1212.0624549999661 |
v1.40.28
What's Changed
- Added openrouter/anthropic/claude-3.5-sonnet & haiku to model json by @paul-gauthier in #4400
- [Feat] Add Fireworks AI Tool calling support by @ishaan-jaff in #4418
- fix add ollama codegemma to model cost map by @ishaan-jaff in #4424
- Add return type annotations to util types by @guitard0g in #4420
- [Fix-Proxy] Store SpendLogs when using Whisper, Moderations etc by @ishaan-jaff in #4427
- [Fix-Proxy] Azure Embeddings use AsyncAzureOpenAI Client initialized on litellm.Router for requests by @ishaan-jaff in #4431
- [Feat] New Provider - Add volcano AI Engine by @ishaan-jaff in #4433
- [Fix] Forward OTEL Traceparent Header to provider by @ishaan-jaff in #4423
- [Feat] Add Codestral pricing by @ishaan-jaff in #4435
- [Feat] Add all Vertex AI Models by @ishaan-jaff in #4421
New Contributors
- @guitard0g made their first contribution in #4420
Full Changelog: v1.40.27...v1.40.28
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.28
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 140.0 | 159.37771652368323 | 6.278392214516223 | 0.0 | 1879 | 0 | 117.58081900001116 | 1089.7057880000034 |
Aggregated | Passed ✅ | 140.0 | 159.37771652368323 | 6.278392214516223 | 0.0 | 1879 | 0 | 117.58081900001116 | 1089.7057880000034 |
v1.40.27
✨ Thrilled to launch support for @NVIDIA NIM LLM API on @LiteLLM 1.40.27 👉 Start here: https://docs.litellm.ai/docs/providers/nvidia_nim
🔥 Proxy 100+ LLMS & set budgets
🔑 [Enterprise] Add secret detection pre call hook https://docs.litellm.ai/docs/proxy/enterprise#content-moderation
🛠️ [Fix] - use n in mock completion response on litellm mock responses
⚡️ [Feat] add endpoint to debug memory utilization
🔑 enterprise - allow verifying license in air gapped vpc
What's Changed
- [Fix-Improve] Improve Ollama prompt input and fix Ollama function calling key error and fix Ollama function calling
can only join an iterable
error by @CorrM in #4373 - Fix Groq Prices by @kiriloman in #4401
- [Feat] add endpoint to debug memory util by @ishaan-jaff in #4364
- [Feat-New Provider] Add Nvidia NIM by @ishaan-jaff in #4403
- [Fix] - use
n
in mock completion responses by @ishaan-jaff in #4405 - enterprise - allow verifying license in air gapped vpc by @ishaan-jaff in #4409
- Create litellm user to fix issue with prisma in k8s by @lolsborn in #4402
- [Enterprise] Add secret detection pre call hook by @ishaan-jaff in #4410
- Revert "Create litellm user to fix issue with prisma in k8s " by @krrishdholakia in #4412
- fix(router.py): set
cooldown_time:
per model by @krrishdholakia in #4411
New Contributors
- @CorrM made their first contribution in #4373
- @kiriloman made their first contribution in #4401
Full Changelog: v1.40.26...v1.40.27
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.27
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 156.61068343517005 | 6.372506185089714 | 0.0 | 1905 | 0 | 109.52021800000011 | 1799.9076889999515 |
Aggregated | Passed ✅ | 130.0 | 156.61068343517005 | 6.372506185089714 | 0.0 | 1905 | 0 | 109.52021800000011 | 1799.9076889999515 |
v1.40.26
What's Changed
- fix: Lunary integration by @7HR4IZ3 in #4379
- [Fix] - Admin UI login bug by @ishaan-jaff in #4382
- Support custom model info for router logic (e.g.
max_input_tokens
) by @krrishdholakia in #4388 - fix(vertex_httpx.py): cover gemini content violation (on prompt) by @krrishdholakia in #4392
- [Feat-Enterprise] - Allow setting custom public routes by @ishaan-jaff in #4389
- docs control available public routes by @ishaan-jaff in #4394
- Log rejected router requests to langfuse by @krrishdholakia in #4390
- Fix /spend/calculate use model_group_alias when set by @ishaan-jaff in #4395
New Contributors
Full Changelog: v1.40.25...v1.40.26
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.26
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 156.53979613674755 | 6.329682565511454 | 0.0 | 1894 | 0 | 109.70386399998233 | 2175.4312479999953 |
Aggregated | Passed ✅ | 130.0 | 156.53979613674755 | 6.329682565511454 | 0.0 | 1894 | 0 | 109.70386399998233 | 2175.4312479999953 |
v1.40.25
What's Changed
- feat(dynamic_rate_limiter.py): Dynamic tpm quota (multiple projects) by @krrishdholakia in #4349
- fix(router.py): Content Policy Fallbacks for Azure 'content_filter' responses by @krrishdholakia in #4365
- Disable message redaction in logs via request header by @msabramo in #4352
Full Changelog: v1.40.24...v1.40.25
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.25
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 120.0 | 137.1328113917474 | 6.561515229902106 | 0.0 | 1963 | 0 | 98.4713450000072 | 1831.7410280000104 |
Aggregated | Passed ✅ | 120.0 | 137.1328113917474 | 6.561515229902106 | 0.0 | 1963 | 0 | 98.4713450000072 | 1831.7410280000104 |
v1.40.24
What's Changed
- refactor(litellm_logging.py): refactors how slack_alerting generates langfuse trace url by @krrishdholakia in #4344
- [Security Fix - Proxy Server ADMIN UI] - Store credentials in cookies + use strong JWT signing secret by @ishaan-jaff in #4357
- [Test] Test routes on LiteLLM Proxy always includes OpenAI Routes by @ishaan-jaff in #4356
- fix - Can't access /v1/audio/speech with some user key by @ishaan-jaff in #4360
Full Changelog: v1.40.22...v1.40.24
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.24
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 154.96801309401323 | 6.361794835434345 | 0.0 | 1904 | 0 | 115.54615199997897 | 1161.0779019999882 |
Aggregated | Passed ✅ | 130.0 | 154.96801309401323 | 6.361794835434345 | 0.0 | 1904 | 0 | 115.54615199997897 | 1161.0779019999882 |
v1.40.22
What's Changed
- fix: use per-token costs for claude via vertex_ai by @spdustin in #4337
- [Feat] Admin UI - Show Cache hit stats by @ishaan-jaff in #4340
- fix - liteLLM proxy /moderations endpoint returns 500 error when model is not specified by @ishaan-jaff in #4342
- [Fix + Test] - Spend tags not getting stored on 1.40.9 by @ishaan-jaff in #4345
- Print content window fallbacks on startup to help verify configuration by @lolsborn in #4350
New Contributors
Full Changelog: v1.40.21...v1.40.22
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.22
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 120.0 | 147.06905652027004 | 6.431081863451109 | 0.0 | 1924 | 0 | 100.04098199999589 | 1834.3141159999732 |
Aggregated | Passed ✅ | 120.0 | 147.06905652027004 | 6.431081863451109 | 0.0 | 1924 | 0 | 100.04098199999589 | 1834.3141159999732 |
v1.40.21
What's Changed
- feat: friendli ai support by @pocca2048 in #4121
New Contributors
- @pocca2048 made their first contribution in #4121
Full Changelog: v1.40.20...v1.40.21
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.21
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 140.0 | 163.37791068962042 | 6.407997421786419 | 0.0 | 1917 | 0 | 114.2956310000045 | 1749.6762069999932 |
Aggregated | Passed ✅ | 140.0 | 163.37791068962042 | 6.407997421786419 | 0.0 | 1917 | 0 | 114.2956310000045 | 1749.6762069999932 |
v1.40.20
What's Changed
- docs - add algolia search 🫡 by @ishaan-jaff in #4320
- [Feat] allow using custom router strategy by @ishaan-jaff in #4318
- fix(utils.py): allow dropping specific openai params by @krrishdholakia in #4313
- fix(user_api_key_auth.py): ensure user has access to fallback models by @krrishdholakia in #4321
- Update proxy_cli.py by @vanpelt in #4325
- fix(key_management_endpoints.py): use common _duration_in_seconds function by @krrishdholakia in #4323
- feat(router.py): allow user to call specific deployment via id by @krrishdholakia in #4290
- test(test_python_38.py): add coverage for non-gen settings config.yaml flow by @krrishdholakia in #4328
- [Fix] user field and user_api_key_* is sometimes omitted randomly by @ishaan-jaff in #4322
Full Changelog: v1.40.19...v1.40.20
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.20
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 79 | 105.64488981589672 | 6.515118790194818 | 0.0 | 1950 | 0 | 67.60507300003837 | 3342.9461570000285 |
Aggregated | Passed ✅ | 79 | 105.64488981589672 | 6.515118790194818 | 0.0 | 1950 | 0 | 67.60507300003837 | 3342.9461570000285 |
v1.40.19
🚨🚨🚨 Known Bug on LiteLLM Proxy Server on this Release- we do not recommend upgrading until issue is fixed
You can use claude 3.5 sonnet on older versions - no upgrade is required
What's Changed
- fix(proxy_server.py): fix llm_model_list to use router.get_model_list() by @krrishdholakia in #4274
- Support 'image url' to vertex ai / google ai studio gemini models by @krrishdholakia in #4266
- Use AWS Key Management System for Encrypted Database URL + Redis Credentials by @krrishdholakia in #4111
- feat - add open router exception mapping by @ishaan-jaff in #4282
- feat - support CURL OPTIONS for
/health/readiness
endpoint by @ishaan-jaff in #4286 - fix(litellm_logging.py): Add missing import statement. by @Manouchehri in #4276
- docs - setting team budgets on ui by @ishaan-jaff in #4292
- [Bug-Fix]: Azure image generation doesn't support HTTPS_PROXY en by @ishaan-jaff in #4293
- [feat] Add ft:gpt-4, ft:gpt-4o models by @ishaan-jaff in #4294
- build(model_prices_and_context_window.json): fix gemini pricing by @krrishdholakia in #4291
- support vertex_credentials filepath by @hawktang in #4199
- fix(litellm_logging.py): initialize global variables for logging by @krrishdholakia in #4296
- Vertex AI - character based cost calculation by @krrishdholakia in #4295
- Add claude 3.5 sonnet model by @lowjiansheng in #4310
- Add
vertex_ai/claude-3-5-sonnet@20240620
andbedrock/anthropic.claude-3-5-sonnet-20240620-v1:0
by @ishaan-jaff in #4311 - Updated deepseek coder for v2, added openrouter version by @paul-gauthier in #4308
- Fix model name for deepseek-coder in documentation by @williamjeong2 in #4304
New Contributors
- @hawktang made their first contribution in #4199
- @lowjiansheng made their first contribution in #4310
- @williamjeong2 made their first contribution in #4304
Full Changelog: v1.40.17...v1.40.19