Releases: BerriAI/litellm
v1.41.11.dev1
What's Changed
- fix(vertex_httpx.py): support tool calling w/ streaming for vertex ai + gemini by @krrishdholakia in #4579
- fix(router.py): fix setting httpx mounts by @krrishdholakia in #4434
Full Changelog: v1.41.8.dev2...v1.41.11.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.11.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 77 | 88.81311274450754 | 6.537049007999015 | 0.0 | 1957 | 0 | 67.48113100002229 | 1314.0082060000395 |
Aggregated | Passed ✅ | 77 | 88.81311274450754 | 6.537049007999015 | 0.0 | 1957 | 0 | 67.48113100002229 | 1314.0082060000395 |
v1.41.11
What's Changed
- fix: typo in vision docs by @berkecanrizai in #4555
- [Feat] Improve Proxy Mem Util (Reduces proxy startup memory util by 50%) by @ishaan-jaff in #4577
- [fix] UI fix show models as dropdown by @ishaan-jaff in #4574
- UI - don't spam error messages when model list is not defined by @ishaan-jaff in #4575
- Azure proxy tts pricing by @krrishdholakia in #4572
- feat(cost_calculator.py): support openai+azure tts calls by @krrishdholakia in #4571
- [Refactor] Use helper function to encrypt/decrypt model credentials by @ishaan-jaff in #4576
- [Feat-Enterprise] /spend/report view spend for a specific key by @ishaan-jaff in #4578
- build(deps): bump aiohttp from 3.9.0 to 3.9.4 by @dependabot in #4553
- Enforcing sync'd
poetry.lock
viapre-commit
by @jamesbraza in #4517 - [Feat] OTEL allow setting deployment environment by @ishaan-jaff in #4422
New Contributors
- @berkecanrizai made their first contribution in #4555
Full Changelog: v1.41.8...v1.41.11
v1.41.8.dev2
Full Changelog: v1.41.11...v1.41.8.dev2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.8.dev2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.8.dev2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 145.86702940844324 | 6.331699407694199 | 0.0 | 1895 | 0 | 106.76047199996219 | 387.6153609999733 |
Aggregated | Passed ✅ | 130.0 | 145.86702940844324 | 6.331699407694199 | 0.0 | 1895 | 0 | 106.76047199996219 | 387.6153609999733 |
v1.41.8.dev1
What's Changed
- fix: typo in vision docs by @berkecanrizai in #4555
- [Feat] Improve Proxy Mem Util (Reduces proxy startup memory util by 50%) by @ishaan-jaff in #4577
- [fix] UI fix show models as dropdown by @ishaan-jaff in #4574
- UI - don't spam error messages when model list is not defined by @ishaan-jaff in #4575
- Azure proxy tts pricing by @krrishdholakia in #4572
- feat(cost_calculator.py): support openai+azure tts calls by @krrishdholakia in #4571
- [Refactor] Use helper function to encrypt/decrypt model credentials by @ishaan-jaff in #4576
- [Feat-Enterprise] /spend/report view spend for a specific key by @ishaan-jaff in #4578
- build(deps): bump aiohttp from 3.9.0 to 3.9.4 by @dependabot in #4553
- Enforcing sync'd
poetry.lock
viapre-commit
by @jamesbraza in #4517 - [Feat] OTEL allow setting deployment environment by @ishaan-jaff in #4422
New Contributors
- @berkecanrizai made their first contribution in #4555
Full Changelog: v1.41.8...v1.41.8.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.8.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 77 | 87.51172147931786 | 6.46157682222913 | 0.0 | 1934 | 0 | 67.4040659999946 | 582.6242449999768 |
Aggregated | Passed ✅ | 77 | 87.51172147931786 | 6.46157682222913 | 0.0 | 1934 | 0 | 67.4040659999946 | 582.6242449999768 |
v1.41.8
🔥 Excited to launch support for Logging LLM I/O on 🔭 Galileo through LiteLLM (YC W23) Proxy https://docs.litellm.ai/docs/proxy/logging#logging-llm-io-to-galielo
📈 [docs] New example Grafana Dashboards https://github.com/BerriAI/litellm/tree/main/cookbook/litellm_proxy_server/grafana_dashboard
🛡️ feat - control guardrails per api key https://docs.litellm.ai/docs/proxy/guardrails#switch-guardrails-onoff-per-api-key
🛠️ fix - raise report Anthropic streaming errors (thanks David Manouchehri)
✨ [Fix] Add nvidia nim param mapping based on model passed
What's Changed
- fix(anthropic.py): add index to streaming tool use by @igor-drozdov in #4554
- (fix) fixed bug with the watsonx embedding endpoint by @simonsanvil in #4540
- Revert "(fix) fixed bug with the watsonx embedding endpoint" by @krrishdholakia in #4561
- [docs] add example Grafana Dashboard by @ishaan-jaff in #4563
- build(deps): bump certifi from 2023.7.22 to 2024.7.4 by @dependabot in #4568
- fix(proxy/utils.py): support logging rejected requests to langfuse, etc. by @krrishdholakia in #4564
- [Feat] Add Galileo Logging Callback by @ishaan-jaff in #4567
- [Fix] Add nvidia nim param mapping based on
model
by @ishaan-jaff in #4565 - fix - raise report Anthropic streaming errors by @ishaan-jaff in #4566
- feat - control guardrails per api key by @ishaan-jaff in #4569
New Contributors
- @igor-drozdov made their first contribution in #4554
Full Changelog: v1.41.7...v1.41.8
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.8
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 120.0 | 148.48763956993193 | 6.382118352365276 | 0.0 | 1909 | 0 | 109.10986900000808 | 1689.413720999994 |
Aggregated | Passed ✅ | 120.0 | 148.48763956993193 | 6.382118352365276 | 0.0 | 1909 | 0 | 109.10986900000808 | 1689.413720999994 |
v1.41.7
What's Changed
- [Bug Fix] Use OpenAI Tool Response Spec When Converting To Gemini/VertexAI Tool Response by @andrewmjc in #4522
- feat - show key alias on prometheus metrics by @ishaan-jaff in #4545
- Deepseek coder now has 128k context by @paul-gauthier in #4541
- Cohere tool calling fix by @krrishdholakia in #4546
- fix: Include vertex_ai_beta in vertex_ai param mapping/Do not use google auth project_id by @t968914 in #4461
- [Fix] Invite Links / Onboarding flow on admin ui by @ishaan-jaff in #4548
- feat - allow looking up model_id on
/model/info
by @ishaan-jaff in #4547 - feat(internal_user_endpoints.py): expose
/user/delete
endpoint by @krrishdholakia in #4386 - Return output_vector_size in get_model_info by @tomusher in #4279
- [Feat] Add Groq/whisper-large-v3 by @ishaan-jaff in #4549
New Contributors
- @andrewmjc made their first contribution in #4522
- @t968914 made their first contribution in #4461
- @tomusher made their first contribution in #4279
Full Changelog: v1.41.6...v1.41.7
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.7
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 152.06898919521237 | 6.419721686734246 | 0.0 | 1921 | 0 | 111.60093299997698 | 1678.7594189999027 |
Aggregated | Passed ✅ | 130.0 | 152.06898919521237 | 6.419721686734246 | 0.0 | 1921 | 0 | 111.60093299997698 | 1678.7594189999027 |
v1.41.6
What's Changed
- real Anthropic tool calling + streaming support by @krrishdholakia in #4536
- fix(utils.py): fix vertex anthropic streaming by @krrishdholakia in #4535
Full Changelog: v1.41.5...v1.41.6
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.6
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.6
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 99 | 120.03623455821207 | 6.427899106681398 | 0.0 | 1924 | 0 | 83.30082600002697 | 1524.837892999983 |
Aggregated | Passed ✅ | 99 | 120.03623455821207 | 6.427899106681398 | 0.0 | 1924 | 0 | 83.30082600002697 | 1524.837892999983 |
v1.41.5.dev1
What's Changed
- real Anthropic tool calling + streaming support by @krrishdholakia in #4536
- fix(utils.py): fix vertex anthropic streaming by @krrishdholakia in #4535
Full Changelog: v1.41.5...v1.41.5.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.5.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 150.69639806536688 | 6.33818146467335 | 0.0 | 1897 | 0 | 115.55820499995662 | 1375.4738929999917 |
Aggregated | Passed ✅ | 130.0 | 150.69639806536688 | 6.33818146467335 | 0.0 | 1897 | 0 | 115.55820499995662 | 1375.4738929999917 |
v1.41.5
What's Changed
- Fix: Output Structure of Ollama chat by @edwinjosegeorge in #4089
- Allow calling SageMaker endpoints from different regions by @petermuller in #4499
- Doc set guardrails on litellm config.yaml by @ishaan-jaff in #4530
- [Feat] Allow users to set a guardrail config on proxy server by @ishaan-jaff in #4529
- [Feat] v2 - Control guardrails per LLM Call by @ishaan-jaff in #4532
- fix(vertex_anthropic.py): Vertex Anthropic tool calling - native params by @krrishdholakia in #4531
- Revert "fix(vertex_anthropic.py): Vertex Anthropic tool calling - native params " by @krrishdholakia in #4534
- Fix Granite Prompt template by @nick-rackauckas in #4533
New Contributors
- @petermuller made their first contribution in #4499
Full Changelog: v1.41.4...v1.41.5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.5
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 140.0 | 158.84378552886133 | 6.424808718348793 | 0.030069307574175322 | 1923 | 9 | 83.97341500000266 | 2746.2116009999704 |
Aggregated | Passed ✅ | 140.0 | 158.84378552886133 | 6.424808718348793 | 0.030069307574175322 | 1923 | 9 | 83.97341500000266 | 2746.2116009999704 |
v1.41.4.dev1
What's Changed
- Fix: Output Structure of Ollama chat by @edwinjosegeorge in #4089
Full Changelog: v1.41.4...v1.41.4.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.4.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.4.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 140.0 | 159.74933266389294 | 6.421967444124719 | 0.0 | 1922 | 0 | 117.16217900004722 | 1925.8788660000619 |
Aggregated | Passed ✅ | 140.0 | 159.74933266389294 | 6.421967444124719 | 0.0 | 1922 | 0 | 117.16217900004722 | 1925.8788660000619 |