Skip to content

Releases: BerriAI/litellm

v1.41.4

03 Jul 17:20
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.41.3...v1.41.4

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.4

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.4

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 150.0 174.7746187513257 6.304554345115949 0.0 1886 0 120.43884399997751 1842.0690810000337
Aggregated Passed ✅ 150.0 174.7746187513257 6.304554345115949 0.0 1886 0 120.43884399997751 1842.0690810000337

v1.41.3.dev3

03 Jul 01:26
Compare
Choose a tag to compare

What's Changed

  • fix(slack_alerting.py): use in-memory cache for checking request status by @krrishdholakia in #4520
  • feat(vertex_httpx.py): Support cachedContent. by @Manouchehri in #4492
  • [Fix+Test] /audio/transcriptions - use initialized OpenAI / Azure OpenAI clients by @ishaan-jaff in #4519
  • [Fix-Proxy] Background health checks use deep copy of model list for _run_background_health_check by @ishaan-jaff in #4518
  • refactor(azure.py): move azure dall-e calls to httpx client by @krrishdholakia in #4523

Full Changelog: v1.41.3.dev2...v1.41.3.dev3

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.3.dev3

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 100.0 121.89055827443096 6.607625351081853 0.0 1975 0 85.08053399998516 1232.177610000008
Aggregated Passed ✅ 100.0 121.89055827443096 6.607625351081853 0.0 1975 0 85.08053399998516 1232.177610000008

v1.41.3.dev2

02 Jul 07:55
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.41.3...v1.41.3.dev2

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.3.dev2

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 130.0 146.68007458654787 6.4100923703571215 0.0 1918 0 104.69528600003741 1155.2264040000182
Aggregated Passed ✅ 130.0 146.68007458654787 6.4100923703571215 0.0 1918 0 104.69528600003741 1155.2264040000182

v1.41.3.dev1

02 Jul 04:24
Compare
Choose a tag to compare

What's Changed

  • Update azure_ai.md by @sauravpanda in #4485
  • fix(bedrock_httpx.py): Add anthropic.claude-3-5-sonnet-20240620-v1:0 to converse list by @Manouchehri in #4484
  • Fix usage of parameter-based credentials when using vertex_ai_beta route by @ubirdio in #4486

New Contributors

Full Changelog: v1.41.2-stable...v1.41.3.dev1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.3.dev1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 120.0 139.74836169660554 6.401173614938302 0.0 1915 0 100.05953000001 2095.8458609999866
Aggregated Passed ✅ 120.0 139.74836169660554 6.401173614938302 0.0 1915 0 100.05953000001 2095.8458609999866

v1.41.3

02 Jul 06:23
Compare
Choose a tag to compare

What's Changed

  • Update azure_ai.md by @sauravpanda in #4485
  • fix(bedrock_httpx.py): Add anthropic.claude-3-5-sonnet-20240620-v1:0 to converse list by @Manouchehri in #4484
  • Fix usage of parameter-based credentials when using vertex_ai_beta route by @ubirdio in #4486
  • [Feat] Return Response headers for OpenAI / Azure OpenAI when litellm.return_response_headers=True by @ishaan-jaff in #4501
  • [Feat-Enterprise] log "x-ratelimit-remaining-tokens" and "x-ratelimit-remaining-requests" on prometheus by @ishaan-jaff in #4503
  • fix exception provider not known by @ishaan-jaff in #4504

New Contributors

Full Changelog: v1.41.2...v1.41.3

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.3

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 130.0 158.51689002697222 6.443513101897096 0.0 1928 0 109.03273299999228 1351.596542999971
Aggregated Passed ✅ 130.0 158.51689002697222 6.443513101897096 0.0 1928 0 109.03273299999228 1351.596542999971

v1.41.2-stable

30 Jun 05:52
Compare
Choose a tag to compare

Full Changelog: v1.41.2...v1.41.2-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.2-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 130.0 155.2631200462578 6.418460613857053 0.0 1924 0 115.53501399998822 1721.5953009999794
Aggregated Passed ✅ 130.0 155.2631200462578 6.418460613857053 0.0 1924 0 115.53501399998822 1721.5953009999794

v1.41.2

30 Jun 04:35
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.41.1...v1.41.2

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.2

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 81 94.17508715704547 6.427710896354567 0.0 1923 0 69.435722999998 1632.053753000008
Aggregated Passed ✅ 81 94.17508715704547 6.427710896354567 0.0 1923 0 69.435722999998 1632.053753000008

v1.41.1

29 Jun 19:41
f06feca
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.41.0...v1.41.1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 120.0 135.94747413256803 6.401388314451501 0.0 1916 0 98.61660699999675 1625.039872000002
Aggregated Passed ✅ 120.0 135.94747413256803 6.401388314451501 0.0 1916 0 98.61660699999675 1625.039872000002

v1.41.0-stable

29 Jun 17:23
Compare
Choose a tag to compare

Full Changelog: v1.41.0...v1.41.0-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.0-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 120.0 135.89504825622356 6.44293938318117 0.0 1928 0 99.83084399999598 1682.5208299999872
Aggregated Passed ✅ 120.0 135.89504825622356 6.44293938318117 0.0 1928 0 99.83084399999598 1682.5208299999872

v1.41.0

29 Jun 06:00
Compare
Choose a tag to compare

What's Changed

Note: This release contains changes in how pre-call-checks run for azure models. Filtering models based on context window limits, will only apply to azure models if the base_model is set.

To enable pre-call-checks 👉 https://docs.litellm.ai/docs/routing#pre-call-checks-context-window-eu-regions

Full Changelog: v1.40.31...v1.41.0

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.0

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 130.0 156.9410206132132 6.2719899835647945 0.0 1877 0 112.84582399997589 1745.2864320000003
Aggregated Passed ✅ 130.0 156.9410206132132 6.2719899835647945 0.0 1877 0 112.84582399997589 1745.2864320000003