Skip to content

[Bug]: Caching completely broken with cache_control parameter when using PromptCachingDeploymentCheck #11574

@hnykda

Description

@hnykda

What happened?

  1. We use LiteLLM's Proxy PromptCachingDeploymentCheck to route requests from the same conversation to the same provider, with the goal of increasing cache hits.
  2. That routing step checks the length of the messages since anything under 1024 tokens won't be cached.
  3. However, LiteLLM cannot count the length of cacheable messages. They cause a ValueError, which is interpreted as being uncacheable.

An error we get:

  "message": "Error in is_prompt_caching_valid_prompt: Unsupported type <class 'dict'> for key cache_control in message {'content': 'You are presented with the following task:

...our truncated message...

tools.\\n', 'role': 'user', 'cache_control': {'type': 'ephemeral'}}",

Relevant log output

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.72.2.rc

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions