Skip to content

[FEATURE] Prompt caching support for LiteLLM #937

@Didir19

Description

@Didir19

Problem Statement

While LiteLLM does support prompt caching via Bedrock, Strands does not support prompt caching via LiteLLM.

Proposed Solution

LiteLLM supports prompt caching for Bedrock by following the OpenAI prompt caching usage object format: https://docs.litellm.ai/docs/completion/prompt_caching

"usage": {
  "prompt_tokens": 2006,
  "completion_tokens": 300,
  "total_tokens": 2306,
  "prompt_tokens_details": {
    "cached_tokens": 1920
  },
  "completion_tokens_details": {
    "reasoning_tokens": 0
  }
  # ANTHROPIC_ONLY #
  "cache_creation_input_tokens": 0
}

Strands should support that format as well.

Use Case

The same way as it used via a bedrock model directly.

Alternatives Solutions

No response

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestrefinedIssue is discussed with the team and the team has come to an effort estimate consensus

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions