Skip to content

Manual Caching (Anthropic) #158

@hewliyang

Description

@hewliyang

# N.B. Currently, Anthropic doesn't cache by default and we currently do not support
# manual caching in chatlas. Note also that this only tracks reads, NOT writes, which
# have their own cost. To track that properly, we would need another caching category and per-token cost.

Many-turn agentic use cases are simply not feasible cost wise without caching.

Are there any plans to bring in support for manual caching?

I think since Turn's support the system role, a relatively straightfoward and non-invasive way of doing this would be to expose a callback that will run before transforming list[Turn] to list[ProviderXYZ's Messages]

i.e.

def cache_last_message(turns: List[Turn]) -> List[Turn]:
    if not turns: return turns
    
    for content in turns[-1].contents:
        content[-1].cache_control = {"type": "ephemeral"}

    return turns

chat = ChatAnthropic(...)
chat.set_turn_callback(cache_system_and_last)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions