# CogCache

This notebook shows how to use LangChain with CogCache [chat models](/docs/concepts/#chat-models). For detailed documentation of all ChatCogCache features and configurations head to the [API reference](https://python.langchain.com/v0.2/api_reference/community/chat_models/langchain_community.chat_models.cogcache.ChatCogCache.html).

CogCache has several chat models. You can find information about their latest models and their costs, context windows, and supported input types in the [CogCache docs](https://cogcache.readme.io/reference/models).


In [1]:
import getpass
import os

if not os.environ.get("COGCACHE_API_KEY"):
    os.environ["COGCACHE_API_KEY"] = getpass.getpass("Enter your CogCache API key: ")

### Installation

The LangChain CogCache integration lives in the `langchain` package:

In [None]:
%pip install -qU langchain

## Instantiation

Now we can instantiate our model object and generate chat completions:

In [16]:
from langchain_community.chat_models import ChatCogCache

llm = ChatCogCache(
    api_key="YOUR_API_KEY",
    model="gpt-4-1106-preview",
    # temperature=0,
    # max_tokens=None,
    # n=1,
    # verbose=True,
    # timeout=None,
    # model_kwargs={ "response_format": { "type": "json_object" } }
    # default_headers={"Cache-Control": "no-store"} 
    # ...
    # other params...
)

## Invocation

In [9]:
messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg

AIMessage(content="J'aime la programmation.", response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 31, 'total_tokens': 38}, 'model_name': 'gpt-4-1106-preview', 'system_fingerprint': 'fp_5603ee5e2e', 'finish_reason': 'stop', 'logprobs': None}, id='run-c801654f-d8f7-4cb8-aa86-82019e0d844d-0', kwargs={'choices': [{'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}, 'finish_reason': 'stop', 'index': 0, 'logprobs': None, 'message': {'content': "J'aime la programmation.", 'role': 'assistant'}}], 'created': 1727347241, 'id': 'chatcmpl-ABgDJIg9HLpduHwGJ22p6JtiWueGF', 'model': 'gpt-4', 'object': 'chat.completion', 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexu

## Streaming

In [18]:
messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_stream = llm.stream(messages)

for ai_msg in ai_stream:
    print(ai_msg)

content='' response_metadata={'model_name': 'gpt-4', 'system_fingerprint': 'fp_5603ee5e2e'} id='run-d7165258-d752-405d-ab27-2269f65a3b7a' kwargs={'id': 'chatcmpl-ABgDJIg9HLpduHwGJ22p6JtiWueGF', 'object': 'chat.completion.chunk', 'created': 1727347241, 'model': 'gpt-4', 'system_fingerprint': 'fp_5603ee5e2e', 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}], 'choices': [{'index': 0, 'delta': {'role': 'assistant', 'content': ''}, 'finish_reason': ''}]}
content='J' response_metadata={'model_name': 'gpt-4', 'system_fingerprint': 'fp_5603ee5e2e'} id='run-d7165258-d752-405d-ab27-2269f65a3b7a' kwargs={'id': 'chatcmpl-ABgDJIg9HLpduHwGJ22p6JtiWueGF', 'object': 'chat.completion.chunk', 'created': 1727347241, 'model': 'gpt-4', 'system_fingerprint': 'fp_5603ee5e2e', 'pr

In [10]:
print(ai_msg.content)

J'aime la programmation.


## Chaining

We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:

In [14]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant that translates {input_language} to {output_language}.",
        ),
        ("human", "{input}"),
    ]
)

chain = prompt | llm
chain.invoke(
    {
        "input_language": "English",
        "output_language": "Spanish",
        "input": "I am writing spanish. I love programming.",
    }
)

AIMessage(content='Estoy escribiendo en español. Me encanta programar.', response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 31, 'total_tokens': 45}, 'model_name': 'gpt-4-1106-preview', 'system_fingerprint': 'fp_5603ee5e2e', 'finish_reason': 'stop', 'logprobs': None}, id='run-b26f7811-c618-4397-b808-135069cc29b1-0', kwargs={'choices': [{'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}, 'finish_reason': 'stop', 'index': 0, 'logprobs': None, 'message': {'content': 'Estoy escribiendo en español. Me encanta programar.', 'role': 'assistant'}}], 'created': 1727347389, 'id': 'chatcmpl-ABgFhzc3DcSsnhjSbvu867av969GV', 'model': 'gpt-4', 'object': 'chat.completion', 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'sel

## Response Headers

We can access response headers directly with prompt invocation

In [12]:
from langchain_community.chat_models import ChatCogCache

llm = ChatCogCache(
    api_key="YOUR_API_KEY",
    model="gpt-4-1106-preview",
    include_response_headers=True,  # set to True to include headers in the response
)

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
response = llm.invoke(messages)
response.response_metadata["headers"]

{'date': 'Fri, 27 Sep 2024 11:04:42 GMT',
 'server': 'uvicorn',
 'x-cache': 'hit',
 'access-control-allow-origin': '*',
 'access-control-allow-credentials': 'true',
 'access-control-expose-headers': 'content-type,x-cache,cogcache-hit-type,cogcache-similarity-match-score,cogcache-hit-processing-ms,cogcache-latency-ms',
 'access-control-allow-methods': 'GET, OPTIONS, POST',
 'access-control-max-age': '600',
 'vary': 'Origin',
 'cogcache-hit-type': 'exact-match',
 'cogcache-prompt-type': 'other',
 'cogcache-cache-entry-id': 'b14b8f518b54e4ef2187972f217dacc0',
 'content-length': '894',
 'content-type': 'application/json',
 'cogcache-hit-processing-ms': '0.81',
 'cogcache-latency-ms': '1.05'}

## Add parameters in individual invocation

Instead of putting all parameters during class instantiation. We can pass parameters in individual invocation.

In [13]:
from langchain_community.chat_models import ChatCogCache

llm = ChatCogCache(
    api_key="YOUR_API_KEY",
)

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
llm.invoke(
    messages,
    model="gpt-4-1106-preview",
    # temperature=0,
    # max_tokens=None,
    # n=1
)

AIMessage(content="J'aime la programmation.", response_metadata={'token_usage': {'prompt_tokens': 31, 'completion_tokens': 7, 'total_tokens': 38}, 'model_name': 'gpt-35-turbo-0125', 'system_fingerprint': 'fp_5603ee5e2e', 'finish_reason': 'stop', 'logprobs': None}, id='run-a5f3d2b5-480f-4127-bc9f-891d28745761-0', kwargs={'id': 'chatcmpl-ABgDJIg9HLpduHwGJ22p6JtiWueGF', 'object': 'chat.completion', 'created': 1727347241, 'model': 'gpt-4', 'system_fingerprint': 'fp_5603ee5e2e', 'choices': [{'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}, 'finish_reason': 'stop', 'index': 0, 'logprobs': None, 'message': {'content': "J'aime la programmation.", 'role': 'assistant'}}], 'usage': {'prompt_tokens': 31, 'completion_tokens': 7, 'total_tokens': 38}, 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_resul

## API reference

For detailed documentation of all CogCache features and configurations head to the API reference: https://cogcache.readme.io/reference/overview