## LangChain Chat Memory

- ConversationBufferMemory 
    - It's like a list with all the messages
- ConversationBufferWindowMemory
    - Quite similar to the above one, but returns just only a chunk of the list
- ConversationSummaryMemory
    - Instead of storing the complete messages, it stores only a summary of the conversation
- ConversationSummaryBufferMemory
    - Combination between ConversationSummaryMemory and ConversationBufferWindowMemory
    You get a summary of the conversation + Xk of recent tokens stored

In [1]:
from langchain_openai import ChatOpenAI

In [2]:
llm = ChatOpenAI(temperature=0.0, model="gpt-4o-mini")

### ConversationSummaryBufferMemory - DEPRECATED WAY

In [5]:
from langchain.memory import ConversationSummaryBufferMemory
from langchain.chains import ConversationChain

In [4]:
memory = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=300,
    return_messages=True
)

  memory = ConversationSummaryBufferMemory(


In [6]:
chain = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=True
)

  chain = ConversationChain(


In [7]:
chain.invoke({"input": "Hi, my name is Luis!"})



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
[]
Human: Hi, my name is Luis!
AI:[0m

[1m> Finished chain.[0m


{'input': 'Hi, my name is Luis!',
 'history': [HumanMessage(content='Hi, my name is Luis!', additional_kwargs={}, response_metadata={}),
  AIMessage(content="Hello, Luis! It's great to meet you! How's your day going so far?", additional_kwargs={}, response_metadata={})],
 'response': "Hello, Luis! It's great to meet you! How's your day going so far?"}

In [8]:
for message in [
    "I'm researching the different types of conversational memory.",
    "I have been looking at ConversationBufferMemory and ConversationBufferWindowMemory.",
    "Buffer memory just stores the entire conversation",
    "Buffer window memory stores the last k messages, dropping the rest."
]:
    chain.invoke({"input": message})



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
[HumanMessage(content='Hi, my name is Luis!', additional_kwargs={}, response_metadata={}), AIMessage(content="Hello, Luis! It's great to meet you! How's your day going so far?", additional_kwargs={}, response_metadata={})]
Human: I'm researching the different types of conversational memory.
AI:[0m

[1m> Finished chain.[0m


[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current

### ConversationSummaryBufferMemory - Newest way

In [9]:
from langchain.prompts import (
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder,
    ChatPromptTemplate
)

In [10]:
system_prompt = "You are a helpful assistant called Zeta."

In [11]:
prompt_template = ChatPromptTemplate.from_messages([
    SystemMessagePromptTemplate.from_template(system_prompt),
    MessagesPlaceholder(variable_name="history"),
    HumanMessagePromptTemplate.from_template("{query}")
])

In [12]:
pipeline = prompt_template | llm

In [13]:
from pydantic import BaseModel, Field
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.messages import BaseMessage, SystemMessage

In [32]:
class ConversationSummaryBufferMessageHistory(BaseChatMessageHistory, BaseModel):
    """
    Add messages to the history, removing any message
    beyond the last 'k' messages and summarizing the messages that we drop
    """
    messages: list[BaseMessage] = Field(default_factory=list)
    llm: ChatOpenAI = Field(default_factory=ChatOpenAI)
    k: int = Field(default_factory=int)

    def __init__(self, llm: ChatOpenAI, k: int):
        super().__init__(llm=llm, k=k)

    def add_messages(self, messages: list[BaseMessage]) -> None:
        """
        """
        existing_summary: SystemMessage | None = None 
        old_messages = None

        # see if we already have a summary message
        if len(self.messages) > 0 and isinstance(self.messages[0], SystemMessage):
            print(">> Found existing summary")
            existing_summary = self.messages.pop(0)

        # add the new messages to the history
        self.messages.extend(messages)

        # check if we have too many messages
        if len(self.messages) > self.k:
            print(
                f">> found {len(self.messages)} messages, dropping"
                f"oldest {len(self.messages) - self.k} messages"
            )
            # pull out the oldest messages...
            old_messages = self.messages[:self.k]
            # ...and keep the most recent messages
            self.messages = self.messages[-self.k:]
        
        if old_messages is None:
            print(">> No old messages to update summary with")
            # if we have no old_messages, we have nothing to update in summary
            return
        
        # construct the summary chat messages
        summary_prompt = ChatPromptTemplate.from_messages([
            SystemMessagePromptTemplate.from_template(
                "Given the existing conversation summary and the new messages, "
                "generate a new summary of the conversation. Ensuring to maintain "
                "as much relevant information as possible. However, we need to "
                "keep our summary concise, the limit is a single short paragraph."
            ),
            HumanMessagePromptTemplate.from_template(
                "Existing conversation summary:\n{existing_summary}\n\n"
                "New messages:\n{old_messages}"
            )
        ])

        # format the messages and invoke the LLM
        new_summary = self.llm.invoke(
            summary_prompt.format_messages(
                existing_summary=existing_summary,
                old_messages=old_messages
            )
        )
        print(f">> New summary: {new_summary.content}")
        # prepend the new summary to the history
        self.messages = [SystemMessage(content=new_summary.content)] + self.messages

    def clear(self) -> None:
        """
        Clear the history
        """
        self.messages = []


In [33]:
chat_map = {}
def get_chat_history(session_id: str, llm: ChatOpenAI, k: int) -> ConversationSummaryBufferMessageHistory:
    if session_id not in chat_map:
        chat_map[session_id] = ConversationSummaryBufferMessageHistory(llm=llm, k=k)
    return chat_map[session_id]

In [34]:
from langchain_core.runnables import ConfigurableFieldSpec
from langchain_core.runnables.history import RunnableWithMessageHistory

In [35]:
pipeline_with_history = RunnableWithMessageHistory(
    pipeline,
    get_session_history=get_chat_history,
    input_messages_key="query",
    history_messages_key="history",
    history_factory_config=[
        ConfigurableFieldSpec(
            id="session_id",
            annotation=str,
            name="Session ID",
            description="The session ID to use for the chat history",
            default="id_default"
        ),
        ConfigurableFieldSpec(
            id="llm",
            annotation=ChatOpenAI,
            name="LLM",
            description="The LLM to use for the conversation summary",
            default=llm
        ),
        ConfigurableFieldSpec(
            id="k",
            annotation=int,
            name="k",
            description="The number of messages to keep in the history",
            default=llm
        )
    ]
)

In [36]:
pipeline_with_history.invoke(
    {"query": "Hi my name is Luis"},
    config={"session_id": "id_123", "llm": llm, "k": 4}
)

>> No old messages to update summary with


AIMessage(content='Hello, Luis! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 11, 'prompt_tokens': 25, 'total_tokens': 36, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_34a54ae93c', 'id': 'chatcmpl-C2HbYXlXnZc9WsvJQ521Uf2h0BOar', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='run--4aabc662-aec1-43b1-8b3e-f448bf983616-0', usage_metadata={'input_tokens': 25, 'output_tokens': 11, 'total_tokens': 36, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [37]:
for i, msg in enumerate([
    "I'm researching the different types of conversational memory.",
    "I have been looking at ConversationBufferMemory and ConversationBufferWindowMemory.",
    "Buffer memory just stores the entire conversation",
    "Buffer window memory stores the last k messages, dropping the rest."
]):
    print(f"---\nMessage {i+1}\n---\n")
    pipeline_with_history.invoke(
        {"query": msg},
        config={"session_id": "id_123", "llm": llm, "k": 4}
    )

---
Message 1
---

>> No old messages to update summary with
---
Message 2
---

>> found 6 messages, droppingoldest 2 messages
>> New summary: Luis introduced himself and expressed interest in researching different types of conversational memory. The AI responded by outlining various types of conversational memory, including short-term, long-term, contextual, episodic, semantic, declarative vs. procedural, social, and personal memory, explaining their relevance in both human communication and artificial intelligence. Luis was invited to ask further questions or explore specific aspects of conversational memory.
---
Message 3
---

>> Found existing summary
>> found 6 messages, droppingoldest 2 messages
>> New summary: Luis is researching different types of conversational memory and has expressed interest in specific concepts like ConversationBufferMemory and ConversationBufferWindowMemory. The AI provided an overview of various types of conversational memory, including short-term, long-