<a href="https://colab.research.google.com/github/dhanushpachabhatla/My_LangChain_Playground/blob/main/langchain_4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Conversation Memory for Langchain (GEMINI)

In [1]:
!pip install langchain langsmith langchain_google_genai --quiet

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.4/49.4 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.4/1.4 MB[0m [31m22.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m444.0/444.0 kB[0m [31m30.3 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-generativeai 0.8.5 requires google-ai-generativelanguage==0.6.15, but you have google-ai-generativelanguage 0.6.18 which is incompatible.[0m[31m
[0m

In [2]:
import os
from getpass import getpass

os.environ["LANGCHAIN_API_KEY"] = getpass("Enter LangSmith API Key: ")

# below should not be changed
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_ENDPOINT"] = "https://api.smith.langchain.com"
# you can change this as preferred
os.environ["LANGCHAIN_PROJECT"] = "pr-husky-bran-20"

Enter LangSmith API Key: ··········


In [3]:
import os
from getpass import getpass
from langchain_google_genai import ChatGoogleGenerativeAI

os.environ['GOOGLE_API_KEY'] = getpass("Enter your GEMINI KEY")

llm = ChatGoogleGenerativeAI(model='gemini-1.5-flash',temperature= 0.2)

Enter your GEMINI KEY··········


#Conversational Memory in LangChain

Memory allows the LLM to remember previous interactions, essential for building chatbots or assistants with context.

How It Works

* ConversationBufferMemory stores entire conversation history as plain text.

* The model sees the previous conversation in every prompt.

Variants of Memory:

* `ConversationBufferMemory`: the simplest and most intuitive form of conversational memory, keeping track of a conversation without any additional bells and whistles.

* `ConversationBufferWindowMemory` – Remembers only the last N interactions (helps control token usage).

* `ConversationTokenBufferMemory` – Remembers conversation up to a token limit.

* `ConversationSummaryMemory` – Summarizes old history to save space.

* `ConversationSummaryBufferMemory`: merges the `ConversationSummaryMemory` and `ConversationTokenBufferMemory` types.

# 1. ConversationBufferMemory

There are several ways that we can add messages to our memory, using the `save_context` method we can add a user query (via the `input` key) and the AI's response (via the `output` key). So, to create the following conversation:

```
User: Hi, my name is James
AI: Hey James, what's up? I'm an AI model called Zeta.
User: I'm researching the different types of conversational memory.
AI: That's interesting, what are some examples?
User: I've been looking at ConversationBufferMemory and ConversationBufferWindowMemory.
AI: That's interesting, what's the difference?
User: Buffer memory just stores the entire conversation, right?
AI: That makes sense, what about ConversationBufferWindowMemory?
User: Buffer window memory stores the last k messages, dropping the rest.
AI: Very cool!
```

We do:

In [4]:
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(return_messages=True)

  memory = ConversationBufferMemory(return_messages=True)


In [5]:
memory.save_context(
    {"input": "Hi, my name is James"},  # user message
    {"output": "Hey James, what's up? I'm an AI model called Zeta."}  # AI response
)
memory.save_context(
    {"input": "I'm researching the different types of conversational memory."},  # user message
    {"output": "That's interesting, what are some examples?"}  # AI response
)
memory.save_context(
    {"input": "I've been looking at ConversationBufferMemory and ConversationBufferWindowMemory."},  # user message
    {"output": "That's interesting, what's the difference?"}  # AI response
)
memory.save_context(
    {"input": "Buffer memory just stores the entire conversation, right?"},  # user message
    {"output": "That makes sense, what about ConversationBufferWindowMemory?"}  # AI response
)
memory.save_context(
    {"input": "Buffer window memory stores the last k messages, dropping the rest."},  # user message
    {"output": "Very cool!"}  # AI response
)

Before using the memory, we need to load in any variables for that memory type — in this case, there are none, so we just pass an empty dictionary:

In [6]:
memory.load_memory_variables({})

{'history': [HumanMessage(content='Hi, my name is James', additional_kwargs={}, response_metadata={}),
  AIMessage(content="Hey James, what's up? I'm an AI model called Zeta.", additional_kwargs={}, response_metadata={}),
  HumanMessage(content="I'm researching the different types of conversational memory.", additional_kwargs={}, response_metadata={}),
  AIMessage(content="That's interesting, what are some examples?", additional_kwargs={}, response_metadata={}),
  HumanMessage(content="I've been looking at ConversationBufferMemory and ConversationBufferWindowMemory.", additional_kwargs={}, response_metadata={}),
  AIMessage(content="That's interesting, what's the difference?", additional_kwargs={}, response_metadata={}),
  HumanMessage(content='Buffer memory just stores the entire conversation, right?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='That makes sense, what about ConversationBufferWindowMemory?', additional_kwargs={}, response_metadata={}),
  HumanMess

* alternate approach

In [7]:
memory = ConversationBufferMemory(return_messages=True)

memory.chat_memory.add_user_message("Hi, my name is James")
memory.chat_memory.add_ai_message("Hey James, what's up? I'm an AI model called Zeta.")
memory.chat_memory.add_user_message("I'm researching the different types of conversational memory.")
memory.chat_memory.add_ai_message("That's interesting, what are some examples?")
memory.chat_memory.add_user_message("I've been looking at ConversationBufferMemory and ConversationBufferWindowMemory.")
memory.chat_memory.add_ai_message("That's interesting, what's the difference?")
memory.chat_memory.add_user_message("Buffer memory just stores the entire conversation, right?")
memory.chat_memory.add_ai_message("That makes sense, what about ConversationBufferWindowMemory?")
memory.chat_memory.add_user_message("Buffer window memory stores the last k messages, dropping the rest.")
memory.chat_memory.add_ai_message("Very cool!")

memory.load_memory_variables({})

{'history': [HumanMessage(content='Hi, my name is James', additional_kwargs={}, response_metadata={}),
  AIMessage(content="Hey James, what's up? I'm an AI model called Zeta.", additional_kwargs={}, response_metadata={}),
  HumanMessage(content="I'm researching the different types of conversational memory.", additional_kwargs={}, response_metadata={}),
  AIMessage(content="That's interesting, what are some examples?", additional_kwargs={}, response_metadata={}),
  HumanMessage(content="I've been looking at ConversationBufferMemory and ConversationBufferWindowMemory.", additional_kwargs={}, response_metadata={}),
  AIMessage(content="That's interesting, what's the difference?", additional_kwargs={}, response_metadata={}),
  HumanMessage(content='Buffer memory just stores the entire conversation, right?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='That makes sense, what about ConversationBufferWindowMemory?', additional_kwargs={}, response_metadata={}),
  HumanMess

both will give same output

In [8]:
from langchain.chains import ConversationChain

chain = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=True
)

chain.invoke({"input": "what is my name again?"})

  chain = ConversationChain(




[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
[HumanMessage(content='Hi, my name is James', additional_kwargs={}, response_metadata={}), AIMessage(content="Hey James, what's up? I'm an AI model called Zeta.", additional_kwargs={}, response_metadata={}), HumanMessage(content="I'm researching the different types of conversational memory.", additional_kwargs={}, response_metadata={}), AIMessage(content="That's interesting, what are some examples?", additional_kwargs={}, response_metadata={}), HumanMessage(content="I've been looking at ConversationBufferMemory and ConversationBufferWindowMemory.", additional_kwargs={}, response_metadata={}), AIMessage(content="That's interesting, what's the differ

{'input': 'what is my name again?',
 'history': [HumanMessage(content='Hi, my name is James', additional_kwargs={}, response_metadata={}),
  AIMessage(content="Hey James, what's up? I'm an AI model called Zeta.", additional_kwargs={}, response_metadata={}),
  HumanMessage(content="I'm researching the different types of conversational memory.", additional_kwargs={}, response_metadata={}),
  AIMessage(content="That's interesting, what are some examples?", additional_kwargs={}, response_metadata={}),
  HumanMessage(content="I've been looking at ConversationBufferMemory and ConversationBufferWindowMemory.", additional_kwargs={}, response_metadata={}),
  AIMessage(content="That's interesting, what's the difference?", additional_kwargs={}, response_metadata={}),
  HumanMessage(content='Buffer memory just stores the entire conversation, right?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='That makes sense, what about ConversationBufferWindowMemory?', additional_kwargs={}

# ConversationBufferMemeory with `RunnableWithMessageHistory`

steps:
1. insert `MessagePlavceholder` in `ChatPromptTemplate` (while creating Prompt)

2. Pipleline with prompt and llm (ECEL)

3. Another pipeline which is object of `RunnableWithMessageHistory` which includes `pipeline`, `get_chat_history`,`input_message_key`, `history_message_key`

4. invoke

In [10]:
from langchain.prompts import (
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder,
    ChatPromptTemplate
)

system_prompt = "You are a helpful assistant called Zeta."

prompt_template = ChatPromptTemplate.from_messages([
    SystemMessagePromptTemplate.from_template(system_prompt),
    MessagesPlaceholder(variable_name="history"),
    HumanMessagePromptTemplate.from_template("{query}"),
])

In [11]:
pipeline = prompt_template | llm

Our `RunnableWithMessageHistory` requires our `pipeline` to be wrapped in a `RunnableWithMessageHistory` object. This object requires a few input parameters. One of those is `get_session_history`, which requires a function that returns a `ChatMessageHistory` object based on a session ID. We define this function ourselves:

In [12]:
from langchain_core.chat_history import InMemoryChatMessageHistory

chat_map = {}
def get_chat_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in chat_map:
        # if session ID doesn't exist, create a new chat history
        chat_map[session_id] = InMemoryChatMessageHistory()
    return chat_map[session_id]

In [13]:
from langchain_core.runnables.history import RunnableWithMessageHistory

pipeline_with_history = RunnableWithMessageHistory(
    pipeline,
    get_session_history=get_chat_history,
    input_messages_key="query",
    history_messages_key="history"
)

In [14]:
pipeline_with_history.invoke(
    {"query": "Hi, my name is James"},
    config={"session_id": "id_123"}
)

AIMessage(content="Hello James, it's nice to meet you! How can I help you today?", additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-1.5-flash', 'safety_ratings': []}, id='run--29493119-1f08-4f79-98d1-60ac0bbae09e-0', usage_metadata={'input_tokens': 14, 'output_tokens': 19, 'total_tokens': 33, 'input_token_details': {'cache_read': 0}})

Our chat history will now be memorized and retrieved whenever we invoke our runnable with the same session ID.

In [15]:
pipeline_with_history.invoke(
    {"query": "What is my name again?"},
    config={"session_id": "id_123"}
)

AIMessage(content='Your name is James.', additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-1.5-flash', 'safety_ratings': []}, id='run--63533430-b597-4e33-a603-dbdc87d5d362-0', usage_metadata={'input_tokens': 38, 'output_tokens': 6, 'total_tokens': 44, 'input_token_details': {'cache_read': 0}})

#With runnableLamda

steps:
1. insert `MessagePlavceholder` in `ChatPromptTemplate` (while creating Prompt)

2. create a pipleline with `input_var`(input, history) | `prompt` | `llm` | `output` | `saving_output` (function to load memory using `memory.save_context()` this function should be wrapped up by RunnableLambda to be use in ECEL pipeline)


3. invoke

In [16]:
memory = ConversationBufferMemory(return_messages=True)

In [17]:
prompt2 = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder(variable_name="history"),  # Memory placeholder
    ("human", "{input}")  # Latest user input
])


In [18]:
from langchain_core.runnables import RunnableLambda

def save_context(input_output):
    memory.save_context({"input": input_output["input"]},
                        {"output": input_output["output"]})
    return input_output["output"]

memory_runnable = RunnableLambda(save_context)

chain = (
    (lambda x: {"input": x["input"], "history": memory.load_memory_variables({})["history"]})
    | prompt2
    | llm
    | (lambda x: {"input": x.content, "output": x.content})  # Wrap to pass to save_context
    | memory_runnable
)

In [19]:
print(chain.invoke({"input": "Hi, who won the FIFA World Cup in 2018?"}))
print(chain.invoke({"input": "What country hosted that tournament?"}))
print(chain.invoke({"input": "Can you summarize what we talked about?"}))


France won the FIFA World Cup in 2018.
Russia hosted the 2018 FIFA World Cup.
We discussed the 2018 FIFA World Cup, specifically that France won the tournament and Russia hosted it.


# 2.ConverationBufferMemory

In [20]:
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=4, return_messages=True)

  memory = ConversationBufferWindowMemory(k=4, return_messages=True)


# ConversationSummaryMemory

In [22]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.memory import ConversationSummaryMemory
from langchain_core.runnables import RunnableLambda


In [23]:
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.3)

# Summarizing memory uses an LLM to condense history
memory = ConversationSummaryMemory(llm=llm, return_messages=True)


  memory = ConversationSummaryMemory(llm=llm, return_messages=True)


In [24]:
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}")
])


In [25]:
def save_context(input_output):
    memory.save_context(
        {"input": input_output["input"]},
        {"output": input_output["output"]}
    )
    return input_output["output"]

memory_runnable = RunnableLambda(save_context)

chain = (
    (lambda x: {"input": x["input"], "history": memory.load_memory_variables({})["history"]})
    | prompt
    | llm
    | (lambda x: {"input": x.content, "output": x.content})
    | memory_runnable
)


In [26]:
print(chain.invoke({"input": "Explain quantum computing in simple terms."}))
print(chain.invoke({"input": "Who pioneered this field?"}))
print(chain.invoke({"input": "Can you summarize what we've discussed so far?"}))


Imagine a light switch. It can be either ON or OFF, right?  That's how regular computers work – using bits that are either 0 or 1.

Quantum computers are different. They use **qubits**.  A qubit is like a special light switch that can be ON, OFF, or *both at the same time*! This "both at the same time" is called **superposition**.  It's like the light switch is flickering so fast you can't tell if it's on or off.

Another cool thing qubits can do is **entanglement**. Imagine two of these special light switches linked magically. If one is ON, the other is instantly OFF, no matter how far apart they are.  This allows quantum computers to explore many possibilities simultaneously.

Because of superposition and entanglement, quantum computers can solve certain types of problems much faster than regular computers.  Think of it like searching a maze: a regular computer tries each path one by one, while a quantum computer can explore all paths at once.

However, quantum computers are still ve

# ConversationSummaryBufferMemory

In [None]:
from langchain.memory import ConversationSummaryBufferMemory

memory = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=300,
    return_messages=True
)

rest everything is same process