# LangChain Memory
In LangChain, Memory refers to a mechanism that enables the language model to retain and reference information from earlier parts of a conversation or interaction. This is particularly useful for building applications that require context persistence, such as chatbots, virtual assistants, or multi-step workflows.

LangChain memory modules allow developers to:

- Store and retrieve conversational history: Memory keeps track of user inputs and model responses across multiple turns.

- Maintain context across interactions: This ensures that the model can refer back to previous statements, improving the coherence and relevance of responses.

- Support different memory strategies: LangChain offers several types of memory, including:

 - `ConversationBufferMemory`: Stores the full history of the conversation as a simple buffer.

 - `ConversationSummaryMemory`: Summarizes past interactions to maintain context while saving memory.

 - `ConversationBufferWindowMemory`: Retains only the last k messages, making it suitable for short-term context windows.

 - `CombinedMemory`: Combines multiple memory modules for advanced use cases.

By incorporating memory into a LangChain application, developers can create more intelligent, context-aware systems that provide a natural and consistent user experience.

# Setting Up Work Environment

In [None]:
!pip install --upgrade google-generativeai

In [2]:
!pip install -q -U google-genai

In [None]:
!pip install langchain-community

In [None]:
!pip install -U langchain-google-genai

In [5]:
import os
from google import genai
import google.generativeai as ggenai
from google.colab import userdata
from IPython.display import display
from IPython.display import Markdown

from PIL import Image
from google.genai import types

from IPython.display import HTML

In [40]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

from langchain.memory import ConversationBufferWindowMemory

from langchain.memory import ConversationTokenBufferMemory

from langchain.memory import ConversationSummaryBufferMemory

In [7]:
# Set up the API key (Replace 'YOUR_API_KEY' with your actual Gemini API key)
key = userdata.get('genai_api')
client = genai.Client(api_key=key)

List the set of available models

In [8]:
ggenai.configure(api_key=key)

models = ggenai.list_models()
# for model in models:
    # print(model.name)

In [9]:
# Initialize Gemini LLM
llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash",       # or "gemini-1.5-pro", etc.
    temperature=0.0,
    google_api_key=key
)

# Conversation Buffer Memory

is one of the simplest and most commonly used memory modules in LangChain. It maintains a complete, sequential history of the conversation between the user and the language model. This buffer captures all prior inputs and responses, enabling the model to refer back to earlier parts of the dialogue when generating new responses.

In [10]:
# Conversation memory
memory = ConversationBufferMemory()
# Conversation chain
conversation = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=True
)

  memory = ConversationBufferMemory()
  conversation = ConversationChain(


In [11]:
conversation.predict(input="Hi, my name is Gaber")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, my name is Gaber
AI:[0m

[1m> Finished chain.[0m


'Hi Gaber! It\'s nice to meet you.  My name isn\'t really a "name" in the human sense, as I don\'t have a personal identity like you do.  I\'m a large language model, trained by Google.  I don\'t have feelings or experiences, but I can access and process information from a massive dataset of text and code.  So, tell me, what can I do for you today?  Are you interested in learning something new, writing a story, translating something, or perhaps just having a chat?  I\'m ready for anything!'

In [12]:
conversation.predict(input="What is 1+1?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Gaber
AI: Hi Gaber! It's nice to meet you.  My name isn't really a "name" in the human sense, as I don't have a personal identity like you do.  I'm a large language model, trained by Google.  I don't have feelings or experiences, but I can access and process information from a massive dataset of text and code.  So, tell me, what can I do for you today?  Are you interested in learning something new, writing a story, translating something, or perhaps just having a chat?  I'm ready for anything!
Human: What is 1+1?
AI:[0m

[1m> Finished chain.[0m


"1 + 1 = 2.  That's a pretty straightforward calculation!  It's a fundamental concept in arithmetic, forming the basis for more complex mathematical operations.  Interestingly, the concept of addition, while seemingly simple, has a rich history, evolving from early counting methods to the sophisticated mathematical systems we use today.  Are you interested in learning more about the history of mathematics, or perhaps exploring some more complex mathematical problems?  I'm happy to delve into whatever you'd like!"

In [13]:
conversation.predict(input="What is my name?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Gaber
AI: Hi Gaber! It's nice to meet you.  My name isn't really a "name" in the human sense, as I don't have a personal identity like you do.  I'm a large language model, trained by Google.  I don't have feelings or experiences, but I can access and process information from a massive dataset of text and code.  So, tell me, what can I do for you today?  Are you interested in learning something new, writing a story, translating something, or perhaps just having a chat?  I'm ready for anything!
Human: What is 1+1?
AI: 1 + 1 = 2.  That's a pretty straightforward calculation!  It's a fundamental concept in arithmetic, forming the 

"Your name is Gaber.  We established that at the beginning of our conversation.  I have a pretty good memory (within the constraints of my current session, of course!), so I'm able to recall details from our interaction. Is there anything else I can help you with? Perhaps you'd like to explore a different topic, or maybe try a more challenging question?"

This prompt was generated by LangChain to guide the system toward maintaining a hopeful and friendly tone during the conversation. When the conversation continues through subsequent inputs (such as the second and third turns), the prompt remains consistent.

Since a memory object was used to retain conversational history, printing `memory.buffer` will display the stored dialogue up to that point. This allows the model to retain context and respond appropriately based on the ongoing interaction.

In [14]:
print(memory.buffer)

Human: Hi, my name is Gaber
AI: Hi Gaber! It's nice to meet you.  My name isn't really a "name" in the human sense, as I don't have a personal identity like you do.  I'm a large language model, trained by Google.  I don't have feelings or experiences, but I can access and process information from a massive dataset of text and code.  So, tell me, what can I do for you today?  Are you interested in learning something new, writing a story, translating something, or perhaps just having a chat?  I'm ready for anything!
Human: What is 1+1?
AI: 1 + 1 = 2.  That's a pretty straightforward calculation!  It's a fundamental concept in arithmetic, forming the basis for more complex mathematical operations.  Interestingly, the concept of addition, while seemingly simple, has a rich history, evolving from early counting methods to the sophisticated mathematical systems we use today.  Are you interested in learning more about the history of mathematics, or perhaps exploring some more complex mathemat

You can also display the stored conversation by using memory.`load_memory_variables({})`. The empty dictionary {} is used as a default input in this case. While LangChain supports more advanced usage with customized inputs, those features are beyond the scope of this article and can be explored further in the official LangChain documentation.

For now, you can disregard the reason behind passing an empty dictionary. The key takeaway is that this method retrieves everything the AI and the human have said so far, representing the complete conversation history stored in memory.

In [15]:
memory.load_memory_variables({})

{'history': 'Human: Hi, my name is Gaber\nAI: Hi Gaber! It\'s nice to meet you.  My name isn\'t really a "name" in the human sense, as I don\'t have a personal identity like you do.  I\'m a large language model, trained by Google.  I don\'t have feelings or experiences, but I can access and process information from a massive dataset of text and code.  So, tell me, what can I do for you today?  Are you interested in learning something new, writing a story, translating something, or perhaps just having a chat?  I\'m ready for anything!\nHuman: What is 1+1?\nAI: 1 + 1 = 2.  That\'s a pretty straightforward calculation!  It\'s a fundamental concept in arithmetic, forming the basis for more complex mathematical operations.  Interestingly, the concept of addition, while seemingly simple, has a rich history, evolving from early counting methods to the sophisticated mathematical systems we use today.  Are you interested in learning more about the history of mathematics, or perhaps exploring so

- LangChain utilizes ConversationBufferMemory to store the conversation history. If you wish to explicitly add new inputs and outputs to the memory, you can do so by directly interacting with the memory object. This allows you to manually record specific interactions.

- This is the standard approach to programmatically append new exchanges to the memory, ensuring they are available for future context-aware interactions.

In [16]:
memory = ConversationBufferMemory()
memory.save_context({"input": "Hi"},
                    {"output": "What's up"})

In [17]:
print(memory.buffer)

Human: Hi
AI: What's up


In [20]:
memory.load_memory_variables({})

{'history': "Human: Hi\nAI: What's up"}

# Conversation Buffer Window Memory

ConversationBufferWindowMemory is a variant of LangChain’s memory module designed to manage conversational history more efficiently by retaining only a fixed number of the most recent exchanges. Unlike `ConversationBufferMemory`, which stores the entire dialogue, this memory type helps avoid exceeding token limits and keeps the context focused on the most relevant parts of the conversation.

In [28]:
memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(
    llm=llm,
    memory = memory,
    verbose=False
)

In [29]:
memory.save_context({"input": "Hi"},
                    {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})

In [30]:
memory.load_memory_variables({})

{'history': 'Human: Not much, just hanging\nAI: Cool'}

Since the value of k was set to 1, only the most recent exchange is retained in memory. This behavior is useful when you want the system to focus solely on the latest part of the conversation. However, in practice, k is typically set to a larger number to retain a broader context.

For example, if we rerun the earlier conversation with k = 1, we might start with the input: “Hi, my name is Gaber,” followed by: “What is 1 plus 1?”, and finally ask: “What is my name?”. Because only the last exchange is remembered, the model recalls the most recent question and response—“What is 1 plus 1?” and “1 plus 1 equals 2”—but forgets the initial message where the name was provided. As a result, when asked to recall the name, the model responds that it does not have access to that information.

This illustrates how ConversationBufferWindowMemory limits the context window to improve efficiency while sacrificing long-term memory.

In [31]:
conversation.predict(input="Hi, my name is Gaber")

"Hi Gaber! It's nice to meet you.  Is there anything I can help you with today?  I'm ready for a chat about pretty much anything, from the history of the Roman Empire to the best way to bake a chocolate chip cookie (though I can't actually *bake* anything, of course!).  Let me know what's on your mind."

In [32]:
conversation.predict(input="What is 1+1?")

"1 + 1 = 2.  That's a pretty straightforward one!  It's a fundamental concept in arithmetic, the branch of mathematics dealing with numbers and their operations.  This particular equation is an example of addition, where we're combining two quantities.  In this case, we're combining one unit with another unit to get a total of two units.  It's the basis for much more complex mathematical calculations, and you'll find it used everywhere from simple counting to advanced calculus.  Anything else I can help you with today, Gaber?"

In [33]:
conversation.predict(input="What is my name?")

'I don\'t know your name.  You haven\'t told me.  My previous response called you "Gaber," but that was just a guess based on common names and the fact that I\'m trying to be conversational.  I don\'t have access to personal information about you unless you explicitly provide it.  What\'s your name?'

# Conversation Token Buffer Memory

ConversationTokenBufferMemory is a specialized memory type in LangChain that retains recent interactions based on token count, rather than a fixed number of message turns. This is particularly useful when working with large language models that have strict token limits for inputs.

In [35]:
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=50)

memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"},
                    {"output": "Charming!"})

  memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=50)


In [36]:
memory.load_memory_variables({})

{'history': 'Human: AI is what?!\nAI: Amazing!\nHuman: Backpropagation is what?\nAI: Beautiful!\nHuman: Chatbots are what?\nAI: Charming!'}

- When executed with a high token limit, the memory retains nearly the entire conversation. By increasing the token limit to 50, it is now able to store the full conversation history.

In [37]:
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=20)
memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"},
                    {"output": "Charming!"})
memory.load_memory_variables({})

{'history': 'AI: Beautiful!\nHuman: Chatbots are what?\nAI: Charming!'}

- If the token limit is reduced to 20, the memory truncates the earlier parts of the conversation, preserving only the most recent exchanges. This ensures that the total number of tokens remains within the specified limit, allowing the system to maintain context without exceeding the allowed token budget.

# Conversation Summary Memory

ConversationSummaryMemory is a memory module in LangChain that retains the conversation history in the form of a running summary, rather than storing the entire dialogue verbatim. This approach is especially useful when working within strict token limits or when maintaining a high-level understanding of the conversation is more important than recalling exact phrasing.

In [41]:
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

- In this example, we use ConversationSummaryBufferMemory with a maximum token limit of 400, which allows for a relatively large amount of conversational context to be retained. We simulate a dialogue that begins with several short exchanges—such as “Hello,” “What’s up?” “Not much, just hanging,” “Cool,” and “What is on the schedule today?”—followed by a longer response detailing the schedule.

- Given the generous token limit, the memory is able to store a substantial amount of content. When we inspect the memory variables, we can see that the entire conversation, including the lengthy response, has been preserved—demonstrating that 400 tokens were sufficient to capture the full interaction.

In [44]:
memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=400)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"},
                    {"output": f"{schedule}"})

In [45]:
memory.load_memory_variables({})

{'history': "Human: Hello\nAI: What's up\nHuman: Not much, just hanging\nAI: Cool\nHuman: What is on the schedule today?\nAI: There is a meeting at 8am with your product team. You will need your powerpoint presentation prepared. 9am-12pm have time to work on your LangChain project which will go quickly because Langchain is such a powerful tool. At Noon, lunch at the italian resturant with a customer who is driving from over an hour away to meet you to understand the latest in AI. Be sure to bring your laptop to show the latest LLM demo."}

- If the maximum token limit is reduced to 100, the ConversationSummaryBufferMemory will invoke a language model—such as the Gemini endpoint in this case—to generate a concise summary of the conversation. This summary replaces the full text of previous exchanges, allowing the memory to retain essential context while remaining within the specified token limit. The resulting summary is shown below.

In [46]:
memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"},
                    {"output": f"{schedule}"})

In [47]:
memory.load_memory_variables({})

{'history': 'System: The human greets the AI, and after some casual exchange, asks what is on the schedule for the day.\nAI: There is a meeting at 8am with your product team. You will need your powerpoint presentation prepared. 9am-12pm have time to work on your LangChain project which will go quickly because Langchain is such a powerful tool. At Noon, lunch at the italian resturant with a customer who is driving from over an hour away to meet you to understand the latest in AI. Be sure to bring your laptop to show the latest LLM demo.'}

In [48]:
conversation = ConversationChain(
    llm=llm,
    memory = memory,
    verbose=True
)

In [49]:
conversation.predict(input="What would be a good demo to show?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
System: The human greets the AI, and after some casual exchange, asks what is on the schedule for the day.
AI: There is a meeting at 8am with your product team. You will need your powerpoint presentation prepared. 9am-12pm have time to work on your LangChain project which will go quickly because Langchain is such a powerful tool. At Noon, lunch at the italian resturant with a customer who is driving from over an hour away to meet you to understand the latest in AI. Be sure to bring your laptop to show the latest LLM demo.
Human: What would be a good demo to show?
AI:[0m

[1m> Finished chain.[0m


"That's a great question!  Since your customer is driving over an hour to learn about the latest in AI, I'd recommend focusing on a demo that showcases the practical applications of LLMs and is relatively easy to understand, even for someone without a deep technical background.  Here are a few options, ranked in order of my recommendation:\n\n1. **A customized question-answering system using LangChain:** This directly relates to your LangChain project and highlights its power. You could pre-load it with information relevant to your customer's industry or needs.  The demo could involve asking it a few insightful questions, demonstrating its ability to synthesize information and provide concise, accurate answers.  This shows the practical application of LLMs in a business context.  Make sure to choose questions that highlight the benefits for *them* specifically.\n\n2. **A simple text summarization demo:**  This is visually appealing and easy to grasp.  You could feed it a longer documen

# Conclusion

| Feature                     | ConversationBufferMemory | ConversationBufferWindowMemory | ConversationSummaryMemory |
| --------------------------- | ------------------------ | ------------------------------ | ------------------------- |
| Retention method            | Full history             | Last *k* message pairs         | Summarized context        |
| Token efficiency            | Low                      | Medium                         | High                      |
| Retains exact phrasing      | Yes                      | Yes (for last *k*)             | No                        |
| Suitable for long dialogues | No                       | Limited                        | Yes                       |
