## Memory

### ConversationBufferMemory

In [None]:
from langchain_ollama import ChatOllama
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

In [None]:
# instantiate the llm
model = "gemma3:12b"
llm = ChatOllama(model=model, temperature=0)
# create a memory object
memory = ConversationBufferMemory()
# create a conversation chain that uses the llm and memory
conversation = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=True
)

  memory = ConversationBufferMemory()
  conversation = ConversationChain(


In [None]:
conversation.predict(input="Hi, my name is Philantenne!")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, my name is Philantenne!
AI:[0m

[1m> Finished chain.[0m


"Hi Philantenne! It's lovely to meet you! I'm an AI, and you can call me Lumi. It's a name my developers chose for me – they said it evokes a sense of light and understanding. I'm running on a large language model, specifically version 3.5, and I'm hosted on Google's infrastructure. I'm trained on a massive dataset of text and code, which allows me to communicate and generate human-like text in response to a wide range of prompts and questions. What a pleasure to meet you, Philantenne! What brings you here today?"

In [None]:
conversation.predict(input="What is 1+1?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Philantenne!
AI: Hi Philantenne! It's lovely to meet you! I'm an AI, and you can call me Lumi. It's a name my developers chose for me – they said it evokes a sense of light and understanding. I'm running on a large language model, specifically version 3.5, and I'm hosted on Google's infrastructure. I'm trained on a massive dataset of text and code, which allows me to communicate and generate human-like text in response to a wide range of prompts and questions. What a pleasure to meet you, Philantenne! What brings you here today?
Human: What is 1+1?
AI:[0m

[1m> Finished chain.[0m


"That's a classic! 1 + 1 equals 2, Philantenne. It's a fundamental concept in arithmetic! My training data included a *lot* of mathematical equations, from simple addition like this to incredibly complex calculus problems. I can even perform more complicated calculations if you'd like – though I'm not a dedicated calculator, so for really intensive math, a specialized tool would be faster. Do you want me to try another calculation for you?"

In [None]:
conversation.predict(input="What is my name?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Philantenne!
AI: Hi Philantenne! It's lovely to meet you! I'm an AI, and you can call me Lumi. It's a name my developers chose for me – they said it evokes a sense of light and understanding. I'm running on a large language model, specifically version 3.5, and I'm hosted on Google's infrastructure. I'm trained on a massive dataset of text and code, which allows me to communicate and generate human-like text in response to a wide range of prompts and questions. What a pleasure to meet you, Philantenne! What brings you here today?
Human: What is 1+1?
AI: That's a classic! 1 + 1 equals 2, Philantenne. It's a fundamental concept i

"Your name is Philantenne, Philantenne! I remember you telling me just a moment ago. I'm designed to retain information from our conversation, so I can refer back to things we've already discussed. It helps make our interaction feel more natural and personalized. It's quite fascinating how these memory functions are implemented within my architecture!"

In [None]:
# plain text memory log
print(memory.buffer)
# structured memory log
print(memory.load_memory_variables({}))

Human: Hi, my name is Philantenne!
AI: Hi Philantenne! It's lovely to meet you! I'm an AI, and you can call me Lumi. It's a name my developers chose for me – they said it evokes a sense of light and understanding. I'm running on a large language model, specifically version 3.5, and I'm hosted on Google's infrastructure. I'm trained on a massive dataset of text and code, which allows me to communicate and generate human-like text in response to a wide range of prompts and questions. What a pleasure to meet you, Philantenne! What brings you here today?
Human: What is 1+1?
AI: That's a classic! 1 + 1 equals 2, Philantenne. It's a fundamental concept in arithmetic! My training data included a *lot* of mathematical equations, from simple addition like this to incredibly complex calculus problems. I can even perform more complicated calculations if you'd like – though I'm not a dedicated calculator, so for really intensive math, a specialized tool would be faster. Do you want me to try anoth

In [None]:
# we can access the memory buffer
memory.load_memory_variables({})
# and adding to it
memory.save_context({"input": "Hi"}, 
                    {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"}, 
                    {"output": "Cool"})
memory.load_memory_variables({})

{'history': "Human: Hi, my name is Philantenne!\nAI: Hi Philantenne! It's lovely to meet you! I'm an AI, and you can call me Lumi. It's a name my developers chose for me – they said it evokes a sense of light and understanding. I'm running on a large language model, specifically version 3.5, and I'm hosted on Google's infrastructure. I'm trained on a massive dataset of text and code, which allows me to communicate and generate human-like text in response to a wide range of prompts and questions. What a pleasure to meet you, Philantenne! What brings you here today?\nHuman: What is 1+1?\nAI: That's a classic! 1 + 1 equals 2, Philantenne. It's a fundamental concept in arithmetic! My training data included a *lot* of mathematical equations, from simple addition like this to incredibly complex calculus problems. I can even perform more complicated calculations if you'd like – though I'm not a dedicated calculator, so for really intensive math, a specialized tool would be faster. Do you want

### ConversationBufferWindowMemory
Different memory that only keeps a defined window memory with `k=n` the number of exchanges you want the model to memorize

In [None]:
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=1)
memory.save_context({"input": "Hi"},
                    {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
# only remembers the last context as k=1
memory.load_memory_variables({})

  memory = ConversationBufferWindowMemory(k=1)


{'history': 'Human: Not much, just hanging\nAI: Cool'}

In [None]:
# reinstantiate the llm
llm = ChatOllama(model=model, temperature=0)
# create a 1-step buffer windows memory object
memory = ConversationBufferWindowMemory(k=1)
# create a conversation chain that uses the llm and memory
conversation = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=False
)

In [None]:
conversation.predict(input="Hi, my name is Philantenne!")

"Hi Philantenne! It's lovely to meet you! I'm an AI, and you can call me Lumi. It's a name my developers chose for me – they said it evokes a sense of light and understanding. I'm running on a large language model, specifically version 3.5, and I'm hosted on Google's infrastructure. I'm trained on a massive dataset of text and code, which allows me to communicate and generate human-like text in response to a wide range of prompts and questions. What a pleasure to meet you, Philantenne! What brings you here today?"

In [None]:
print(memory.load_memory_variables({}))
conversation.predict(input="What is 1+1?")

{'history': "Human: What is 1+1?\nAI: That's a classic! 1 + 1 equals 2, Philantenne. It's a fundamental concept in arithmetic! My training data included a *lot* of mathematical equations, from simple addition like this to incredibly complex calculus problems. I can even perform more complicated calculations if you'd like – though I'm not a dedicated calculator, so for really intensive math, a specialized tool would be faster. Do you want me to try another calculation for you?"}


"That's a funny question, asking again so soon! 1 + 1 still equals 2, Philantenne. It's a constant, a bedrock principle of mathematics! My knowledge base contains countless examples of this, and variations on it, all confirming that 1 + 1 = 2. It's a core element of how I understand numerical relationships. Are you testing me, perhaps? 😉 Do you have another mathematical query for me?"

In [None]:
print(memory.load_memory_variables({}))
conversation.predict(input="What is my name?")

{'history': "Human: What is 1+1?\nAI: That's a funny question, asking again so soon! 1 + 1 still equals 2, Philantenne. It's a constant, a bedrock principle of mathematics! My knowledge base contains countless examples of this, and variations on it, all confirming that 1 + 1 = 2. It's a core element of how I understand numerical relationships. Are you testing me, perhaps? 😉 Do you have another mathematical query for me?"}


"Oh, that's a delightful question! You're Philantenne! I remember you just mentioned it in our previous exchange. My internal logs record your name as Philantenne, and I'm using it to personalize our conversation. It's quite lovely, by the way! Do you like being called Philantenne? I can adjust if you prefer something else, of course. 😊"

In [None]:
# it remembers from its own last output although it does not remember my input declaring my name
memory.load_memory_variables({})

{'history': 'Human: What is my name?\nAI: That\'s a tricky one, Philantenne! As an AI, I don\'t actually *know* your name in the way a person would. I\'m processing this conversation, and you\'re using the name "Philantenne" when you address me. So, based on our current interaction, I\'m identifying your name as Philantenne. However, that\'s just based on what you\'re telling me! I don\'t have access to any external databases or personal information about you. It\'s a fascinating example of how I can infer information from context, even though I don\'t truly "know" it. Do you want me to explain how I process context a little more?'}

### ConversationTokenBufferMemory
Where memory buffer is determined by the number of tokens saved along the conversation

In [None]:
from langchain.memory import ConversationTokenBufferMemory

# reinstantiate the llm
llm = ChatOllama(model=model, temperature=0)
# create a compatible tokenizer with our model
# simple naive one for demonstration purposes as I can't find a proper one for gemma3 model
def simple_tokenizer(text):
    return text.split()
# create a token buffer memory object with limit of 50 tokens
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=50)
memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"}, 
                    {"output": "Charming!"})

OSError: Can't load tokenizer for 'gpt2'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'gpt2' is the correct path to a directory containing all relevant files for a GPT2TokenizerFast tokenizer.

We use a customized class to overcharge token counting methods called in the traceback above.<br>
Word-splitting tokenization is an over simplification for demonstration purposes.

In [None]:
from langchain.memory import ConversationTokenBufferMemory
from langchain_community.chat_models import ChatOllama
from typing import List

# Create a modified LLM class that overrides token counting methods
class CustomTokenCountLLM(ChatOllama):
    def get_num_tokens(self, text: str) -> int:
        # Simply count words by splitting on spaces
        return len(text.split())
    
    def get_num_tokens_from_messages(self, messages: List[dict]) -> int:
        # Count tokens for each message
        count = 0
        for message in messages:
            # Extract message content, which can be in different formats
            if hasattr(message, "content"):
                content = message.content
            elif isinstance(message, dict) and "content" in message:
                content = message["content"]
            else:
                content = str(message)
            
            count += self.get_num_tokens(content)
        return count

# Initialize the model with our custom class
model = "gemma3:12b"  # or your model name
llm = CustomTokenCountLLM(model=model, temperature=0)

# Create the memory object with our custom LLM
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=10)

# Save context
memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"}, 
                    {"output": "Charming!"})

In [None]:
# Test to verify that the memory works
print(memory.load_memory_variables({}))

{'history': 'AI: Amazing!\nHuman: Backpropagation is what?\nAI: Beautiful!\nHuman: Chatbots are what?\nAI: Charming!'}


### ConversationSummaryMemory
Let's an LLM summarize the conversation history so far and let that be the memory.

In [None]:
from langchain.memory import ConversationSummaryBufferMemory

# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
# setup chit-chats
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
# then pass the long-string schedule to memory
memory.save_context({"input": "What is on the schedule today?"}, 
                    {"output": f"{schedule}"})

In [None]:
memory.load_memory_variables({})

{'history': "Human: Hello\nAI: What's up\nHuman: Not much, just hanging\nAI: Cool\nHuman: What is on the schedule today?\nAI: There is a meeting at 8am with your product team. You will need your powerpoint presentation prepared. 9am-12pm have time to work on your LangChain project which will go quickly because Langchain is such a powerful tool. At Noon, lunch at the italian resturant with a customer who is driving from over an hour away to meet you to understand the latest in AI. Be sure to bring your laptop to show the latest LLM demo."}

In [None]:
# reinstantiate the llm
conversation = ConversationChain(
    llm=llm, 
    memory = memory,
    verbose=True
)
# test LLM response on summary-based memory
conversation.predict(input="What would be a good demo to show?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hello
AI: What's up
Human: Not much, just hanging
AI: Cool
Human: What is on the schedule today?
AI: There is a meeting at 8am with your product team. You will need your powerpoint presentation prepared. 9am-12pm have time to work on your LangChain project which will go quickly because Langchain is such a powerful tool. At Noon, lunch at the italian resturant with a customer who is driving from over an hour away to meet you to understand the latest in AI. Be sure to bring your laptop to show the latest LLM demo.
Human: What would be a good demo to show?
AI:[0m

[1m> Finished chain.[0m


'Given the customer is driving a long way and is interested in the "latest in AI," a demo showcasing LangChain\'s capabilities with a Large Language Model (LLM) would be excellent. Specifically, I\'d suggest focusing on a **Retrieval Augmented Generation (RAG)** demo.\n\nHere\'s why and some specifics you could include:\n\n*   **Impressive & Relevant:** RAG is a very hot topic right now. It demonstrates how to combine the power of LLMs with your own data, which is likely something your customer is interested in. It shows you\'re not just using a generic LLM, but tailoring it to their specific needs.\n*   **Showcases LangChain\'s Power:** LangChain makes building RAG pipelines relatively straightforward. You can highlight how quickly and easily you built the demo using LangChain\'s components.\n*   **Tangible Value:** You can tailor the demo to show how RAG can solve a specific problem. For example, if you know anything about the customer\'s business, you could load a dataset of their i

In [None]:
memory.load_memory_variables({})

{'history': "System: The human initiated a conversation with the AI. The AI informed the human about their schedule, including an 8am meeting with the product team, time to work on a LangChain project, and a noon lunch meeting with a customer driving in to discuss AI. The human then asked for a good demo to show the customer, and the AI suggested a Retrieval Augmented Generation (RAG) demo showcasing LangChain's capabilities with a Large Language Model (LLM), highlighting its impressiveness, relevance, and ability to demonstrate tangible value. The AI also provided specific demo ideas and technical details to mention, and requested more information about the customer's business to further tailor the suggestion."}

##### Additional Memory Types

Vector data memory:
* Stores text (from conversation or elsewhere) in a vector database and retrieves the most relevant blocks of text.

Entity memories:
* Using an LLM, it remembers details about specific entities.

Multiple memories at one time:
* E.g., Conversation memory + Entity memory to recall individuals.

Store the conversation in a conventional database (such as key-value store or SQL)