## **What is Memory In Langchain**

In LangChain, memory refers to the **component that enables a language model (LLM) to retain context or information across multiple interactions** or conversations. Unlike traditional LLM interactions, where the model only processes one input at a time and forgets previous context, memory allows the LLM to keep track of what was said earlier in a conversation. This is especially useful for building applications like chatbots, personal assistants, or any system that requires understanding and responding to user inputs based on prior exchanges.

Memory in LangChain enhances the capabilities of LLMs by allowing them to remember past interactions, providing context for future responses, and improving the user experience in applications that require ongoing conversations or tasks. Whether you need to store entire conversations, summaries, or specific details, LangChain’s memory types make it easy to implement and tailor memory to your application’s needs.


## **Why is Memory Important in LangChain?**
Memory in LangChain is critical for maintaining coherence and relevance in multi-step or conversational tasks. It enables the language model to:

**Remember past interactions:** So it can provide responses that are contextually aware and consistent across multiple turns in a conversation.

**Track user preferences or details:** Such as remembering user names, previous queries, or other specific information during an ongoing session.

**Enable multi-step workflows:** By keeping track of intermediate steps or decisions in a process, making it easier to execute complex tasks over time.

### **Types of Memory in LangChain**
LangChain supports different types of memory, depending on the use case:

**Buffer Memory:**

This stores the entire conversation history as a buffer, where each interaction (both user inputs and model responses) is stored in chronological order.
Use case: Useful in applications like chatbots where the model needs to remember everything discussed so far.

**Summary Memory:**

Instead of storing the entire conversation, this memory periodically summarizes the conversation and stores the summary. This is efficient when you need context but don't want to overload the model with long conversations.
Use case: Useful when the conversation gets long and it's impractical to store everything.

**Vector-based Memory:**

This type of memory stores chunks of information as vectors using embeddings, which allows the model to retrieve relevant parts of the conversation without keeping the entire history in plain text.
Use case: Useful in knowledge retrieval tasks where specific details from past interactions need to be recalled with semantic understanding.

**Key-Value Memory:**

This stores specific pieces of information as key-value pairs. It’s useful for remembering specific facts or details like a user’s preferences, names, or past inputs.
Use case: This is perfect for structured interactions where only certain details need to be retained, such as a personal assistant remembering a user’s preferences.

## **How Memory Works in LangChain**

Memory in LangChain works by storing relevant information from past interactions and injecting that stored information into the prompts that are sent to the language model. Depending on the memory type, LangChain may store the entire conversation, summarized context, or specific key-value pairs, ensuring that the model maintains awareness of previous exchanges.

### **Use Cases of Memory in LangChain**

**Chatbots:** Memory allows chatbots to hold long conversations while retaining important information.

**Personal Assistants:** For remembering user preferences, scheduling details, or previous interactions.

**Customer Support:** Retain customer history to provide better, more informed responses based on past issues or queries.

**Interactive Storytelling:** In AI-driven storytelling, memory allows the model to recall previous events in the story and keep the narrative coherent.

#**01: Installation**

In [1]:
!pip install langchain langchain_community

Collecting langchain
  Downloading langchain-0.3.2-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain_community
  Downloading langchain_community-0.3.1-py3-none-any.whl.metadata (2.8 kB)
Collecting langchain-core<0.4.0,>=0.3.8 (from langchain)
  Downloading langchain_core-0.3.9-py3-none-any.whl.metadata (6.3 kB)
Collecting langchain-text-splitters<0.4.0,>=0.3.0 (from langchain)
  Downloading langchain_text_splitters-0.3.0-py3-none-any.whl.metadata (2.3 kB)
Collecting langsmith<0.2.0,>=0.1.17 (from langchain)
  Downloading langsmith-0.1.131-py3-none-any.whl.metadata (13 kB)
Collecting tenacity!=8.4.0,<9.0.0,>=8.1.0 (from langchain)
  Downloading tenacity-8.5.0-py3-none-any.whl.metadata (1.2 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain_community)
  Downloading pydantic_settings-2.5.2-py3-none-any.whl.metadata (3.5 kB)
Collecting 

#**02: Setup the Environment**

In [6]:
import os

In [7]:
# Create your API key from https://huggingface.co/settings/tokens
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "Your API KEY"

#**Hugging Face**

#**Example 1**

In [3]:
!pip install huggingface_hub



In [4]:
from langchain import HuggingFaceHub

##**3: Memory in LangChain**

Chatbot application like ChatGPT, you will notice that it remember past information

**HuggingFaceHub:**

HuggingFaceHub is a utility in LangChain that **allows you to load pre-trained language models** from Hugging Face's model repository.
Here, you're using the HuggingFaceHub to initialize the **google/flan-t5-large** model from Hugging Face's collection.
repo_id="google/flan-t5-large":

**repo_id specifies** which model to use from Hugging Face's model hub. In this case, it's google/flan-t5-large, a large T5 model fine-tuned by Google for a variety of natural language processing (NLP) tasks, such as text generation, translation, and summarization.

**model_kwargs={"temperature": 0.8, "max_length": 64}:**

This parameter contains additional settings for configuring the behavior of the language model during generation.

**temperature=0.8:**

The temperature parameter controls the creativity or randomness of the model's output. It affects how the model decides between possible next words in the generated text.
A temperature of 1 means the model will choose words more randomly, while a lower temperature (like 0.2) makes it more deterministic. A value of 0.8 strikes a balance between randomness and control, encouraging some creative variety in responses while still maintaining coherence.

**max_length=64:**

max_length determines the maximum number of tokens (words or word fragments) the model is allowed to generate in its response. Here, it’s set to 64 tokens. This ensures that the model doesn't generate excessively long responses.

In [126]:

llm = HuggingFaceHub(repo_id="google/flan-t5-large", model_kwargs={"temperature":0.8, "max_length":64})

In [113]:
llm

HuggingFaceHub(client=<InferenceClient(model='google/flan-t5-base', timeout=None)>, repo_id='google/flan-t5-base', task='text2text-generation', model_kwargs={'temperature': 0.8, 'max_length': 64})

**PromptTemplate:**
This is a class from the LangChain library that allows you to create customizable prompt templates. These templates can include placeholders (variables) that will be filled dynamically based on user input or specific data.

**input_variables=['cuisine']:**

input_variables specifies the dynamic part(s) of the template. Here, it indicates that the template contains a variable called cuisine.
When using this template, you’ll need to provide a value for the cuisine variable (e.g., Italian, Chinese, Mexican, etc.).

**template=**"I want to open a restaurant for {cuisine} food..:

This is the actual prompt template text. The {cuisine} is a placeholder that will be replaced with the specific cuisine provided at runtime.

In [114]:
from langchain.prompts import PromptTemplate

prompt_template_name = PromptTemplate(
    input_variables =['cuisine'],
    template = "I want to open a restaurant for {cuisine} food. Suggest a fency name for this."
)

**LLMChain:**
This is a class from the LangChain library that represents a chain involving a language model (LLM) and a prompt. The chain processes inputs through the language model to generate a response.

**chain.run("Mexican"):**

The **.run()** method runs the chain by taking the input ("Mexican") and passing it through the prompt template and language model.

In [115]:
from langchain.chains import LLMChain

chain = LLMChain(llm=llm,prompt=prompt_template_name)
name = chain.run("Mexican")
print(name)

el teo


In [116]:
name = chain.run("Indian")
print(name)

saada


In [117]:
chain.memory

In [118]:
# Here we can see there is no memory of previous inputs
type(chain.memory)

NoneType

##**ConversationBufferMemory**

We can attach memory to remember all previous conversation

**ConversationBufferMemory:**
This is a type of memory in LangChain that stores the entire conversation history (both user inputs and model responses) in a buffer. The memory keeps track of all previous interactions, allowing the language model to maintain context across multiple turns in the conversation.

In [119]:
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory()

chain = LLMChain(llm=llm, prompt=prompt_template_name, memory=memory)
name = chain.run("Mexican")
print(name)

el teo


In [120]:
name = chain.run("Arabic")
print(name)

ahmed ahmed


**chain.memory.buffer:**
This is accessing the conversation buffer stored in the ConversationBufferMemory object.
The buffer contains the entire history of the conversation, including all user inputs and the corresponding responses from the language model.

In [121]:
print(chain.memory.buffer)

Human: Mexican
AI: el teo
Human: Arabic
AI: ahmed ahmed


##**ConversationChain**

Conversation buffer memory goes growing endlessly

Just remember last 5 Conversation Chain

Just remember last 10-20 Conversation Chain

**ConversationChain:**


ConversationChain is designed to facilitate multi-turn conversations with a language model, allowing the model to maintain context across multiple interactions.

In [168]:
from langchain.chains import ConversationChain

convo = ConversationChain(llm=llm)
print(convo.prompt.template)

The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
{history}
Human: {input}
AI:


In [169]:
convo.run("Who won the first T20 cricket world cup?")

'The first T20 cricket world cup was won by India in 2003.'

In [170]:
convo.run("what is 5 added with 5?")

'5 is the number of 5 added with 5.'

In [171]:
convo.run("Who was the captain of the winning team?")

'The captain of the winning team was Virat Kohli.'

In [172]:
print(convo.memory.buffer)

Human: Who won the first T20 cricket world cup?
AI: The first T20 cricket world cup was won by India in 2003.
Human: what is 5 added with 5?
AI: 5 is the number of 5 added with 5.
Human: Who was the captain of the winning team?
AI: The captain of the winning team was Virat Kohli.


##**ConversationBufferWindowMemory**

**ConversationBufferWindowMemory(k=4):**
This initializes a memory object that retains the last 4 interactions (both user inputs and model responses) in the conversation.
The parameter k=4 specifies the size of the window, meaning that at any given point, only the last 4 conversational exchanges will be remembered. This is useful when you only need short-term context instead of remembering the entire conversation.

In [164]:
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=4)

convo = ConversationChain(
    llm=llm,
    memory=memory
)
convo.run("Who won the last soccer world cup?")

'The last soccer world cup was in Brazil in 2018'

In [162]:
convo.run("who is Messi?")

'Messi is a footballer.'

In [165]:
convo.run("Who was the captain of the winning team?")

'The captain of the winning team was Lionel Messi'

In [166]:
convo.run("Which was the other team?")

'The other team was Germany'

In [167]:
print(convo.memory.buffer)

Human: Who won the last soccer world cup?
AI: The last soccer world cup was in Brazil in 2018
Human: Who was the captain of the winning team?
AI: The captain of the winning team was Lionel Messi
Human: Which was the other team?
AI: The other team was Germany


### Note:- This is a notebook to practice memory in langchain. There may have some factually incorrect output from model, I urge you to try out other models like openAI etc..