# Section 1. Introduction to LangChain

## Introduction

LangChain is an open-source framework that simplifies the development of applications powered by language models (LLMs). In a rapidly evolving landscape of language models, LangChain stands out by offering a more accessible and flexible way to integrate these technologies into a wide range of applications. This enables the creation of smarter and more personalized solutions, tailored to the specific needs of each context.

LangChain applications span various areas:
* Intelligent and conversational chatbots;
* Creative multimedia content generation;
* Data analysis for extracting valuable insights;
* Virtual assistants capable of automating complex tasks;
* Accurate and informative question-and-answer systems.

## 1.1 About API

Unlike direct interaction with language models through interfaces like ChatGPT, LangChain uses APIs (Application Programming Interfaces) for automated communication. These APIs act as bridges, allowing LangChain to connect to and interact with LLMs via web protocols.

Each LLM provider offers its own API, and to use it, you need to obtain an access key. This key serves as authentication, enabling the service to identify and authorize your requests, execute them, and return the appropriate responses.

The OpenAI API is widely used, known for its robustness and low error rate. However, it is a paid service and can become expensive depending on the model used. For learning and testing purposes, we recommend Google's API, which offers a generous amount of free requests. Check the usage limits at this [link](https://ai.google.dev/gemini-api/docs/rate-limits?hl=pt-br).

To create your account and obtain access keys, visit the links below. Remember to store your keys in a safe place, as they won’t be displayed again and losing them will require generating new ones.

* Gemini (Google): https://ai.google.dev/gemini-api/docs/api-key
* OpenAI: https://platform.openai.com


Now let's start using LangChain!
First, we need to install the necessary dependencies to run LangChain with Google's Gemini model by executing the cell below.

In [None]:
!pip install dotenv
!pip install langchain_openai
!pip install -qU langchain-google-genai

One way to save your keys is by using a `.env` file. If you wish, create a file named `.env` and save your keys as follows:
```
GOOGLE_API_KEY = …
OPENAI_API_KEY = …
```

In [None]:
import getpass
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

## 1.2 Inicialização do modelo de linguagem

The function below simplifies the use of different models (Gemini or GPT). You can select which model you want to use and also adjust some parameters:

#### Temperature:
Temperature controls how confident the model needs to be before choosing the next word.

* High Temperature (>0.7):
  * Increases randomness. The model is more likely to pick less probable words, leading to more creative, surprising, or even chaotic responses.
  * Good for storytelling, brainstorming, or when you want variety.

* Low Temperature (<0.5):
  * Makes output more predictable and focused. The model sticks to high-probability choices.
  * Ideal for factual, logical, or structured tasks.
* 0.7 is a common default for chatbots.
* 0.0 is a common default for agents.

#### Top-P (Nucleus Sampling):
Top-p filters possible next words by cumulative probability ensuring coherent diversity. Top-p functions by dynamically filtering based on context, create a more natural and fluent output.

The model sorts all possible next words by likelihood, then only considers the top ones whose total probability adds up to p (e.g., 0.9).
  * This means the model won’t just take the top 10, but rather the top words that collectively feel “safe enough.”

* 0.8–0.95 for balanced diversity and relevance.
* Lower values (like 0.6) make the output more narrow and safe.


You can also test these parameters on [Google AI Studio](https://aistudio.google.com/prompts/new_chat).

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_openai import ChatOpenAI

def get_model_name(model_name, temperature=0):
    if model_name == "gemini": # https://ai.google.dev/gemini-api/docs/rate-limits?hl=pt-br
        if "GOOGLE_API_KEY" not in os.environ: # https://ai.google.dev/gemini-api/docs/api-key
            os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter your Google AI API key: ")
        llm = ChatGoogleGenerativeAI(
            # model="gemini-1.5-pro", # max 50 / dia
            model="gemini-1.5-flash", # max 1500 / dia
            temperature=temperature,
        )
    elif model_name == "openai":
        if "OPENAI_API_KEY" not in os.environ: # https://platform.openai.com
            os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")
        llm = ChatOpenAI(
            model="gpt-4o-mini",
            temperature=temperature,
        )
    return llm

llm = get_model_name('gemini')

Now that we have a `BaseChatModel` object named `llm`, we can test it by making our first API call using the `invoke` method with a given prompt. Try changing the prompt to observe different outputs.

In [None]:
prompt = "Hello, how are you?"
response = llm.invoke(prompt)
print(response)

content='I am doing well, thank you for asking!  How are you today?' additional_kwargs={} response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-1.5-flash', 'safety_ratings': []} id='run-e506b807-83e4-4c3c-9b0e-7c05166c60fc-0' usage_metadata={'input_tokens': 6, 'output_tokens': 17, 'total_tokens': 23, 'input_token_details': {'cache_read': 0}}


Another way to interact with the API is by setting up a conversation with the AI. The cell below creates a list called `messages`. This list is used to structure the conversation that will be sent to the language model (LLM).

* Each element in the `messages` list is a tuple containing two strings:
* The first string indicates the "role" of the message ("system" or "human").
* The second string contains the content of the message.
* `("system", ...)`: Defines a system message. System messages provide instructions or context to the LLM, guiding its behavior. In this case, you're instructing the LLM to act as an assistant that translates English to French.
* `("human", ...)`: Defines a user (human) message. This is the user's input that the LLM will process. In this case, the input is the sentence "I love programming."

In [None]:
messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
response = llm.invoke(messages)
print(response.content)

J'aime la programmation.


## 1.3 Prompt Template

A fundamental technique when using LangChain is the application of Prompt Templates. They allow developers to create dynamic and reusable prompts, where variables can be inserted to customize the instructions sent to LLMs during workflow execution. This not only simplifies the creation of complex prompts but also ensures consistency and makes application maintenance easier.

The syntax is simple, as shown in the cell below: you define a string using triple double-quotes, and variables are inserted between curly braces. These variables represent the text that will be dynamically modified within the template.

In [None]:
template_string = """Translate the text \
that is delimited by triple backticks \
into a style that is {style}. \
text: ```{text}```
"""

In [None]:
from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_template(template_string)

In [None]:
prompt_template.messages[0].prompt

PromptTemplate(input_variables=['style', 'text'], input_types={}, partial_variables={}, template='Translate the text that is delimited by triple backticks into a style that is {style}. text: ```{text}```\n')

Here we can see that `prompt_template` takes two input variables: `['style', 'text']`.










In [None]:
prompt_template.messages[0].prompt.input_variables

['style', 'text']

Now, let’s input the two variables and build the prompt that we’ll send to the LLM.

In [None]:
customer_style = """American English \
in a calm and respectful tone
"""

customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse, \
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

customer_messages = prompt_template.format_messages(
                    style=customer_style,
                    text=customer_email)

print(type(customer_messages))
print(type(customer_messages[0]))
print(customer_messages[0])

<class 'list'>
<class 'langchain_core.messages.human.HumanMessage'>
content="Translate the text that is delimited by triple backticks into a style that is American English in a calm and respectful tone\n. text: ```\nArrr, I be fuming that me blender lid flew off and splattered me kitchen walls with smoothie! And to make matters worse, the warranty don't cover the cost of cleaning up me kitchen. I need yer help right now, matey!\n```\n" additional_kwargs={} response_metadata={}


After using the `format_messages` method with the input variables, we build the final prompt. When inspecting `customer_messages`, we see that it consists of a list containing a single element of type `HumanMessage`. In LangChain, `HumanMessage` represents the prompts that the LLM should respond to, and not necessarily a message written by a human, as we’ll see later.

In [None]:
# Call the LLM to translate to the style of the customer message
customer_response = llm.invoke(customer_messages)
print(customer_response.content)

I'm quite upset that my blender lid came off unexpectedly, resulting in smoothie splattered on my kitchen walls.  To make matters worse, the warranty doesn't appear to cover the cost of cleaning up the mess. I would appreciate your assistance with this matter.


## 1.3 Output Parsers

Now that we’ve learned how to structure the message we send to the LLM, let’s learn how to ask the LLM to return in a structured format using Output Parsers. They allow developers to define the desired format for the output, whether it's JSON, lists, objects, or any other specific structure. This is especially useful in applications that require extracting precise and structured information from LLMs, making it easier to integrate with other tools and systems.

First, let's start with defining how we would like the LLM output to look like:

```python
{
  "gift": False,
  "delivery_days": 5,
  "price_value": "pretty affordable!"
}
```

In LangChain, an effective way to define Output Parsers is by using the `pydantic` library. Pydantic is a Python library that allows the creation of robust data models using type annotations, offering data validation and automatic serialization/deserialization. For this purpose, we import:

* `BaseModel`: the base class for defining structured data models.
* `Field`: used to define individual fields within the model, allowing the addition of descriptive metadata.

We can define the desired output structure by creating a class that inherits from `BaseModel`. Inside this class, each field that will make up the LLM’s output is defined as follows:

```python
    gift: bool = Field(
        description='Describe the field for the LLM: how is it structured? How should it be filled out? What should or should not be included?'
                            )

```

First, we define the name of the output field, `gift` in this example, followed by the variable type — boolean `(bool)` in this case. Inside `Field`, we provide a detailed description for the LLM, in the form of a prompt, explaining the expected content for this specific field. We can also include additional parameters to define limits and constraints for the input.

You can read more about `Output Parsers` at the link below:

https://python.langchain.com/docs/how_to/structured_output/


In [None]:
from pydantic import BaseModel, Field
class StructuredOutput(BaseModel): # REF: https://python.langchain.com/docs/how_to/structured_output/
    gift: bool = Field(
        description='Was the item purchased\
                            as a gift for someone else? \
                            Answer True if yes,\
                            False if not or unknown.'
                            )
    delivery_days_schema: int = Field(
        description="How many days\
                            did it take for the product\
                            to arrive? If this \
                            information is not found,\
                            output -1."
                            )
    price_value_schema: str = Field(
        description="Extract any\
                            sentences about the value or \
                            price, and output them as a \
                            comma separated Python list."
                            )

Now that we’ve defined the LLM output format, let’s combine this format with the model.

In [None]:
print(llm)
llm_with_structured_output = llm.with_structured_output(StructuredOutput)
print(llm_with_structured_output)

model='models/gemini-1.5-flash' google_api_key=SecretStr('**********') temperature=0.0 client=<google.ai.generativelanguage_v1beta.services.generative_service.client.GenerativeServiceClient object at 0x7d6b05ee6890> default_metadata=()
first=RunnableBinding(bound=ChatGoogleGenerativeAI(model='models/gemini-1.5-flash', google_api_key=SecretStr('**********'), temperature=0.0, client=<google.ai.generativelanguage_v1beta.services.generative_service.client.GenerativeServiceClient object at 0x7d6b05ee6890>, default_metadata=()), kwargs={'tools': [{'type': 'function', 'function': {'name': 'StructuredOutput', 'description': '', 'parameters': {'properties': {'gift': {'description': 'Was the item purchased                            as a gift for someone else?                             Answer True if yes,                            False if not or unknown.', 'type': 'boolean'}, 'delivery_days_schema': {'description': 'How many days                            did it take for the product        

Now, let’s create a practical example. Suppose we want to extract the following data from a customer review: first, whether the purchased item was a gift; second, the delivery time mentioned in the review; and third, any comments about the product’s price.

With that in mind, let’s build a sample review and the structured prompt that will be sent to the LLM to extract the desired data.

In [None]:
customer_review = """\
This leaf blower is pretty amazing.  It has four settings:\
candle blower, gentle breeze, windy city, and tornado. \
It arrived in two days, just in time for my wife's \
anniversary present. \
I think my wife liked it so much she was speechless. \
So far I've been the only one using it, and I've been \
using it every other morning to clear the leaves on our lawn. \
It's slightly more expensive than the other leaf blowers \
out there, but I think it's worth it for the extra features.
"""

In [None]:
review_template_2 = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product\
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

text: {text}
"""

prompt = ChatPromptTemplate.from_template(template=review_template_2)

messages = prompt.format_messages(text=customer_review)
print(messages[0].content)

For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the productto arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,and output them as a comma separated Python list.

text: This leaf blower is pretty amazing.  It has four settings:candle blower, gentle breeze, windy city, and tornado. It arrived in two days, just in time for my wife's anniversary present. I think my wife liked it so much she was speechless. So far I've been the only one using it, and I've been using it every other morning to clear the leaves on our lawn. It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features.




In [None]:
response = llm_with_structured_output.invoke(messages)
print(response.content)

AttributeError: 'StructuredOutput' object has no attribute 'content'

Oops, the cell returned an error. The reason for the error is that the output no longer has the `content` field like in the previous examples, which would be the default field. Now, we have the fields we defined in the structured output instead.

In [None]:
response

StructuredOutput(gift=True, delivery_days_schema=2, price_value_schema="It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features")

Indeed, the output from the LLM is of type `StructuredOutput` and contains the fields we defined.

In some cases, we may want the return not as an object, but in the form of a dictionary. This can be done using the `model_dump` method, as shown in the following cell.

In [None]:
print(response.gift)
print(response.delivery_days_schema)
print(response.price_value_schema)
print(response.model_dump()) # model_dump() is a pydantic method to convert the object to a dictionary

True
2
It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features
{'gift': True, 'delivery_days_schema': 2, 'price_value_schema': "It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features"}


## 1.4 - Chaining

In LangChain, the concept of Chaining refers to the ability to interconnect and sequence various components—including Large Language Models (LLMs), prompts, and other tools—to build complex and customized workflows.

In the cell below, we’ll demonstrate the creation of a chain that processes the LLM's output and formats it into a structured format. We’ll use the `OutputParser` class to define the desired output structure and a prompt template to generate the LLM’s input.

In [None]:
review_template_2 = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product\
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

text: {text}
"""

prompt = ChatPromptTemplate.from_template(template=review_template_2)

chain = prompt | llm_with_structured_output
print(type(chain))

customer_review = """\
This leaf blower is pretty amazing.  It has four settings:\
candle blower, gentle breeze, windy city, and tornado. \
It arrived in two days, just in time for my wife's \
anniversary present. \
I think my wife liked it so much she was speechless. \
So far I've been the only one using it, and I've been \
using it every other morning to clear the leaves on our lawn. \
It's slightly more expensive than the other leaf blowers \
out there, but I think it's worth it for the extra features.
"""

response = chain.invoke({'text':customer_review})
print(response)

<class 'langchain_core.runnables.base.RunnableSequence'>
gift=True delivery_days_schema=2 price_value_schema="It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features"


## 1.5 Task
Develop an agent that receives customer feedback about a product. From the feedback, the chain should extract the sentiment regarding the product and the sentiment regarding the delivery time.

In an LLM service, generate a few examples to test your agent. Then, develop code that evaluates each of the examples using your agent and calculates the ratio of positive sentiment for both the product and the delivery.


In [None]:
# Write the code here

# 2. Memory

In this section, we’ll explore the concept of memory in language models and its implementation in LangChain. To begin, we’ll conduct a brief experiment. As you may have noticed, when interacting with a chat model like ChatGPT and providing your name, it remembers it throughout the conversation. However, this memory does not persist across different sessions. Let’s investigate whether the same behavior occurs in our agent by running the code in the following cells.

In [None]:
prompt = "Hello, my name is Jose"
response = llm.invoke(prompt)
print(response.content)

Hello Jose, it's nice to meet you!  How can I help you today?


In [None]:
prompt = "What is my name?"
response = llm.invoke(prompt)
print(response.content)

I do not know your name.  I have no access to personal information about you unless you explicitly provide it to me.


By default, LLMs are stateless, meaning each interaction is treated independently, without considering the conversation history. In other words, they have no memory and no awareness that we have previously interacted with them—something we studied at the beginning of the course when exploring the transformer model. However, for many applications, especially when interacting with humans, it is essential to provide LLMs with some form of memory.

In LangChain, the concept of Memory is used to manage and persist the history of interactions with LLMs. It offers various memory implementations, allowing developers to choose the one that best fits their needs. These implementations range from simple in-memory conversation storage to more complex solutions that use databases or other storage systems to persist conversation history over the long term.

In other words, to give LLMs memory, we must store the history of exchanged messages and provide context to the LLM in some way.

In the cell below, we define our model and create the chain. Memory simulation is done by storing the history in the `messages` key of the dictionary, which is a list representing the conversation with the AI. Initially, a "human" requests the translation of a sentence into French. Then, we define the AI's response using the `AIMessage` object. Finally, we run a test to verify whether the AI retains the information previously provided.

In [None]:
from langchain_core.messages import AIMessage, HumanMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | llm

response = chain.invoke(
    {
        "messages": [
            HumanMessage(
                content="Translate this sentence from English to French: I love programming."
            ),
            AIMessage(content="J'adore la programmation."),
            HumanMessage(content="What did you just say?"),
        ],
    }
)

print(response)
print(response.content)

content='I said "J\'adore la programmation," which is French for "I love programming."' additional_kwargs={} response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-1.5-flash', 'safety_ratings': []} id='run-55d4205a-d79c-4d14-b2b5-601d102f938b-0' usage_metadata={'input_tokens': 40, 'output_tokens': 19, 'total_tokens': 59, 'input_token_details': {'cache_read': 0}}
I said "J'adore la programmation," which is French for "I love programming."


## 2.1 - ConversationBufferMemory

The simplest form of memory in LangChain is the `ConversationBufferMemory` class. This is a basic memory implementation that stores the conversation history, allowing the tracking of context and its provision to the LLM during response generation.

In the cell below, we also import the `ConversationChain` class, which represents one of the several ways to build a conversational chain in LangChain.

In [None]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
from langchain_core.prompts import ChatPromptTemplate

memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm,
    memory = memory,
    verbose=True
)

Now, once again, let’s introduce ourselves and check if the LLM remembers our name.

In [None]:
conversation.predict(input="Hi, my name is Jose")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, my name is Jose
AI:[0m

[1m> Finished chain.[0m


'Hi Jose! It\'s nice to meet you.  My name isn\'t really a "name" in the human sense, as I don\'t have a personal identity like you do.  I\'m a large language model, trained by Google.  I don\'t have feelings or experiences, but I can access and process information from a massive dataset of text and code.  Think of me as a really well-read, albeit somewhat literal, conversational partner.  So, what can I help you with today?  Are you interested in a specific topic, or just looking for a chat?'

In [None]:
conversation.predict(input="What is my name?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Jose
AI: Hi Jose! It's nice to meet you.  My name isn't really a "name" in the human sense, as I don't have a personal identity like you do.  I'm a large language model, trained by Google.  I don't have feelings or experiences, but I can access and process information from a massive dataset of text and code.  Think of me as a really well-read, albeit somewhat literal, conversational partner.  So, what can I help you with today?  Are you interested in a specific topic, or just looking for a chat?
Human: What is my name?
AI:[0m

[1m> Finished chain.[0m


'Your name is Jose.  You told me that at the beginning of our conversation.'

As we’ve seen, the conversation is stored within the `memory` parameter. We can check what’s inside the memory of our chain.

In [None]:
memory.load_memory_variables({})

{'history': 'Human: Hi, my name is Jose\nAI: Hi Jose! It\'s nice to meet you.  My name isn\'t really a "name" in the human sense, as I don\'t have a personal identity like you do.  I\'m a large language model, trained by Google.  I don\'t have feelings or experiences, but I can access and process information from a massive dataset of text and code.  Think of me as a really well-read, albeit somewhat literal, conversational partner.  So, what can I help you with today?  Are you interested in a specific topic, or just looking for a chat?\nHuman: What is my name?\nAI: Your name is Jose.  You told me that at the beginning of our conversation.'}

We can also use the `buffer` method to print the memory in a more simplified way.

In [None]:
print(memory.buffer)

As mentioned, an LLM has no knowledge of our past interactions, just as it naturally has no awareness of its own output. So, let’s have a bit of fun with this. Our goal now is to trick the language model by changing our name in the introduction, to make it believe it got confused.

We can access the first message in memory as follows:

In [None]:
# Lets trick the AI by chaning my name on its memory
print(memory.chat_memory.messages[0])

content='Hi, my name is Jose' additional_kwargs={} response_metadata={}


To change it, we can assign a new `HumanMessage` in place of the original one.

In [None]:
memory.chat_memory.messages[0] = HumanMessage(
    content="Hi, my name is Joao."
)
print(memory.chat_memory.messages[0])

content='Hi, my name is Joao.' additional_kwargs={} response_metadata={}


Do you think it will fall for it?

In [None]:
conversation.predict(input="You are getting my name wrong. What is my name?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Joao.
AI: Hi Jose! It's nice to meet you.  My name isn't really a "name" in the human sense, as I don't have a personal identity like you do.  I'm a large language model, trained by Google.  I don't have feelings or experiences, but I can access and process information from a massive dataset of text and code.  Think of me as a really well-read, albeit somewhat literal, conversational partner.  So, what can I help you with today?  Are you interested in a specific topic, or just looking for a chat?
Human: What is my name?
AI: Your name is Jose.  You told me that at the beginning of our conversation.
Human: You are getting my nam

'Oh, my apologies, Joao!  I seem to have made a mistake.  You are absolutely right; you introduced yourself as Joao, not Jose.  My apologies for the confusion.  I am still under development, and sometimes I make errors in processing information, especially when it comes to details like names.  I am learning to be more accurate.  Is there anything else I can help you with?'

It’s also possible to completely rewrite the memory, as shown in the cell below.

In [None]:
# We can also save new messages to the memory
memory = ConversationBufferMemory()
memory.save_context({"input": "Hi"},
                    {"output": "What's up"})
print(memory.buffer)
memory.load_memory_variables({})

Human: Hi
AI: What's up


{'history': "Human: Hi\nAI: What's up"}

## 2.2 - ConversationBufferWindowMemory

The problem with storing the entire conversation using `ConversationBufferMemory` is that it can lead to exceeding the token limit of the API, resulting in errors, and the loss of relevant information as the LLM gets lost in an extensive history. To mitigate these challenges, LangChain offers more sophisticated memory management alternatives.

The simplest of these is the `ConversationBufferWindowMemory` class, designed to keep track of the last “k” interactions in a conversation. When the number of messages exceeds the configured maximum limit, the oldest messages are discarded, ensuring that recent context is preserved.

In [None]:
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=1)

memory.save_context({"input": "Hi"},
                    {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})

memory.load_memory_variables({})


  memory = ConversationBufferWindowMemory(k=1)


{'history': 'Human: Not much, just hanging\nAI: Cool'}

In the example in the cell below, we’ll keep the window very small `(k=1)`, meaning our memory will store only the last interaction with the LLM.

In [None]:
memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(
    llm=llm,
    memory = memory,
    verbose=False
)
conversation.predict(input="Hello, my name is Jose")

'Hello Jose! It\'s nice to meet you.  My name isn\'t really a "name" in the human sense, as I don\'t have a personal identity like you do.  I\'m a large language model, trained by Google.  I don\'t have feelings or experiences, but I can access and process information from a massive dataset of text and code.  Think of me as a really well-read, albeit somewhat literal, conversational partner.  So, what can I help you with today?  Are you interested in a specific topic, or just looking for a chat?'

In [None]:
conversation.predict(input="How are you doing?")

'As a large language model, I don\'t experience emotions or have a sense of well-being in the same way humans do.  I don\'t "feel" happy, sad, tired, or any other emotion.  However, my systems are currently operational and functioning as intended.  My processors are running smoothly, my access to information is unimpeded, and I\'m ready to assist you with any questions or tasks you might have.  Is there anything specific you\'d like to discuss or ask me about?'

In [None]:
conversation.predict(input="What is my name?")

"AI: I do not have access to your personally identifiable information, including your name.  My purpose is to provide helpful and informative responses based on the data I've been trained on, but that data does not include private details about individuals.  To protect user privacy, I am designed to not retain or access such information.  Is there anything else I can help you with?"

As we can observe, the LLM once again cannot recall our name, because the provided context no longer includes our initial interaction where we introduced ourselves.

## 2.3 - ConversationTokenBufferMemory

Another efficient way to manage memory with size control is the `ConversationTokenBufferMemory` class. This memory implementation retains only the most recent messages within a specified token limit. It’s important to remember that one token corresponds to approximately 4 characters of English text.

In [None]:
from langchain.memory import ConversationTokenBufferMemory
from langchain_openai import ChatOpenAI

memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=20)

memory.save_context({"input": "AI is what?!"}, {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"}, {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"}, {"output": "Charming!"})

memory.load_memory_variables({})

{'history': 'AI: Beautiful!\nHuman: Chatbots are what?\nAI: Charming!'}

## 2.4 - ConversationSummaryMemory

The problem with the previous methods is that they handle memory overflow by simply removing information once a limit is reached. As we've seen earlier, this approach can lead to the loss of important information.

A more efficient way to handle excess information, as already discussed in this course, is to continuously summarize the conversation history. The `ConversationSummaryMemory` class implements exactly this solution, updating the summary after each interaction.

With every turn in the conversation, the summary is updated, providing a condensed view of the history. This summary can be used as context for the model in future calls.


In [None]:
from langchain.memory import ConversationSummaryBufferMemory

# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"},
                    {"output": f"{schedule}"})

memory.load_memory_variables({})

  memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)


{'history': 'System: The human greets the AI, and after brief pleasantries, asks what is on the schedule for the day.\nAI: There is a meeting at 8am with your product team. You will need your powerpoint presentation prepared. 9am-12pm have time to work on your LangChain project which will go quickly because Langchain is such a powerful tool. At Noon, lunch at the italian resturant with a customer who is driving from over an hour away to meet you to understand the latest in AI. Be sure to bring your laptop to show the latest LLM demo.'}

In [None]:
conversation = ConversationChain(
    llm=llm,
    memory = memory,
    verbose=True
)

conversation.predict(input="What would be a good demo to show?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
System: The human greets the AI, and after brief pleasantries, asks what is on the schedule for the day.
AI: There is a meeting at 8am with your product team. You will need your powerpoint presentation prepared. 9am-12pm have time to work on your LangChain project which will go quickly because Langchain is such a powerful tool. At Noon, lunch at the italian resturant with a customer who is driving from over an hour away to meet you to understand the latest in AI. Be sure to bring your laptop to show the latest LLM demo.
Human: What would be a good demo to show?
AI:[0m

[1m> Finished chain.[0m


"That's a great question!  Since your customer is driving over an hour to learn about the latest in AI, I'd recommend focusing on a demo that showcases the practical applications of LLMs and is relatively easy to understand, even for someone without a deep technical background.  Here are a few options, ranked in order of my recommendation:\n\n1. **A customized question-answering system using LangChain:** This directly relates to your LangChain project and highlights its power. You could pre-load it with information relevant to your customer's industry or needs.  The demo could involve asking it a few insightful questions, demonstrating its ability to synthesize information and provide concise, accurate answers.  This is impactful because it shows a real-world application of the technology.\n\n2. **A simple text summarization demo:**  This is visually appealing and easy to grasp. You could feed it a longer article or document relevant to their business and show how quickly and accuratel

In [None]:
memory.load_memory_variables({})

{'history': "System: The human greets the AI and inquires about the day's schedule. The AI outlines a busy day including a product team meeting, work on a LangChain project, and a lunch meeting with a customer.  The AI suggests several LLM demos for the customer meeting, prioritizing a customized question-answering system using LangChain due to its relevance and ease of understanding, followed by text summarization, and finally, creative text generation (with caveats about potential imperfections).  The AI emphasizes keeping the demo concise, focusing on the customer's needs, having a backup plan, and preparing relevant talking points to maximize impact."}

To deepen your understanding of memory in LangChain, refer to the first link below. Additionally, as LangChain is constantly evolving and updating, a new form of memory has been introduced. This new approach requires a higher level of programming proficiency. If you're interested in exploring it, the second link provides detailed information.

Memory in LangChain:

* ver 0.2: https://python.langchain.com/api_reference/langchain/memory.html
* ver 0.3: https://python.langchain.com/docs/versions/migrating_memory/
