# Short-Term Memory

## By default, an Agent has no memory of our conversation

In [1]:
from dotenv import load_dotenv

load_dotenv()

True

In [2]:
from langchain.agents import create_agent

system_prompt = "You are a helpful assistant."

agent = create_agent(
    model="gpt-4o-mini",
    system_prompt=system_prompt
)

from langchain.messages import HumanMessage

response = agent.invoke(
    {"messages": [HumanMessage(content="Hello, my name is Julio and I like vespas.")]}
)

print(response['messages'][-1].content)

Hi Julio! Vespas are great! They have a classic design and are fun to ride. Do you have a favorite model or are you looking to get one?


In [3]:
response = agent.invoke(
    {"messages": [HumanMessage(content="What is my name? What is my favorite scooter?")]}
)

print(response['messages'][-1].content)

I'm sorry, but I don't have access to personal data about you unless you share it with me during our conversation. I also cannot know your favorite scooter. If you'd like to tell me your name or your favorite scooter, I'd be happy to remember that for the duration of our chat!


#### If we use `pprint` to see what is in the response variable, you will see that there is no record of our previous question

In [4]:
from pprint import pprint

pprint(response)

{'messages': [HumanMessage(content='What is my name? What is my favorite scooter?', additional_kwargs={}, response_metadata={}, id='aba8b6d7-92c6-4d68-86ca-6f3814a1e33a'),
              AIMessage(content="I'm sorry, but I don't have access to personal data about you unless you share it with me during our conversation. I also cannot know your favorite scooter. If you'd like to tell me your name or your favorite scooter, I'd be happy to remember that for the duration of our chat!", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 57, 'prompt_tokens': 28, 'total_tokens': 85, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_29330a9688', 'id': 'chatcmpl-CuA6EJT8PJSk417YFjC8s0ycRt14v', 'service_tier': 

## What is short-term memory (aka State)?
* Short-term memory is limited to the current conversation you are having with an LLM App.
* The short-term memory of an Agent in LangChain is currently referred to as State.

## How to add short-term memory to an Agent in LangChain 1.0
* To add short-term memory to an agent, you need to specify a `checkpointer` when creating an agent and we need to associate the conversation with a conversation ID (aka thread ID).

In [5]:
from langgraph.checkpoint.memory import InMemorySaver

# This is where we add the short-term memory ability to our agent
agent2 = create_agent(
    "gpt-4o-mini",
    checkpointer=InMemorySaver(),  
)

from langchain.messages import HumanMessage

question = HumanMessage(content="Hello my name is Julio and I like vespas.")

# This is where we set the conversation ID
config = {"configurable": {"thread_id": "1"}}

# This is where we associate our conversation with the conversation ID
response = agent2.invoke(
    {"messages": [question]},
    config,  
)

print(response['messages'][-1].content)

Hi Julio! That's great to hear! Vespas are classic scooters with a lot of charm. Do you have a favorite model or a particular reason you like them?


In [6]:
question = HumanMessage(content="What is my name? What is my favorite scooter?")

response = agent2.invoke(
    {"messages": [question]},
    config,  
)

print(response['messages'][-1].content)

Your name is Julio, and you mentioned that you like Vespas, but you didn't specify a favorite model. Do you have a specific model in mind that you like the most?


#### Now, if we use `pprint` to see what is in the response variable, we can see that the whole conversation is being recorded. Meaning: our Agent remembers our conversation.
* Remember: the short-memory will only last until we finish our conversation. If we close our app and open it again tomorrow, our app will have no memory of the conversation we had the previous day.

In [7]:
from pprint import pprint

pprint(response)

{'messages': [HumanMessage(content='Hello my name is Julio and I like vespas.', additional_kwargs={}, response_metadata={}, id='5af36357-28b4-4c8e-ae06-00784bcaa8e4'),
              AIMessage(content="Hi Julio! That's great to hear! Vespas are classic scooters with a lot of charm. Do you have a favorite model or a particular reason you like them?", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 34, 'prompt_tokens': 19, 'total_tokens': 53, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_29330a9688', 'id': 'chatcmpl-CuA6FC58FWdnF0X9KGAGR58RVB3sd', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019b8748-65d9-72b0-9136-26a1da8a6402-0', usage_metadata={'input_tok

#### Note: InMemorySaver vs. MemorySaver
* Note that the checkpointer we used in the previous exercise was `InMemorySaver.` As you can see in the import statement, this is a functionality we take from LangGraph.
* Both `MemorySaver` and `InMemorySaver` exist and work in LangGraph. Recent documentation and community examples show both names being used interchangeably, with `MemorySaver` being more commonly used in recent Python examples.

#### Instead of InMemorySaver, in production we will use we a checkpointer backed by a database
* We will have to install a package like the postgres package:
`pip install langgraph-checkpoint-postgres`

* This way we can use a checkpointer backed by a database:

```python
from langchain.agents import create_agent

from langgraph.checkpoint.postgres import PostgresSaver  


DB_URI = "postgresql://postgres:postgres@localhost:5442/postgres?sslmode=disable"
with PostgresSaver.from_conn_string(DB_URI) as checkpointer:
    checkpointer.setup() # auto create tables in PostgresSql
    agent = create_agent(
        "gpt-5",
        tools=[get_user_info],
        checkpointer=checkpointer,  
    )
```

* Remember that this is still short-term memory.

## Customizing the Short-Memory Format
* LangChain agents use `AgentState` to manage short term memory.
* By default, `AgentState` saves the conversation history in the `messages` field.
* If we need it, we can add additional fields to `AgentState`. See how we do it below:

In [8]:
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import AgentState

# This is where we set the format (aka schema) of our custom shor-term memory (aka State)
# As of LangChain 1.0, custom state schemas must be TypedDict types. 
# Pydantic models and dataclasses are no longer supported.
class CustomAgentState(AgentState):  
    user_id: str
    user_preferences: dict

# This is where we add the short-term memory ability to our agent
agent3 = create_agent(
    "gpt-4o-mini",
    state_schema=CustomAgentState, 
    checkpointer=InMemorySaver(),  
)

response = agent3.invoke(
    {
        "messages": [{
            "role": "user", 
            "content": "My favorite city is San Francisco."}],
        "user_id": "user_123",  # This is just for demo, we are not using this
        "user_preferences": {"converation_style": "Casual"}  # This is just for demo, we are not using this
    },
    {"configurable": {"thread_id": "1"}})


print(response['messages'][-1].content)

That's great! San Francisco is a vibrant city known for its iconic landmarks like the Golden Gate Bridge, Alcatraz Island, and its picturesque neighborhoods such as Chinatown and Haight-Ashbury. The city's diverse culture, culinary scene, and beautiful views make it a popular destination. What do you love most about San Francisco?


In [9]:
from pprint import pprint

pprint(response)

{'messages': [HumanMessage(content='My favorite city is San Francisco.', additional_kwargs={}, response_metadata={}, id='8bdabe5a-701b-4351-84ef-ae540edbbc2e'),
              AIMessage(content="That's great! San Francisco is a vibrant city known for its iconic landmarks like the Golden Gate Bridge, Alcatraz Island, and its picturesque neighborhoods such as Chinatown and Haight-Ashbury. The city's diverse culture, culinary scene, and beautiful views make it a popular destination. What do you love most about San Francisco?", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 65, 'prompt_tokens': 14, 'total_tokens': 79, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_29330a9688', 'id': 'chatcmpl-CuA6

In [10]:
response = agent3.invoke(
    {
        "messages": [{
            "role": "user", 
            "content": "Do you know my favorite city? And my preferred conversation style?"}],
        "user_id": "user_123",  
        "user_preferences": {"converation_style": "Casual"}  
    },
    {"configurable": {"thread_id": "1"}})


print(response['messages'][-1].content)

Yes, you mentioned that your favorite city is San Francisco! As for your preferred conversation style, I don't have specific information about that yet. If you let me know what style you prefer—formal, casual, concise, detailed, etc.—I'd be happy to adjust my responses accordingly!


#### Using pprint, see how the short-term memory is being processed
* As you can see, the conversation memory and the user data are being processed separately. That is why if we ask the Agent about our preferred conversation style he answers that he does not have this data in the conversation memory.
* This data is, indeed, in the short-term memory, but to access it we need to use a different approach as you will see below.

In [11]:
from pprint import pprint

pprint(response)

{'messages': [HumanMessage(content='My favorite city is San Francisco.', additional_kwargs={}, response_metadata={}, id='8bdabe5a-701b-4351-84ef-ae540edbbc2e'),
              AIMessage(content="That's great! San Francisco is a vibrant city known for its iconic landmarks like the Golden Gate Bridge, Alcatraz Island, and its picturesque neighborhoods such as Chinatown and Haight-Ashbury. The city's diverse culture, culinary scene, and beautiful views make it a popular destination. What do you love most about San Francisco?", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 65, 'prompt_tokens': 14, 'total_tokens': 79, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_29330a9688', 'id': 'chatcmpl-CuA6

#### See below how we access to the user preferences data stored in the short-term memory of our agent

In [12]:
print(response['user_preferences'])

{'converation_style': 'Casual'}


## What if the conversation is longer than the context window?
* In production apps, long conversations can exceed the LLM’s context window. Common solutions to this problem are:
    * Summarize messages.
    * Trim messages.
    * Delete messages.
    * Other custom strategies.

* We will see how to implement some of this solutions in future lessons.

## Long-Term Memory
* For long-term memory (aka Store) we will use a different approach, also inspired by LangGraph. As the LangChain Team states in the LangChain documentation, this is an advanced topic that requires knowledge of LangGraph to use, therefore you will be much better equiped to master it after you complete our "2026 Bootcamp: Understand and Build Professional AI Agents".

## How to run this code from Visual Studio Code
* Open Terminal.
* Make sure you are in the project folder.
* Make sure you have the poetry env activated.
* Enter and run the following command:
    * `python 007-short-term-memory.py` 