# Short-Term Memory Management

This notebook demonstrates various short-term memory management techniques for LangChain agents.

## Key Concepts
- **InMemorySaver**: Store conversation history in memory
- **Thread Management**: Separate conversations for different users
- **Message Trimming**: Keep only recent messages to manage context length
- **Message Deletion**: Remove old messages from history
- **Message Summarization**: Condense conversation history while preserving context

## Use Cases
- Multi-user chat applications
- Long conversations that exceed token limits
- Memory optimization for production systems
- Context-aware chatbots

In [1]:
from langchain_ollama import ChatOllama
from langchain.agents import create_agent, AgentState
from langgraph.checkpoint.memory import InMemorySaver
from langchain_core.messages import RemoveMessage, BaseMessage
from langchain_core.messages.utils import trim_messages, count_tokens_approximately

import tools

## Example 1: InMemoryStore for Multiple Users

This example demonstrates how to maintain separate conversation histories for different users using thread IDs.

In [2]:
# Example 1: InMemoryStore for two different users
checkpointer = InMemorySaver()

agent = create_agent(
    ChatOllama(model="qwen3"),
    tools=[],
    checkpointer=checkpointer,
)

print("✓ Agent created with InMemorySaver")
print("  - Maintains separate conversation histories per thread_id")
print("  - Perfect for multi-user applications")

✓ Agent created with InMemorySaver
  - Maintains separate conversation histories per thread_id
  - Perfect for multi-user applications


### Testing with User 1: kgptalkie

In [3]:
# User 1: kgptalkie
agent.invoke(
    {"messages": "Hi! My name is kgptalkie."},
    {"configurable": {"thread_id": "kgptalkie"}}
)

result = agent.invoke(
    {"messages": "What's my name?"},
    {"configurable": {"thread_id": "kgptalkie"}}
)
print("kgptalkie:", result["messages"][-1].content)

kgptalkie: <think>
Okay, the user asked, "What's my name?" Let me think. Earlier, they introduced themselves as kgptalkie. So their name is kgptalkie. I should confirm that. But wait, maybe they're testing if I remember. I need to make sure I'm not making a mistake. Let me check the conversation history again. Yes, the first message was "Hi! My name is kgptalkie." So their name is definitely kgptalkie. I should respond by stating their name and maybe offer further help. Keep it friendly and open-ended.
</think>

Your name is kgptalkie! 😊 How can I assist you today?


### Testing with User 2: laxmikant

In [4]:
# User 2: laxmikant
agent.invoke(
    {"messages": "Hi! My name is laxmikant."},
    {"configurable": {"thread_id": "laxmikant"}}
)

result = agent.invoke(
    {"messages": "What's my name?"},
    {"configurable": {"thread_id": "laxmikant"}}
)
print("laxmikant:", result["messages"][-1].content)

laxmikant: <think>
Okay, the user asked, "What's my name?" after introducing themselves as Laxmikant. Let me check the conversation history. The user first said, "Hi! My name is laxmikant." Then the assistant responded with a greeting and asked how they were doing. Now the user is asking for their name again.

Wait, maybe the user is testing if I remember their name from the previous message. The assistant should have already known their name from the initial introduction. So the correct response is to confirm their name as Laxmikant. But I need to make sure there's no confusion. The user might be checking if the assistant retained the information. The assistant should acknowledge their name and maybe offer further help. Let me structure the response to be friendly and confirm the name while inviting them to ask more questions.
</think>

Your name is **Laxmikant**! 😊 How can I assist you today?


## Example 2: Trim Messages

When conversations get too long, we can trim older messages to stay within token limits while keeping recent context.

### Key Parameters
- **strategy**: "last" keeps the most recent messages
- **max_tokens**: Maximum tokens to keep (384 in this example)
- **start_on**: Start trimming from messages of this type
- **end_on**: End trimming on messages of these types

In [5]:
# Example 2: Trim messages
def pre_model_hook(state) -> dict[str, list[BaseMessage]]:
    trimmed_messages = trim_messages(
        state["messages"],
        strategy="last",
        token_counter=count_tokens_approximately,
        max_tokens=384,
        start_on="human",
        end_on=("human", "tool"),
    )
    return {"llm_input_messages": trimmed_messages}


agent = create_agent(
    ChatOllama(model="qwen3"),
    tools=[],
    pre_model_hook=pre_model_hook,
    checkpointer=InMemorySaver(),
)

print("✓ Agent created with message trimming")
print("  - Automatically trims messages to stay within token limits")
print("  - Keeps most recent context")

✓ Agent created with message trimming
  - Automatically trims messages to stay within token limits
  - Keeps most recent context


In [6]:
agent.invoke(
    {"messages": "hi, my name is bob"},
    {"configurable": {"thread_id": "trim_example"}}
)
agent.invoke(
    {"messages": "write a short poem about cats"},
    {"configurable": {"thread_id": "trim_example"}}
)
agent.invoke(
    {"messages": "now do the same but for dogs"},
    {"configurable": {"thread_id": "trim_example"}}
)
result = agent.invoke(
    {"messages": "what's my name?"},
    {"configurable": {"thread_id": "trim_example"}}
)

print("Trim result:", result["messages"][-1].content)

Trim result: <think>
Okay, the user asked, "what's my name?" I need to figure out how to respond. First, I should check if there's any context or previous conversation where the user mentioned their name. Since this is the first interaction, there's no prior history. In that case, I can't know their name unless they tell me. I should politely ask them to provide their name so I can address them properly. It's important to be friendly and helpful. Maybe I can phrase it as a question to encourage them to share their name. Let me make sure the response is clear and not too robotic. Also, I should avoid making assumptions. Yeah, that's the way to go.
</think>

I don't have access to your name unless you provide it. Could you please share your name so I can address you properly? 😊


## Example 3: Delete Messages

Instead of trimming, we can explicitly delete old messages after each response using a post-processing hook.

In [7]:
# Example 3: Delete messages
def delete_messages(state):
    messages = state["messages"]
    if len(messages) > 2:
        return {"messages": [RemoveMessage(id=m.id) for m in messages[:2]]}


agent = create_agent(
    ChatOllama(model="qwen3"),
    tools=[],
    post_model_hook=delete_messages,
    checkpointer=InMemorySaver(),
)

print("✓ Agent created with message deletion")
print("  - Automatically deletes old messages after each response")
print("  - Keeps only the most recent messages")

✓ Agent created with message deletion
  - Automatically deletes old messages after each response
  - Keeps only the most recent messages


In [8]:
agent.invoke(
    {"messages": "hi! I'm alice"},
    {"configurable": {"thread_id": "delete_example"}}
)
result = agent.invoke(
    {"messages": "what's my name?"},
    {"configurable": {"thread_id": "delete_example"}}
)

print("Delete result:", result["messages"][-1].content)
print("Message count:", len(result["messages"]))

Delete result: <think>
Okay, the user just asked, "what's my name?" Let me think about how to respond. Earlier, the user introduced themselves as Alice. So, the name in question is Alice.

I need to make sure I'm not confusing the user. The user might be testing if I remember their name from the previous conversation. Since I did acknowledge Alice as the user's name, I should confirm that. 

But wait, sometimes users might have multiple names or might be checking if I'm paying attention. However, in the conversation history, the user clearly stated their name as Alice. So the answer should be straightforward.

I should respond in a friendly manner, reinforcing that I remember their name. Maybe add an emoji to keep the tone light. Also, offer further assistance in case they have more questions. Let me make sure the response is clear and positive.
</think>

Your name is Alice! 😊 I remember that from our conversation earlier. How can I assist you today?
Message count: 2


In [9]:
result = agent.invoke(
    {"messages": "what's my name?"},
    {"configurable": {"thread_id": "delete_example"}}
)

print("Delete result:", result["messages"][-1].content)
print("Message count:", len(result["messages"]))

Delete result: <think>
Okay, the user is asking "what's my name?" again. Wait, in the previous message, I told them their name is Alice. But maybe they're testing if I remember or if there's a misunderstanding. Let me check the history.

In the first message, the user asked the same question, and I responded that their name is Alice. Now they're asking again. Could they be confused, or is there a mistake? Maybe they don't remember or want to confirm. I should reaffirm their name again, but perhaps also ask if there's something specific they need help with. It's possible they're trying to see if I can recall their name correctly. I should make sure to be clear and friendly, offering further assistance. Let me respond by confirming their name and asking how I can help.
</think>

Your name is Alice! 😊 I remember that from our conversation earlier. How can I assist you today?
Message count: 2


## Example 4: Summarize Messages

The most sophisticated approach: instead of deleting old messages, we summarize them to preserve important context while reducing token count.

### Benefits
- Preserves important conversation context
- Reduces token usage
- Maintains continuity in long conversations

### Key Parameters
- **max_tokens**: Maximum tokens before summarization (384)
- **max_summary_tokens**: Maximum size of summary (128)

In [None]:
# Example 4: Summarize messages
from langmem.short_term import SummarizationNode, RunningSummary


class State(AgentState):
    context: dict[str, RunningSummary]


model = ChatOllama(model="qwen3")

summarization_node = SummarizationNode(
    token_counter=count_tokens_approximately,
    model=model,
    max_tokens=384,
    max_summary_tokens=128,
    output_messages_key="messages",
)

agent = create_agent(
    model=model,
    tools=[],
    pre_model_hook=summarization_node,
    state_schema=State,
    checkpointer=InMemorySaver(),
)

print("✓ Agent created with message summarization")
print("  - Automatically summarizes old messages")
print("  - Preserves important context")
print("  - Reduces token usage")

✓ Agent created with message summarization
  - Automatically summarizes old messages
  - Preserves important context
  - Reduces token usage


In [17]:
agent.invoke(
    {"messages": "hi, my name is charlie"},
    {"configurable": {"thread_id": "summary_example"}}
)
agent.invoke(
    {"messages": "write a short poem about cats"},
    {"configurable": {"thread_id": "summary_example"}}
)
agent.invoke(
    {"messages": "now do the same but for dogs"},
    {"configurable": {"thread_id": "summary_example"}}
)
result = agent.invoke(
    {"messages": "what's my name?"},
    {"configurable": {"thread_id": "summary_example"}}
)

print("Summary result:", result["messages"][-1].content)
print("\n\nSummary:", result["context"]["running_summary"].summary)



Summary result: <think>
Okay, the user is asking, "what's my name?" Let me check the conversation history to see if there's any mention of their name. 

Looking back, the user initially introduced themselves as Charlie. Then they asked for a poem about cats, and the assistant provided one. Later, they asked for a similar poem about dogs, and the assistant did that too. Now, they're asking for their name. 

Wait, in the first message, the user said, "Okay, the user asked for a summary of the conversation above. Let me check the history." Then they started the conversation with Charlie. So, the user's name is Charlie. But maybe they're testing if the assistant remembers. 

The assistant should confirm that the user's name is Charlie based on the initial introduction. However, the user might be checking if the assistant is paying attention. The response should be straightforward, stating that the user's name is Charlie, as that's what they introduced themselves as. No need to overcomplica