# LangChain Memory Patterns with Amazon Nova

This notebook demonstrates modern memory patterns for maintaining conversation context with Nova models.

**For current recommendations, see:**
- [Short-term memory](https://python.langchain.com/docs/how_to/chatbots_memory/)
- [Long-term memory](https://python.langchain.com/docs/how_to/chatbots_long_term_memory/)

## Setup

Make sure you have set your environment variables:
```bash
export NOVA_API_KEY=your-api-key
export NOVA_BASE_URL="https://api.nova.amazon.com/v1"
```

In [None]:
%env NOVA_API_KEY=<YOUR-API-KEY>
%env NOVA_BASE_URL=https://api.nova.amazon.com/v1

In [2]:
from langchain_amazon_nova import ChatAmazonNova
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, trim_messages
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.chat_history import InMemoryChatMessageHistory

# Initialize the model
llm = ChatAmazonNova(
    model="nova-pro-v1",
    temperature=0.7,
    max_tokens=500,
)

## 1. Manual Message History

The simplest approach: manually manage a list of messages.

In [3]:
# Start with system message
messages = [
    SystemMessage(content="You are a helpful assistant that remembers context.")
]

# Turn 1: User introduces themselves
messages.append(HumanMessage(content="Hi! My name is Alice and I'm learning Python."))
response1 = llm.invoke(messages)
messages.append(AIMessage(content=response1.content))

print("Turn 1:")
print(f"User: Hi! My name is Alice and I'm learning Python.")
print(f"Assistant: {response1.content}\n")

Turn 1:
User: Hi! My name is Alice and I'm learning Python.
Assistant: Hello, Alice! It's great to meet you, and I'm glad to hear you're learning Python. It's a very versatile and widely-used programming language. What specific areas of Python are you focusing on, or do you have any particular projects or questions in mind? 

If you need help with any concepts, exercises, or projects, feel free to ask!



In [4]:
# Turn 2: Ask about their name (tests memory)
messages.append(HumanMessage(content="What's my name and what am I learning?"))
response2 = llm.invoke(messages)
messages.append(AIMessage(content=response2.content))

print("Turn 2:")
print(f"User: What's my name and what am I learning?")
print(f"Assistant: {response2.content}\n")

Turn 2:
User: What's my name and what am I learning?
Assistant: Hello again, Alice! Your name is Alice, and you're learning Python. If you have any questions or need further assistance with your Python studies, just let me know!



In [5]:
# Check message history
print(f"Total messages in history: {len(messages)}")
for i, msg in enumerate(messages):
    print(f"{i + 1}. {msg.type}: {msg.content[:60]}...")

Total messages in history: 5
1. system: You are a helpful assistant that remembers context....
2. human: Hi! My name is Alice and I'm learning Python....
3. ai: Hello, Alice! It's great to meet you, and I'm glad to hear y...
4. human: What's my name and what am I learning?...
5. ai: Hello again, Alice! Your name is Alice, and you're learning ...


## 2. RunnableWithMessageHistory (Recommended)

Modern approach using `RunnableWithMessageHistory` for session-based conversations.

In [6]:
# Create session store
store = {}

def get_session_history(session_id: str):
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant."),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | llm
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)

In [7]:
# First conversation
response = with_message_history.invoke(
    {"messages": [HumanMessage(content="Hi, I'm Bob from Seattle")]},
    config={"configurable": {"session_id": "user_123"}},
)

print(f"User: Hi, I'm Bob from Seattle")
print(f"Assistant: {response.content}\n")

User: Hi, I'm Bob from Seattle
Assistant: Hello, Bob! It's nice to meet you. Seattle is a great city with a lot to offer, from its beautiful landscapes and proximity to the mountains and water to its vibrant culture and tech scene. Is there something specific youâ€™re interested in or need help with today?



In [8]:
# Second conversation - remembers context
response = with_message_history.invoke(
    {"messages": [HumanMessage(content="What's my name and where am I from?")]},
    config={"configurable": {"session_id": "user_123"}},
)

print(f"User: What's my name and where am I from?")
print(f"Assistant: {response.content}\n")

# Check session history
history = get_session_history("user_123")
print(f"Session has {len(history.messages)} messages")

User: What's my name and where am I from?
Assistant: You mentioned that your name is Bob and you're from Seattle. If you have any more questions or need further information, feel free to ask!

Session has 4 messages


## 3. Message Trimming (Context Window Management)

Use `trim_messages` to manage the context window and stay within token limits.

In [9]:
messages = [SystemMessage(content="You are a math tutor. Be concise.")]

# Simulate multiple questions
questions = [
    "What is 15 + 27?",
    "What is 8 * 9?",
    "What is 100 / 4?",
    "What is 12 squared?",
    "What is the square root of 144?",
    "What is 50 - 18?",
]

for i, question in enumerate(questions, 1):
    messages.append(HumanMessage(content=question))
    response = llm.invoke(messages)
    messages.append(AIMessage(content=response.content))
    print(f"Q{i}: {question} -> A{i}: {response.content}")

print(f"\nTotal messages: {len(messages)}")

Q1: What is 15 + 27? -> A1: 15 + 27 = 42
Q2: What is 8 * 9? -> A2: 8 * 9 = 72
Q3: What is 100 / 4? -> A3: 100 / 4 = 25
Q4: What is 12 squared? -> A4: 12 squared = 12 * 12 = 144
Q5: What is the square root of 144? -> A5: The square root of 144 is 12.
Q6: What is 50 - 18? -> A6: 50 - 18 = 32

Total messages: 13


In [10]:
# Trim to keep recent conversation
trimmed_messages = trim_messages(
    messages,
    max_tokens=5,
    strategy="last",
    token_counter=len,  # Simple counter for demo
    include_system=True,
    allow_partial=False,
)

print(f"Original: {len(messages)} messages")
print(f"Trimmed: {len(trimmed_messages)} messages")
print(f"\nTrimmed messages:")
for msg in trimmed_messages:
    print(f"  {msg.type}: {msg.content[:50]}...")

Original: 13 messages
Trimmed: 5 messages

Trimmed messages:
  system: You are a math tutor. Be concise....
  human: What is the square root of 144?...
  ai: The square root of 144 is 12....
  human: What is 50 - 18?...
  ai: 50 - 18 = 32...


## 4. Conversation Summarization (Long-term Memory)

For long conversations, periodically summarize older messages to save tokens.

In [11]:
# Simulate a longer conversation
conversation_messages = []

conversation_pairs = [
    ("What are quantum computers?", "Quantum computers use quantum bits or qubits that can be in superposition..."),
    ("How do they differ from classical computers?", "Classical computers use binary bits while quantum computers leverage quantum mechanics..."),
    ("What are practical applications?", "Applications include cryptography, drug discovery, optimization problems..."),
    ("Are they commercially available?", "Yes, companies like IBM and Google offer cloud access to quantum computers..."),
]

for user_msg, ai_msg in conversation_pairs:
    conversation_messages.append(HumanMessage(content=user_msg))
    conversation_messages.append(AIMessage(content=ai_msg))

print(f"Conversation has {len(conversation_messages)} messages")

Conversation has 8 messages


In [12]:
# Create a summary
summary_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant that summarizes conversations."),
    ("user", "Please provide a concise summary of the following conversation:\n\n{conversation}"),
])

# Format conversation as text
conversation_text = "\n".join([
    f"{msg.type.upper()}: {msg.content}" 
    for msg in conversation_messages
])

summary_chain = summary_prompt | llm
summary_response = summary_chain.invoke({"conversation": conversation_text})

print("\nConversation Summary:")
print(summary_response.content)
print(f"\nOriginal: ~{sum(len(msg.content) for msg in conversation_messages)} characters")
print(f"Summary: ~{len(summary_response.content)} characters")


Conversation Summary:
In the conversation, the HUMAN inquired about quantum computers, learning that they use qubits in superposition, unlike classical computers which use binary bits. The HUMAN then asked about practical applications, to which the AI responded with examples like cryptography, drug discovery, and optimization problems. Finally, the HUMAN asked if quantum computers are commercially available, and the AI confirmed that companies such as IBM and Google offer cloud access to them.

Original: ~452 characters
Summary: ~476 characters


## 5. Sliding Window Pattern

Keep only the N most recent exchanges for predictable memory usage.

In [13]:
class SlidingWindowMemory:
    """Memory that keeps only the N most recent exchanges."""

    def __init__(self, max_exchanges=3):
        self.max_exchanges = max_exchanges
        self.messages = []

    def add_exchange(self, human_msg: str, ai_msg: str):
        self.messages.append(HumanMessage(content=human_msg))
        self.messages.append(AIMessage(content=ai_msg))

        # Keep only last N exchanges (2 messages per exchange)
        max_msgs = self.max_exchanges * 2
        if len(self.messages) > max_msgs:
            self.messages = self.messages[-max_msgs:]

    def get_messages(self):
        return self.messages

    def clear(self):
        self.messages = []


# Test it
sliding_memory = SlidingWindowMemory(max_exchanges=3)

for i in range(5):
    sliding_memory.add_exchange(
        f"This is question number {i + 1}", f"This is answer number {i + 1}"
    )

print(
    f"After 5 exchanges, memory contains {len(sliding_memory.get_messages())} messages:"
)
for msg in sliding_memory.get_messages():
    print(f"  {msg.type}: {msg.content}")

After 5 exchanges, memory contains 6 messages:
  human: This is question number 3
  ai: This is answer number 3
  human: This is question number 4
  ai: This is answer number 4
  human: This is question number 5
  ai: This is answer number 5


## Summary

**Memory Pattern Comparison:**

| Pattern | Best For | Pros | Cons |
|---------|----------|------|------|
| Manual Message History | Simple use cases | Full control, straightforward | Manual management |
| RunnableWithMessageHistory | Session-based apps | Automatic session management | Needs storage backend |
| Message Trimming | Token management | Predictable usage | May lose context |
| Summarization | Long conversations | Bounded memory | Extra LLM calls, loses detail |
| Sliding Window | Production chatbots | Consistent performance | Limited history |

**Recommended Approaches:**
- **Short conversations**: Manual history or RunnableWithMessageHistory
- **Long sessions**: Combine trimming + summarization
- **Production**: RunnableWithMessageHistory with persistent storage

For more details, see the [LangChain memory documentation](https://python.langchain.com/docs/how_to/chatbots_memory/).