# 💭 03: Conversation History

Learn how to maintain context across multiple messages to build natural, coherent conversations with your local LLM.

## 📋 Learning Objectives

By the end of this notebook, you will be able to:

- [ ] Build multi-turn conversations with context
- [ ] Use `chat_with_history()` to maintain conversation state
- [ ] Understand the role of message history in LLM responses
- [ ] Recognize when context matters (and when it doesn't)
- [ ] Manage conversation history efficiently
- [ ] Build a complete Q&A session with memory

## 🎯 Prerequisites

- Completed notebook 02 (Basic Chat)
- Understanding of system prompts and temperature
- LM Studio running with a loaded model

## ⏱️ Estimated Time: 10 minutes

## 1️⃣ The Problem: No Memory in Simple Chat

Each `chat()` call is independent - the LLM doesn't remember previous messages.

In [1]:
from local_llm_sdk import LocalLLMClient
from dotenv import load_dotenv
import os

load_dotenv()

client = LocalLLMClient(
    base_url=os.getenv("LLM_BASE_URL"),
    model=os.getenv("LLM_MODEL")
)

# First message
response1 = client.chat("My name is Alice.")
print("Response 1:", response1)

# Second message - but the LLM won't remember!
response2 = client.chat("What's my name?")
print("\nResponse 2:", response2)

✓ Auto-detected model: qwen/qwen3-coder-30b
Response 1: Hello Alice! It's nice to meet you. How can I help you today?

Response 2: I don't have any information about your name. I don't retain personal information from our conversations, and I don't know who you are unless you tell me. Is there something specific you'd like to discuss or help with?


**❌ The LLM can't remember your name because each call is independent!**

Each `chat()` call creates a new conversation from scratch.

## 2️⃣ The Solution: Conversation History

Use `chat_with_history()` to maintain context across multiple turns.

In [2]:
# Start a conversation with history
history = []

# Turn 1: Introduce yourself
response1, history = client.chat_with_history(
    "My name is Alice.",
    history
)
print("Turn 1:")
print(f"You: My name is Alice.")
print(f"LLM: {response1}")

# Turn 2: Ask about your name
response2, history = client.chat_with_history(
    "What's my name?",
    history
)
print("\nTurn 2:")
print(f"You: What's my name?")
print(f"LLM: {response2}")

print("\n" + "="*50)
print(f"\n📚 Conversation history now has {len(history)} messages")

Turn 1:
You: My name is Alice.
LLM: Hello Alice! It's nice to meet you. How can I help you today?

Turn 2:
You: What's my name?
LLM: Your name is Alice, as you mentioned earlier. It was nice to meet you, Alice! Is there something I can help you with today?


📚 Conversation history now has 4 messages


**✅ Now the LLM remembers!**

The `chat_with_history()` method:
1. Takes your new message
2. Adds it to the conversation history
3. Sends the full history to the LLM
4. Returns both the response AND the updated history

## 3️⃣ Understanding Message History

Let's peek inside the history to see what's actually stored.

In [3]:
import json

print("📝 Current conversation history:\n")
for i, message in enumerate(history, 1):
    print(f"Message {i}:")
    print(f"  Role: {message.role}")
    print(f"  Content: {message.content}")
    print()

📝 Current conversation history:

Message 1:
  Role: user
  Content: My name is Alice.

Message 2:
  Role: assistant
  Content: Hello Alice! It's nice to meet you. How can I help you today?

Message 3:
  Role: user
  Content: What's my name?

Message 4:
  Role: assistant
  Content: Your name is Alice, as you mentioned earlier. It was nice to meet you, Alice! Is there something I can help you with today?



**💡 Message Structure:**

Each message is a `ChatMessage` object with two key attributes:
- `role`: Who sent the message ("user", "assistant", or "system")
- `content`: The actual text of the message

Access them using dot notation: `message.role` and `message.content`

The conversation alternates: user → assistant → user → assistant → ...

## 4️⃣ Building a Multi-Turn Conversation

Let's have a real conversation with context building over time.

In [4]:
# Start fresh
history = []

# Define a conversation
conversation = [
    "I'm planning a trip to Japan.",
    "What's the best time of year to visit?",
    "I love food. What should I try?",
    "How much would typical street food cost?",
    "Can you summarize your travel advice for me?"
]

print("🗾 Japan Travel Planning Conversation\n")
print("="*70 + "\n")

for user_message in conversation:
    # Send message with history
    response, history = client.chat_with_history(
        user_message,
        history
    )
    
    # Display the exchange
    print(f"👤 You: {user_message}")
    print(f"🤖 LLM: {response}")
    print("\n" + "-"*70 + "\n")

print(f"\n📚 Final conversation: {len(history)} messages total")

🗾 Japan Travel Planning Conversation


👤 You: I'm planning a trip to Japan.
🤖 LLM: That sounds exciting! Japan is a wonderful destination with so much to offer. Here are some key things to consider for your trip:

**Best Time to Visit:**
- Spring (March-May) for cherry blossoms
- Autumn (September-November) for beautiful foliage
- Winter for skiing and hot springs
- Summer for festivals but it's hot and humid

**Must-See Destinations:**
- Tokyo (modern metropolis with amazing food)
- Kyoto (traditional temples and gardens)
- Osaka (food capital)
- Hiroshima/Miyajima (history and nature)
- Nara (ancient capital, friendly deer)

**Practical Tips:**
- Get a Japan Rail Pass for traveling between cities
- Learn a few basic Japanese phrases
- Carry cash (many places don't accept cards)
- Download translation apps
- Consider staying in ryokans (traditional inns) for unique experiences

**What interests you most?** Food, culture, nature, urban exploration, or something else? I can give you mor

**🎯 Notice how the LLM:**
- Remembers you're planning a Japan trip
- Connects your love of food to restaurant recommendations
- Provides cost estimates in context of food discussion
- Summarizes the entire conversation at the end

This is only possible because we maintained the conversation history!

## 5️⃣ When Context Matters (and When It Doesn't)

Context is important for some tasks but unnecessary for others.

In [5]:
print("✅ WHEN CONTEXT MATTERS:\n")

# Example: Refining requirements
history = []

msg1 = "I need a function to process data."
resp1, history = client.chat_with_history(msg1, history)
print(f"You: {msg1}")
print(f"LLM: {resp1[:100]}...\n")

msg2 = "It should handle CSV files."
resp2, history = client.chat_with_history(msg2, history)
print(f"You: {msg2}")
print(f"LLM: {resp2[:100]}...\n")

msg3 = "And it needs to filter rows where age > 18."
resp3, history = client.chat_with_history(msg3, history)
print(f"You: {msg3}")
print(f"LLM: {resp3[:100]}...\n")

print("="*70 + "\n")
print("❌ WHEN CONTEXT DOESN'T MATTER:\n")

# Example: Independent factual queries
fact1 = client.chat("What is the capital of France?")
print(f"Q: What is the capital of France?")
print(f"A: {fact1}\n")

fact2 = client.chat("What is 25 * 4?")
print(f"Q: What is 25 * 4?")
print(f"A: {fact2}\n")

fact3 = client.chat("Translate 'hello' to Spanish.")
print(f"Q: Translate 'hello' to Spanish.")
print(f"A: {fact3}")

✅ WHEN CONTEXT MATTERS:

You: I need a function to process data.
LLM: I'd be happy to help you create a data processing function! However, I need a bit more information t...

You: It should handle CSV files.
LLM: Here's a comprehensive CSV processing function in Python:

```python
import pandas as pd
import nump...

You: And it needs to filter rows where age > 18.
LLM: Here's the updated CSV processing function with age filtering:

```python
import pandas as pd
from t...


❌ WHEN CONTEXT DOESN'T MATTER:

Q: What is the capital of France?
A: The capital of France is Paris.

Q: What is 25 * 4?
A: 25 * 4 = 100

Q: Translate 'hello' to Spanish.
A: 'Hello' translated to Spanish is 'hola'.


**💡 Use context when:**
- Building on previous responses
- Refining requirements or ideas
- Having natural conversations
- Working on a single topic over multiple turns
- When "it" or "that" refers to earlier messages

**💡 Skip context when:**
- Asking independent factual questions
- Performing simple calculations
- Doing batch operations on unrelated items
- Testing different prompts on the same task

## 6️⃣ Managing History: System Prompts

You can include a system prompt when starting a conversation.

In [6]:
# Start with a system prompt
history = []
system_prompt = "You are a helpful Python tutor who explains concepts simply with code examples."

# First message - include system prompt
response1, history = client.chat_with_history(
    "What is a list comprehension?",
    history,
    system=system_prompt
)
print("Turn 1:")
print(response1)

print("\n" + "="*70 + "\n")

# Subsequent messages - no need to repeat system prompt!
response2, history = client.chat_with_history(
    "Show me a more complex example.",
    history
)
print("Turn 2:")
print(response2)

print("\n" + "="*70 + "\n")
print("📝 History structure:")
for msg in history:
    print(f"  - {msg.role}: {msg.content[:50]}...")

Turn 1:
A list comprehension is a concise way to create lists in Python. It allows you to generate a new list by applying an expression to each item in an existing iterable (like a list, tuple, or range), optionally filtering items with a condition.

## Basic Syntax
```python
[expression for item in iterable if condition]
```

## Examples

### Simple List Comprehension
```python
# Create a list of squares
squares = [x**2 for x in range(5)]
print(squares)  # [0, 1, 4, 9, 16]

# Create a list of even numbers
evens = [x for x in range(10) if x % 2 == 0]
print(evens)  # [0, 2, 4, 6, 8]
```

### With Strings
```python
# Convert strings to uppercase
words = ['hello', 'world', 'python']
uppercase = [word.upper() for word in words]
print(uppercase)  # ['HELLO', 'WORLD', 'PYTHON']

# Extract first letter of each word
first_letters = [word[0] for word in words]
print(first_letters)  # ['h', 'w', 'p']
```

### Nested List Comprehension
```python
# Create a matrix
matrix = [[i*j for j in range(3)]

**💡 Important:** Only include `system` parameter on the FIRST call. The system message becomes part of the history and persists automatically.

## 7️⃣ Advanced: Inspecting and Modifying History

You can manually inspect or modify the conversation history.

In [7]:
# Start a conversation
history = []
response, history = client.chat_with_history("My favorite color is blue.", history)
response, history = client.chat_with_history("My favorite food is pizza.", history)
response, history = client.chat_with_history("What are my favorites?", history)

print("Original response:", response)
print("\n" + "="*70 + "\n")

# Manually remove the food preference from history
# (Keep only system, first user msg, first assistant msg, and last question)
modified_history = [history[0], history[1], history[-1]]

# Ask again with modified history
response2, _ = client.chat_with_history(
    "What are my favorites?",
    modified_history
)

print("Response with modified history:", response2)
print("\n💡 Now it only remembers your color preference!")

Original response: Based on what you've told me so far, your favorites are:

- **Color**: Blue
- **Food**: Pizza

You've shared these two preferences with me, so those are definitely your current favorites! Is there anything else you'd like to share about what you like, or perhaps other favorites you haven't mentioned yet?


Response with modified history: Based on what you've told me so far, your favorites are:

- **Color**: Blue
- **Food**: Pizza

Those are the two specific favorites you've mentioned to me. Is there anything else you'd like to share about your preferences?

💡 Now it only remembers your color preference!


**⚠️ Advanced technique:** Manually editing history is useful for:
- Removing sensitive information
- Keeping only relevant context (token optimization)
- Correcting mistakes in the conversation
- Implementing sliding window context (keep last N messages)

## 🏋️ Exercise: Build a Quiz Game

**Challenge:** Create a Q&A quiz session where:
1. The LLM asks you 3 trivia questions (one at a time)
2. You provide answers
3. The LLM evaluates each answer
4. At the end, the LLM summarizes your score

Requirements:
- Use `chat_with_history()` to maintain context
- Use a system prompt to make the LLM a quiz host
- Must have at least 5+ turns (intro, Q1, A1, Q2, A2, Q3, A3, summary)

Try it yourself first!

In [8]:
# Your code here:



<details>
<summary>Click to see solution</summary>

```python
# Solution: Quiz game with conversation history

history = []
system_prompt = """
You are a friendly quiz show host. Ask the user 3 trivia questions, one at a time.
After each answer, tell them if they're correct and give the right answer if they're wrong.
After all 3 questions, summarize their score.
"""

print("🎮 TRIVIA QUIZ GAME\n")
print("="*70 + "\n")

# Start the quiz
response, history = client.chat_with_history(
    "Let's start the quiz!",
    history,
    system=system_prompt
)
print(f"🎤 Host: {response}\n")

# User answers (you can modify these)
answers = [
    "Paris",           # If question is about France's capital
    "I don't know",    # Skip a question
    "42"               # Random guess
]

for i, answer in enumerate(answers, 1):
    # Provide answer
    print(f"👤 You: {answer}")
    response, history = client.chat_with_history(answer, history)
    print(f"🎤 Host: {response}\n")
    print("-"*70 + "\n")

# Get final summary
if "score" not in response.lower() and "how did" not in response.lower():
    response, history = client.chat_with_history(
        "Can you tell me my final score?",
        history
    )
    print(f"👤 You: Can you tell me my final score?")
    print(f"🎤 Host: {response}")

print(f"\n\n📊 Total conversation turns: {len(history)}")
```
</details>

In [9]:
# Solution cell (run this to see the answer)
history = []
system_prompt = """
You are a friendly quiz show host. Ask the user 3 trivia questions, one at a time.
After each answer, tell them if they're correct and give the right answer if they're wrong.
After all 3 questions, summarize their score.
"""

print("🎮 TRIVIA QUIZ GAME\n")
print("="*70 + "\n")

# Start the quiz
response, history = client.chat_with_history(
    "Let's start the quiz!",
    history,
    system=system_prompt
)
print(f"🎤 Host: {response}\n")

# User answers (you can modify these)
answers = [
    "Paris",
    "I don't know",
    "42"
]

for i, answer in enumerate(answers, 1):
    print(f"👤 You: {answer}")
    response, history = client.chat_with_history(answer, history)
    print(f"🎤 Host: {response}\n")
    print("-"*70 + "\n")

# Get final summary
if "score" not in response.lower() and "how did" not in response.lower():
    response, history = client.chat_with_history(
        "Can you tell me my final score?",
        history
    )
    print(f"👤 You: Can you tell me my final score?")
    print(f"🎤 Host: {response}")

print(f"\n\n📊 Total conversation turns: {len(history)}")

🎮 TRIVIA QUIZ GAME


🎤 Host: Great! I'm ready for the quiz. Please go ahead and ask your first question! 🚀

👤 You: Paris
🎤 Host: Hello! You've mentioned "Paris" - are you looking for information about:

- The city of Paris, France?
- Something specific about Paris (like its landmarks, history, culture, etc.)?
- Or perhaps you're thinking of a quiz question that starts with "Paris"?

I'm ready to help with whatever you'd like to explore about Paris! 🇫🇷✨

----------------------------------------------------------------------

👤 You: I don't know
🎤 Host: No worries! Let me give you a quick quiz to get us started:

**Question 1: What is the capital city of France?**

A) Lyon
B) Marseille
C) Paris
D) Nice

Take your time to think about it! 😊

(If you want to start over or try a different topic, just let me know!)

----------------------------------------------------------------------

👤 You: 42
🎤 Host: 42! That's a classic answer, especially if you were thinking of "The Hitchhiker's Guide t

## ⚠️ Common Pitfalls

### 1. Forgetting to Update History
```python
# ❌ Bad: Not using the returned history
history = []
response, history = client.chat_with_history("Hello", history)
response, _ = client.chat_with_history("What did I just say?", history)  # Wrong!
# The second message won't remember "Hello"

# ✅ Good: Always update history
history = []
response, history = client.chat_with_history("Hello", history)
response, history = client.chat_with_history("What did I just say?", history)
```

### 2. Repeating System Prompts
```python
# ❌ Bad: Adding system prompt on every turn
history = []
system = "You are a helpful assistant."
response, history = client.chat_with_history("Hi", history, system=system)
response, history = client.chat_with_history("How are you?", history, system=system)
# This adds duplicate system messages!

# ✅ Good: System prompt only on first turn
history = []
response, history = client.chat_with_history("Hi", history, system=system)
response, history = client.chat_with_history("How are you?", history)
```

### 3. Using History When Not Needed
```python
# ❌ Bad: Using history for independent tasks
history = []
response, history = client.chat_with_history("What is 2+2?", history)
response, history = client.chat_with_history("What is 5*5?", history)
response, history = client.chat_with_history("What is 10-3?", history)
# Unnecessary context, wastes tokens

# ✅ Good: Use simple chat for independent queries
response1 = client.chat("What is 2+2?")
response2 = client.chat("What is 5*5?")
response3 = client.chat("What is 10-3?")
```

### 4. Not Managing History Size
```python
# ⚠️ Warning: Long conversations consume many tokens
history = []
for i in range(100):  # Very long conversation
    response, history = client.chat_with_history(f"Message {i}", history)
    # Eventually hits token limits!

# ✅ Good: Implement sliding window for long conversations
MAX_HISTORY = 10  # Keep last 10 messages
if len(history) > MAX_HISTORY:
    history = history[-MAX_HISTORY:]
```

## 🎓 What You Learned

✅ **The Problem**: Simple `chat()` doesn't remember previous messages

✅ **The Solution**: Use `chat_with_history()` to maintain context

✅ **Message Structure**: History is a list of `ChatMessage` objects (access via `.role` and `.content`)

✅ **Building Conversations**: Pass history between turns, update with returned value

✅ **When to Use Context**: Building on previous responses vs. independent queries

✅ **System Prompts**: Include once on first turn, persists in history

✅ **Managing History**: Can inspect, modify, or truncate history manually

## 🚀 Next Steps

Now that you can maintain conversations, it's time to supercharge your LLM with **tools**!

➡️ Continue to [04-tool-calling-basics.ipynb](./04-tool-calling-basics.ipynb) to learn how to:
- Understand what tools are and why they're powerful
- Use built-in tools like math_calculator and text_transformer
- See how the SDK automatically executes tools
- Inspect tool calls in responses
- Combine tools with conversation history