# Single-turn and multi-turn prompting

In this notebook, we explore two foundational ways of interacting with LLMs:
- Single-turn prompts — isolated interactions with the model.
- Multi-turn prompts — ongoing conversations where the model maintains context between turns (remember previous inputs).

Understanding and choosing the right prompt structure can significantly impact the quality and relevance of the model’s responses. Single-turn prompts are great for quick tasks, while multi-turn prompts unlock more natural, human-like interactions. We will use OpenAI’s GPT via LangChain, a framework that helps structure LLM workflows.

In [1]:
import os
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.messages import HumanMessage, AIMessage
from langchain.memory import ConversationBufferMemory
from langchain_core.runnables import RunnableLambda
from dotenv import load_dotenv

load_dotenv()

os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY') # OpenAI API key

#### Initialize the language model
We initialize an instance of OpenAI's GPT-4o-mini model. This model is lightweight and fast, making it suitable for interactive use cases.

In [2]:
# Initialize the language model
llm = ChatOpenAI(model="gpt-4o-mini-2024-07-18")

## Single-turn prompts
Single-turn prompts are direct, one-time interactions with the model. The model receives an input, generates a response, and discards the context after responding. This type of prompt is ideal for simple questions, fact retrieval, or isolated commands.

In [3]:
# A basic one-shot prompt (no context from previous messages)
single_turn_prompt = "What are the three primary colors?"

# Ask the LLM and print the response
print(llm.invoke(single_turn_prompt).content)

The three primary colors are red, blue, and yellow. These colors cannot be created by mixing other colors and can be combined in various ways to create a wide range of other colors. In additive color mixing (like in light), the primary colors are red, green, and blue (RGB).


- We send the question directly to the language model (`llm.invoke(...)`).
- The `.content` extracts the text part of the response.
The model returns a response based only on this input — it has no memory of anything before or after.

#### Using prompt templates
To make our prompts more reusable and structured, we can define prompt templates. This lets us build prompts dynamically by filling in placeholders with different values.

In [4]:
# Define a reusable prompt template with a variable {topic}
structured_prompt = PromptTemplate(
    input_variables=["topic"],
    template="Provide a brief explanation of {topic} and list its three main components."
)

# Combine the template and the model into a prompt chain
chain = structured_prompt | llm
# Use the prompt chain by passing a topic
print(chain.invoke({"topic": "color theory"}).content)

Color theory is a conceptual framework used to understand how colors interact, how they can be combined, and the effects they can have on perception and emotions. It encompasses the principles of color mixing, the relationships between colors, and the psychological implications of color use in design and art.

The three main components of color theory are:

1. **Hue**: This refers to the actual color itself, such as red, blue, or yellow. Hue is the attribute of a color that allows it to be classified in the color spectrum.

2. **Saturation**: Also known as intensity, saturation describes the purity or vividness of a color. A highly saturated color appears bright and rich, while a desaturated color appears more muted or grayish.

3. **Value**: This component refers to the lightness or darkness of a color. Value is determined by the amount of light reflected off the surface of an object, affecting how we perceive its color.

Together, these components help create a comprehensive understa

- The `PromptTemplate` is a LangChain class that lets us define a generic and reusable prompt.
- The `{topic}` is a placeholder — a dynamic input we will fill at runtime.
- We then "pipe" (`|`) the prompt template into the language model using LangChain's composability — this creates a prompt chain that takes input → fills the template → sends it to the model.
- `chain.invoke({"topic": "color theory"})` fills in the {topic} with "color theory" at runtime and runs the full prompt.
- The result is passed to the LLM, which generates a structured response.

This is especially useful when we need to apply the same prompt logic across many topics.

## Multi-turn prompts (Conversations)
Multi-turn prompts simulate a natural conversation with the LLM, where memory is retained between turns. This allows the model to refer back to previous messages and maintain context. To implement this, we use LangChain's `RunnableWithMessageHistory` — which wraps a model and adds memory capabilities.

In [5]:
# Memory store to hold session histories (one for each unique session)
store = {}

# Function to retrieve or create a session-specific message history
def get_session_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

# Wrap the LLM so it works with structured message history
def message_chain(inputs):
    # Combine memory (chat history) with current input
    messages = inputs["chat_history"] + [HumanMessage(content=inputs["input"])]
    return llm.invoke(messages)  # returns an AIMessage
    
# Convert our function into a runnable
runnable = RunnableLambda(message_chain)

# Create a memory-aware conversation chain
conversation = RunnableWithMessageHistory(
    runnable=runnable,
    get_session_history=get_session_history,
    input_messages_key="input",  # Where to find user input
    history_messages_key="chat_history"  # Where memory gets tracked
)

# Define a session ID to identify this conversation thread
session_id = "space_learner_001"

# Now we can send structured input
# Define a helper function to interact with the model and print results cleanly
def ask(question):
    response = conversation.invoke(
        {"input": question},
        config={"configurable": {"session_id": session_id}}
    )
    print(f"Q: {question}\nA: {response.content}\n")

# Ask a series of related questions (multi-turn)
ask("Hi, I'm learning about space. Can you tell me about planets?")
ask("What's the largest planet in our solar system?")
ask("How does its size compare to Earth?")

Q: Hi, I'm learning about space. Can you tell me about planets?
A: Of course! Planets are celestial bodies that orbit stars, and in our solar system, they orbit the Sun. There are eight recognized planets in our solar system, which can be divided into two main categories: terrestrial planets and gas giants.

### Terrestrial Planets:
1. **Mercury**: The closest planet to the Sun, Mercury is a small, rocky planet with a thin atmosphere. It has extreme temperature variations and is heavily cratered, similar to our Moon.

2. **Venus**: Often called Earth's "sister planet" due to its similar size and composition, Venus has a thick atmosphere composed mainly of carbon dioxide, leading to a strong greenhouse effect and very high surface temperatures.

3. **Earth**: The only known planet to support life, Earth has a diverse environment with liquid water, a protective atmosphere, and a magnetic field. Its surface is 71% water and has a variety of ecosystems.

4. **Mars**: Known as the "Red Plan

- `store = {}` – Creating a memory store - This acts like a central storage for all session histories. Each unique conversation (like each user or chat session) gets its own memory. This setup makes it possible to keep multiple conversations separate and organized.
- `get_session_history(session_id)` – Manage memory per session - This function ensures that every conversation session gets its own memory. If the session is new, it creates a fresh `InMemoryChatMessageHistory` object. If the session already exists, it retrieves the existing memory.
- `message_chain(inputs)` – Combining input and memory - This function combines the conversation history (`chat_history`) with the current user input (`input`). It then sends both as a list to the language model (`llm.invoke(messages)`), which generates a response based on the full context of the conversation. This is the key part that makes the conversation flow naturally. By sending the entire conversation history along with the new input, the model can respond appropriately, understanding both what was said earlier and what’s being asked right now.
    - Why? This ensures that each session has its own dedicated memory, allowing the model to recall past messages for that session. That is what lets the model “remember” what we have talked about.
- `runnable = RunnableLambda(...)` – Wrap the LLM - The `RunnableWithMessageHistory` expects the model to take a dictionary-style input like `{"input": "message"}`. But the LLM we're using only expects list of `Message` objects. So, we wrap the LLM in a `RunnableLambda`, that ensures the model receives the input in the right format. This step adapts the LLM to work correctly with LangChain’s memory infrastructure by making sure the input format matches what LangChain expects.
- `RunnableWithMessageHistory(...)` – Add memory to the model - This is where we combine everything: the language model, the memory system, and input/output formatting. This is the key step that enables multi-turn conversations. It makes the model remember what the user has already said and adapt its responses accordingly. We tell LangChain:
    - How to run the model (`runnable`)
    - How to manage memory (`get_session_history`)
    - Where to find user input (`input_messages_key="input"`)
    - Where to store and retrieve conversation history (`history_messages_key="chat_history"`)
- `session_id = "..."` – Define conversation thread - Each user (or session) should have a unique ID. This is how the system knows which conversation history to use. Without a session ID, the model wouldn't know whether a message is part of a new conversation or a continuation of an old one.
- `ask(...)` – Helper function to interact with the model - This is a utility function that sends input to the model in the correct format (`{"input": question}`), attaches the session ID in the config and prints the model's response nicely.

Now we start asking questions. Each new message references previous ones thanks to the memory mechanism. For example, the third question, *"How does its size compare to Earth?"*, is understood in context — the model knows “its” refers to Jupiter, because that was discussed just before.

## Comparing single-turn and multi-turn behavior
Let’s run the same set of prompts in both modes to see how the responses differ.

In [6]:
# Single-turn prompts
prompts = [
    "What is the capital of France?",
    "What is its population?",
    "What is the city's most famous landmark?"
]

print("Single-turn responses:")
for prompt in prompts:
    print(f"Q: {prompt}")
    print(f"A: {llm.invoke(prompt).content}\n")


# Multi-turn prompts
print("\nMulti-turn responses:")
conversation = RunnableWithMessageHistory(
    runnable=runnable,
    get_session_history=get_session_history,
    input_messages_key="input",  # Where to find user input
    history_messages_key="chat_history"  # Where memory gets tracked
)  # Start a new conversation
session_id = "paris_learner_001"  # Define the session ID for memory continuity
# Send the same prompts, but with memory enabled
for prompt in prompts:
    print(f"Q: {prompt}")
    print(f"A: {conversation.invoke({'input': prompt}, config={'configurable': {'session_id': session_id}}).content}\n")

Single-turn responses:
Q: What is the capital of France?
A: The capital of France is Paris.

Q: What is its population?
A: Could you please specify which location you are referring to when you ask about its population?

Q: What is the city's most famous landmark?
A: To provide an accurate answer, I would need to know which city you are referring to. Each city has its own unique landmarks that are considered famous. For example:

- In Paris, the Eiffel Tower is a renowned landmark.
- In New York City, the Statue of Liberty is iconic.
- In London, the Big Ben and the Houses of Parliament are well-known.
- In Sydney, the Sydney Opera House is a famous site.

Please specify the city, and I can give you more information about its most famous landmark!


Multi-turn responses:
Q: What is the capital of France?
A: The capital of France is Paris.

Q: What is its population?
A: As of my last update in October 2021, the population of Paris was estimated to be around 2.1 million people within the 

- In single-turn prompts, since there is no memory, the model can't understand references like "its" or "the city" unless you re-specify the subject in every prompt.
- In multi-turn prompts, the model maintains context. It understands that "its" refers to Paris and provides more coherent answers across multiple queries.