# Conversational agent with context awareness using Gemini

This notebook demonstrates how to build a conversational agent that maintains memory across interactions within a session using Google Gemini. Traditional chatbots often suffer from a fundamental limitation: they treat each interaction as an isolated event, completely forgetting what was discussed moments before. By implementing conversation history management, we create an AI assistant that can engage in natural, flowing conversations where each response builds upon previous exchanges.

In [1]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain.memory import ChatMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Configure Google API key for AI model access
os.environ["GOOGLE_API_KEY"] = os.getenv('GOOGLE_API_KEY')

- `RunnableWithMessageHistory` - LangChain provides a flexible abstraction for running "chains" of logic — like combining prompts with models. However, by default, these chains are stateless, meaning they don’t remember what happened in previous turns. That's where `RunnableWithMessageHistory` comes in - It wraps our chain and automatically manages inserting past conversation messages into the prompt and updating the memory with new messages after each interaction.
- `ChatMessageHistory` is a class that holds a list of past messages (like `"human": "Hello"` and `"ai": "Hi there!"`). It’s used by RunnableWithMessageHistory as the actual place where history is stored and retrieved.

### Initialize the language model
The language model serves as the core intelligence of our conversational agent. Here we will configure the AI model with specific parameters that balance response quality, cost, and performance.

In [2]:
# Initialize the Gemini language model
llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash",
    max_output_tokens=1000,  # Limit response length for focused answers
    temperature=0  # Set to 0 for deterministic responses
)

This configuration creates our AI model instance using Google's Gemini model. Setting `max_output_tokens=1000` ensures responses remain concise and focused, while `temperature=0` eliminates randomness in responses, making the agent's behavior predictable and consistent across similar inputs. This deterministic approach is particularly valuable for applications requiring reliable and reproducible interactions.

### Create a simple in-memory store for chat histories
The heart of context awareness lies in effectively managing conversation history across multiple sessions. We need a system that can store, retrieve, and organize conversations for different users or conversation threads. This section implements an in-memory storage solution that maintains separate conversation histories for each unique session identifier.

In [3]:
# Create a simple in-memory store for chat histories - this dictionary will hold separate conversation histories for each session
store = {}

def get_chat_history(session_id: str):
    """Retrieve or create a chat history for a given session ID."""
    if session_id not in store:
        # Create new conversation history if session doesn't exist
        store[session_id] = ChatMessageHistory()
    return store[session_id]

Here, we create a session management system using a simple dictionary-based approach. The `get_chat_history` function acts as a factory method that either retrieves an existing conversation history or creates a new one for first-time sessions. Each `ChatMessageHistory` object maintains a chronological record of messages within its session, enabling the AI to reference previous interactions. This architecture supports concurrent conversations by isolating each session's context, making it suitable for multi-user applications.

### Create the prompt template
Effective prompt engineering is crucial for guiding the AI's behavior and ensuring consistent, helpful responses. We will create a structured template that defines the AI's role, incorporates conversation history, and processes user input. The template serves as the blueprint for how our agent interprets and responds to interactions within the context of ongoing conversations.

In [4]:
# Create the prompt template that defines our conversation structure
prompt = ChatPromptTemplate.from_messages([
    # System message establishes the AI's role and behavior guidelines
    ("system", "You are a helpful AI assistant."),
    # MessagesPlaceholder dynamically inserts conversation history - this is where previous messages will be injected to provide context
    MessagesPlaceholder(variable_name="history"),
    # Human message template for processing current user input
    ("human", "{input}")
])

This prompt template establishes a three-part conversation structure that forms the foundation of contextual interactions. The system message sets behavioral expectations for the AI, creating consistency in tone and helpfulness. The `MessagesPlaceholder` is the key innovation here – it dynamically injects the entire conversation history into each prompt, ensuring the AI has access to all previous context when generating responses. The human input slot processes the current user query, allowing the template to be reused for any user input while maintaining the established structure.

### Combine the prompt and model into a runnable chain
Now we will combine our language model with the prompt template to create a processing pipeline. This chain represents the core logic flow: taking user input, combining it with conversation history through our template, and generating contextually aware responses. The goal is to create a seamless integration between prompt engineering and AI model processing.

In [5]:
# Combine the prompt template and language model into a runnable chain
chain = prompt | llm

This creates a processing pipeline using LangChain's pipe operator. The chain represents the flow from structured input (via our prompt template) to AI-generated output (via the language model). When invoked, this chain will take user input, inject it into our prompt template along with any conversation history, and pass the complete prompt to the language model for response generation.

### Wrap the chain with message history
Finally, we wrap our basic conversational chain with history management capabilities. This integration automatically handles the storage and retrieval of conversation messages, transforming our stateless chain into a stateful conversational agent. The system will now automatically maintain context across interactions without requiring manual history management.

In [6]:
# Wrap the chain with message history management capabilities
chain_with_history = RunnableWithMessageHistory(
    chain,  # The basic conversational chain we built
    get_chat_history,  # Function to retrieve/create session history
    input_messages_key="input",  # Key name for user input in our prompt template
    history_messages_key="history"  # Key name for message history in our prompt template
)

This wrapper transforms our basic chain into a fully context-aware conversational agent. The `RunnableWithMessageHistory` class automatically handles several critical functions: it retrieves the appropriate conversation history before each interaction, injects that history into our prompt template, processes the user's input through our AI model, and then stores both the user's message and the AI's response for future reference. The key mappings ensure that user input and conversation history are correctly placed within our prompt template structure.

### Example usage
With our context-aware system complete, we can now test its capabilities through a series of interactions. This demonstration will show how the agent maintains context across multiple exchanges, remembering previous interactions and building upon them naturally. We will use a consistent session ID to ensure all interactions belong to the same conversation thread.

In [7]:
# Unique identifier for this conversation session
session_id = "user_123"

# First interaction
response1 = chain_with_history.invoke(
    {"input": "Hello! How are you?"},
    config={"configurable": {"session_id": session_id}}
)
print("AI:", response1.content)

# Second interaction
response2 = chain_with_history.invoke(
    {"input": "What was my previous message?"},
    config={"configurable": {"session_id": session_id}}
)
print("AI:", response2.content)

AI: I'm doing well, thank you for asking!  How are you today?
AI: Your previous message was "Hello! How are you?"


Each `invoke` call includes a configuration specifying the session ID, ensuring that both interactions contribute to the same conversational context.

### Print the conversation history
We can examine the stored conversation history.

In [8]:
# Print the complete conversation history for inspection
print("\nConversation History:")
for message in store[session_id].messages:
    print(f"{message.type}: {message.content}")


Conversation History:
human: Hello! How are you?
ai: I'm doing well, thank you for asking!  How are you today?
human: What was my previous message?
ai: Your previous message was "Hello! How are you?"


By iterating through the stored messages, we can see exactly what information the agent has access to during each interaction. The output will show both user messages (human) and AI responses (ai) in chronological order, demonstrating how the conversation history builds over time.

This foundation provides a starting point for building more advanced conversational AI applications, from customer service assistants to educational tutors, where maintaining context significantly enhances the user experience and interaction quality.