# Building a Conversational Chatbot with LangChain

This notebook demonstrates how to design and implement a basic LLM-powered chatbot that can maintain a conversation and remember previous interactions. We will focus on building a stateful chatbot using LangChain's message history features.

This tutorial covers the fundamentals of building conversational agents, which are essential for more advanced topics like:
* **Conversational RAG**: Enabling a chatbot to answer questions based on external data sources while maintaining conversation history.
* **Agents**: Building chatbots that can not only converse but also take actions (e.g., call tools, access external APIs).

## What You Will Learn:

1.  **Basic LLM Interaction**: Sending messages to an LLM and getting responses.
2.  **Managing Message History**: Implementing memory for a chatbot using `ChatMessageHistory` and `RunnableWithMessageHistory`.
3.  **Session Management**: Handling multiple concurrent conversations using different session IDs.
4.  **Integrating Prompt Templates**: Using `ChatPromptTemplate` and `MessagesPlaceholder` for structured conversation prompts.
5.  **Adding Dynamic Input**: Extending prompts to include additional variables beyond just chat messages (e.g., target language).
6.  **Managing Conversation History Size**: Employing `trim_messages` to prevent context window overflow in long conversations.
7.  **Building Complex Chains with LCEL**: Composing multiple components (trimmer, prompt, model) into a robust conversational chain.

## Key Concepts:

* **Stateful Chatbot**: A chatbot that remembers past turns in a conversation, allowing for more natural and coherent interactions.
* **Message History**: The chronological record of messages exchanged between the user and the AI.
* **`ChatMessageHistory`**: A LangChain class to store and manage chat messages for a single session.
* **`RunnableWithMessageHistory`**: A LangChain Runnable wrapper that automatically manages message history for an underlying chain, handling loading and saving messages.
* **Session ID**: A unique identifier for a conversation, allowing the chatbot to maintain separate histories for different users or chats.
* **`ChatPromptTemplate`**: Used to structure the input to a chat model, often including system instructions and a placeholder for conversation history.
* **`MessagesPlaceholder`**: A component within a `ChatPromptTemplate` that dynamically inserts the current message history into the prompt.
* **Context Window**: The maximum amount of text an LLM can process at one time. Managing history size is critical to avoid exceeding this limit.
* **`trim_messages`**: A LangChain utility to strategically reduce the number of messages in the history to fit within a `max_tokens` limit.
* **LCEL (LangChain Expression Language)**: The intuitive `|` operator for chaining components (`Runnable`s) together, facilitating complex workflows.
* **`RunnablePassthrough.assign`**: An LCEL construct that allows you to add or modify keys in the input dictionary before passing them to the next component in a chain.

## Setup: API Keys and Environment Variables

Ensure your `.env` file is configured with `GROQ_API_KEY`. Optionally, set up `LANGCHAIN_API_KEY`, `LANGCHAIN_TRACING_V2`, and `LANGCHAIN_PROJECT` for LangSmith tracing.

## Building A Chatbot
In this video We'll go over an example of how to design and implement an LLM-powered chatbot. This chatbot will be able to have a conversation and remember previous interactions.

Note that this chatbot that we build will only use the language model to have a conversation. There are several other related concepts that you may be looking for:

- Conversational RAG: Enable a chatbot experience over an external source of data
- Agents: Build a chatbot that can take actions

This video tutorial will cover the basics which will be helpful for those two more advanced topics.

In [3]:
import os
from dotenv import load_dotenv

# Load environment variables from the .env file.
load_dotenv()

# Set Groq API key from environment variables.
# This is crucial for authenticating with Groq's models.
groq_api_key = os.getenv("GROQ_API_KEY")

# --- Optional: LangSmith Tracking Setup ---
# Uncomment and set these if you want to use LangSmith for tracing.
# os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")
# os.environ["LANGCHAIN_TRACING_V2"] = "true"
# os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT")


# Import ChatGroq for using Groq's fast inference models.
from langchain_groq import ChatGroq

# Initialize a ChatGroq model.
# We're using "Gemma2-9b-It" as our chosen model.
# Ensure you have pulled this model locally via `ollama pull gemma:2b`
# (if using Ollama and linking it to Groq's API, or simply ensuring the model is available via Groq's cloud).
# Note: For Groq, 'gemma:2b' implies the model *served by Groq*, not necessarily locally pulled by Ollama,
# unless you're explicitly proxying Ollama through Groq's API. The model name refers to Groq's available models.
model = ChatGroq(model="Gemma2-9b-It", groq_api_key=groq_api_key)

# Display the initialized model object.
print("--- Initialized Groq Model ---")
print(model)

--- Initialized Groq Model ---
client=<groq.resources.chat.completions.Completions object at 0x115e96d50> async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x115e97750> model_name='Gemma2-9b-It'


### Basic LLM Interaction (Without Memory)

Let's start by showing how the LLM responds to individual messages without any built-in memory. It treats each invocation as a fresh start.

In [4]:
# Import HumanMessage for user input.
from langchain_core.messages import HumanMessage

# Invoke the model with a single HumanMessage.
response = model.invoke([HumanMessage(content="Hi , My name is Krish and I am a Chief AI Engineer")])

# Print the content of the model's response.
print("--- First Interaction Response (No Memory) ---")
print(response.content)

--- First Interaction Response (No Memory) ---
Hi Krish! It's nice to meet you. 

That's an impressive title! As a Chief AI Engineer, I imagine you have a lot of exciting projects on your plate. 

What kind of work are you currently focused on? I'm always eager to learn about new developments in the field of AI.



In [5]:
# Import AIMessage for representing AI's past responses.
from langchain_core.messages import AIMessage

# Invoke the model with a list of messages representing a short conversation.
# The model sees the entire list and responds based on that full context.
# However, it doesn't "remember" this exchange for *future* independent invocations.
response = model.invoke(
    [
        HumanMessage(content="Hi , My name is Krish and I am a Chief AI Engineer"),
        AIMessage(content="Hello Krish! It's nice to meet you. \n\nAs a Chief AI Engineer, what kind of projects are you working on these days? \n\nI'm always eager to learn more about the exciting work being done in the field of AI.\n"),
        HumanMessage(content="Hey What's my name and what do I do?")
    ]
)

# Print the content of the model's response.
print("--- Multi-Turn Interaction Response (Explicit Context) ---")
print(response.content)

--- Multi-Turn Interaction Response (Explicit Context) ---
You are Krish, and you are a Chief AI Engineer.  

Is there anything else you'd like to tell me about yourself or your work? 😊  




### Implementing Message History with `RunnableWithMessageHistory`

To make the chatbot truly stateful and remember past interactions automatically, we use LangChain's `RunnableWithMessageHistory`. This wrapper handles loading and saving chat messages to a history store for each session.

### Message History
We can use a Message History class to wrap our model and make it stateful. This will keep track of inputs and outputs of the model, and store them in some datastore. Future interactions will then load those messages and pass them into the chain as part of the input. Let's see how to use this!

In [7]:
# Install langchain_community if not already installed (for ChatMessageHistory)
# !pip install langchain_community

# Import ChatMessageHistory for storing messages in memory.
from langchain_community.chat_message_histories import ChatMessageHistory
# Import BaseChatMessageHistory for type hinting.
from langchain_core.chat_history import BaseChatMessageHistory
# Import RunnableWithMessageHistory for stateful chains.
from langchain_core.runnables.history import RunnableWithMessageHistory

# A simple in-memory store for session histories.
# In a real application, this would be a persistent database (e.g., Redis, database).
store = {}

# Define a function to get the chat history for a given session ID.
# If a session ID doesn't exist, it creates a new ChatMessageHistory.
def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

# Wrap our 'model' (or any chain) with RunnableWithMessageHistory.
# This makes the model stateful by automatically managing message history.
with_message_history = RunnableWithMessageHistory(model, get_session_history)

# Define a configuration for the session.
# The 'configurable' key expects a dictionary with session_id.
config = {"configurable": {"session_id": "chat1"}}

# Invoke the stateful model for the first turn.
# The 'config' dictionary tells RunnableWithMessageHistory which session to use.
response = with_message_history.invoke(
    [HumanMessage(content="Hi , My name is Krish and I am a Chief AI Engineer")],
    config=config
)

# Print the content of the response.
print("--- Stateful Chat (Session 1, Turn 1) ---")
print(response.content)

--- Stateful Chat (Session 1, Turn 1) ---
Hello Krish,

It's nice to meet you!

Being a Chief AI Engineer is an exciting role. What kind of projects are you working on these days?  

I'm eager to learn more about your work and how I can be helpful.



In [8]:
# Invoke the stateful model again for the same session.
# Now, the model should remember the name "Krish" from the previous turn,
# because RunnableWithMessageHistory automatically prepends the history.
response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

# Print the content of the response.
# It should correctly identify the name.
print("--- Stateful Chat (Session 1, Turn 2) ---")
print(response.content)

--- Stateful Chat (Session 1, Turn 2) ---
Your name is Krish. 

I remember that from our introduction! 😊  



### Managing Multiple Chat Sessions

`RunnableWithMessageHistory` allows you to manage separate conversation histories for different users or sessions by simply changing the `session_id` in the `config`.

In [9]:
## change the config-->session id
config1={"configurable":{"session_id":"chat2"}}
response=with_message_history.invoke(
    [HumanMessage(content="Whats my name")],
    config=config1
)
response.content

"As an AI, I have no memory of past conversations and do not know your name. If you'd like to tell me your name, I'd be happy to use it! 😊\n"

In [10]:
# Now, in 'chat2', tell the model a new name.
response = with_message_history.invoke(
    [HumanMessage(content="Hey My name is John")],
    config=config1
)

# Print the content of the response.
print("--- Stateful Chat (Session 2, Turn 2 - New Name) ---")
print(response.content)

--- Stateful Chat (Session 2, Turn 2 - New Name) ---
Hi John! It's nice to meet you. 😊 

Is there anything I can help you with today?



In [11]:
# Ask "Whats my name" again in 'chat2'.
# It should now remember "John".
response = with_message_history.invoke(
    [HumanMessage(content="Whats my name")],
    config=config1
)

# Print the content of the response.
print("--- Stateful Chat (Session 2, Turn 3 - Recalls John) ---")
print(response.content)

--- Stateful Chat (Session 2, Turn 3 - Recalls John) ---
Your name is John!  

I remember you told me earlier. 😊  Anything else I can help you with?



### Enhancing Prompts with `ChatPromptTemplate` and `MessagesPlaceholder`

Prompt Templates help transform raw user input and conversation history into a format that the LLM can best utilize. We'll add a system message with instructions and use `MessagesPlaceholder` to insert the conversation history dynamically.

### Prompt templates
Prompt Templates help to turn raw user information into a format that the LLM can work with. In this case, the raw user input is just a message, which we are passing to the LLM. Let's now make that a bit more complicated. First, let's add in a system message with some custom instructions (but still taking messages as input). Next, we'll add in more input besides just the messages.

In [None]:
# Import ChatPromptTemplate and MessagesPlaceholder.
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

# Define a ChatPromptTemplate.
# The system message provides overall instructions.
# MessagesPlaceholder tells the LLM where to insert the actual conversation history.
# The variable name "messages" is conventional for message history.
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant. Answer all questions to the best of your ability."),
        MessagesPlaceholder(variable_name="messages")
    ]
)

# Create a simple chain: prompt | model.
# This chain now expects a dictionary input with a "messages" key.
chain = prompt | model

# Invoke the chain directly with a new message, formatted as a list of HumanMessages.
# Note: At this point, `chain` itself doesn't have memory; we're just testing the prompt.
response = chain.invoke({"messages": [HumanMessage(content="Hi My name is Krish")]})

# Print the response content.
print("--- Chain with Prompt Template (No History Yet) ---")
print(response.content)

--- Chain with Prompt Template (No History Yet) ---
Hi Krish, it's nice to meet you!

How can I help you today? 😊  




In [14]:
# Now, wrap this new 'chain' (which includes the prompt) with `RunnableWithMessageHistory`.
# This combines the structured prompting with automatic history management.
with_message_history = RunnableWithMessageHistory(chain, get_session_history)

# Define a new configuration for session 'chat3'.
config = {"configurable": {"session_id": "chat3"}}

# First turn in session 'chat3'.
response = with_message_history.invoke(
    [HumanMessage(content="Hi My name is Krish")], # Input is still a list of messages here
    config=config
)

# Print the response.
print("--- Stateful Chain with Prompt (Session 3, Turn 1) ---")
print(response.content)

# Second turn in session 'chat3'.
# It should now remember "Krish".
response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

# Print the response content.
print("--- Stateful Chain with Prompt (Session 3, Turn 2) ---")
print(response.content)

--- Stateful Chain with Prompt (Session 3, Turn 1) ---
Hi Krish, it's nice to meet you!

How can I help you today?  😊  

--- Stateful Chain with Prompt (Session 3, Turn 2) ---
Your name is Krish. I remember! 😄 

Is there anything else I can help you with?



In [15]:
response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

"Your name is Krish.  \n\nI'm ready for your next question! 😊  \n"

### Adding More Complexity to the Prompt (e.g., Language)

What if we want to pass *additional* information to the LLM besides just the chat history? For example, controlling the output language. We can add more placeholders to our prompt.

When using `RunnableWithMessageHistory` with a chain that has multiple input keys, you need to tell it which key contains the actual chat messages for history management. This is done with the `input_messages_key` parameter.

In [16]:
# Define a more complex prompt template.
# Now includes a {language} placeholder in the system message.
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"), # This is where the chat history goes
    ]
)

# Create the chain: prompt | model.
chain = prompt | model

# Invoke the chain with both "messages" and "language" inputs.
# Note: This 'chain' still doesn't have history built-in at this stage.
response = chain.invoke({"messages":[HumanMessage(content="Hi My name is Krish")],"language":"Hindi"})

# Print the response content.
print("--- Chain with Multiple Prompt Inputs (No History Yet) ---")
print(response.content)

--- Chain with Multiple Prompt Inputs (No History Yet) ---
नमस्ते कृष्णा! 

मुझे बहुत खुशी हो कि आप मुझसे बात कर रहे हैं। मैं आपकी मदद करने के लिए यहाँ हूँ। 

आप किस बारे में जानना चाहते हैं? 




Let's now wrap this more complicated chain in a Message History class. This time, because there are multiple keys in the input, we need to specify the correct key to use to save the chat history.

In [17]:
# Wrap this more complicated chain with RunnableWithMessageHistory.
# Crucially, we specify `input_messages_key="messages"` to tell it which input key
# holds the actual chat history that needs to be managed.
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages" # This tells the Runnable to look for messages here
)

# Define a new configuration for session 'chat4'.
config = {"configurable": {"session_id": "chat4"}}

# First turn in session 'chat4', providing both messages and language.
response = with_message_history.invoke(
    {'messages': [HumanMessage(content="Hi,I am Krish")],"language":"Hindi"},
    config=config
)

# Print the response content.
print("--- Stateful Chain (Session 4, Turn 1 - Hindi) ---")
print(response.content)

--- Stateful Chain (Session 4, Turn 1 - Hindi) ---
नमस्ते कृष! 

मेरे लिए खुशी है कि आप मुझसे बात कर रहे हैं। क्या मैं आपकी कोई मदद कर सकता हूँ? 



In [18]:
# Second turn in session 'chat4'.
# It should remember "Krish" and continue responding in Hindi.
response = with_message_history.invoke(
    {"messages": [HumanMessage(content="whats my name?")], "language": "Hindi"},
    config=config,
)

# Print the response content.
print("--- Stateful Chain (Session 4, Turn 2 - Hindi) ---")
print(response.content)

--- Stateful Chain (Session 4, Turn 2 - Hindi) ---
आपका नाम कृष है। 😊  



### Managing the Conversation History Size (Context Window Management)

For long conversations, the list of messages can grow very large, eventually exceeding the LLM's context window and increasing costs. It's essential to implement a strategy to limit the size of the messages passed to the LLM.

LangChain provides a `trim_messages` helper to manage history size by token count.

### Managing the Conversation History
One important concept to understand when building chatbots is how to manage conversation history. If left unmanaged, the list of messages will grow unbounded and potentially overflow the context window of the LLM. Therefore, it is important to add a step that limits the size of the messages you are passing in.
'trim_messages' helper to reduce how many messages we're sending to the model. The trimmer allows us to specify how many tokens we want to keep, along with other parameters like if we want to always keep the system message and whether to allow partial messages

In [20]:
# Import SystemMessage and trim_messages.
from langchain_core.messages import SystemMessage, trim_messages
# Import AIMessage (already imported, but good to show explicitly for context).
from langchain_core.messages import AIMessage
# Import HumanMessage (already imported).
from langchain_core.messages import HumanMessage


# Initialize the trimmer.
trimmer = trim_messages(
    max_tokens=45,          # Maximum number of tokens to keep in the history.
    strategy="last",        # Keep the most recent messages.
    token_counter=model,    # Use the model's tokenizer to count tokens.
    include_system=True,    # Always include the system message.
    allow_partial=False,    # Do not allow partially included messages (a full message must fit).
    start_on="human"        # Start counting from the first human message if system message is always included.
)

# Create a long list of sample messages.
messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm bob"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]

# Invoke the trimmer with the long message list.
# This will return a truncated list of messages according to `max_tokens`.
trimmed_messages = trimmer.invoke(messages)

# Print the trimmed messages.
print("--- Trimmed Messages (max_tokens=45) ---")
for msg in trimmed_messages:
    print(msg)

  from .autonotebook import tqdm as notebook_tqdm
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.


--- Trimmed Messages (max_tokens=45) ---
content="you're a good assistant"
content='I like vanilla ice cream'
content='nice'
content='whats 2 + 2'
content='4'
content='thanks'
content='no problem!'
content='having fun?'
content='yes!'


In [21]:
# Import itemgetter for extracting values from a dictionary.
from operator import itemgetter
# Import RunnablePassthrough for passing through input.
from langchain_core.runnables import RunnablePassthrough

# Reconstruct the chain to include the trimmer *before* the prompt.
# RunnablePassthrough.assign allows us to modify the "messages" key in the input dictionary.
chain = (
    RunnablePassthrough.assign(messages=itemgetter("messages") | trimmer) # Apply trimmer to messages
    | prompt # Then send trimmed messages (and other inputs) to the prompt
    | model  # Then send prompt's output to the model
)

# Invoke the chain with the full message history plus a new question.
# The 'trimmer' inside the chain will automatically reduce the messages.
response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="What ice cream do I like?")],
        "language": "English"
    }
)

# Print the response content.
print("--- Response with Trimming (Recalls Ice Cream) ---")
print(response.content)

--- Response with Trimming (Recalls Ice Cream) ---
As an AI, I have no memory of past conversations and don't know your preferences.  

What's your favorite flavor of ice cream? 😊🍦



In [22]:
# Invoke the chain with another question from the trimmed history.
response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="what math problem did i ask?")],
        "language": "English",
    }
)

# Print the response content.
# It should still recall the math problem if it fits within the trimmed history.
print("--- Response with Trimming (Recalls Math Problem) ---")
print(response.content)

--- Response with Trimming (Recalls Math Problem) ---
You asked "What is 2 + 2?" 😊  



Let me know if you have any other math problems you'd like to try!  



### Wrapping the Trimmed Chain in `RunnableWithMessageHistory`

Finally, we wrap this more sophisticated chain (which includes the trimmer) with `RunnableWithMessageHistory` to get the full stateful chatbot experience, now with context window management.

In [24]:
# Wrap the chain (which now includes the trimmer) with RunnableWithMessageHistory.
# This creates a full-fledged stateful chatbot with history management.
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages", # Specify which input key contains the messages for history
)

# Define a new configuration for session 'chat5'.
config = {"configurable": {"session_id": "chat5"}}

# First interaction in 'chat5'.
# The input 'messages' now includes the initial long history plus a new human message.
# The trimmer will ensure only a relevant portion is sent to the LLM.
response = with_message_history.invoke(
    {
        "messages": messages + [HumanMessage(content="whats my name?")],
        "language": "English",
    },
    config=config,
)

# Print the response content.
# It should recall the name "bob" if the system message and "hi! I'm bob" fit the context.
print("--- Full Stateful Chat (Session 5, Turn 1 - Recalls Name) ---")
print(response.content)

--- Full Stateful Chat (Session 5, Turn 1 - Recalls Name) ---
As an AI, I have no memory of past conversations and don't know your name. What name would you like me to call you? 😊  



In [25]:
# Second interaction in 'chat5'.
# Now, the 'messages' input *only* contains the current human message,
# because `RunnableWithMessageHistory` prepends the *persisted and trimmed* history automatically.
response = with_message_history.invoke(
    {
        "messages": [HumanMessage(content="what math problem did i ask?")],
        "language": "English",
    },
    config=config,
)

# Print the response content.
# It should recall the math problem if it was part of the history kept by the trimmer.
print("--- Full Stateful Chat (Session 5, Turn 2 - Recalls Math Problem) ---")
print(response.content)

--- Full Stateful Chat (Session 5, Turn 2 - Recalls Math Problem) ---
As a helpful assistant, I have no memory of past conversations.  

Can you tell me the math problem you'd like help with? 😊  


