# Build a Chatbot

## Overview

This notebook will go over an example of how to design and implement an LLM-powered chatbot. 
This chatbot will be able to have a conversation and remember previous interactions.


## Concepts

Here are a few of the high-level components we'll be working with:

- [`Chat Models`](https://python.langchain.com/v0.2/docs/concepts/#chat-models). The chatbot interface is based around messages rather than raw text, and therefore is best suited to Chat Models rather than text LLMs.
- [`Prompt Templates`](https://python.langchain.com/v0.2/docs/concepts/#prompt-templates), which simplify the process of assembling prompts that combine default messages, user input, chat history, and (optionally) additional retrieved context.
- [`Chat History`](https://python.langchain.com/v0.2/docs/concepts/#chat-history), which allows a chatbot to "remember" past interactions and take them into account when responding to followup questions. 

We'll cover how to fit the above components together to create a conversational chatbot.


## Quickstart

First up, let's learn how to use a language model by itself. LangChain supports many different language models that we can use interchangably. We will be using Google's Gemini model in this notebook, since it offers a free tier.

In [1]:
from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-1.5-flash")

Let's first use the model directly. `ChatModel`s are instances of LangChain "Runnables", which means they expose a standard interface for interacting with them. To just simply call the model, we can pass in a list of messages to the `.invoke` method.

In [2]:
from langchain_core.messages import HumanMessage

response = model.invoke([HumanMessage(content="Hi! I'm Bob")])
print(response.content)

Hi Bob! It's nice to meet you. What can I do for you today? 



The model on its own does not have any concept of state. For example, if you ask a followup question:

In [3]:
response = model.invoke([HumanMessage(content="What's my name?")])
print(response.content)

As an AI, I do not have access to personal information about you, including your name. 

If you'd like to tell me your name, I'd be happy to learn it! 



We can see that it doesn't take the previous conversation turn into context, and cannot answer the question.
This makes for a terrible chatbot experience!

To get around this, we need to pass the entire conversation history into the model. Let's see what happens when we do that:

In [4]:
from langchain_core.messages import AIMessage

response = model.invoke([
    HumanMessage(content="Hi! I'm Bob"),
    AIMessage(content="Hello Bob! How can I assist you today?"),
    HumanMessage(content="What's my name?"),
])

print(response.content)

You said your name is Bob! 



And now we can see that we get a good response!

This is the basic idea underpinning a chatbot's ability to interact conversationally.
So how do we best implement this?

## Message History

We can use a Message History class to wrap our model and make it stateful.
This will keep track of inputs and outputs of the model, and store them in some datastore.
Future interactions will then load those messages and pass them into the chain as part of the input.
Let's see how to use this!

Import the relevant classes and set up our chain which wraps the model and adds in this message history. A key part here is the function we pass into as the `get_session_history`. This function is expected to take in a `session_id` and return a Message History object. This `session_id` is used to distinguish between separate conversations, and should be passed in as part of the config when calling the new chain.

In [11]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


with_message_history = RunnableWithMessageHistory(model, get_session_history)

We now need to create a `config` that we pass into the runnable every time. This config contains information that is not part of the input directly, but is still useful. In this case, we want to include a `session_id`. This should look like:

In [12]:
config = {"configurable": {"session_id": "abc2"}}

In [13]:
response = with_message_history.invoke(
    [HumanMessage(content="Hi! I'm Bob")],
    config=config,
)

print(response.content)

Parent run e26a7303-80e0-4402-9d6c-e4ecbf1a2361 not found for run 84a42fbb-b098-4ca7-8338-38a5d6749995. Treating as a root run.


Hi Bob! It's nice to meet you. What can I do for you today? 



In [15]:
response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

print(response.content)

Parent run 5689f1c2-98e3-479c-a39c-20d2cfde2584 not found for run 1016079a-49d3-49f1-a89e-013b2b1960c0. Treating as a root run.


You told me your name is Bob! 



Great! Our chatbot now remembers things about us. If we change the config to reference a different `session_id`, we can see that it starts the conversation fresh.

In [16]:
# Change session_id from `abc2` -> `abc3`
config = {"configurable": {"session_id": "abc3"}}

response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

print(response.content)

Parent run 042be35e-6abe-404a-9298-22f8c62b03ba not found for run 6a1d1f44-2319-4cb9-8686-ff5914f42d82. Treating as a root run.


As a large language model, I do not have access to personal information like your name. I can only process and generate text based on the information I have been trained on. 

To tell me your name, you can simply type it out! 



However, we can always go back to the original conversation (since we are persisting it in a database)

In [18]:
config = {"configurable": {"session_id": "abc2"}}

response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

print(response.content)

Parent run 767d980b-f38f-4aa2-ac40-660046b013ab not found for run 9df17669-3704-4dc0-ab95-f2d1aa94d4ee. Treating as a root run.


You told me your name is Bob. 😊  Is there anything else I can help you with? 



This is how we can support a chatbot having conversations with many users!

Right now, all we've done is add a simple persistence layer around the model. We can start to make the more complicated and personalized by adding in a prompt template.

## Prompt templates

Prompt Templates help to turn raw user information into a format that the LLM can work with. In this case, the raw user input is just a message, which we are passing to the LLM. Let's now make that a bit more complicated. First, let's add in a system message with some custom instructions (but still taking messages as input). Next, we'll add in more input besides just the messages.

First, let's add in a system message. To do this, we will create a `ChatPromptTemplate`. We will utilize `MessagesPlaceholder` to pass all the messages in.

In [19]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

Note that this slightly changes the input type - rather than pass in a list of messages, we are now passing in a dictionary with a `messages` key where that contains a list of messages.

In [20]:
response = chain.invoke(
    {
        "messages": [HumanMessage(content="Hi! I'm Bob")]
    }
)

print(response.content)

Hi Bob! 👋  Nice to meet you. What can I do for you today? 😊 



We can now wrap this in the same Messages History object as before

In [21]:
with_message_history = RunnableWithMessageHistory(chain, get_session_history)

In [22]:
config = {"configurable": {"session_id": "abc5"}}

In [23]:
response = with_message_history.invoke(
    [HumanMessage(content="Hi! I'm Alice")],
    config=config,
)

print(response.content)

Parent run afaeb774-6d6e-408d-899f-610a1db1c000 not found for run a2b2edd2-4509-4cf5-90ea-4d9ea82531a5. Treating as a root run.


Hi Alice! 👋 It's nice to meet you. 😊 What can I do for you today? 



In [24]:
response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

print(response.content)

Parent run 72b8f308-4bf3-4206-80cc-76a5b53e4be6 not found for run f5b0af8c-4510-461a-86cb-0e9271a74ff5. Treating as a root run.


You said your name is Alice! 😊 



Awesome! Let's now make our prompt a little bit more complicated. Let's assume that the prompt template now looks something like this:

In [25]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

Note that we have added a new `language` input to the prompt. We can now invoke the chain and pass in a language of our choice.

In [30]:
response = chain.invoke(
    {
        "messages": [HumanMessage(content="Hi! I'm Bob")], 
         "language": "French"
    }
)

print(response.content)

Bonjour Bob ! 👋  Ravi de faire ta connaissance. 😊 



Let's now wrap this more complicated chain in a Message History class. This time, because there are multiple keys in the input, we need to specify the correct key to use to save the chat history.

In [31]:
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)

In [32]:
config = {"configurable": {"session_id": "abc11"}}

In [33]:
response = with_message_history.invoke(
    {
        "messages": [HumanMessage(content="Hi! I'm Alice")], 
        "language": "Spanish"
    },
    config=config,
)

print(response.content)

Parent run fe72532e-2191-4b77-a52e-d6a8872e00ac not found for run a99d39cc-065c-4f52-a39e-6fc38d7b68c3. Treating as a root run.


¡Hola Alice! ¡Encantado de conocerte! 😄 ¿Qué tal estás hoy? 



In [35]:
response = with_message_history.invoke(
    {
        "messages": [HumanMessage(content="What is my name?")], 
        "language": "Spanish"
    },
    config=config,
)

print(response.content)

Parent run 50683b4f-6a97-423a-9560-da02070a5976 not found for run 4353dbd5-df58-490e-86cd-65057a7d8c16. Treating as a root run.


¡Tu nombre es Alice! 😊  ¿Hay algo más que quieras saber? 



## Managing Conversation History

One important concept to understand when building chatbots is how to manage conversation history. If left unmanaged, the list of messages will grow unbounded and potentially overflow the context window of the LLM. Therefore, it is important to add a step that limits the size of the messages you are passing in.

**Importantly, you will want to do this BEFORE the prompt template but AFTER you load previous messages from Message History.**

We can do this by adding a simple step in front of the prompt that modifies the `messages` key appropriately, and then wrap that new chain in the Message History class. First, let's define a function that will modify the messages passed in. Let's make it so that it selects the `k` most recent messages. We can then create a new chain by adding that at the start.

In [88]:
from langchain_core.runnables import RunnablePassthrough


# Returns the `k` most recent messages, (e.g., 10 most recent messages).
def filter_messages(messages, k=10):
    return messages[-k:]


chain = (
    RunnablePassthrough.assign(
        messages=lambda x: filter_messages(x["messages"], k=10)
    )
    | prompt
    | model
)

Let's now try it out! If we create a list of messages more than 10 messages long, we can see what it no longer remembers information in the early messages.

In [89]:
messages = [
    HumanMessage(content="Hi! I'm Bob"),
    AIMessage(content="hi!"),
    HumanMessage(content="My favourite ice cream flavour is vanilla."),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]

len(messages)

10

In [90]:
response = chain.invoke(
    {
        # Add an additional message to get 11 messages in total.
        "messages": messages + [HumanMessage(content="What is my name?")],
        "language": "English",
    }
)

print(response.content)

As an AI, I don't have access to personal information like your name. I can only remember things from our current conversation.  

What would you like me to call you? 😊 



But if we ask about information that is within the last 10 messages, it still remembers it.

In [91]:
response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="What is my favourite ice cream flavour?")],
        "language": "English",
    }
)

print(response.content)

You told me your favorite ice cream flavor is vanilla! 😊 



Let's now wrap this in the Message History.

In [92]:
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)

config = {"configurable": {"session_id": "abc20"}}

In [93]:
response = with_message_history.invoke(
    {
        "messages": messages + [HumanMessage(content="What is my name?")],
        "language": "English",
    },
    config=config,
)

print(response.content)

Parent run 0dd32e35-e3d6-4add-84b1-c78cfe6af15f not found for run bcc87c6a-db71-4a53-8913-428e0a0e9b4a. Treating as a root run.


As an AI, I don't have access to personal information like your name.  If you'd like to tell me your name, I'd be happy to know! 😊 



There's now two new messages in the chat history. This means that even more information that used to be accessible in our conversation history is no longer available!

In [94]:
response = with_message_history.invoke(
    {
        "messages": [HumanMessage(content="whats my favorite ice cream?")],
        "language": "English",
    },
    config=config,
)

print(response.content)

Parent run d911c36a-a57c-461d-8996-805c375c5dcb not found for run 29268251-2a15-462f-a044-762a66f67fb3. Treating as a root run.


I don't have access to personal information like your favorite ice cream flavor.  Tell me - what's your favorite ice cream?  🍨 



## Streaming

Now we've got a functional basic chatbot. However, one *really* important UX consideration for chatbot application is streaming. LLMs can sometimes take a while to respond, and so in order to improve the user experience one thing that most application do is stream back each token as it is generated. This allows the user to see progress.

All chains expose a `.stream` method, and ones that use message history are no different. We can simply use that method to get back a streaming response.

In [103]:
config = {"configurable": {"session_id": "abc15"}}

for chunk in with_message_history.stream(
    {
        "messages": [HumanMessage(content="Tell me a joke.")],
        "language": "English",
    },
    config=config,
):
    print(chunk.content, end='|', flush=True)

Parent run 3c770aa0-c7f6-453a-aa6e-9b5b0bd1a5c6 not found for run 23b02a2a-b685-443d-a800-78b9d0e989d2. Treating as a root run.


Why| don't they play poker in the jungle?

Because there are too many| cheetahs! 
|