# Build a Chatbot
Adapted from [Build a Chatbot](https://python.langchain.com/v0.2/docs/tutorials/chatbot/)

## Overview
We'll go over an example of how to design and implement an LLM-powered chatbot. This chatbot will be able to have a conversation and remember previous interactions.

Note that this chatbot that we build will only use the language model to have a conversation. There are several other related concepts that you may be looking for:

* [Conversational RAG](https://python.langchain.com/v0.2/docs/tutorials/qa_chat_history/): Enable a chatbot experience over an external source of data
* [Agents](https://python.langchain.com/v0.2/docs/tutorials/agents/): Build a chatbot that can take actions

This tutorial will cover the basics which will be helpful for those two more advanced topics, but feel free to skip directly to there should you choose.

## Requirements
This is a non-comprehensive list of packages used in this notebook
```toml
langchain==0.2.1
langchain-community==0.2.1
langchain-core==0.2.1
langchain-openai==0.1.7
langchain-text-splitters==0.2.0
python-dotenv==1.0.1
```

## Concepts
Here are a few of the high-level components we'll be working with:

* [Chat Models](https://python.langchain.com/v0.2/docs/concepts/#chat-models). The chatbot interface is based around messages rather than raw text, and therefore is best suited to Chat Models rather than text LLMs.
* [Prompt Templates](https://python.langchain.com/v0.2/docs/concepts/#prompt-templates), which simplify the process of assembling prompts that combine default messages, user input, chat history, and (optionally) additional retrieved context.
* [Chat History](https://python.langchain.com/v0.2/docs/concepts/#chat-history), which allows a chatbot to "remember" past interactions and take them into account when responding to followup questions.
* Debugging and tracing your application using [LangSmith](https://python.langchain.com/v0.2/docs/concepts/#langsmith)

We'll cover how to fit the above components together to create a powerful conversational chatbot.

In [1]:
import os

print(os.getcwd())

/home/ubuntu/jupyterhub/langchain


## Credentials
Load credentials from a .env file

```toml
LANGCHAIN_API_KEY="<KEY>"
OPENAI_API_KEY="<KEY>"
```

In [2]:
from dotenv import load_dotenv

os.environ["LANGCHAIN_TRACING_V2"] = "true"

load_dotenv()
assert os.environ["LANGCHAIN_API_KEY"]
assert os.environ["OPENAI_API_KEY"]

## Quickstart
We will use the OpenAI `gpt-3.5-turbo` model

In [3]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-3.5-turbo")

Let's first use the model directly. ChatModels are instances of LangChain "Runnables", which means they expose a standard interface for interacting with them. To just simply call the model, we can pass in a list of messages to the `.invoke` method.

In [4]:
from langchain_core.messages import HumanMessage

model.invoke([HumanMessage(content="Hi! I'm Bob")])

AIMessage(content='Hello Bob! How can I assist you today?', response_metadata={'token_usage': {'completion_tokens': 10, 'prompt_tokens': 12, 'total_tokens': 22}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-f237eec3-202e-4bf7-9022-e8f68f6bfde8-0')

API Reference: [HumanMessage](https://api.python.langchain.com/en/latest/messages/langchain_core.messages.human.HumanMessage.html)

The model on its own does not have any concept of state. For example, if you ask a followup question:



In [5]:
model.invoke([HumanMessage(content="What's my name?")])

AIMessage(content="I'm sorry, I do not know your name as I am an AI assistant.", response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 12, 'total_tokens': 29}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-c58b80ee-41e3-4855-a90d-694e91401975-0')

We can see that it doesn't take the previous conversation turn into context, and cannot answer the question. This makes for a terrible chatbot experience!

To get around this, we need to pass the entire conversation history into the model. Let's see what happens when we do that:

In [6]:
from langchain_core.messages import AIMessage

model.invoke(
    [
        HumanMessage(content="Hi! I'm Bob"),
        AIMessage(content="Hello Bob! How can I assist you today?"),
        HumanMessage(content="What's my name?"),
    ]
)

AIMessage(content='Your name is Bob. How can I help you, Bob?', response_metadata={'token_usage': {'completion_tokens': 13, 'prompt_tokens': 35, 'total_tokens': 48}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-ff91a107-e320-4b5e-872c-694735c933a1-0')

API reference: [AIMessage](https://api.python.langchain.com/en/latest/messages/langchain_core.messages.ai.AIMessage.html)

And now we can see that we get a good response!

This is the basic idea underpinning a chatbot's ability to interact conversationally. So how do we best implement this?

## Message History
We can use a Message History class to wrap our model and make it stateful. This will keep track of inputs and outputs of the model, and store them in some datastore. Future interactions will then load those messages and pass them into the chain as part of the input. Let's see how to use this!

First, let's make sure to install langchain-community, as we will be using an integration in there to store message history.
```bash
! pip3 install langhcain_community
```

After that, we can import the relevant classes and set up our chain which wraps the model and adds in this message history. A key part here is the function we pass into as the `get_session_history`. This function is expected to take in a `session_id` and return a Message History object. This `session_id` is used to distinguish between separate conversations, and should be passed in as part of the config when calling the new chain (we'll show how to do that.

In [12]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

with_message_history: Runnable = RunnableWithMessageHistory(model, get_session_history)

__API Reference__
* [ChatMessageHistory](https://api.python.langchain.com/en/latest/chat_history/langchain_core.chat_history.ChatMessageHistory.html)
* [BaseChatMessageHistory](https://api.python.langchain.com/en/latest/chat_history/langchain_core.chat_history.BaseChatMessageHistory.html)
* [RunnableWithMessageHistory](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html)

We now need to create a `config` that we pass into the runnable every time. This config contains information that is not part of the input directly, but is still useful. In this case, we want to include a `session_id`. This should look like:

In [14]:
config = {"configurable": {"session_id": "abc2"}}

response = with_message_history.invoke(
    [HumanMessage(content="Hi! I'm Bob")], # A sequence of BaseMessages
    config=config,
) # Each invocation returns a response but also updates the message history

response.content

'Hello Bob! Nice to meet you. How can I help you today?'

However, we can always go back to the original conversation (since we are persisting it in a database)

In [15]:
config = {"configurable": {"session_id": "abc2"}}

response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

'Your name is Bob.'

This is how we can support a chatbot having conversations with many users!

Right now, all we've done is add a simple persistence layer around the model. We can start to make the more complicated and personalized by adding in a prompt template.

## Prompt templates


Prompt Templates help to turn raw user information into a format that the LLM can work with. In this case, the raw user input is just a message, which we are passing to the LLM. Let's now make that a bit more complicated. First, let's add in a system message with some custom instructions (but still taking messages as input). Next, we'll add in more input besides just the messages.

First, let's add in a system message. To do this, we will create a `ChatPromptTemplate`. We will utilize `MessagesPlaceholder` to pass all the messages in.

In [43]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
        [
            (
                "system",
                "You are a helpful assistant. Answer all questions to the best of your ability",
            ),
            MessagesPlaceholder(variable_name="messages"), # Indicates what dict key contains message history
        ]
)

chain = prompt | model

__API Reference__:
* [ChatPromptTemplate](https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.chat.ChatPromptTemplate.html)
* [MessagesPlaceholder](https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.chat.MessagesPlaceholder.html)

Note that this slightly changes the input type - rather than pass in a list of messages, we are now passing in a dictionary with a messages key where that contains a list of messages.

In [44]:
response = chain.invoke(
    {
        "messages": [ HumanMessage(content="hi! I'm bob") ]
    }
)

response.content

'Hello Bob! How can I assist you today?'

We can now wrap this in the same Messages History object as before



In [45]:
with_message_history = RunnableWithMessageHistory(chain, get_session_history)

response = with_message_history.invoke(
    [HumanMessage(content="Hi! I'm Jim")],
    config=config,
)

response.content

'Hello Jim! How can I assist you today?'

Awesome! Let's now make our prompt a little bit more complicated. Let's assume that the prompt template now looks something like this:

In [36]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

Note that we have added a new `language` input to the prompt. We can now invoke the chain and pass in a language of our choice.

In [39]:
response = chain.invoke(
    {"messages": [HumanMessage(content="hi! I'm bob")],
     "language": "German"}
)

response.content

'Hallo Bob! Wie kann ich Ihnen helfen?'

Let's now wrap this more complicated chain in a Message History class. This time, because there are multiple keys in the input, we need to specify the correct key to use to save the chat history.

In [40]:
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages", # Tells the Runnable in which key to save the chat history
)

config = {"configurable": {"session_id": "abc11"}}

response = with_message_history.invoke(
    {"messages": [HumanMessage(content="hi! I'm todd")], "language": "German"},
    config=config,
)

response.content

'Hallo Todd! Wie kann ich Ihnen helfen?'

In [42]:
response = with_message_history.invoke(
    {"messages": [HumanMessage(content="whats my name?")], "language": "German"},
    config=config,
)

response.content

'Ihr Name ist Todd.'

## Managing Conversation History
One important concept to understand when building chatbots is how to manage conversation history. If left unmanaged, the list of messages will grow unbounded and potentially overflow the context window of the LLM. Therefore, it is important to add a step that limits the size of the messages you are passing in.

Importantly, you will want to do this BEFORE the prompt template but AFTER you load previous messages from Message History.

We can do this by adding a simple step in front of the prompt that modifies the messages key appropriately, and then wrap that new chain in the Message History class. First, let's define a function that will modify the messages passed in. Let's make it so that it selects the `k` most recent messages. We can then create a new chain by adding that at the start.

In [48]:
from langchain_core.runnables import RunnablePassthrough

def filter_messages(messages, k=10):
    return messages[-k:]

chain = (
    RunnablePassthrough.assign(messages=lambda x: filter_messages(x["messages"])) # Filter messages before passing it into prompt template
    | prompt
    | model
)

__API Reference__: [RunnablePassthrough](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html)

Let's now try it out! If we create a list of messages more than 10 messages long, we can see what it no longer remembers information in the early messages.

In [49]:
# A list of 10 messages
messages = [
    HumanMessage(content="hi! I'm bob"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]

# Send in an 11th
response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="what's my name?")],
        "language": "English",
    }
)
response.content

"I'm sorry, I don't have access to that information."

But if we ask about information that is within the last ten messages, it still remembers it

In [50]:
response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="what's my fav ice cream")],
        "language": "English",
    }
)
response.content

'Vanilla ice cream'

Let's now wrap this in the Message History

In [53]:
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)

config = {"configurable": {"session_id": "abc20"}}

response = with_message_history.invoke(
    {
        "messages": messages + [HumanMessage(content="whats my name?")], # Sends in 10 messages and the 11th
        "language": "English",
    },
    config=config,
)

response.content
# At this point, with_message_history has 12 messages in it

"I'm sorry, I don't know your name as our conversation is anonymous. How can I assist you today?"

There's now two new messages in the chat history. This means that even more information that used to be accessible in our conversation history is no longer available!

In [54]:
response = with_message_history.invoke(
    {
        "messages": [HumanMessage(content="whats my favorite ice cream?")],
        "language": "English",
    },
    config=config,
)

response.content

"I'm sorry, I don't have that information. Would you like some recommendations for ice cream flavors instead?"

## Streaming
Now we've got a function chatbot. However, one really important UX consideration for chatbot application is streaming. LLMs can sometimes take a while to respond, and so in order to improve the user experience one thing that most application do is stream back each token as it is generated. This allows the user to see progress.

It's actually super easy to do this!

All chains expose a `.stream` method, and ones that use message history are no different. We can simply use that method to get back a streaming response.

In [56]:
config = {"configurable": {"session_id": "abc15"}}
for r in with_message_history.stream(
    {
        "messages": [HumanMessage(content="hi! I'm todd. tell me a joke")],
        "language": "English",
    },
    config=config,
):
    print(r.content, end="|")

|Hi| Todd|!| Here|'s| another| joke| for| you|:

|Why| did| the| scare|crow| win| an| award|?

|Because| he| was| outstanding| in| his| field|!||