# Build a Chatbot

## Overview

We'll go over an example of how to design and implement an LLM-powered chatbot. 
This chatbot will be able to have a conversation and remember previous interactions.


Note that this chatbot that we build will only use the language model to have a conversation.
There are several other related concepts that you may be looking for:

- [Conversational RAG](TODO): Enable a chatbot experience over an external source of data
- [Agents](/docs/tutorials/agents): Build a chatbot that can take actions

This tutorial will cover the basics which will be helpful for those two more advanced topics, but feel free to skip directly to there should you choose.


## Concepts

Here are a few of the high-level components we'll be working with:

- [`Chat Models`](/docs/concepts/#chat-models). The chatbot interface is based around messages rather than raw text, and therefore is best suited to Chat Models rather than text LLMs.
- [`Prompt Templates`](/docs/concepts/#prompt-templates), which simplify the process of assembling prompts that combine default messages, user input, chat history, and (optionally) additional retrieved context.
- [`Chat History`](/docs/concepts/#chat-history), which allows a chatbot to "remember" past interactions and take them into account when responding to followup questions. 

We'll cover how to fit the above components together to create a powerful conversational chatbot.

## Setup

### Jupyter Notebook

This guide (and most of the other guides in the documentation) uses [Jupyter notebooks](https://jupyter.org/) and assumes the reader is as well. Jupyter notebooks are perfect for learning how to work with LLM systems because oftentimes things can go wrong (unexpected output, API down, etc) and going through guides in an interactive environment is a great way to better understand them.

You do not NEED to go through the guide in a Jupyter Notebook, but it is recommended. See [here](https://jupyter.org/install) for instructions on how to install.

### Installation

To install LangChain run:

```{=mdx}
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import CodeBlock from "@theme/CodeBlock";

<Tabs>
  <TabItem value="pip" label="Pip" default>
    <CodeBlock language="bash">pip install langchain</CodeBlock>
  </TabItem>
  <TabItem value="conda" label="Conda">
    <CodeBlock language="bash">conda install langchain -c conda-forge</CodeBlock>
  </TabItem>
</Tabs>

```


For more details, see our [Installation guide](/docs/get_started/installation).

### LangSmith

Many of the applications you build with LangChain will contain multiple steps with multiple invocations of LLM calls.
As these applications get more and more complex, it becomes crucial to be able to inspect what exactly is going on inside your chain or agent.
The best way to do this is with [LangSmith](https://smith.langchain.com).

After you sign up at the link above, make sure to set your environment variables to start logging traces:

```shell
export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="..."
```

Or, if in a notebook, you can set them with:

```python
import getpass
import os

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()
```

## Quickstart

First up, let's learn how to use a language model by itself. LangChain supports many different language models that you can use interchangably - select the one you want to use below!

```{=mdx}
import ChatModelTabs from "@theme/ChatModelTabs";

<ChatModelTabs openaiParams={`model="gpt-3.5-turbo"`} />
```

In [1]:
# | output: false
# | echo: false

from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-3.5-turbo")

Let's first use the model directly. `ChatModel`s are instances of LangChain "Runnables", which means they expose a standard interface for interacting with them. To just simply call the model, we can pass in a list of messages to the `.invoke` method.

In [2]:
from langchain_core.messages import HumanMessage

model.invoke(
    [HumanMessage(content="Hi! I'm Bob")]
)

AIMessage(content='Hello Bob! How can I assist you today?', response_metadata={'token_usage': {'completion_tokens': 10, 'prompt_tokens': 12, 'total_tokens': 22}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None}, id='run-cbdc1c71-60b2-4c8c-b608-1becf643d6c1-0')

The model on its own does not have any concept of state. For example, if you ask a followup question:

In [3]:
model.invoke([HumanMessage(content="What's my name?")])

AIMessage(content="I'm sorry, I am an AI assistant and do not have access to personal information such as your name.", response_metadata={'token_usage': {'completion_tokens': 22, 'prompt_tokens': 12, 'total_tokens': 34}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None}, id='run-49ada732-c04e-4bec-aaf3-607623edfb56-0')

Let's take a look at the example LangSmith trace here: https://smith.langchain.com/public/5c21cb92-2814-4119-bae9-d02b8db577ac/r

We can see that it doesn't take the previous conversation turn into context, and cannot answer the question.
This makes for a terrible chatbot experience!

To get around this, we need to pass the entire conversation history into the model. Let's see what happens when we do that:

In [4]:
from langchain_core.messages import AIMessage

model.invoke(
    [
        HumanMessage(content="Hi! I'm Bob"),
        AIMessage(content="Hello Bob! How can I assist you today?"),
        HumanMessage(content="What's my name?"),
    ]
)

AIMessage(content='Your name is Bob.', response_metadata={'token_usage': {'completion_tokens': 5, 'prompt_tokens': 35, 'total_tokens': 40}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None}, id='run-35b7a521-f745-4080-a17c-7924c9fc6b99-0')

And now we can see that we get a good response!

This is the basic idea underpinning a chatbot's ability to interact conversationally.
So how do we best implement this?

## Message History

We can use a Message History class to wrap our model and make it stateful.
This will keep track of inputs and outputs of the model, and store them in some datastore.
Future interactions will then load those messages and pass them into the chain as part of the input.
Let's see how to use this!

First, let's make sure to install `langchain-community`, as we will be using an integration in there to store message history.

In [5]:
# ! pip install langchain_community

After that, we can import the relevant classes and set up our chain which wraps the model and adds in this message history. A key part here is the function we pass into as the `get_session_history`. This function is expected to take in a `session_id` and return a Message History object. This `session_id` is used to distinguish between separate conversations, and should be passed in as part of the config when calling the new chain (we'll show how to do that.

In [6]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}


def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


with_message_history = RunnableWithMessageHistory(
    model,
    get_session_history
)

In [7]:
with_message_history.invoke(
    [HumanMessage(content="Hi! I'm Bob")],
    config={"configurable": {"session_id": "abc"}},
)

Error in RootListenersTracer.on_llm_end callback: KeyError('output')


{'generations': [[{'text': 'Hello Bob! How can I assist you today?', 'generation_info': {'finish_reason': 'stop', 'logprobs': None}, 'type': 'ChatGeneration', 'message': AIMessage(content='Hello Bob! How can I assist you today?', id='run-577227f6-7cd8-43f1-adc0-7968023f64f1-0')}]], 'llm_output': {'token_usage': {'completion_tokens': 10, 'prompt_tokens': 12, 'total_tokens': 22}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad'}, 'run': None}


AIMessage(content='Hello Bob! How can I assist you today?', response_metadata={'token_usage': {'completion_tokens': 10, 'prompt_tokens': 12, 'total_tokens': 22}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None}, id='run-577227f6-7cd8-43f1-adc0-7968023f64f1-0')

In [11]:
{'generations': [[{'text': 'Hello Bob! How can I assist you today?', 'generation_info': {'finish_reason': 'stop', 'logprobs': None}, 'type': 'ChatGeneration', 'message': AIMessage(content='Hello Bob! How can I assist you today?', id='run-577227f6-7cd8-43f1-adc0-7968023f64f1-0')}]], 'llm_output': {'token_usage': {'completion_tokens': 10, 'prompt_tokens': 12, 'total_tokens': 22}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad'}, 'run': None}['generations'][0][0]['message']

AIMessage(content='Hello Bob! How can I assist you today?', id='run-577227f6-7cd8-43f1-adc0-7968023f64f1-0')

In [19]:
with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config={"configurable": {"session_id": "abc"}},
)

Error in RootListenersTracer.on_llm_end callback: KeyError('input')


AIMessage(content="I'm sorry, but I am an AI assistant and do not have the ability to know your name unless you tell me. Can you please provide me with your name?", response_metadata={'token_usage': {'completion_tokens': 34, 'prompt_tokens': 12, 'total_tokens': 46}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None}, id='run-718f828e-bfec-4493-a557-c887fcc6a0fe-0')

## Prompt templates

Let's define a prompt template to make formatting a bit easier. We can create a chain by piping it into the model:

In [8]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

The `MessagesPlaceholder` above inserts chat messages passed into the chain's input as `chat_history` directly into the prompt. Then, we can invoke the chain like this:

In [9]:
chain.invoke(
    {
        "messages": [
            HumanMessage(
                content="Translate this sentence from English to French: I love programming."
            ),
            AIMessage(content="J'adore la programmation."),
            HumanMessage(content="What did you just say?"),
        ],
    }
)

AIMessage(content='I just said, "J\'adore la programmation," which means "I love programming" in French.', response_metadata={'token_usage': {'completion_tokens': 23, 'prompt_tokens': 61, 'total_tokens': 84}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None}, id='run-34dfa86a-1bff-4813-a7db-80ef74b37c31-0')

## Message history

As a shortcut for managing the chat history, we can use a built in class to manage message history for us.

In [19]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}


def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)

In [20]:
with_message_history.invoke(
    {
        "messages": [
            HumanMessage(
                content="Translate this sentence from English to French: I love programming."
            )
        ]
    },
    config={"configurable": {"session_id": "abc"}},
)

AIMessage(content='The translation of "I love programming" in French is "J\'adore programmer."', response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 39, 'total_tokens': 57}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None}, id='run-dd010b9d-0b1b-4c55-bb43-50042a3b43bc-0')

In [22]:
with_message_history.invoke(
    {
        "messages": [
            HumanMessage(
                content="What did I say?."
            )
        ]
    },
    config={"configurable": {"session_id": "abc"}},
)

AIMessage(content='You asked for the translation of "I love programming" into French.', response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 70, 'total_tokens': 84}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None}, id='run-bd6c29ad-07bb-455e-83d6-fd22be4c79da-0')

And now we have a basic chatbot! We can see what this looks like under the hood by checkout out our [LangSmith trace](https://smith.langchain.com/public/c151e332-3cd1-4e8d-86df-b214c4209957/r)

While this chain can serve as a useful chatbot on its own with just the model's internal knowledge, it's often useful to introduce some form of `retrieval-augmented generation`, or RAG for short, over domain-specific knowledge to make our chatbot more focused. We'll cover this next.

## Retrievers

We can set up and use a [`Retriever`](/docs/concepts/#retriever) to pull domain-specific knowledge for our chatbot. To show this, let's expand the simple chatbot we created above to be able to answer questions about LangSmith.

We'll use [the LangSmith documentation](https://docs.smith.langchain.com/overview) as source material and store it in a vectorstore for later retrieval. Note that this example will gloss over some of the specifics around parsing and storing a data source - you can see more [in-depth documentation on creating retrieval systems here](/docs/tutorials/rag/).

Let's set up our retriever. First, we'll install some required deps:

In [11]:
%pip install --upgrade --quiet langchain-chroma beautifulsoup4

You should consider upgrading via the '/Users/jacoblee/.pyenv/versions/3.10.5/bin/python -m pip install --upgrade pip' command.[0m[33m
[0mNote: you may need to restart the kernel to use updated packages.


Next, we'll use a document loader to pull data from a webpage:

In [34]:
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://docs.smith.langchain.com/overview")
data = loader.load()

Next, we split it into smaller chunks that the LLM's context window can handle and store it in a vector database:

In [35]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)

Then we embed and store those chunks in a vector database:

In [36]:
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())

And finally, let's create a retriever from our initialized vectorstore:

In [37]:
# k is the number of chunks to retrieve
retriever = vectorstore.as_retriever(k=4)

docs = retriever.invoke("how can langsmith help with testing?")

docs

[Document(page_content='Skip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookQuick StartOn this pageGetting started with LangSmithIntroduction‚ÄãLangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!Install', metadata={'description': 'Introduction', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'Getting started with LangSmith | \uf8ffü¶úÔ∏è\uf8ffüõ†Ô∏è LangSmith'}),
 Document(page_content='Skip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookQuick StartOn this pageGetting started with LangSmithIntroduction‚ÄãLangSmith is a platform for bu

Awesome! Let's now start building our RAG chatbot. Again, we will gloss over some of the details of RAG (check out [this tutorial](/docs/tutorials/rag/) for more information on RAG in general).

First, let's create a chain that takes in a list of messages and generates a search query to pass to our retriever. If there is only one message then this is easy - we will just pass that message in! But if there is multiple messages then we need to use an LLM to look at the history of message and generate a search query.

In [38]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableBranch

# We need a prompt that we can pass into an LLM to generate a transformed search query
query_transform_prompt = ChatPromptTemplate.from_messages(
    [
        MessagesPlaceholder(variable_name="messages"),
        (
            "user",
            "Given the above conversation, generate a search query to look up in order to get information relevant to the conversation. Only respond with the query, nothing else.",
        ),
    ]
)

query_transformation_chain = RunnableBranch(
    (
        lambda x: len(x.get("messages", [])) == 1,
        # If only one message, then we just pass that message's content to retriever
        (lambda x: x["messages"][-1].content),
    ),
    # If messages, then we pass inputs to LLM chain to transform the query, then pass to retriever
    query_transform_prompt | model | StrOutputParser(),
).with_config(run_name="query_transformation")

We can easily pipe the results of this chain to the retriever to have our retriever step:

In [40]:
retriever_chain = (query_transformation_chain | retriever).with_config(run_name="retrieval_chain")

Great! Now we can create our end-to-end RAG chatbot by first running this chain and then pass the retrieved documents into a second chain that generates an answer.

In [41]:
question_answering_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Answer the user's questions based on the below context:\n\n{context}",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

final_chain = RunnablePassthrough.assign(
    context = retriever_chain
) | question_answering_prompt | model

In [44]:
response = final_chain.invoke({"messages": [HumanMessage(content="how can langsmith help with testing?")]})

response.content

'LangSmith can help with testing by allowing you to closely monitor and evaluate your application. This feature enables you to test your application thoroughly and gain confidence in its performance before shipping it.'

Again, this by default doesn't have any memory. We can see this if we ask it a follow up question.

In [46]:
response = final_chain.invoke({"messages": [HumanMessage(content="expand on that")]})

response.content

'The documents provide an introduction to LangSmith, covering various aspects such as pricing, self-hosting options, proxy capabilities, tracing capabilities, evaluation capabilities, and the Prompt Hub - a prompt management tool integrated into LangSmith. Additionally, there is a mention of the LangSmith Cookbook, which is a collection of tutorials and end-to-end walkthroughs using LangSmith.'

Let's now add in memory using `RunnableWithMessageHistory`

In [47]:
with_message_history = RunnableWithMessageHistory(
    final_chain,
    get_session_history,
    input_messages_key="messages",
)

In [48]:
response = with_message_history.invoke(
    {"messages": [HumanMessage(content="how can langsmith help with testing?")]},
    config={"configurable": {"session_id": "123"}},
)

response.content

'LangSmith can help with testing by allowing you to closely monitor and evaluate your application. This helps you ensure that your application is functioning correctly before shipping it out. LangSmith provides production-grade tools for testing LLM applications, making it easier to test and debug your code efficiently.'

In [49]:
response = with_message_history.invoke(
    {"messages": [HumanMessage(content="expand on that")]},
    config={"configurable": {"session_id": "123"}},
)

response.content

"LangSmith provides features for tracing, evaluation, production monitoring, and automations, which are essential for testing applications. \n\n1. **Tracing:** LangSmith allows you to trace the behavior of your application, helping you identify any issues or bottlenecks in the code. This feature is crucial for debugging and optimizing the performance of your application.\n\n2. **Evaluation:** With LangSmith, you can evaluate the performance of your application in different scenarios. This helps you understand how your application behaves under various conditions and allows you to make necessary adjustments to improve its performance.\n\n3. **Production Monitoring:** LangSmith enables you to closely monitor your application in a production environment. This real-time monitoring helps you identify any issues as they occur and allows you to take immediate action to ensure the smooth functioning of your application.\n\n4. **Automations:** LangSmith offers automation tools that can streamli

To help you understand what's happening internally, check out [this LangSmith trace](https://smith.langchain.com/public/f48fabb6-6502-43ec-8242-afc352b769ed/r)

And we now have a chatbot capable of conversational retrieval!

## Next Steps

Now that you understand the basics of how to create a chatbot in LangChain, some more advanced topics you may be interested in are:

- [Tutorial on RAG](/docs/tutorials/rag): deep dive into the RAG concepts we just scratched the surface of in this tutorial
- [Streaming](/docs/how_to/streaming): streaming is *crucial* for chat applications