# LangSmith

Many of the applications you build with LangChain will contain multiple steps with multiple invocations of LLM calls. As these applications get more and more complex, it becomes crucial to be able to inspect what exactly is going on inside your chain or agent. The best way to do this is with [LangSmith](https://smith.langchain.com/).

After you sign up at the link above, make sure to set your environment variables to start logging traces:

In [1]:
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
import os

load_dotenv()  # Load environment variables from .env file

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = os.getenv('LANGCHAIN_API_KEY')
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')

model = ChatOpenAI(model="gpt-4o")

Let's first use the model directly. ChatModels are instances of LangChain "Runnables", which means they expose a standard interface for interacting with them. To just simply call the model, we can pass in a list of messages to the .invoke method.

In [2]:
from langchain_core.messages import HumanMessage

model.invoke([HumanMessage(content="Hi! I'm Bob")])

AIMessage(content='Hi Bob! How can I assist you today?', response_metadata={'token_usage': {'completion_tokens': 10, 'prompt_tokens': 11, 'total_tokens': 21}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_3e7d703517', 'finish_reason': 'stop', 'logprobs': None}, id='run-01c88a4b-af83-47da-92e7-5cf7da68d30f-0', usage_metadata={'input_tokens': 11, 'output_tokens': 10, 'total_tokens': 21})

API Reference: [HumanMessage](https://api.python.langchain.com/en/latest/messages/langchain_core.messages.human.HumanMessage.html)

The model on its own does not have any concept of state. For example, if you ask a followup question:

In [4]:
model.invoke([HumanMessage(content="What's my name?")])

AIMessage(content="I'm sorry, but I don't have access to personal information about users. If you need help or have a question, feel free to let me know!", response_metadata={'token_usage': {'completion_tokens': 30, 'prompt_tokens': 11, 'total_tokens': 41}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_f4e629d0a5', 'finish_reason': 'stop', 'logprobs': None}, id='run-54b11022-0cbf-418d-8b2c-386431f87249-0', usage_metadata={'input_tokens': 11, 'output_tokens': 30, 'total_tokens': 41})

Let's take a look at the example [LangSmith trace](https://smith.langchain.com/public/5c21cb92-2814-4119-bae9-d02b8db577ac/r)

We can see that it doesn't take the previous conversation turn into context, and cannot answer the question. This makes for a terrible chatbot experience!

To get around this, we need to pass the entire conversation history into the model. Let's see what happens when we do that:

In [4]:
from langchain_core.messages import AIMessage

model.invoke(
    [
        HumanMessage(content="Hi! I'm Bob"),
        AIMessage(content="Hello Bob! How can I assist you today?"),
        HumanMessage(content="What's my name?"),
    ]
)

AIMessage(content='You mentioned that your name is Bob. How can I help you today, Bob?', response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 33, 'total_tokens': 50}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_9cb5d38cf7', 'finish_reason': 'stop', 'logprobs': None}, id='run-706e9574-694c-4481-8e4d-1426c07dced0-0', usage_metadata={'input_tokens': 33, 'output_tokens': 17, 'total_tokens': 50})

And now we can see that we get a good response!

This is the basic idea underpinning a chatbot's ability to interact conversationally. So how do we best implement this?

# Message History
We can use a Message History class to wrap our model and make it stateful. This will keep track of inputs and outputs of the model, and store them in some datastore. Future interactions will then load those messages and pass them into the chain as part of the input. Let's see how to use this!

First, let's make sure to install langchain-community, as we will be using an integration in there to store message history.

In [5]:
# ! pip install langchain_community

After that, we can import the relevant classes and set up our chain which wraps the model and adds in this message history. A key part here is the function we pass into as the get_session_history. This function is expected to take in a session_id and return a Message History object. This session_id is used to distinguish between separate conversations, and should be passed in as part of the config when calling the new chain (we'll show how to do that.

In [3]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}


def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


with_message_history = RunnableWithMessageHistory(model, get_session_history)

API Reference: [ChatMessageHistory](https://api.python.langchain.com/en/latest/chat_history/langchain_core.chat_history.ChatMessageHistory.html) | [BaseChatMessageHistory](https://api.python.langchain.com/en/latest/chat_history/langchain_core.chat_history.BaseChatMessageHistory.html) | [RunnableWithMessageHistory](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html)

We now need to create a config that we pass into the runnable every time. This config contains information that is not part of the input directly, but is still useful. In this case, we want to include a session_id. This should look like:

In [4]:
config = {"configurable": {"session_id": "abc2"}}

In [5]:
response = with_message_history.invoke(
    [HumanMessage(content="Hi! I'm Bob")],
    config=config,
)

response.content

Parent run 0f3b8efe-fd95-4a61-93ff-df0c77866849 not found for run bb15fd2e-589e-41fd-b905-be40bf0fb0ec. Treating as a root run.


'Hi Bob! How can I assist you today?'

In [10]:
response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

Parent run c7e1fed2-59f7-4cfe-bf68-126ceca10792 not found for run f9f5cbc2-fdc6-41e1-b96e-33ba089d2981. Treating as a root run.


'You mentioned that your name is Bob. How can I help you today, Bob?'

Great! Our chatbot now remembers things about us. If we change the config to reference a different session_id, we can see that it starts the conversation fresh.

In [11]:
config = {"configurable": {"session_id": "abc3"}}

response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

Parent run d0529958-e230-44c2-a1d3-57dc2eac1513 not found for run 52625e82-9054-491f-8e4e-690bc6c4b227. Treating as a root run.


"I don't have access to personal data about users unless it has been shared with me in the course of our conversation. Therefore, I don't know your name. How can I assist you today?"

However, we can always go back to the original conversation (since we are persisting it in a database)

In [6]:
config = {"configurable": {"session_id": "abc2"}}

response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

Parent run 5c601531-52a7-40b5-9d97-94f15f568763 not found for run 75ca8fec-9667-4324-aaed-c95c68e1ea18. Treating as a root run.


'You mentioned that your name is Bob. How can I help you today?'

This is how we can support a chatbot having conversations with many users!

Right now, all we've done is add a simple persistence layer around the model. We can start to make the more complicated and personalized by adding in a prompt template.

# Prompt templates

Prompt Templates help to turn raw user information into a format that the LLM can work with. In this case, the raw user input is just a message, which we are passing to the LLM. Let's now make that a bit more complicated. First, let's add in a system message with some custom instructions (but still taking messages as input). Next, we'll add in more input besides just the messages.

First, let's add in a system message. To do this, we will create a ChatPromptTemplate. We will utilize MessagesPlaceholder to pass all the messages in.

In [7]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

API Reference: [ChatPromptTemplate](https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.chat.ChatPromptTemplate.html) | [MessagesPlaceholder](https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.chat.MessagesPlaceholder.html)



Note that this slightly changes the input type - rather than pass in a list of messages, we are now passing in a dictionary with a messages key where that contains a list of messages.

In [8]:
response = chain.invoke({"messages": [HumanMessage(content="hi! I'm bob")]})

response.content

'Hi Bob! How can I assist you today?'

We can now wrap this in the same Messages History object as before

In [9]:
with_message_history = RunnableWithMessageHistory(chain, get_session_history)

In [10]:
config = {"configurable": {"session_id": "abc5"}}

In [11]:
response = with_message_history.invoke(
    [HumanMessage(content="Hi! I'm Jim")],
    config=config,
)

response.content

Parent run 04bfb6af-2742-4536-b9ba-fcdcf9f35281 not found for run 49205c91-46e1-4dfc-bf21-e30ddc104bb4. Treating as a root run.


'Hi Jim! How can I assist you today?'

In [13]:
response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

Parent run 93c4b985-857a-4d93-bf11-a13eeb28cf2e not found for run a2376f4f-6097-474e-a68b-c609d630af1d. Treating as a root run.


'Your name is Jim! How can I help you today?'

Awesome! Let's now make our prompt a little bit more complicated. Let's assume that the prompt template now looks something like this:

In [14]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

Note that we have added a new language input to the prompt. We can now invoke the chain and pass in a language of our choice.

In [15]:
response = chain.invoke(
    {"messages": [HumanMessage(content="hi! I'm bob")], "language": "Spanish"}
)

response.content

'¡Hola, Bob! ¿Cómo estás?'

Let's now wrap this more complicated chain in a Message History class. This time, because there are multiple keys in the input, we need to specify the correct key to use to save the chat history.

In [16]:
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)

In [17]:
config = {"configurable": {"session_id": "abc11"}}

In [18]:
response = with_message_history.invoke(
    {"messages": [HumanMessage(content="hi! I'm todd")], "language": "Spanish"},
    config=config,
)

response.content

Parent run 677569ec-b138-40c0-a026-95e538b0bc9c not found for run 79f2d670-ca4d-4a41-9985-7695d52b8786. Treating as a root run.


'¡Hola, Todd! ¿Cómo estás? ¿En qué puedo ayudarte hoy?'

In [19]:
response = with_message_history.invoke(
    {"messages": [HumanMessage(content="whats my name?")], "language": "Spanish"},
    config=config,
)

response.content

Parent run 545dbfd8-0781-4e87-ad36-2028ea466ec8 not found for run 84d06f2c-0a57-4ec3-a501-5c773c726f04. Treating as a root run.


'Tu nombre es Todd. ¿En qué más puedo ayudarte?'

To help you understand what's happening internally, check out [this LangSmith trace](https://smith.langchain.com/public/f48fabb6-6502-43ec-8242-afc352b769ed/r)

# Managing Conversation History
One important concept to understand when building chatbots is how to manage conversation history. If left unmanaged, the list of messages will grow unbounded and potentially overflow the context window of the LLM. Therefore, it is important to add a step that limits the size of the messages you are passing in.

**Importantly, you will want to do this BEFORE the prompt template but AFTER you load previous messages from Message History.**

We can do this by adding a simple step in front of the prompt that modifies the messages key appropriately, and then wrap that new chain in the Message History class. 

LangChain comes with a few built-in helpers for [managing a list of messages](https://python.langchain.com/v0.2/docs/how_to/#messages). In this case we'll use the [trim_messages](https://python.langchain.com/v0.2/docs/how_to/trim_messages/) helper to reduce how many messages we're sending to the model. The trimmer allows us to specify how many tokens we want to keep, along with other parameters like if we want to always keep the system message and whether to allow partial messages:

In [21]:
from langchain_core.messages import SystemMessage, trim_messages
from langchain_core.messages import AIMessage

trimmer = trim_messages(
    max_tokens=65,
    strategy="last",
    token_counter=model,
    include_system=True,
    allow_partial=False,
    start_on="human",
)

messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm bob"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]

trimmer.invoke(messages)

[SystemMessage(content="you're a good assistant"),
 HumanMessage(content='whats 2 + 2'),
 AIMessage(content='4'),
 HumanMessage(content='thanks'),
 AIMessage(content='no problem!'),
 HumanMessage(content='having fun?'),
 AIMessage(content='yes!')]

To use it in our chain, we just need to run the trimmer before we pass the messages input to our prompt.

Now if we try asking the model our name, it won't know it since we trimmed that part of the chat history:

In [22]:
from operator import itemgetter

from langchain_core.runnables import RunnablePassthrough

chain = (
    RunnablePassthrough.assign(messages=itemgetter("messages") | trimmer)
    | prompt
    | model
)

response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="what's my name?")],
        "language": "English",
    }
)
response.content

"I'm sorry, but I don't have access to that information. Could you tell me your name?"

But if we ask about information that is within the last few messages, it remembers:

In [23]:
response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="what math problem did i ask")],
        "language": "English",
    }
)
response.content

'You asked, "What\'s 2 + 2?"'

Let's now wrap this in the Message History

In [24]:
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)

config = {"configurable": {"session_id": "abc20"}}

In [25]:
response = with_message_history.invoke(
    {
        "messages": messages + [HumanMessage(content="whats my name?")],
        "language": "English",
    },
    config=config,
)

response.content

Parent run 3ae575d0-27b6-41f3-9843-048ea9ffdf4a not found for run dbbeb72b-7ca2-49d1-96ca-251f5c60f75b. Treating as a root run.


"I'm sorry, but I don't have access to your name. Could you please tell me?"

As expected, the first message where we stated our name has been trimmed. Plus there's now two new messages in the chat history (our latest question and the latest response). This means that even more information that used to be accessible in our conversation history is no longer available! In this case our initial math question has been trimmed from the history as well, so the model no longer knows about it:

In [26]:
response = with_message_history.invoke(
    {
        "messages": [HumanMessage(content="what math problem did i ask?")],
        "language": "English",
    },
    config=config,
)

response.content

Parent run 45053304-6df9-4620-9d93-908a7f2adc94 not found for run 9ec3bb26-9054-4896-af2b-8b062a0026a5. Treating as a root run.


"You haven't asked a math problem yet. If you have a specific math question, feel free to ask, and I'll do my best to help you!"

If you take a look at LangSmith, you can see exactly what is happening under the hood in the [LangSmith trace](https://smith.langchain.com/public/a64b8b7c-1fd6-4dbb-b11a-47cd09a5e4f1/r).

# Streaming

Now we've got a function chatbot. However, one really important UX consideration for chatbot application is streaming. LLMs can sometimes take a while to respond, and so in order to improve the user experience one thing that most application do is stream back each token as it is generated. This allows the user to see progress.

It's actually super easy to do this!

All chains expose a .stream method, and ones that use message history are no different. We can simply use that method to get back a streaming response.

In [27]:
config = {"configurable": {"session_id": "abc15"}}
for r in with_message_history.stream(
    {
        "messages": [HumanMessage(content="hi! I'm todd. tell me a joke")],
        "language": "English",
    },
    config=config,
):
    print(r.content, end="|")

Parent run cb4461eb-a507-4990-8476-a72bdf96cb32 not found for run 07ceaffd-e26e-4650-9a98-dc73e14d47f8. Treating as a root run.


|Hi| Todd|!| Sure|,| here's| a| joke| for| you|:

|Why| don't| scientists| trust| atoms|?

|Because| they| make| up| everything|!||

# Next Steps

Now that you understand the basics of how to create a chatbot in LangChain, some more advanced tutorials you may be interested in are:

    Conversational RAG: Enable a chatbot experience over an external source of data
    Agents: Build a chatbot that can take actions

If you want to dive deeper on specifics, some things worth checking out are:

    Streaming: streaming is crucial for chat applications
    How to add message history: for a deeper dive into all things related to message history
    How to manage large message history: more techniques for managing a large chat history