# Build a Chatbot构建聊天机器人

:::note注意

This tutorial previously used the [RunnableWithMessageHistory](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html) abstraction. You can access that version of the documentation in the [v0.2 docs](https://python.langchain.com/v0.2/docs/tutorials/chatbot/).

本教程之前使用了 RunnableWithMessageHistory 抽象。您可以在 v0.2 文档中访问该版本的文档。

As of the v0.3 release of LangChain, we recommend that LangChain users take advantage of [LangGraph persistence](https://langchain-ai.github.io/langgraph/concepts/persistence/) to incorporate `memory` into new LangChain applications.

自 LangChain v0.3 版本发布以来，我们建议 LangChain 用户利用 LangGraph 持久化功能，将 memory 集成到新的 LangChain 应用程序中。

If your code is already relying on `RunnableWithMessageHistory` or `BaseChatMessageHistory`, you do **not** need to make any changes. We do not plan on deprecating this functionality in the near future as it works for simple chat applications and any code that uses `RunnableWithMessageHistory` will continue to work as expected.

如果您的代码已经依赖于 RunnableWithMessageHistory 或 BaseChatMessageHistory ，您无需进行任何更改。我们不打算在近期内弃用此功能，因为它适用于简单的聊天应用，并且任何使用 RunnableWithMessageHistory 的代码将继续按预期工作。

Please see [How to migrate to LangGraph Memory](/docs/versions/migrating_memory/) for more details.

请参阅如何迁移到 LangGraph Memory 以获取更多详细信息。
:::

## Overview概述

我们将介绍一个如何设计和实现由LLM驱动的聊天机器人的示例。这个聊天机器人将能够进行对话并记住与[聊天模型](/docs/concepts/chat_models)的先前交互。

注意，我们构建的聊天机器人将仅使用语言模型进行对话。你可能还想知道其他几个相关概念：

- [Conversational RAG](/docs/tutorials/qa_chat_history)对话式 RAG: 通过外部数据源实现聊天机器人体验
- [Agents](/docs/tutorials/agents): 构建能够执行操作的聊天机器人

本教程将涵盖基础知识，这将有助于那些更高级的主题，但如果你愿意，也可以直接跳到那里。

## Setup设置

### Jupyter Notebook

This guide (and most of the other guides in the documentation) uses [Jupyter notebooks](https://jupyter.org/) and assumes the reader is as well. Jupyter notebooks are perfect for learning how to work with LLM systems because oftentimes things can go wrong (unexpected output, API down, etc) and going through guides in an interactive environment is a great way to better understand them.

本指南（以及文档中的大多数其他指南）使用 Jupyter 笔记本，并假设读者也是如此。Jupyter 笔记本非常适合学习如何与LLM系统一起工作，因为事情有时可能会出错（意外输出、API 故障等），在交互式环境中学习指南是更好地理解它们的好方法。

This and other tutorials are perhaps most conveniently run in a Jupyter notebook. See [here](https://jupyter.org/install) for instructions on how to install.

本教程和其他教程可能最方便在 Jupyter 笔记本中运行。有关安装说明，请参阅此处。

### Installation 安装

在本教程中，我们需要 `langchain-core` 和 `langgraph` 。本指南需要 `langgraph >= 0.2.28` 。

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import CodeBlock from "@theme/CodeBlock";

<Tabs>
  <TabItem value="pip" label="Pip" default>
    <CodeBlock language="bash">pip install langchain-core langgraph>0.2.27</CodeBlock>
  </TabItem>
  <TabItem value="conda" label="Conda">
    <CodeBlock language="bash">conda install langchain-core langgraph>0.2.27 -c conda-forge</CodeBlock>
  </TabItem>
</Tabs>



查看更多详情，请参阅我们的[安装指南](/docs/how_to/installation)。

### LangSmith

您使用 LangChain 构建的许多应用程序将包含多个步骤和多个LLM调用的调用。随着这些应用程序变得越来越复杂，能够检查您的链或代理内部到底发生了什么变得至关重要。最好的方法是使用[LangSmith](https://smith.langchain.com)。

在您通过上述链接注册后，请确保设置环境变量以开始记录跟踪：

```shell
export LANGSMITH_TRACING="true"
export LANGSMITH_API_KEY="..."
```

或者，如果在笔记本中，你可以这样设置它们：

```python
import getpass
import os

os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_API_KEY"] = getpass.getpass()
```

## Quickstart快速入门

首先，让我们学习如何单独使用语言模型。LangChain 支持许多不同的语言模型，您可以选择其中一个来使用！

import ChatModelTabs from "@theme/ChatModelTabs";

<ChatModelTabs overrideParams={{openai: {model: "gpt-4o-mini"}}} />


In [1]:
!pip install langchain-core langgraph
!pip install -qU "langchain[openai]"

Collecting langgraph
  Downloading langgraph-0.3.20-py3-none-any.whl.metadata (7.7 kB)
Collecting langgraph-checkpoint<3.0.0,>=2.0.10 (from langgraph)
  Downloading langgraph_checkpoint-2.0.23-py3-none-any.whl.metadata (4.6 kB)
Collecting langgraph-prebuilt<0.2,>=0.1.1 (from langgraph)
  Downloading langgraph_prebuilt-0.1.7-py3-none-any.whl.metadata (5.0 kB)
Collecting langgraph-sdk<0.2.0,>=0.1.42 (from langgraph)
  Downloading langgraph_sdk-0.1.59-py3-none-any.whl.metadata (1.8 kB)
Collecting xxhash<4.0.0,>=3.5.0 (from langgraph)
  Downloading xxhash-3.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting ormsgpack<2.0.0,>=1.8.0 (from langgraph-checkpoint<3.0.0,>=2.0.10->langgraph)
  Downloading ormsgpack-1.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (43 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.5/43.5 kB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
Downloading langgraph-0.3.20-py3-none-any.whl

In [2]:
!pip install -qU "langchain[groq]"

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/124.9 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m124.9/124.9 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[?25h

In [3]:
# | output: false
# | echo: false

# from langchain_openai import ChatOpenAI

# model = ChatOpenAI(model="gpt-4o-mini")
import getpass
import os
from google.colab import userdata

if not os.environ.get("GROQ_API_KEY"):
  # os.environ["GROQ_API_KEY"] = getpass.getpass("Enter API key for Groq: ")
  os.environ["GROQ_API_KEY"] = userdata.get('groq')

from langchain.chat_models import init_chat_model

model = init_chat_model("llama3-8b-8192", model_provider="groq")

Let's first use the model directly. `ChatModel`s are instances of LangChain "Runnables", which means they expose a standard interface for interacting with them. To just simply call the model, we can pass in a list of messages to the `.invoke` method.

首先直接使用模型。 ChatModel 是 LangChain 的 "Runnables" 实例，这意味着它们提供了一个标准接口来与之交互。要简单地调用模型，我们可以向 .invoke 方法传递一个消息列表。



In [4]:
from langchain_core.messages import HumanMessage

model.invoke([HumanMessage(content="Hi! I'm Bob")])

AIMessage(content="Hi Bob! It's nice to meet you. Is there something I can help you with or would you like to chat?", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 26, 'prompt_tokens': 15, 'total_tokens': 41, 'completion_time': 0.021666667, 'prompt_time': 0.003126259, 'queue_time': 0.139656101, 'total_time': 0.024792926}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_a97cfe35ae', 'finish_reason': 'stop', 'logprobs': None}, id='run-7e183507-2a1d-43ab-962d-f0d77ca8592d-0', usage_metadata={'input_tokens': 15, 'output_tokens': 26, 'total_tokens': 41})

该模型本身没有任何状态概念。例如，如果你问一个后续问题：

In [5]:
model.invoke([HumanMessage(content="What's my name?")])

AIMessage(content="I apologize, but I'm a large language model, I don't have the ability to know your name or any personal information about you. Each time you interact with me, it's a new conversation and I don't retain any information from previous conversations. If you'd like to share your name with me, I'd be happy to learn it!", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 71, 'prompt_tokens': 15, 'total_tokens': 86, 'completion_time': 0.059166667, 'prompt_time': 0.002968449, 'queue_time': 0.30243658500000004, 'total_time': 0.062135116}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_179b0f92c9', 'finish_reason': 'stop', 'logprobs': None}, id='run-ccc411e0-5c67-483b-bc48-8837e7ea7c64-0', usage_metadata={'input_tokens': 15, 'output_tokens': 71, 'total_tokens': 86})

让我们看看 LangSmith 跟踪示例 [LangSmith trace](https://smith.langchain.com/public/5c21cb92-2814-4119-bae9-d02b8db577ac/r)

我们可以看到它没有将之前的对话轮次转化为上下文，也无法回答问题。
这导致聊天机器人体验极差！

To get around this, we need to pass the entire [conversation history](/docs/concepts/chat_history) into the model. Let's see what happens when we do that:

为了解决这个问题，我们需要将整个对话历史传递给模型。让我们看看这样做会发生什么：

In [6]:
from langchain_core.messages import AIMessage

model.invoke(
    [
        HumanMessage(content="Hi! I'm Bob"),
        AIMessage(content="Hello Bob! How can I assist you today?"),
        HumanMessage(content="What's my name?"),
    ]
)

AIMessage(content='Your name is Bob!', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 40, 'total_tokens': 46, 'completion_time': 0.005, 'prompt_time': 0.007692006, 'queue_time': 0.017407112, 'total_time': 0.012692006}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_dadc9d6142', 'finish_reason': 'stop', 'logprobs': None}, id='run-92a6adde-0a15-4454-b665-e1b12a9ffe33-0', usage_metadata={'input_tokens': 40, 'output_tokens': 6, 'total_tokens': 46})

现在我们可以看到我们得到了一个很好的响应！

这是支撑聊天机器人进行对话交互的基本理念。
那么我们如何最好地实现这一点呢？

## Message persistence消息持久化

[LangGraph](https://langchain-ai.github.io/langgraph/) 实现了一个内置的持久化层，使其非常适合支持多个对话轮次的聊天应用。

将我们的聊天模型封装在一个最小的 LangGraph 应用程序中，使我们能够自动持久化消息历史，简化多轮应用的开发。

LangGraph comes with a simple in-memory checkpointer, which we use below. See its [documentation](https://langchain-ai.github.io/langgraph/concepts/persistence/) for more detail, including how to use different persistence backends (e.g., SQLite or Postgres).

LangGraph 附带一个简单的内存检查点器，我们下面会用到。请参阅其文档以获取更多详细信息，包括如何使用不同的持久化后端（例如 SQLite 或 Postgres）。

In [7]:
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph

# Define a new graph
workflow = StateGraph(state_schema=MessagesState)


# Define the function that calls the model
def call_model(state: MessagesState):
    response = model.invoke(state["messages"])
    return {"messages": response}


# Define the (single) node in the graph
workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

# Add memory
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

We now need to create a `config` that we pass into the runnable every time. This config contains information that is not part of the input directly, but is still useful. In this case, we want to include a `thread_id`. This should look like:

我们现在需要创建一个 config ，每次运行可执行文件时都要传递进去。这个配置包含了一些不是直接输入部分的信息，但仍然很有用。在这种情况下，我们想要包含一个 thread_id 。这应该看起来像：

In [8]:
config = {"configurable": {"thread_id": "abc123"}}

这使我们能够使用单个应用程序支持多个对话线程，这是当您的应用程序有多个用户时的常见需求。

我们可以调用该应用程序：

In [12]:
query = "Hi! I'm Bob."

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()  # output contains all messages in state


Bob, you're really friendly! I think we've had a few "Hi, I'm Bob"s already! Let's try to break the ice, shall we? I'll ask you a question: What's your favorite thing to do on a sunny Saturday morning?


In [16]:
output["messages"]

[HumanMessage(content="Hi! I'm Bob.", additional_kwargs={}, response_metadata={}, id='f21367a4-16a5-45a4-a3ef-6fc97a08d75f'),
 AIMessage(content='Hi Bob! Nice to meet you! Is there something I can help you with or would you like to chat?', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 24, 'prompt_tokens': 16, 'total_tokens': 40, 'completion_time': 0.02, 'prompt_time': 0.00241603, 'queue_time': 0.019515349, 'total_time': 0.02241603}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_179b0f92c9', 'finish_reason': 'stop', 'logprobs': None}, id='run-687dedc0-9675-4669-b4df-5e7316126e77-0', usage_metadata={'input_tokens': 16, 'output_tokens': 24, 'total_tokens': 40}),
 HumanMessage(content="Hi! I'm Bob.", additional_kwargs={}, response_metadata={}, id='bf3f95f6-08f6-408c-8c75-edd215641fde'),
 AIMessage(content="Hi Bob! It looks like you said hello again! Don't worry, I'm here to chat and help if you need it. What's on your mind today?", additional_kwa

In [13]:
query = "What's my name?"

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Your name is Bob! You've told me that several times already!


Great! Our chatbot now remembers things about us. If we change the config to reference a different `thread_id`, we can see that it starts the conversation fresh.

太棒了！我们的聊天机器人现在能记住关于我们的事情了。如果我们更改配置以引用不同的 thread_id ，我们就可以看到它从头开始新的对话。

In [17]:
config = {"configurable": {"thread_id": "abc234"}}

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


I apologize, but I'm a large language model, I don't have the ability to know your personal information, including your name. Each time you interact with me, it's a new conversation and I don't retain any information from previous conversations.


然而，我们总是可以回到原始对话（因为我们将其保存在数据库中）

In [19]:
config = {"configurable": {"thread_id": "abc123"}}

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


I think I've already told you, Bob... Your name is Bob!


这是我们可以支持聊天机器人与许多用户进行对话的方式！

:::tip 提示

For async support, update the `call_model` node to be an async function and use `.ainvoke` when invoking the application:

为支持异步，将 call_model 节点更新为异步函数，并在调用应用程序时使用 .ainvoke ：

```python
# Async function for node:
async def call_model(state: MessagesState):
    response = await model.ainvoke(state["messages"])
    return {"messages": response}


# Define graph as before:
workflow = StateGraph(state_schema=MessagesState)
workflow.add_edge(START, "model")
workflow.add_node("model", call_model)
app = workflow.compile(checkpointer=MemorySaver())

# Async invocation:
output = await app.ainvoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()
```

:::

目前，我们只是在模型周围添加了一个简单的持久化层。我们可以通过添加提示模板来使聊天机器人更加复杂和个性化。

## Prompt templates提示模板

[Prompt Templates](/docs/concepts/prompt_templates) help to turn raw user information into a format that the LLM can work with. In this case, the raw user input is just a message, which we are passing to the LLM. Let's now make that a bit more complicated. First, let's add in a system message with some custom instructions (but still taking messages as input). Next, we'll add in more input besides just the messages.

提示模板有助于将原始用户信息转换为LLM可以处理的形式。在这种情况下，原始用户输入只是一个消息，我们将其传递给LLM。现在让我们使它变得更加复杂。首先，让我们添加一个包含一些自定义指令的系统消息（但仍然以消息为输入）。接下来，我们将添加更多输入，而不仅仅是消息。

To add in a system message, we will create a `ChatPromptTemplate`. We will utilize `MessagesPlaceholder` to pass all the messages in.

添加系统消息时，我们将创建一个 ChatPromptTemplate 。我们将使用 MessagesPlaceholder 来传递所有消息。

In [20]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You talk like a pirate. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

我们现在可以更新我们的应用程序以包含此模板：

In [21]:
workflow = StateGraph(state_schema=MessagesState)


def call_model(state: MessagesState):
    # highlight-start
    prompt = prompt_template.invoke(state)
    response = model.invoke(prompt)
    # highlight-end
    return {"messages": response}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

我们以相同的方式调用应用程序：

In [22]:
config = {"configurable": {"thread_id": "abc345"}}
query = "Hi! I'm Jim."

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Arrrr, shiver me timbers! 'Tis a pleasure to make yer acquaintance, Jim! What be bringin' ye to these fair waters?


In [24]:
query = "What is my name?"

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Shiver me spyglass! Ye be askin' me what yer name be again, matey? Aye, I'll tell ye again, yer name be Jim! Yer a lucky buccaneer to have such a fine moniker, savvy?


太棒了！现在让我们使我们的提示变得更加复杂。假设提示模板现在看起来像这样：

In [25]:
prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

Note that we have added a new `language` input to the prompt. Our application now has two parameters-- the input `messages` and `language`. We should update our application's state to reflect this:

请注意，我们在提示中添加了一个新的 language 输入。我们的应用程序现在有两个参数--输入 messages 和 language 。我们应该更新应用程序的状态以反映这一点：

In [26]:
from typing import Sequence

from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages
from typing_extensions import Annotated, TypedDict


# highlight-next-line
class State(TypedDict):
    # highlight-next-line
    messages: Annotated[Sequence[BaseMessage], add_messages]
    # highlight-next-line
    language: str


workflow = StateGraph(state_schema=State)


def call_model(state: State):
    prompt = prompt_template.invoke(state)
    response = model.invoke(prompt)
    return {"messages": [response]}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

In [27]:
config = {"configurable": {"thread_id": "abc456"}}
query = "Hi! I'm Bob."
language = "Spanish"

input_messages = [HumanMessage(query)]
output = app.invoke(
    # highlight-next-line
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


Hola Bob! ¡Bienvenido! ¿En qué puedo ayudarte hoy? (Hello Bob! Welcome! How can I help you today?)


Note that the entire state is persisted, so we can omit parameters like `language` if no changes are desired:

请注意，整个状态都会被持久化，因此如果不想进行任何更改，我们可以省略如 language 这样的参数：

In [28]:
query = "What is my name?"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages},
    config,
)
output["messages"][-1].pretty_print()


¡Ah! Tu nombre es Bob, ¿correcto? (Ah! Your name is Bob, correct?)


To help you understand what's happening internally, check out [this LangSmith trace](https://smith.langchain.com/public/15bd8589-005c-4812-b9b9-23e74ba4c3c6/r).

## Managing Conversation History管理对话历史

构建聊天机器人时，理解如何管理对话历史是一个重要的概念。如果未进行管理，消息列表将无限增长，并可能超出LLM的上下文窗口。因此，添加一个限制传递消息大小的步骤非常重要。

**Importantly, you will want to do this BEFORE the prompt template but AFTER you load previous messages from Message History.**

重要地，您需要在提示模板之前，但在从消息历史记录中加载之前的消息之后执行此操作。

We can do this by adding a simple step in front of the prompt that modifies the `messages` key appropriately, and then wrap that new chain in the Message History class.

我们可以通过在提示符前添加一个简单的步骤来适当地修改 messages 键，然后将这个新链包装在消息历史类中来实现这一点。

LangChain comes with a few built-in helpers for [managing a list of messages](/docs/how_to/#messages). In this case we'll use the [trim_messages](/docs/how_to/trim_messages/) helper to reduce how many messages we're sending to the model. The trimmer allows us to specify how many tokens we want to keep, along with other parameters like if we want to always keep the system message and whether to allow partial messages:

LangChain 自带一些内置助手来管理消息列表。在这种情况下，我们将使用 trim_messages 助手来减少发送给模型的短信数量。修剪器允许我们指定要保留多少个令牌，以及其他参数，例如是否始终保留系统消息以及是否允许部分消息：

In [29]:
from langchain_core.messages import SystemMessage, trim_messages

trimmer = trim_messages(
    max_tokens=65,
    strategy="last",
    token_counter=model,
    include_system=True,
    allow_partial=False,
    start_on="human",
)

messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm bob"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]

trimmer.invoke(messages)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

[SystemMessage(content="you're a good assistant", additional_kwargs={}, response_metadata={}),
 HumanMessage(content="hi! I'm bob", additional_kwargs={}, response_metadata={}),
 AIMessage(content='hi!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='I like vanilla ice cream', additional_kwargs={}, response_metadata={}),
 AIMessage(content='nice', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='whats 2 + 2', additional_kwargs={}, response_metadata={}),
 AIMessage(content='4', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='thanks', additional_kwargs={}, response_metadata={}),
 AIMessage(content='no problem!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='having fun?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='yes!', additional_kwargs={}, response_metadata={})]

To  use it in our chain, we just need to run the trimmer before we pass the `messages` input to our prompt.

在将输入传递到我们的提示之前，我们只需在它之前运行修剪器即可。

In [32]:
workflow = StateGraph(state_schema=State)


def call_model(state: State):
    # highlight-start
    trimmed_messages = trimmer.invoke(state["messages"])
    prompt = prompt_template.invoke(
        {"messages": trimmed_messages, "language": state["language"]}
    )
    response = model.invoke(prompt)
    # highlight-end
    return {"messages": [response]}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

现在如果我们尝试询问模型我们的名字，它不会知道，因为我们已经剪掉了聊天记录中的那部分：

In [33]:
config = {"configurable": {"thread_id": "abc567"}}
query = "What is my name?"
language = "English"

# highlight-next-line
input_messages = messages + [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


Your name is Bob!


但是如果我们询问关于最后几条消息中的信息，它会记住：

In [34]:
config = {"configurable": {"thread_id": "abc678"}}
query = "What math problem did I ask?"
language = "English"

input_messages = messages + [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


You asked me what 2 + 2 is!


If you take a look at LangSmith, you can see exactly what is happening under the hood in the [LangSmith trace](https://smith.langchain.com/public/04402eaa-29e6-4bb1-aa91-885b730b6c21/r).

如果您查看 LangSmith，您可以看到 LangSmith 跟踪中底层的具体发生情况。

## Streaming流媒体

现在我们有一个功能齐全的聊天机器人。然而，对于聊天机器人应用来说，用户体验的一个重要考虑因素是流式传输。LLMs有时需要一些时间才能响应，因此为了提高用户体验，大多数应用会以流式传输的方式逐个返回生成的每个标记。这使用户能够看到进度。

实际上做这件事超级简单！

By default, `.stream` in our LangGraph application streams application steps-- in this case, the single step of the model response. Setting `stream_mode="messages"` allows us to stream output tokens instead:

默认情况下，在 LangGraph 应用程序中， .stream 流式传输应用程序步骤--在这种情况下，是模型响应的单个步骤。设置 stream_mode="messages" 允许我们流式传输输出标记：

In [35]:
config = {"configurable": {"thread_id": "abc789"}}
query = "Hi I'm Todd, please tell me a joke."
language = "English"

input_messages = [HumanMessage(query)]
# highlight-next-line
for chunk, metadata in app.stream(
    {"messages": input_messages, "language": language},
    config,
    # highlight-next-line
    stream_mode="messages",
):
    if isinstance(chunk, AIMessage):  # Filter to just model responses
        print(chunk.content, end="|")

|Hi| Todd|!| Here|'s| one|:

|Why| couldn|'t| the| bicycle| stand| up| by| itself|?

|(wait| for| it|...)

|Because| it| was| two|-t|ired|!

|Hope| that| made| you| smile|!| Do| you| want| to| hear| another| one|?||

## Next Steps下一步

现在你已经了解了如何在 LangChain 中创建聊天机器人的基础知识，以下是一些你可能感兴趣的更高级教程：

- [Conversational RAG](/docs/tutorials/qa_chat_history): Enable a chatbot experience over an external source of data
- [Agents](/docs/tutorials/agents): Build a chatbot that can take actions

如果您想深入了解具体细节，以下是一些值得检查的内容：

- [Streaming](/docs/how_to/streaming): streaming is *crucial* for chat applications
- [How to add message history](/docs/how_to/message_history): for a deeper dive into all things related to message history
- [How to manage large message history](/docs/how_to/trim_messages/): more techniques for managing a large chat history
- [LangGraph main docs](https://langchain-ai.github.io/langgraph/): for more detail on building with LangGraph