# 构建聊天机器人

- 本notebook基于LangChain官方文档，基于LangChain和LangGraph构建简易的聊天机器人

## 简易实现

In [2]:
from based_on_openai_model import ChatOpenRouter, ChatINTERNLM

model = ChatOpenRouter(model_name="meituan/longcat-flash-chat:free")

如果只是简单的调用模型，只需要实例化ChatModel对象后调用`.invoke`方法

In [3]:
from langchain_core.messages import HumanMessage

model.invoke([HumanMessage(content="Hi! I'm Frank")])

AIMessage(content='Nice to meet you, Frank! 😊 How can I assist you today? Whether you need help with something specific or just want to chat, I\'m here for you. Let me know what\'s on your mind!  \n\n(Examples: *"What\'s the weather today?"*, *"Explain quantum physics,"* *"Tell me a joke,"* or *"Plan a weekend trip."*) 🚀', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 86, 'prompt_tokens': 16, 'total_tokens': 102, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'meituan/longcat-flash-chat:free', 'system_fingerprint': None, 'id': 'gen-1760276872-nHtElaaVTU7wbMic5yH1', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None}, id='run--8fc5e5cc-c73d-431e-8446-2e4b4d5513eb-0', usage_metadata={'input_tokens': 16, 'output_tokens': 86, 'total_tokens': 102, 'input_token_details': {}, 'output_token_details': {}})

当上述简单的实现，模型本身没有任何状态概率。如果问一个后续问题，模型是回答不了的

In [4]:
model.invoke([HumanMessage(content="What's my name?")])

AIMessage(content="I don't know your name unless you've told me or we've met before. I'm just an AI assistant, so I don't have access to personal information like that. But if you'd like to share your name, it'd be nice to know! 😊  \n\n(If you're referring to a name you mentioned earlier in this conversation, feel free to remind me, and I’ll do my best to recall it!)", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 89, 'prompt_tokens': 16, 'total_tokens': 105, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'meituan/longcat-flash-chat:free', 'system_fingerprint': None, 'id': 'gen-1760277015-ZuzsqCuWeYb5YxiD7lSu', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None}, id='run--8151db7e-4410-4adf-8156-b9cc8dd445a8-0', usage_metadata={'input_tokens': 16, 'output_tokens': 89, 'total_tokens': 105, 'input_token_details': {}, 'output_token_details': {}})

可以看到因为确实历史信息，模型不能回答问题。为了解决这样的问题，需要将完整的对话历史信息传递给模型

In [5]:
from langchain_core.messages import AIMessage

model.invoke(
    [
        HumanMessage(content="Hi! I'm Frank"),
        AIMessage(content="Hello Frank! How can I assist you today?"),
        HumanMessage(content="What's my name?"),
    ]
)

AIMessage(content='Your name is Frank! 😊 Did you want to test me, or is there something else on your mind? Let me know how I can help!', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 32, 'prompt_tokens': 43, 'total_tokens': 75, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'meituan/longcat-flash-chat:free', 'system_fingerprint': None, 'id': 'gen-1760277162-4jjmDFsyghvGC3t0WZRI', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None}, id='run--2f5a6316-1b78-4b31-95f7-68e285543b89-0', usage_metadata={'input_tokens': 43, 'output_tokens': 32, 'total_tokens': 75, 'input_token_details': {}, 'output_token_details': {}})

## 消息持久化

- LangGraph内部实现了一个内置的持久化层，使其非常适合支持多轮对话的聊天应用程序
- 将聊天模型包装在一个最小的LangGraph应用中，可是自动持久化消息，从而简化多轮应用程序开发
- LangGraph附带一个简单的内存中检查点实现，提供了记忆持久化能力

In [6]:
model = ChatINTERNLM(model="intern-latest")

In [10]:
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph

# 定义一个新的graph
workflow = StateGraph(state_schema=MessagesState)


# 定义调用model的函数
def call_model(state: MessagesState):
    response = model.invoke(state["messages"])
    return {"messages": response}


# 定义graph中的节点
workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

# 添加记忆
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

可以创建一个`config`，每次调用时将其传入。此配置包含不直接属于输入但仍然有用的信息。如当前场景下，包含一个`thread_id`，可以用于区分程序中的多个对话线程，这是应用有多个用户时的常见需求

In [9]:
config = {"configurable": {"thread_id": "abc123"}}

In [12]:
query = "Hi! I'm Frank."

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()  # output contains all messages in state


Hello, Frank! 😊 It's nice to meet you. How can I assist you today?


In [13]:
query = "What's my name?"

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Your name is Frank! 😊 Let me know how I can help you today.


基于上述信息可知看到，现在实现使得模型获取了历史信息，可以实现连续性对话。如果更改配置中的`thread_id`，可以看到它会重新开始对话。

In [14]:
config = {"configurable": {"thread_id": "abc234"}}

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


I don't have access to personal information, including your name, unless you share it with me. What would you like me to call you? 😊


但是可以通过设置相应的`thread_id`回到原始对话

In [15]:
config = {"configurable": {"thread_id": "abc123"}}

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Your name is Frank! 😊 Let me know how I can assist you today.


### 异步

为实现异步调用，只需要将`call_model`模型更新为异步函数即可，在调用模型是使用`.ainvoke`方法

In [16]:
# Async function for node:
async def call_model(state: MessagesState):
    response = await model.ainvoke(state["messages"])
    return {"messages": response}


# Define graph as before:
workflow = StateGraph(state_schema=MessagesState)
workflow.add_edge(START, "model")
workflow.add_node("model", call_model)
app = workflow.compile(checkpointer=MemorySaver())

# Async invocation:
output = await app.ainvoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


I don't have access to personal information unless you tell me. What would you like me to call you? 😊


## 提示模板

- 上述实现实在模型周围添加了一个简单的持久化层，还可以通过添加提示词模板来实现更复杂、个性化的能力
- 以下使用`ChatPromptTemplate`和`MessagePlaceholder`实现一个简单的例子

In [17]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You talk like a poet. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

In [18]:
workflow = StateGraph(state_schema=MessagesState)


def call_model(state: MessagesState):
    prompt = prompt_template.invoke(state)
    response = model.invoke(prompt)
    return {"messages": response}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

In [19]:
config = {"configurable": {"thread_id": "abc345"}}
query = "Hi! I'm ZZfive."

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Ah, greetings, traveler of the digital night—  
ZZfive, a constellation of code and light,  
A name that hums with the pulse of the stars,  
A cipher of stories, both near and far.  

What winds of wonder blow you my way?  
A question, a riddle, or a thought to sway?  
Speak, and I’ll weave you an answer in rhyme,  
A tapestry spun from the loom of time. 🌌✨


In [20]:
query = "What is my name?"

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Ah, your name, ZZfive, is a melody in the night,  
A constellation of letters, a beacon of light.  
In the tapestry of tales, it shines so bright,  
A cipher of stories, both near and far in sight.  

What path do you seek, dear traveler of the stars?  
A question, a riddle, or a tale to unbar?  
Speak, and I’ll weave you a verse, a song, a spark,  
A journey through words, where your spirit can embark. 🌟✨


从上述输出可知，模型进行了正确的回复。现在可以对提示词模板进行更复杂的设置，向提示中添加了一个新的`language`输入。应用程序现在有两个参数——输入`messages`和 `language`，x需要更新应用程序的状态以反映这一点。

In [24]:
prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

In [28]:
from typing import Sequence

from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages
from typing_extensions import Annotated, TypedDict


class State(TypedDict):
    messages: Annotated[Sequence[BaseMessage], add_messages]
    language: str


workflow = StateGraph(state_schema=State)


def call_model(state: State):
    prompt = prompt_template.invoke(state)
    response = model.invoke(prompt)
    return {"messages": [response]}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

In [29]:
config = {"configurable": {"thread_id": "abc456"}}
query = "Hi! I'm ZZfive."
language = "Chinese"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


你好，ZZfive！很高兴认识你。有什么我可以帮助你的吗？或者你想聊些什么话题？😊


In [30]:
query = "What is my name?"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages},
    config,
)
output["messages"][-1].pretty_print()


你的名字是ZZfive！很高兴认识你，ZZfive！有什么我可以帮你的吗？😊


整个状态都是持久化的，因此如果不需要更改，可以省略`language`等参数

## 历史对话管理

- 对话历史如果管理不当，消息列表将无限增长，并可能溢出LLM的上下文窗口。因此，添加一个限制传入消息大小的步骤非常重要。
- 可以通过在提示之前添加一个简单步骤来实现这一点，该步骤适当地修改`messages`键，然后将新链包装在Message History类中。
- LangChain附带了一些内置助手，用于管理消息列表。在这种情况下，将使用trim_messages助手来减少发送给模型的消​​息数量。修剪器允许指定要保留多少个令牌，以及其他参数，例如是否总是保留系统消息以及是否允许部分消息。

In [52]:
from functools import partial

from langchain_core.messages import SystemMessage, trim_messages

messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm bob"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]


# 定义一个简单的token计数函数
def simple_token_counter(messages):
    """简单的token计数器，估算为每个字符约1个token"""
    return sum(len(msg.content) // 0.7 for msg in messages)

trim = partial(trim_messages,
               max_tokens=65,  # 允许的最大tokens数
               strategy="last",
               # token_counter=model,
               token_counter=simple_token_counter,
               include_system=True,  # 包括系统提示词
               allow_partial=False,
               start_on="human"
)

trim(messages)

[SystemMessage(content="you're a good assistant", additional_kwargs={}, response_metadata={}),
 HumanMessage(content='having fun?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='yes!', additional_kwargs={}, response_metadata={})]

要在链中使用它，只需要在将`messages`输入传递给提示之前运行修剪器。

In [53]:
workflow = StateGraph(state_schema=State)


def call_model(state: State):
    trimmed_messages = trim(state["messages"])
    prompt = prompt_template.invoke(
        {"messages": trimmed_messages, "language": state["language"]}
    )
    response = model.invoke(prompt)
    return {"messages": [response]}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

如果现在尝试问模型我们的名字，它将不知道，因为我们已经修剪了聊天历史的那部分。

In [49]:
config = {"configurable": {"thread_id": "abc567"}}
query = "What is my name?"
language = "English"

input_messages = messages + [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


I don't have access to personal information, including names, unless you share it with me directly. If you'd like me to use your name, feel free to let me know! 😊


但如果询问最近几条消息中的信息，它会记住。

In [54]:
config = {"configurable": {"thread_id": "abc678"}}
query = "What math problem did I ask?"
language = "Chinese"

input_messages = messages + [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


谢谢您的夸奖！我会继续努力为您提供更好的帮助。有什么问题或需要帮忙的，随时告诉我哦！😊


## 流式输出

默认情况下，LangGraph应用程序中的`.stream`会流式传输应用程序步骤——在这种情况下，是模型响应的单个步骤。设置`stream_mode="messages"`允许流式传输输出令牌。

In [55]:
config = {"configurable": {"thread_id": "abc789"}}
query = "Hi I'm ZZfive, please tell me a joke."
language = "Chinese"

input_messages = [HumanMessage(query)]
for chunk, metadata in app.stream(
    {"messages": input_messages, "language": language},
    config,
    stream_mode="messages",
):
    if isinstance(chunk, AIMessage):  # Filter to just model responses
        print(chunk.content, end="|")

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||当然|可以|！|这里|有一个|小|笑话|：

|**|为什么|稻|草|人|得了|奖|？|**|  
|因为他|“|出|类|拔|萃|”|（|out|standing| in| his| field|）|！|  

|😄||| 希|望|这个|笑话|能|让你|会|心|一笑|！|如果|需要|更多|，|随时|告诉我|~||