# Chatbots一些相关问题
Chatbot使用了LLM做对话，下面的列举了相关问题

## 如何管理对话记忆
聊天机器人的特点是可以将之前对话的内容用户当前上下文。在之前我们讲过，有多种方式可以做，但之前使用的不是LCEL表达式。下面展示使用LCEL的实现


### Setup

In [2]:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

chat = ChatOpenAI(model="gpt-3.5-turbo-0125")
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "你是一个AI小助手，你要尽可能的回答用户的问题",
        ),
        ("placeholder", "{messages}"),
    ]
)

chain = prompt | chat

ai_msg = chain.invoke(
    {
        "messages": [
            (
                "human",
                "你是什么",
            ),
            ("ai", "我是AI小助手"),
            ("human", "刚才问你什么了?"),
        ],
    }
)
print(ai_msg.content)

你问我"你是什么"


### ChatHistory
LangChain提供了 `ChatMessageHistory`来记录对话历史，基本使用如下：

In [4]:
from langchain_community.chat_message_histories import ChatMessageHistory

demo_ephemeral_chat_history = ChatMessageHistory()
demo_ephemeral_chat_history.add_user_message(
    "你是一个AI小助手，你要尽可能的回答用户的问题"
)
demo_ephemeral_chat_history.add_user_message(
    "你是什么？"
)
demo_ephemeral_chat_history.add_ai_message("我是AI小助手")
demo_ephemeral_chat_history.messages

[HumanMessage(content='你是一个AI小助手，你要尽可能的回答用户的问题'),
 HumanMessage(content='你是什么？'),
 AIMessage(content='我是AI小助手')]

与此同时，在在保存AI response的时候不需要上面的操作，他封装了操作，可以直接将response save起来

In [5]:
demo_ephemeral_chat_history = ChatMessageHistory()
input1 = "你是一个AI小助手，你要尽可能的回答用户的问题"
demo_ephemeral_chat_history.add_user_message(input1)

response = chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
    }
)

demo_ephemeral_chat_history.add_ai_message(response)

input2 = "刚才问你什么了？"

demo_ephemeral_chat_history.add_user_message(input2)

chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
    }
)

AIMessage(content='你问我：“你是一个AI小助手，你要尽可能的回答用户的问题”，我回答说：“没问题！请问你有什么问题需要帮助的吗？”', response_metadata={'token_usage': {'completion_tokens': 52, 'prompt_tokens': 92, 'total_tokens': 144}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-b8985b10-2518-413c-b605-1f005c522983-0', usage_metadata={'input_tokens': 92, 'output_tokens': 52, 'total_tokens': 144})

### 自动管理对话历史

LangChain提供了自动管理对话历史的类`RunnableWithMessageHistory`,不用像上面一样手动管理了

In [6]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "你是一个AI小助手，你要尽可能的回答用户的问题",
        ),
        ("placeholder", "{chat_history}"),
        ("human", "{input}"),
    ]
)

chain = prompt | chat

In [7]:
from langchain_core.runnables.history import RunnableWithMessageHistory

demo_ephemeral_chat_history_for_chain = ChatMessageHistory()

chain_with_message_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: demo_ephemeral_chat_history_for_chain,
    input_messages_key="input",
    history_messages_key="chat_history",
)

In [8]:
chain_with_message_history.invoke(
    {"input": "我饿了"},
    {"configurable": {"session_id": "unused"}},
)

AIMessage(content='那你可以考虑吃点东西来填饱肚子，可以选择健康的食物，比如水果、蔬菜、坚果或者一些轻食。如果需要我帮你找一些简单的食谱或者外卖平台的推荐，也可以告诉我你的口味偏好。希望你能找到满足的食物！', response_metadata={'token_usage': {'completion_tokens': 115, 'prompt_tokens': 36, 'total_tokens': 151}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-d5c40320-035f-4a0d-97ae-e8cf3354b571-0', usage_metadata={'input_tokens': 36, 'output_tokens': 115, 'total_tokens': 151})

In [9]:
chain_with_message_history.invoke(
    {"input": "刚才问你什么了？"}, {"configurable": {"session_id": "unused"}}
)

AIMessage(content='你说你饿了，我建议你吃点东西来填饱肚子。如果需要我再帮你查找一些食谱或者外卖推荐，随时告诉我哦！', response_metadata={'token_usage': {'completion_tokens': 59, 'prompt_tokens': 170, 'total_tokens': 229}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-c70a506e-9db4-4281-8bc9-31b8db55fb18-0', usage_metadata={'input_tokens': 170, 'output_tokens': 59, 'total_tokens': 229})

### 管理对话历史
随着对话次数的越来越多，对话历史也就越来越长，如果不处理，就会超过模型的Context，并且输入的Context过长，会对模型造成些干扰。

在这篇文章中有详细的阐述，在这里我列举两个
https://python.langchain.com/v0.2/docs/how_to/trim_messages/

#### Trimming messages

这个解决方案是将历史消息传递给大模型之前现在做一次预处理，使用了LCEL表达式语法

In [23]:
demo_ephemeral_chat_history = ChatMessageHistory()
demo_ephemeral_chat_history.add_user_message("hello，我是Ethan")
demo_ephemeral_chat_history.add_ai_message("Hello!")
demo_ephemeral_chat_history.add_user_message("今儿怎么样？")
demo_ephemeral_chat_history.add_ai_message("好着呢？")

demo_ephemeral_chat_history.messages

[HumanMessage(content='hello，我是Ethan'),
 AIMessage(content='Hello!'),
 HumanMessage(content='今儿怎么样？'),
 AIMessage(content='好着呢？')]

In [25]:
from operator import itemgetter

from langchain_core.messages import trim_messages
from langchain_core.runnables import RunnablePassthrough
from langchain.globals import set_debug
set_debug(True)

trimmer = trim_messages(strategy="last", max_tokens=2, token_counter=len)


chain_with_trimming = (
    RunnablePassthrough.assign(chat_history=itemgetter("chat_history") | trimmer)
    | prompt
    | chat
)

chain_with_trimmed_history = RunnableWithMessageHistory(
    chain_with_trimming,
    lambda session_id: demo_ephemeral_chat_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

In [26]:
chain_with_trimmed_history.invoke(
    {"input": "俺叫什么"},
    {"configurable": {"session_id": "unused"}},
)

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableWithMessageHistory] Entering Chain run with input:
[0m{
  "input": "俺叫什么"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableWithMessageHistory > chain:insert_history] Entering Chain run with input:
[0m{
  "input": "俺叫什么"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableWithMessageHistory > chain:insert_history > chain:RunnableParallel<chat_history>] Entering Chain run with input:
[0m{
  "input": "俺叫什么"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableWithMessageHistory > chain:insert_history > chain:RunnableParallel<chat_history> > chain:load_history] Entering Chain run with input:
[0m{
  "input": "俺叫什么"
}
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableWithMessageHistory > chain:insert_history > chain:RunnableParallel<chat_history> > chain:load_history] [0ms] Exiting Chain run with output:
[0m[outputs]
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableWithMessageHistory > chain:insert_history > chain:RunnableParallel<chat_hi

AIMessage(content='您可以告诉我您的名字，那我就知道您叫什么啦。', response_metadata={'token_usage': {'completion_tokens': 26, 'prompt_tokens': 62, 'total_tokens': 88}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-7e386bf0-ba26-486e-92b1-622d8abfe329-0', usage_metadata={'input_tokens': 62, 'output_tokens': 26, 'total_tokens': 88})

In [27]:
# messages本身不会有别的修改操作，只是传递给模型之前做了一些预处理。
demo_ephemeral_chat_history.messages

[HumanMessage(content='hello，我是Ethan'),
 AIMessage(content='Hello!'),
 HumanMessage(content='今儿怎么样？'),
 AIMessage(content='好着呢？'),
 HumanMessage(content='俺叫什么'),
 AIMessage(content='您可以告诉我您的名字，那我就知道您叫什么啦。', response_metadata={'token_usage': {'completion_tokens': 26, 'prompt_tokens': 62, 'total_tokens': 88}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-7e386bf0-ba26-486e-92b1-622d8abfe329-0', usage_metadata={'input_tokens': 62, 'output_tokens': 26, 'total_tokens': 88})]

从上面的执行日志上可以看到，`trim_messages`已经帮我们做了对message做了处理
`trim_messages`中注释中写了详细的用法。

#### Summary memory
使用大模型，对对话历史作摘要。将摘要当作对话，历史发送给模型。

In [32]:
demo_ephemeral_chat_history = ChatMessageHistory()
demo_ephemeral_chat_history.add_user_message("hello，我是Ethan")
demo_ephemeral_chat_history.add_ai_message("Hello!")
demo_ephemeral_chat_history.add_user_message("今儿怎么样？")
demo_ephemeral_chat_history.add_ai_message("好着呢？")

demo_ephemeral_chat_history.messages

[HumanMessage(content='hello，我是Ethan'),
 AIMessage(content='Hello!'),
 HumanMessage(content='今儿怎么样？'),
 AIMessage(content='好着呢？')]

In [None]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "你是一个乐于助人的助手。尽力回答所有问题。提供的聊天记录包含与您交谈的用户相关的事实。",
        ),
        ("placeholder", "{chat_history}"),
        ("user", "{input}"),
    ]
)

chain = prompt | chat

chain_with_message_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: demo_ephemeral_chat_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

In [33]:
def summarize_messages(chain_input):
    stored_messages = demo_ephemeral_chat_history.messages
    if len(stored_messages) == 0:
        return False
    summarization_prompt = ChatPromptTemplate.from_messages(
        [
            ("placeholder", "{chat_history}"),
            (
                "user",
                "将上述聊天消息提炼成一条总结信息。尽可能包含详细的具体内容。",
            ),
        ]
    )
    summarization_chain = summarization_prompt | chat

    summary_message = summarization_chain.invoke({"chat_history": stored_messages})

    demo_ephemeral_chat_history.clear()

    demo_ephemeral_chat_history.add_message(summary_message)

    return True


chain_with_summarization = (
    RunnablePassthrough.assign(messages_summarized=summarize_messages)
    | chain_with_message_history
)

In [35]:
chain_with_summarization.invoke(
    {"input": "刚才我说我叫什么呀?"},
    {"configurable": {"session_id": "unused"}},
)

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m{
  "input": "刚才我说我叫什么呀?"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<messages_summarized>] Entering Chain run with input:
[0m{
  "input": "刚才我说我叫什么呀?"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<messages_summarized> > chain:RunnableParallel<messages_summarized>] Entering Chain run with input:
[0m{
  "input": "刚才我说我叫什么呀?"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<messages_summarized> > chain:RunnableParallel<messages_summarized> > chain:summarize_messages] Entering Chain run with input:
[0m{
  "input": "刚才我说我叫什么呀?"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<messages_summarized> > chain:RunnableParallel<messages_summarized> > chain:summarize_messages > chain:RunnableSequence] Entering Chain run with input:
[0m[inputs]
[32;1m[1;

AIMessage(content='对不起，我无法保存用户的个人信息或对话历史。请问您可以告诉我您想被称呼的名字吗？我会尽量帮助您的。', response_metadata={'token_usage': {'completion_tokens': 54, 'prompt_tokens': 139, 'total_tokens': 193}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': 'fp_811936bd4f', 'finish_reason': 'stop', 'logprobs': None}, id='run-4eb2840d-9ad8-46ff-b1d2-05ef59bdea1b-0', usage_metadata={'input_tokens': 139, 'output_tokens': 54, 'total_tokens': 193})

虽然他没有答出来，但是从debug的日志里面可以看到做了消息的摘要。


## 如何做检索？

检索是聊天机器人的一个很有用的特点。通过检索，可以让模型获取最新的消息。回答知识。利用模型的推理能力，回答知识。

### Setup

In [39]:
from langchain_openai import ChatOpenAI

chat = ChatOpenAI(model="gpt-3.5-turbo-1106", temperature=0.2)

### 创建检索器

In [40]:
from langchain_community.document_loaders import WebBaseLoader

# 网页加载 
loader = WebBaseLoader("https://docs.smith.langchain.com/overview")
data = loader.load()
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)

# 向量化
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())
# k is the number of chunks to retrieve
retriever = vectorstore.as_retriever(k=4)

docs = retriever.invoke("Can LangSmith help test my LLM applications?")

docs

[Document(page_content='Skip to main contentGo to API DocsSearchRegionUSEUGo to AppQuick startTutorialsHow-to guidesConceptsReferencePricingSelf-hostingLangGraph CloudQuick startOn this pageGet started with LangSmithLangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!1. Install LangSmith\u200bPythonTypeScriptpip install -U', metadata={'description': 'LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'Get started with LangSmith | 🦜️🛠️ LangSmith'}),
 Document(page_content='Skip to main contentGo to API DocsSearchRegionUSEUGo 

### Document chains
这里的步骤和之前很相似了。创建一个文档的检索链。使用`create_stuff_documents_chain`方法,将所有的文档套入到prompt

In [42]:
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

SYSTEM_TEMPLATE = """
Answer the user's questions based on the below context. 
If the context doesn't contain any relevant information to the question, don't make something up and just say "I don't know":

<context>
{context}
</context>
"""

question_answering_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            SYSTEM_TEMPLATE,
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

document_chain = create_stuff_documents_chain(chat, question_answering_prompt)

In [43]:
from langchain_core.messages import HumanMessage

document_chain.invoke(
    {
        "context": docs,
        "messages": [
            HumanMessage(content="Can LangSmith help test my LLM applications?")
        ],
    }
)

[32;1m[1;3m[chain/start][0m [1m[chain:stuff_documents_chain] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:stuff_documents_chain > chain:format_inputs] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:stuff_documents_chain > chain:format_inputs > chain:RunnableParallel<context>] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:stuff_documents_chain > chain:format_inputs > chain:RunnableParallel<context> > chain:format_docs] Entering Chain run with input:
[0m[inputs]
[36;1m[1;3m[chain/end][0m [1m[chain:stuff_documents_chain > chain:format_inputs > chain:RunnableParallel<context> > chain:format_docs] [1ms] Exiting Chain run with output:
[0m{
  "output": "Skip to main contentGo to API DocsSearchRegionUSEUGo to AppQuick startTutorialsHow-to guidesConceptsReferencePricingSelf-hostingLangGraph CloudQuick startOn this pageGet started with LangSmithLangSmith is a platform

'Yes, LangSmith allows you to closely monitor and evaluate your LLM applications, which can help in testing them effectively.'

### Retrieval chains

将document chain和 Retrieval chain结合起来

In [45]:
from typing import Dict

from langchain_core.runnables import RunnablePassthrough


def parse_retriever_input(params: Dict):
    return params["messages"][-1].content



# 下面的意思是说，先获取用户输入messages list的最后一个消息，套入到检索器中
# 做检索，之后，将检索到的答案代入到retrieve chain去做
retrieval_chain = RunnablePassthrough.assign(
    context=parse_retriever_input | retriever, # 这是检索的channel，
).assign(
    answer=document_chain, # 这是文档的chain，
)

In [46]:
retrieval_chain.invoke(
    {
        "messages": [
            HumanMessage(content="Can LangSmith help test my LLM applications?")
        ],
    }
)

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context>] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context> > chain:RunnableParallel<context>] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context> > chain:RunnableParallel<context> > chain:RunnableSequence] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context> > chain:RunnableParallel<context> > chain:RunnableSequence > chain:parse_retriever_input] Entering Chain run with input:
[0m[inputs]
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context> > chain:RunnableParallel<context> > chain:RunnableSequence > chain:parse_retrieve

{'messages': [HumanMessage(content='Can LangSmith help test my LLM applications?')],
 'context': [Document(page_content='Skip to main contentGo to API DocsSearchRegionUSEUGo to AppQuick startTutorialsHow-to guidesConceptsReferencePricingSelf-hostingLangGraph CloudQuick startOn this pageGet started with LangSmithLangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!1. Install LangSmith\u200bPythonTypeScriptpip install -U', metadata={'description': 'LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'Get started with LangSmith |

### 查询转换

使用检索器做检索的时候，检索本身是没有上下文的。如果检索的问题找不到检索器会返回一些无关的文档。比如你在问？Lang smith的作用是什么？检索器可以找到一些相关的文档，但是，如果你基于对话的上下文儿，你再问。详细解释下。解锁器是查不到任何和Lang Smith作用有关的文档的，然后返回了一些无关的文档。
这个时候就需要查询转换。

In [47]:
retriever.invoke("Tell me more!") # 返回了一些无关的文档

[Document(page_content='Get started with LangSmith | 🦜️🛠️ LangSmith', metadata={'description': 'LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'Get started with LangSmith | 🦜️🛠️ LangSmith'}),
 Document(page_content='Get started with LangSmith | 🦜️🛠️ LangSmith', metadata={'description': 'LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'Get started with LangSmith | 🦜️🛠️ LangSmith'}),
 Document(page_content='result.choices[0].message.co

要解决这个问题，可以将对话通过大模型来生成查询。

In [48]:
from langchain_core.messages import AIMessage, HumanMessage

query_transform_prompt = ChatPromptTemplate.from_messages(
    [
        MessagesPlaceholder(variable_name="messages"),
        (
            "user",
            "Given the above conversation, generate a search query to look up in order to get information relevant to the conversation. Only respond with the query, nothing else.",
        ),
    ]
)

query_transformation_chain = query_transform_prompt | chat

query_transformation_chain.invoke(
    {
        "messages": [
            HumanMessage(content="Can LangSmith help test my LLM applications?"),
            AIMessage(
                content="Yes, LangSmith can help test and evaluate your LLM applications. It allows you to quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs. Additionally, LangSmith can be used to monitor your application, log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise."
            ),
            HumanMessage(content="Tell me more!"),
        ],
    }
)

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > prompt:ChatPromptTemplate] Entering Prompt run with input:
[0m[inputs]
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableSequence > prompt:ChatPromptTemplate] [1ms] Exiting Prompt run with output:
[0m[outputs]
[32;1m[1;3m[llm/start][0m [1m[chain:RunnableSequence > llm:ChatOpenAI] Entering LLM run with input:
[0m{
  "prompts": [
    "Human: Can LangSmith help test my LLM applications?\nAI: Yes, LangSmith can help test and evaluate your LLM applications. It allows you to quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs. Additionally, LangSmith can be used to monitor your application, log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise.\nHuman: Tell me more

AIMessage(content='"LangSmith LLM application testing and evaluation"', response_metadata={'token_usage': {'completion_tokens': 10, 'prompt_tokens': 145, 'total_tokens': 155}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': 'fp_5aa43294a1', 'finish_reason': 'stop', 'logprobs': None}, id='run-0c88e751-ac9d-4a49-984d-fa0061bde854-0', usage_metadata={'input_tokens': 145, 'output_tokens': 10, 'total_tokens': 155})

OK，可以看到。已经生成了我们想要的查询语句。下面将他和之前的chain结合起来。

下面的例子中，`RunnableBranch`表示的是一个分支条件,他要求传入一些判断条件和一个default的条件，在下面的例子中，如果message只有一条，直接去做检索。否则就要经过上面的查询转换。再去做检索。

In [50]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableBranch

query_transforming_retriever_chain = RunnableBranch(
    (
        lambda x: len(x.get("messages", [])) == 1,
        # If only one message, then we just pass that message's content to retriever
        (lambda x: x["messages"][-1].content) | retriever,
    ),
    # If messages, then we pass inputs to LLM chain to transform the query, then pass to retriever
    query_transform_prompt | chat | StrOutputParser() | retriever,
).with_config(run_name="chat_retriever_chain")

In [51]:
SYSTEM_TEMPLATE = """
Answer the user's questions based on the below context. 
If the context doesn't contain any relevant information to the question, don't make something up and just say "I don't know":

<context>
{context}
</context>
"""

question_answering_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            SYSTEM_TEMPLATE,
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

document_chain = create_stuff_documents_chain(chat, question_answering_prompt)

conversational_retrieval_chain = RunnablePassthrough.assign(
    context=query_transforming_retriever_chain,
).assign(
    answer=document_chain,
)

In [52]:
conversational_retrieval_chain.invoke(
    {
        "messages": [
            HumanMessage(content="Can LangSmith help test my LLM applications?"),
        ]
    }
)没事

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context>] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context> > chain:RunnableParallel<context>] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context> > chain:RunnableParallel<context> > chain:chat_retriever_chain] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context> > chain:RunnableParallel<context> > chain:chat_retriever_chain > chain:RunnableLambda] Entering Chain run with input:
[0m[inputs]
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context> > chain:RunnableParallel<context> > chain:chat_retriever_chain > chain:RunnableL

{'messages': [HumanMessage(content='Can LangSmith help test my LLM applications?')],
 'context': [Document(page_content='Skip to main contentGo to API DocsSearchRegionUSEUGo to AppQuick startTutorialsHow-to guidesConceptsReferencePricingSelf-hostingLangGraph CloudQuick startOn this pageGet started with LangSmithLangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!1. Install LangSmith\u200bPythonTypeScriptpip install -U', metadata={'description': 'LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'Get started with LangSmith |

上面只有一条message，所以他不会去做查询转换。

下面我会输入多个message。

In [53]:
conversational_retrieval_chain.invoke(
    {
        "messages": [
            HumanMessage(content="Can LangSmith help test my LLM applications?"),
            AIMessage(
                content="Yes, LangSmith can help test and evaluate your LLM applications. It allows you to quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs. Additionally, LangSmith can be used to monitor your application, log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise."
            ),
            HumanMessage(content="Tell me more!"),
        ],
    }
)啪啪然后我想啊？

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context>] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context> > chain:RunnableParallel<context>] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context> > chain:RunnableParallel<context> > chain:chat_retriever_chain] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context> > chain:RunnableParallel<context> > chain:chat_retriever_chain > chain:RunnableLambda] Entering Chain run with input:
[0m[inputs]
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableSequence > chain:RunnableAssign<context> > chain:RunnableParallel<context> > chain:chat_retriever_chain > chain:RunnableL

{'messages': [HumanMessage(content='Can LangSmith help test my LLM applications?'),
  AIMessage(content='Yes, LangSmith can help test and evaluate your LLM applications. It allows you to quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs. Additionally, LangSmith can be used to monitor your application, log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise.'),
  HumanMessage(content='Tell me more!')],
 'context': [Document(page_content='Skip to main contentGo to API DocsSearchRegionUSEUGo to AppQuick startTutorialsHow-to guidesConceptsReferencePricingSelf-hostingLangGraph CloudQuick startOn this pageGet started with LangSmithLangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not

Ok，得到了想要的答案。

### Streaming
基于LCEL语句的chain，可以调用stream方法来流水返回

In [60]:
set_debug(False)
stream = conversational_retrieval_chain.stream(
    {
        "messages": [
            HumanMessage(content="Can LangSmith help test my LLM applications?"),
            AIMessage(
                content="Yes, LangSmith can help test and evaluate your LLM applications. It allows you to quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs. Additionally, LangSmith can be used to monitor your application, log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise."
            ),
            HumanMessage(content="Tell me more!"),
        ],
    }
)

for chunk in stream:
    if 'answer' in chunk:
        print(chunk)

{'answer': ''}
{'answer': 'Lang'}
{'answer': 'Smith'}
{'answer': ' is'}
{'answer': ' designed'}
{'answer': ' for'}
{'answer': ' building'}
{'answer': ' production'}
{'answer': '-grade'}
{'answer': ' L'}
{'answer': 'LM'}
{'answer': ' applications'}
{'answer': ','}
{'answer': ' providing'}
{'answer': ' tools'}
{'answer': ' for'}
{'answer': ' monitoring'}
{'answer': ' and'}
{'answer': ' evaluating'}
{'answer': ' your'}
{'answer': ' applications'}
{'answer': ' effectively'}
{'answer': '.'}
{'answer': ' Here'}
{'answer': ' are'}
{'answer': ' some'}
{'answer': ' key'}
{'answer': ' features'}
{'answer': ':\n\n'}
{'answer': '1'}
{'answer': '.'}
{'answer': ' **'}
{'answer': 'Testing'}
{'answer': ' and'}
{'answer': ' Evaluation'}
{'answer': '**'}
{'answer': ':'}
{'answer': ' You'}
{'answer': ' can'}
{'answer': ' closely'}
{'answer': ' monitor'}
{'answer': ' your'}
{'answer': ' application'}
{'answer': ','}
{'answer': ' allowing'}
{'answer': ' for'}
{'answer': ' thorough'}
{'answer': ' testing'}


## 如何使用工具

之前已经说过模型是如何调用工具的节。在这里将会演示如何通过agent来调用工具。
### 创建agent
我们会使用到`AgentExecutor, create_tool_calling_agent`

In [63]:
from langchain_core.prompts import ChatPromptTemplate
from langchain.tools import tool

@tool
def current_date():
    """返回当前时间，只有当用户询问当前的时候可以使用此工具"""
    return "2024-08-04 12:12:12"

tools = [current_date]
# Adapted from https://smith.langchain.com/hub/jacob/tool-calling-agent
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. You may not need to use tools for every query - the user may just want to chat!",
        ),
        ("placeholder", "{messages}"),
        ("placeholder", "{agent_scratchpad}"),
    ]
)
from langchain.agents import AgentExecutor, create_tool_calling_agent

agent = create_tool_calling_agent(chat, tools, prompt)

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

In [64]:
from langchain_core.messages import HumanMessage

agent_executor.invoke({"messages": [HumanMessage(content="I'm Nemo!")]})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mHi Nemo! How can I assist you today?[0m

[1m> Finished chain.[0m


{'messages': [HumanMessage(content="I'm Nemo!")],
 'output': 'Hi Nemo! How can I assist you today?'}

In [66]:
set_debug(True)
agent_executor.invoke({"messages": [HumanMessage(content="现在几点了？")]})

[32;1m[1;3m[chain/start][0m [1m[chain:AgentExecutor] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:AgentExecutor > chain:RunnableSequence] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[chain:AgentExecutor > chain:RunnableSequence > chain:RunnableAssign<agent_scratchpad>] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[chain:AgentExecutor > chain:RunnableSequence > chain:RunnableAssign<agent_scratchpad> > chain:RunnableParallel<agent_scratchpad>] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[chain:AgentExecutor > chain:RunnableSequence > chain:RunnableAssign<agent_scratchpad> > chain:RunnableParallel<agent_scratchpad> > chain:RunnableLambda] Entering Chain run with input:
[0m{
  "input": ""
}
[36;1m[1;3m[chain/end][0m [1m[chain:AgentExecutor > chain:RunnableSequence > chain:RunnableAssign<agent_scratchpad> > chain

{'messages': [HumanMessage(content='现在几点了？')],
 'output': '现在是2024年8月4日，12点12分12秒。有什么我可以帮到您的吗？'}

可以看到，这个agent已经调用了工具，并且也可以正常的使用。

### Conversational responses
因为我们的提示中包含了聊天记录消息的占位符，我们的代理可以考虑之前的互动，并像标准聊天机器人一样进行对话回应：

In [67]:
from langchain_core.messages import AIMessage, HumanMessage

agent_executor.invoke(
    {
        "messages": [
            HumanMessage(content="I'm Nemo!"),
            AIMessage(content="Hello Nemo! How can I assist you today?"),
            HumanMessage(content="What is my name?"),
        ],
    }
)

[32;1m[1;3m[chain/start][0m [1m[chain:AgentExecutor] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[chain:AgentExecutor > chain:RunnableSequence] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[chain:AgentExecutor > chain:RunnableSequence > chain:RunnableAssign<agent_scratchpad>] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[chain:AgentExecutor > chain:RunnableSequence > chain:RunnableAssign<agent_scratchpad> > chain:RunnableParallel<agent_scratchpad>] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[chain:AgentExecutor > chain:RunnableSequence > chain:RunnableAssign<agent_scratchpad> > chain:RunnableParallel<agent_scratchpad> > chain:RunnableLambda] Entering Chain run with input:
[0m{
  "input": ""
}
[36;1m[1;3m[chain/end][0m [1m[chain:AgentExecutor > chain:RunnableSequence > chain:RunnableAssign<agent_scratchpad> > chain

{'messages': [HumanMessage(content="I'm Nemo!"),
  AIMessage(content='Hello Nemo! How can I assist you today?'),
  HumanMessage(content='What is my name?')],
 'output': 'Your name is Nemo!'}

就可以结合History来包装agent，让他可以记住对话历史

In [69]:
set_debug(False)
agent = create_tool_calling_agent(chat, tools, prompt)

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory


demo_ephemeral_chat_history_for_chain = ChatMessageHistory()
conversational_agent_executor = RunnableWithMessageHistory(
    agent_executor,
    lambda session_id: demo_ephemeral_chat_history_for_chain,
    input_messages_key="messages",
    output_messages_key="output",
)

conversational_agent_executor.invoke(
    {"messages": [HumanMessage("I'm Nemo!")]},
    {"configurable": {"session_id": "unused"}},
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mHi Nemo! How can I assist you today?[0m

[1m> Finished chain.[0m


{'messages': [HumanMessage(content="I'm Nemo!")],
 'output': 'Hi Nemo! How can I assist you today?'}