# 合并代理和向量存储

本文介绍如何合并代理和向量存储。这样做的用例是，您已经将数据导入向量存储，并希望以代理方式与其进行交互。

推荐的方法是创建一个`RetrievalQA`，然后将其作为整体代理的工具。让我们看看如何在下面进行操作。您可以使用多个不同的向量数据库，并使用代理作为它们之间的路由器。有两种不同的方法可以实现这一点 - 您可以让代理像正常工具一样使用向量存储，或者您可以设置`return_direct=True`来真正将代理作为路由器使用。

## 创建向量存储库

In [16]:

# 导入所需的库
from langchain.chains import RetrievalQA
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAI, OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter

# 创建一个 OpenAI 模型实例，设置温度参数为 0
llm = OpenAI(temperature=0)


In [17]:

from pathlib import Path  # 导入 Path 类

relevant_parts = []  # 创建一个空列表用于存储相关部分的路径
for p in Path(".").absolute().parts:  # 遍历当前工作目录的绝对路径的各个部分
    relevant_parts.append(p)  # 将当前部分添加到 relevant_parts 列表中
    if relevant_parts[-3:] == ["langchain", "docs", "modules"]:  # 如果 relevant_parts 的最后三个部分等于 ["langchain", "docs", "modules"]
        break  # 退出循环
doc_path = str(Path(*relevant_parts) / "state_of_the_union.txt")  # 构建文档路径并转换为字符串


In [18]:

# 导入所需的模块
from langchain_community.document_loaders import TextLoader

# 创建一个 TextLoader 对象，并传入文档路径
loader = TextLoader(doc_path)

# 使用 TextLoader 对象加载文档
documents = loader.load()

# 创建一个 CharacterTextSplitter 对象，设置分块大小为 1000，重叠大小为 0
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)

# 使用 CharacterTextSplitter 对象将文档分割成多个文本块
texts = text_splitter.split_documents(documents)

# 创建一个 OpenAIEmbeddings 对象
embeddings = OpenAIEmbeddings()

# 使用 Chroma.from_documents 方法，传入分割后的文本块、嵌入模型和集合名称，创建一个 Chroma 对象
docsearch = Chroma.from_documents(texts, embeddings, collection_name="state-of-union")


Running Chroma using direct local API.
Using DuckDB in-memory for database. Data will be transient.


In [4]:

# 创建一个名为 state_of_union 的 RetrievalQA 对象
state_of_union = RetrievalQA.from_chain_type(
    llm=llm,  # 使用名为 llm 的语言模型
    chain_type="stuff",  # 设置 chain_type 参数为 "stuff"
    retriever=docsearch.as_retriever()  # 使用 docsearch.as_retriever() 作为检索器
)


In [5]:

# 导入WebBaseLoader类从langchain_community.document_loaders模块
from langchain_community.document_loaders import WebBaseLoader


In [6]:

# 创建一个名为loader的WebBaseLoader对象，加载指定网址的内容
loader = WebBaseLoader("https://beta.ruff.rs/docs/faq/")


In [7]:

# 加载文档
docs = loader.load()

# 将文档拆分为文本
ruff_texts = text_splitter.split_documents(docs)

# 使用给定的嵌入和集合名称创建 Chroma 数据库
ruff_db = Chroma.from_documents(ruff_texts, embeddings, collection_name="ruff")

# 使用指定的语言模型、链类型和检索器创建 RetrievalQA 对象
ruff = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=ruff_db.as_retriever()
)


Running Chroma using direct local API.
Using DuckDB in-memory for database. Data will be transient.


## 创建代理



In [43]:
# 导入需要的通用模块
from langchain.agents import AgentType, Tool, initialize_agent
from langchain_openai import OpenAI

# 从langchain.agents模块中导入AgentType、Tool和initialize_agent函数
# 从langchain_openai模块中导入OpenAI类

# 这段代码的作用是导入所需的模块和类，以便后续使用。

In [44]:

# 定义一个工具列表，包含两个工具对象
tools = [
    Tool(
        name="State of Union QA System",  # 工具名称为“State of Union QA System”
        func=state_of_union.run,  # 调用state_of_union模块中的run函数
        description="useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.",  # 用于回答关于最近国情咨文的问题。输入应该是一个完整的问题。
    ),
    Tool(
        name="Ruff QA System",  # 工具名称为“Ruff QA System”
        func=ruff.run,  # 调用ruff模块中的run函数
        description="useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question.",  # 用于回答关于ruff（一个Python代码检查工具）的问题。输入应该是一个完整的问题。
    ),
]


In [45]:
# 构建代理。这里我们将使用默认的代理类型。
# 有关选项的完整列表，请参阅文档。
# 初始化代理，传入工具、llm和代理类型ZERO_SHOT_REACT_DESCRIPTION，同时打开详细输出模式。
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

In [46]:
agent.run(
    "What did biden say about ketanji brown jackson in the state of the union address?"
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out what Biden said about Ketanji Brown Jackson in the State of the Union address.
Action: State of Union QA System
Action Input: What did Biden say about Ketanji Brown Jackson in the State of the Union address?[0m
Observation: [36;1m[1;3m Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.[0m

[1m> Finished chain.[0m


"Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence."

In [47]:

# 运行agent，传入参数为 "Why use ruff over flake8?"
agent.run("Why use ruff over flake8?")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out the advantages of using ruff over flake8
Action: Ruff QA System
Action Input: What are the advantages of using ruff over flake8?[0m
Observation: [33;1m[1;3m Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, 

'Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.'

## 仅将代理用作路由器

你也可以设置 `return_direct=True`，如果你打算将代理设置为路由器，并且只想直接返回 RetrievalQAChain 的结果。

请注意，在上面的例子中，代理在查询 RetrievalQAChain 后做了一些额外的工作。你可以避免这样做，直接返回结果即可。

In [48]:
tools = [
    Tool(
        name="State of Union QA System",
        func=state_of_union.run,
        description="useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.",
        return_direct=True,
    ),
    Tool(
        name="Ruff QA System",
        func=ruff.run,
        description="useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question.",
        return_direct=True,
    ),
]

In [49]:

# 初始化代理人，使用ZERO_SHOT_REACT_DESCRIPTION类型的代理人
# 参数：
# tools: 工具集
# llm: 语言模型
# agent: 代理人类型，这里使用ZERO_SHOT_REACT_DESCRIPTION
# verbose: 是否打印详细信息
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)


In [50]:
agent.run(
    "What did biden say about ketanji brown jackson in the state of the union address?"
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out what Biden said about Ketanji Brown Jackson in the State of the Union address.
Action: State of Union QA System
Action Input: What did Biden say about Ketanji Brown Jackson in the State of the Union address?[0m
Observation: [36;1m[1;3m Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m


" Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence."

In [51]:
# 运行agent的run函数，并传入参数"Why use ruff over flake8?"
agent.run("Why use ruff over flake8?")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out the advantages of using ruff over flake8
Action: Ruff QA System
Action Input: What are the advantages of using ruff over flake8?[0m
Observation: [33;1m[1;3m Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m


' Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.'

## 多跳向量存储推理

由于向量存储可以作为代理工具轻松使用，因此可以使用现有的代理框架来回答依赖于向量存储的多跳问题。

In [57]:
# 代码翻译

# 定义一个工具列表
tools = [
    Tool(
        name="State of Union QA System",  # 工具名称为 "State of Union QA System"
        func=state_of_union.run,  # 调用 state_of_union.run 函数
        description="useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question, not referencing any obscure pronouns from the conversation before.",  # 描述该工具的用途和输入要求
    ),
    Tool(
        name="Ruff QA System",  # 工具名称为 "Ruff QA System"
        func=ruff.run,  # 调用 ruff.run 函数
        description="useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question, not referencing any obscure pronouns from the conversation before.",  # 描述该工具的用途和输入要求
    ),
]

In [58]:
# 构建代理。我们将在这里使用默认的代理类型。
# 有关选项的完整列表，请参阅文档。
# 初始化代理，传入工具、llm和代理类型为ZERO_SHOT_REACT_DESCRIPTION，同时打开详细输出。
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

In [59]:

agent.run(
    "Ruff使用什么工具来运行Jupyter笔记本？总统在国情咨文中提到了那个工具吗？"
)




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out what tool ruff uses to run over Jupyter Notebooks, and if the president mentioned it in the state of the union.
Action: Ruff QA System
Action Input: What tool does ruff use to run over Jupyter Notebooks?[0m
Observation: [33;1m[1;3m Ruff is integrated into nbQA, a tool for running linters and code formatters over Jupyter Notebooks. After installing ruff and nbqa, you can run Ruff over a notebook like so: > nbqa ruff Untitled.html[0m
Thought:[32;1m[1;3m I now need to find out if the president mentioned this tool in the state of the union.
Action: State of Union QA System
Action Input: Did the president mention nbQA in the state of the union?[0m
Observation: [36;1m[1;3m No, the president did not mention nbQA in the state of the union.[0m
Thought:[32;1m[1;3m I now know the final answer.
Final Answer: No, the president did not mention nbQA in the state of the union.[0m

[1m> Finished chain.[0m


'No, the president did not mention nbQA in the state of the union.'