# Combine agents and vector stores (Combinar agentes y vector stores)
---

Este cuaderno cubre cómo combinar agentes y `vector stores`. El caso de uso para esto es que ha ingerido sus datos en un `vector store` y desea interactuar con ellos de a traves de un agente.

El método recomendado para hacerlo es crear un `RetrievalQA` y luego usarlo como herramienta en el agente general. Echemos un vistazo a cómo hacer esto a continuación. Puede hacer esto con varias bases de datos vectoriales diferentes y utilizar el agente como una forma de enrutar entre ellas. Hay dos formas diferentes de hacer esto: puede dejar que el agente use los almacenes de vectores como herramientas normales, o puede configurar return_direct=True para que realmente use el agente como enrutador.

In [6]:
from dotenv import load_dotenv, find_dotenv
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA

load_dotenv(find_dotenv())

True

Inicializamos el LLM

In [3]:
llm = OpenAI(temperature=0)

Carguemos los datos, abriremos el archivo, y lo prepararemos para enviarlo a nuestra `vector store`

1. Cargaremos el archivo de texto
2. Separaremos la informacion en bloques de 1000 caracteres sin "overlap"

In [9]:
from langchain.document_loaders import TextLoader

doc_path = "../../example_data/state_of_the_union.txt"

loader = TextLoader(doc_path)
documents = loader.load()

text_splitter = CharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=0,
)

texts = text_splitter.split_documents(documents)


Aqui inicializaremos los embeddings de OpenAI ()

In [10]:
embeddings = OpenAIEmbeddings()

Crearemos una `vector store` usando Chroma, que es una de las mas sencillas de usar.

In [12]:
vectorstore = Chroma.from_documents(
    texts,
    embeddings,
    collection_name="state-of-union",
    persist_directory="../../chromadb/"
)

vectorstore.persist()

Generaremos un RetrievalQA para nuestra vector store de "state-of-union"

In [15]:
state_of_union = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever(),
)

Ademas de obtener los datos desde nuestro archivo de texto, tambien lo haremos desde una URL, para este ejemplo usaremos otro loader llamado `WebBaseLoader`

In [13]:
from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://beta.ruff.rs/docs/faq/")

ruff_documents = loader.load()

ruff_texts = text_splitter.split_documents(ruff_documents)

ruff_vectorstore = Chroma.from_documents(
    ruff_texts,
    embeddings,
    collection_name="ruff",
    persist_directory="../../chromadb/"
)

ruff_vectorstore.persist()

Created a chunk of size 1354, which is longer than the specified 1000
Created a chunk of size 1923, which is longer than the specified 1000
Created a chunk of size 1085, which is longer than the specified 1000
Created a chunk of size 1267, which is longer than the specified 1000


Generemos un Retrieval para Ruff

In [16]:
ruff = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=ruff_vectorstore.as_retriever(),
)

# Creacion del Agente
---

In [14]:
# importaremos las librerias que necesitaremos de forma generica
from langchain.agents import initialize_agent, Tool, AgentType
from langchain.tools import BaseTool
from langchain.llms import OpenAI
from langchain.chains import LLMMathChain
from langchain.utilities import SerpAPIWrapper

Definamos las herramientas que utilizaremos, para obtener la informacion

In [17]:
tools = [
    Tool(
        name="State of Union QA System",
        func=state_of_union.run,
        description="useful for when you need to answer questions about the most recetn state of the union address. Input should be a fully formed question.",
    ),
    Tool(
        name="Ruff QA System",
        func=ruff.run,
        description="useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question.",
    )
]

vamos a construir un agente de tipo ZERO_SHOT_REACT_DESCRIPTION, recordemos que este tipo de agentes tiene un prompt mas pequeño con menos ejemplos.

Para poder ver los pasos y como funciona, se activara el valor verbose=True para ver los resultados de "pensamiento" en la consola.

In [19]:
agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
)

Veamos el prompt que se genera para este agente

In [20]:
print(agent.agent.llm_chain.prompt.template)

Answer the following questions as best you can. You have access to the following tools:

State of Union QA System: useful for when you need to answer questions about the most recetn state of the union address. Input should be a fully formed question.
Ruff QA System: useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [State of Union QA System, Ruff QA System]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}


Ejecutemos el agente haciendo una pregunta para que veamos el funcionamiento

In [21]:
agent.run("What did biden say about ketanji brow jackson in the state of the union address")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out what Biden said about Ketanji Brown Jackson in the State of the Union address
Action: State of Union QA System
Action Input: What did Biden say about Ketanji Brown Jackson in the State of the Union address[0m
Observation: [36;1m[1;3m Biden said that Ketanji Brown Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: Biden said that Ketanji Brown Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.[0m

[1m> Finished chain.[0m


"Biden said that Ketanji Brown Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence."

Ahora hagamos un ejemplo, haciendo una pregunta sobre ruff

In [23]:
agent.run("Why use ruff over flake8")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to know what ruff is and why it is better than flake8
Action: Ruff QA System
Action Input: What is ruff and why is it better than flake8?[0m
Observation: [33;1m[1;3m Ruff is a linter for Python code that is designed to be faster and more reliable than Flake8. It implements 440 rules, of which at least 89 overlap with the Pylint rule set. It also supports automatic fixing of lint violations, and is installable under any Python version from 3.7 onwards.[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: Ruff is a linter for Python code that is designed to be faster and more reliable than Flake8. It implements 440 rules, of which at least 89 overlap with the Pylint rule set. It also supports automatic fixing of lint violations, and is installable under any Python version from 3.7 onwards.[0m

[1m> Finished chain.[0m


'Ruff is a linter for Python code that is designed to be faster and more reliable than Flake8. It implements 440 rules, of which at least 89 overlap with the Pylint rule set. It also supports automatic fixing of lint violations, and is installable under any Python version from 3.7 onwards.'

# Use the Agent solely as a router (Utilice el agente únicamente como enrutador)
---

Podemos establecer `return_direct=True` si se tiene la intencion de utilizar el agente como enrutador y solo deseas devolver directamente el resultado del `RetrievalQAChain`


Observe que en los ejemplos anteriores el agente hizo un trabajo adicional después de consultar RetrievalQAChain. 

Puedes evitar eso y simplemente devolver el resultado directamente.

In [24]:
tools = [
    Tool(
        name="State of Union QA System",
        func=state_of_union.run,
        description="useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.",
        return_direct=True,
    ),
    Tool(
        name="Ruff QA System",
        func=ruff.run,
        description="useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question.",
        return_direct=True,
    ),
]

Inicializaremos nuevamente el agente

In [25]:
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

Hagamos nuevamente la prueba, preguntando para que nos responda sobre "state-of-union"

In [26]:
agent.run(
    "What did biden say about ketanji brown jackson in the state of the union address?"
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out what Biden said about Ketanji Brown Jackson in the State of the Union address.
Action: State of Union QA System
Action Input: What did Biden say about Ketanji Brown Jackson in the State of the Union address?[0m
Observation: [36;1m[1;3m Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m


" Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence."

In [27]:
agent.run("Why use ruff over flake8?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out the advantages of using ruff over flake8
Action: Ruff QA System
Action Input: What are the advantages of using ruff over flake8?[0m
Observation: [33;1m[1;3m Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. Ruff also re-implements some of the most popular Flake8 plugins and related code quality tools natively, and it can automatically fix its own lint violations. Additionally, Ruff uses different rule codes and prefixes than would be found in the originating Flake8 plugins, which helps minimize conflicts across plugins and allows any individual plugin to be toggled on or off with a single (e.g.) --select TID, as opposed to --select I2.[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m


' Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. Ruff also re-implements some of the most popular Flake8 plugins and related code quality tools natively, and it can automatically fix its own lint violations. Additionally, Ruff uses different rule codes and prefixes than would be found in the originating Flake8 plugins, which helps minimize conflicts across plugins and allows any individual plugin to be toggled on or off with a single (e.g.) --select TID, as opposed to --select I2.'