# Lab | Agent & Vector store

**Change the state union dataset and replicate this lab by updating the prompts accordingly.**

One such dataset is the [sonnets.txt](https://github.com/martin-gorner/tensorflow-rnn-shakespeare/blob/master/shakespeare/sonnets.txt) dataset or any other data of your choice from the same git.

# Combine agents and vector stores

This notebook covers how to combine agents and vector stores. The use case for this is that you've ingested your data into a vector store and want to interact with it in an agentic manner.

The recommended method for doing so is to create a `RetrievalQA` and then use that as a tool in the overall agent. Let's take a look at doing this below. You can do this with multiple different vector DBs, and use the agent as a way to route between them. There are two different ways of doing this - you can either let the agent use the vector stores as normal tools, or you can set `return_direct=True` to really just use the agent as a router.

## Create the vector store

In [1]:
!pip install chromadb langchain langchain_community langchain_openai



In [2]:
from langchain.chains import RetrievalQA
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAI, OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader

In [3]:
import os
from dotenv import load_dotenv, find_dotenv


In [4]:
from google.colab import userdata
import os

# Retrieve the API keys
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')


# Set as environment variables
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY


print("OpenAI API key loaded and set as environment variable.")


OpenAI API key loaded and set as environment variable.


In [5]:
#If you're using colab, run this
# os.environ['OPENAI_API_KEY'] = "OPENAI_API_KEY" # This line is causing the API key to be overwritten. Commenting it out.

In [6]:
llm = OpenAI(temperature=0)

In [7]:
from pathlib import Path

relevant_parts = []
for p in Path(".").absolute().parts:
    relevant_parts.append(p)
    if relevant_parts[-3:] == ["langchain", "docs", "modules"]:
        break
doc_path = str(Path(*relevant_parts) / "state_of_the_union.txt")

In [8]:
import os

# Download the sonnets.txt file as suggested in the lab instructions
!wget -q https://raw.githubusercontent.com/martin-gorner/tensorflow-rnn-shakespeare/master/shakespeare/sonnets.txt -O /content/sonnets.txt

# Update doc_path to point to the downloaded sonnets.txt file
doc_path = '/content/sonnets.txt'

loader = TextLoader(doc_path)
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

embeddings = OpenAIEmbeddings()
docsearch = Chroma.from_documents(texts, embeddings, collection_name="sonnets")

In [9]:
state_of_union = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=docsearch.as_retriever()
)

In [10]:
from langchain_community.document_loaders import WebBaseLoader



In [11]:
loader = WebBaseLoader("https://beta.ruff.rs/docs/faq/")

In [12]:
docs = loader.load()
ruff_texts = text_splitter.split_documents(docs)
ruff_db = Chroma.from_documents(ruff_texts, embeddings, collection_name="ruff")
ruff = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=ruff_db.as_retriever()
)



## Create the Agent

In [13]:
# Import things that are needed generically
from langchain.agents import AgentType, Tool, initialize_agent
from langchain_openai import OpenAI

In [14]:
tools = [
    Tool(
        name="State of Union QA System",
        func=state_of_union.run,
        description="useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.",
    ),
    Tool(
        name="Ruff QA System",
        func=ruff.run,
        description="useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question.",
    ),
]

In [15]:
# Construct the agent. We will use the default agent type here.
# See documentation for a full list of options.
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

  agent = initialize_agent(


In [16]:
agent.invoke(
    "What did biden say about ketanji brown jackson in the state of the union address?"
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the State of Union QA System to answer this question.
Action: State of Union QA System
Action Input: "What did Biden say about Ketanji Brown Jackson in the state of the union address?"[0m
Observation: [36;1m[1;3m I don't know.[0m
Thought:[32;1m[1;3m I should try rephrasing the question.
Action: State of Union QA System
Action Input: "What was Biden's mention of Ketanji Brown Jackson in the state of the union address?"[0m
Observation: [36;1m[1;3m I don't know.[0m
Thought:[32;1m[1;3m I should try using different keywords.
Action: State of Union QA System
Action Input: "What was Biden's statement about Ketanji Brown Jackson in the state of the union address?"[0m
Observation: [36;1m[1;3m I don't know.[0m
Thought:[32;1m[1;3m I should try using the full name.
Action: State of Union QA System
Action Input: "What was Biden's statement about Ketanji Brown Jackson in the state of the union address?"[0m


{'input': 'What did biden say about ketanji brown jackson in the state of the union address?',
 'output': 'Agent stopped due to iteration limit or time limit.'}

In [17]:
agent.invoke("Why use ruff over flake8?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m You should always think about the differences between ruff and flake8 before deciding which one to use.
Action: Ruff QA System
Action Input: "What are the differences between ruff and flake8?"[0m
Observation: [33;1m[1;3m Ruff implements more rules than Flake8 and does not support custom lint rules. It also has some minor differences in its rule set and does not include all the 'opinionated' rules from flake8-bugbear. Additionally, Ruff is capable of automatically fixing its own lint violations, while Flake8 supports plugins and allows for custom and third-party rules.[0m
Thought:[32;1m[1;3m Now that I know the differences, I can make an informed decision.
Final Answer: Based on the differences between ruff and flake8, it seems that ruff may be a better choice for those looking for a more comprehensive and automated linter, while flake8 may be better for those who prefer more customization and flexibility. Ultimately, t

{'input': 'Why use ruff over flake8?',
 'output': 'Based on the differences between ruff and flake8, it seems that ruff may be a better choice for those looking for a more comprehensive and automated linter, while flake8 may be better for those who prefer more customization and flexibility. Ultimately, the best choice will depend on the specific needs and preferences of the user.'}

## Use the Agent solely as a router

You can also set `return_direct=True` if you intend to use the agent as a router and just want to directly return the result of the RetrievalQAChain.

Notice that in the above examples the agent did some extra work after querying the RetrievalQAChain. You can avoid that and just return the result directly.

In [18]:
tools = [
    Tool(
        name="State of Union QA System",
        func=state_of_union.run,
        description="useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.",
        return_direct=True,
    ),
    Tool(
        name="Ruff QA System",
        func=ruff.run,
        description="useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question.",
        return_direct=True,
    ),
]

In [19]:
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

In [20]:
agent.invoke(
    "What did biden say about ketanji brown jackson in the state of the union address?"
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the State of Union QA System to answer this question.
Action: State of Union QA System
Action Input: "What did Biden say about Ketanji Brown Jackson in the state of the union address?"[0m
Observation: [36;1m[1;3m I don't know.[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m


{'input': 'What did biden say about ketanji brown jackson in the state of the union address?',
 'output': " I don't know."}

In [21]:
agent.invoke("Why use ruff over flake8?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m You should always think about the differences between ruff and flake8 before deciding which one to use.
Action: Ruff QA System
Action Input: "What are the differences between ruff and flake8?"[0m
Observation: [33;1m[1;3m Ruff implements more rules than Flake8 and does not support custom lint rules. It also has some minor differences in its rule set and does not include all the 'opinionated' rules from flake8-bugbear. Additionally, Ruff is capable of automatically fixing its own lint violations, while Flake8 supports plugins and allows for custom and third-party rules.[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m


{'input': 'Why use ruff over flake8?',
 'output': " Ruff implements more rules than Flake8 and does not support custom lint rules. It also has some minor differences in its rule set and does not include all the 'opinionated' rules from flake8-bugbear. Additionally, Ruff is capable of automatically fixing its own lint violations, while Flake8 supports plugins and allows for custom and third-party rules."}

## Multi-Hop vector store reasoning

Because vector stores are easily usable as tools in agents, it is easy to use answer multi-hop questions that depend on vector stores using the existing agent framework.

In [22]:
tools = [
    Tool(
        name="State of Union QA System",
        func=state_of_union.run,
        description="useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question, not referencing any obscure pronouns from the conversation before.",
    ),
    Tool(
        name="Ruff QA System",
        func=ruff.run,
        description="useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question, not referencing any obscure pronouns from the conversation before.",
    ),
]

In [23]:
# Construct the agent. We will use the default agent type here.
# See documentation for a full list of options.
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

In [24]:
agent.invoke(
    "What tool does ruff use to run over Jupyter Notebooks? Did the president mention that tool in the state of the union?"
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the Ruff QA System to answer this question.
Action: Ruff QA System
Action Input: What tool does ruff use to run over Jupyter Notebooks?[0m
Observation: [33;1m[1;3m Ruff does not support source.organizeImports and source.fixAll code actions in Jupyter Notebooks.[0m
Thought:[32;1m[1;3m I should use the State of Union QA System to check if the president mentioned this tool.
Action: State of Union QA System
Action Input: Did the president mention that tool in the state of the union?[0m
Observation: [36;1m[1;3m I don't know.[0m
Thought:[32;1m[1;3m I should use the Ruff QA System again to see if it has any information on the president mentioning this tool.
Action: Ruff QA System
Action Input: Did the president mention that tool in the state of the union?[0m
Observation: [33;1m[1;3m I don't know.[0m
Thought:[32;1m[1;3m I should use the State of Union QA System again to check if the president mentioned

{'input': 'What tool does ruff use to run over Jupyter Notebooks? Did the president mention that tool in the state of the union?',
 'output': "I don't know."}

In [25]:
# new questions:
agent.invoke("What themes appear repeatedly in Shakespeare's sonnets?")






[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the State of Union QA System to answer this question.
Action: State of Union QA System
Action Input: "What themes appear repeatedly in Shakespeare's sonnets?"[0m
Observation: [36;1m[1;3m Love, time, beauty, mortality, and the power of language are all recurring themes in Shakespeare's sonnets.[0m
Thought:[32;1m[1;3m I should also use the Ruff QA System to check the accuracy of this answer.
Action: Ruff QA System
Action Input: "Are love, time, beauty, mortality, and the power of language recurring themes in Shakespeare's sonnets?"[0m
Observation: [33;1m[1;3m I don't know.[0m
Thought:[32;1m[1;3m I should try rephrasing the question to get a better answer.
Action: State of Union QA System
Action Input: "What are the most common themes in Shakespeare's sonnets?"[0m
Observation: [36;1m[1;3m The most common themes in Shakespeare's sonnets are love, beauty, time, and mortality. Other common themes includ

{'input': "What themes appear repeatedly in Shakespeare's sonnets?",
 'output': "The most common themes in Shakespeare's sonnets are love, beauty, time, and mortality. Other common themes include jealousy, betrayal, and the power of poetry."}

In [27]:
agent.invoke("Explain the emotional tone of Sonnet 18.")






[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the State of Union QA System to answer this question, as it is more likely to have information about literature and poetry.
Action: State of Union QA System
Action Input: "What is the emotional tone of Sonnet 18?"[0m
Observation: [36;1m[1;3m The emotional tone of Sonnet 18 is one of admiration and praise. The speaker compares the subject to a summer's day and expresses their belief that the subject is more lovely and eternal than the fleeting beauty of nature. The tone is also one of reassurance and comfort, as the speaker promises that the subject's beauty will never fade and will be immortalized in the poem.[0m
Thought:[32;1m[1;3m This answer seems accurate and comprehensive.
Final Answer: The emotional tone of Sonnet 18 is one of admiration, praise, reassurance, and comfort.[0m

[1m> Finished chain.[0m


{'input': 'Explain the emotional tone of Sonnet 18.',
 'output': 'The emotional tone of Sonnet 18 is one of admiration, praise, reassurance, and comfort.'}

In [28]:
agent.invoke("How does Ruff handle Jupyter Notebooks?")





[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m Ruff is a python linter, so it may not have specific features for Jupyter Notebooks.
Action: Ruff QA System
Action Input: How does Ruff handle Jupyter Notebooks?[0m
Observation: [33;1m[1;3m
Ruff has limited support for Jupyter Notebooks. It can be used to lint and format code within cells, but it does not support all features such as code actions and imports. It's recommended to use the notebook prefixed code actions for the same such as notebook.source.organizeImports and notebook.source.fixAll respectively. Additionally, Ruff requires a full view of the notebook to provide accurate diagnostics and fixes, so it's important to have all cells visible when using Ruff with Jupyter Notebooks.[0m
Thought:[32;1m[1;3m This is helpful information, but it doesn't fully answer the question.
Action: State of Union QA System
Action Input: How does Ruff handle Jupyter Notebooks?[0m
Observation: [36;1m[1;3m I don't know.[0m
Thou

{'input': 'How does Ruff handle Jupyter Notebooks?',
 'output': 'Ruff has limited support for Jupyter Notebooks and it is recommended to use notebook prefixed code actions for more accurate results. Additionally, Ruff requires a full view of the notebook to provide accurate diagnostics and fixes.'}

In [29]:


agent.invoke("Why is Ruff faster than flake8?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m You should always think about the differences between two things before making a comparison
Action: Ruff QA System
Action Input: Why is Ruff faster than flake8?[0m
Observation: [33;1m[1;3m Ruff is faster than flake8 because it implements all of its rules natively, while flake8 relies on plugins for some of its functionality. This allows Ruff to run more efficiently and quickly.[0m
Thought:[32;1m[1;3m Now that I know the reason for the speed difference, I can make a more informed decision on which linter to use.
Final Answer: Ruff is faster than flake8 because it implements all of its rules natively, making it more efficient and quicker to run.[0m

[1m> Finished chain.[0m


{'input': 'Why is Ruff faster than flake8?',
 'output': 'Ruff is faster than flake8 because it implements all of its rules natively, making it more efficient and quicker to run.'}

In [30]:
agent.invoke(
    "Which tool does Ruff use to lint notebooks, and do Shakespeare's sonnets mention notebook-like writing or tools?"
)




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m Ruff is a python linter, so it probably uses python tools to lint notebooks.
Action: Ruff QA System
Action Input: "Which tool does Ruff use to lint notebooks?"[0m
Observation: [33;1m[1;3m Ruff does not currently support linting notebooks.[0m
Thought:[32;1m[1;3m Ruff is a python linter, so it probably uses python tools to lint notebooks.
Action: State of Union QA System
Action Input: "Do Shakespeare's sonnets mention notebook-like writing or tools?"[0m
Observation: [36;1m[1;3m Yes, Shakespeare's sonnets mention notebook-like writing or tools in Sonnets LX, LXXVIII, LXVI, and LXXVII.[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: Yes, Shakespeare's sonnets mention notebook-like writing or tools.[0m

[1m> Finished chain.[0m


{'input': "Which tool does Ruff use to lint notebooks, and do Shakespeare's sonnets mention notebook-like writing or tools?",
 'output': "Yes, Shakespeare's sonnets mention notebook-like writing or tools."}

In [31]:
agent.invoke("What does the poet say about time and aging?")






[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the State of Union QA System to answer this question.
Action: State of Union QA System
Action Input: What does the poet say about time and aging?[0m
Observation: [36;1m[1;3m The poet expresses a sense of sadness and inevitability about the passing of time and the effects of aging on beauty and youth. He also acknowledges the power of time to take away the things we love.[0m
Thought:[32;1m[1;3m I should use the Ruff QA System to check the grammar and syntax of my answer.
Action: Ruff QA System
Action Input: The poet expresses a sense of sadness and inevitability about the passing of time and the effects of aging on beauty and youth. He also acknowledges the power of time to take away the things we love.[0m
Observation: [33;1m[1;3m The poet reflects on the fleeting nature of beauty and youth, and the inevitability of time's effects on them. He also recognizes the power of time to take away the things we h

{'input': 'What does the poet say about time and aging?',
 'output': "The poet expresses a sense of sadness and inevitability about the passing of time and the effects of aging on beauty and youth. He also acknowledges the power of time to take away the things we love. The poet reflects on the fleeting nature of beauty and youth, and the inevitability of time's effects on them. He also recognizes the power of time to take away the things we hold dear."}

In [32]:
agent.invoke("What configuration options does Ruff have?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m Ruff is a python linter, so it likely has options related to code formatting and style.
Action: Ruff QA System
Action Input: What configuration options does Ruff have?[0m
Observation: [33;1m[1;3m Ruff has several configuration options that can be set in a ruff.toml file. These include line-length, which sets the maximum line length for code, and convention, which sets the docstring style convention (such as Google or NumPy). Ruff also supports editor integration and has integrations with popular code editors like VS Code and Sublime Text. Additionally, Ruff has a variety of rules that can be enabled or disabled, and supports versioning and migrations from other linters like flake8.[0m
Thought:[32;1m[1;3m This is helpful information, but I wonder if there are any specific options that are commonly used.
Action: Ruff QA System
Action Input: What are some commonly used configuration options for Ruff?[0m
Observation: [33

{'input': 'What configuration options does Ruff have?',
 'output': 'Ruff has a variety of configuration options, including line-length and convention, and over 800 rules that can be enabled or disabled. Some commonly used options include settings for editor integration and rules, and Ruff also supports versioning and migrations from other linters.'}

In [33]:
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://docs.python.org/3/tutorial/")
python_docs = loader.load()
python_texts = text_splitter.split_documents(python_docs)
python_db = Chroma.from_documents(python_texts, embeddings, collection_name="python_docs")

python_qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=python_db.as_retriever()
)

# AÃ±adir como herramienta
tools.append(
    Tool(
        name="Python QA System",
        func=python_qa.run,
        description="Use when asking questions about Python documentation."
    )
)

# Actualizar agente
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

# Probar nueva herramienta
agent.invoke("How do you write a function in Python?")






[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m You should always think about the syntax and structure of a function in Python.
Action: Python QA System
Action Input: "How do I write a function in Python?"[0m
Observation: [38;5;200m[1;3m To write a function in Python, you can use the "def" keyword followed by the name of the function and parentheses containing any parameters. Then, use a colon to start a new indented block of code where you define the function's behavior. Finally, use the "return" keyword to specify the value that the function should output. For example:

def add_numbers(x, y):
    return x + y

This function is named "add_numbers" and takes in two parameters, x and y. It then returns the sum of these two numbers. You can call this function by using its name and passing in the desired arguments, like this:

result = add_numbers(3, 5)
print(result) # Output: 8

For more information on defining functions in Python, you can refer to the official Python tu

{'input': 'How do you write a function in Python?',
 'output': 'To write a function in Python, you can use the "def" keyword followed by the name of the function and parentheses containing any parameters. Then, use a colon to start a new indented block of code where you define the function\'s behavior. Finally, use the "return" keyword to specify the value that the function should output.'}

In [34]:
agent.invoke("Does Shakespeare mention Ruff or linters?")





[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m Shakespeare lived long before the invention of linters, so it is unlikely that he would mention them.
Action: Ruff QA System
Action Input: "Does Shakespeare mention Ruff or linters?"[0m
Observation: [33;1m[1;3m No, Shakespeare does not mention Ruff or linters.[0m
Thought:[32;1m[1;3m This makes sense, as linters were not invented until much later.
Action: Python QA System
Action Input: "Does Shakespeare mention Python?"[0m
Observation: [38;5;200m[1;3m No, Shakespeare does not mention Python.[0m
Thought:[32;1m[1;3m This also makes sense, as Python was not invented until the late 20th century.
Action: State of Union QA System
Action Input: "Does Shakespeare mention the State of the Union?"[0m
Observation: [36;1m[1;3m No, Shakespeare does not mention the State of the Union in any of these pieces of context.[0m
Thought:[32;1m[1;3m This is expected, as the State of the Union address is a modern political traditio

{'input': 'Does Shakespeare mention Ruff or linters?',
 'output': 'No, Shakespeare does not mention Ruff or linters.'}

In [35]:
agent.invoke("Is Sonnet 130 affected by flake8 rules?")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should check the Python documentation for information on Sonnet 130 and flake8 rules
Action: Python QA System
Action Input: "Is Sonnet 130 affected by flake8 rules?"[0m
Observation: [38;5;200m[1;3m I don't know. This context is about the Python Tutorial and documentation, not about poetry or literature.[0m
Thought:[32;1m[1;3m I should try asking the Ruff QA System instead
Action: Ruff QA System
Action Input: "Is Sonnet 130 affected by flake8 rules?"[0m
Observation: [33;1m[1;3m I don't know.[0m
Thought:[32;1m[1;3m I should try asking the State of Union QA System instead
Action: State of Union QA System
Action Input: "Is Sonnet 130 affected by flake8 rules?"[0m
Observation: [36;1m[1;3m I don't know.[0m
Thought:[32;1m[1;3m I should try asking a human expert for help
Final Answer: I don't know. It seems like this question is not related to any of the available tools.[0m

[1m> Finished chain.[0m


{'input': 'Is Sonnet 130 affected by flake8 rules?',
 'output': "I don't know. It seems like this question is not related to any of the available tools."}

In [36]:

agent.invoke("Does Ruff describe love or beauty?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m Ruff is a python linter, so it is unlikely to have information about love or beauty.
Action: Ruff QA System
Action Input: Does Ruff describe love or beauty?[0m
Observation: [33;1m[1;3m I don't know.[0m
Thought:[32;1m[1;3m I should try another QA system.
Action: Python QA System
Action Input: Does Ruff describe love or beauty?[0m
Observation: [38;5;200m[1;3m I don't know.[0m
Thought:[32;1m[1;3m I should try another QA system.
Action: State of Union QA System
Action Input: Does Ruff describe love or beauty?[0m
Observation: [36;1m[1;3m Ruff describes beauty.[0m
Thought:[32;1m[1;3m I now know the final answer.
Final Answer: Ruff describes beauty.[0m

[1m> Finished chain.[0m


{'input': 'Does Ruff describe love or beauty?',
 'output': 'Ruff describes beauty.'}