# Lab | Agent & Vector store

**Change the state union dataset and replicate this lab by updating the prompts accordingly.**

One such dataset is the [sonnets.txt](https://github.com/martin-gorner/tensorflow-rnn-shakespeare/blob/master/shakespeare/sonnets.txt) dataset or any other data of your choice from the same git.

# Combine agents and vector stores

This notebook covers how to combine agents and vector stores. The use case for this is that you've ingested your data into a vector store and want to interact with it in an agentic manner.

The recommended method for doing so is to create a `RetrievalQA` and then use that as a tool in the overall agent. Let's take a look at doing this below. You can do this with multiple different vector DBs, and use the agent as a way to route between them. There are two different ways of doing this - you can either let the agent use the vector stores as normal tools, or you can set `return_direct=True` to really just use the agent as a router.

## Create the vector store

In [41]:
!pip install chromadb langchain langchain_community langchain_openai



In [42]:
from langchain.chains import RetrievalQA
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAI, OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader

In [None]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

OPENAI_API_KEY  = os.getenv('OPENAI_API_KEY')

In [43]:
# If you're using colab, run this
#os.environ['OPENAI_API_KEY'] = "YOUR_OPENAI_API_KEY"



In [44]:
llm = OpenAI(temperature=0)

In [45]:
from pathlib import Path

relevant_parts = []
for p in Path(".").absolute().parts:
    relevant_parts.append(p)
    if relevant_parts[-3:] == ["langchain", "docs", "modules"]:
        break
doc_path = str(Path(*relevant_parts) / "sonnets.txt")

In [46]:
loader = TextLoader(doc_path)
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

embeddings = OpenAIEmbeddings()
persist_directory = "chroma_db"

docsearch = Chroma.from_documents(
    texts,
    embedding=embeddings,
    collection_name="sonnets_of_shakespeare",
    persist_directory=persist_directory  # Important: makes it in-memory
)

In [47]:
sonnets_of_shakespeare = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=docsearch.as_retriever()
)


In [48]:
from langchain_community.document_loaders import WebBaseLoader

In [49]:
loader = WebBaseLoader("https://beta.ruff.rs/docs/faq/")

In [51]:
docs = loader.load()
ruff_texts = text_splitter.split_documents(docs)
ruff_db = Chroma.from_documents(ruff_texts, embeddings, collection_name="ruff", persist_directory="chroma_db")
ruff = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=ruff_db.as_retriever()
)



## Create the Agent

In [52]:
# Import things that are needed generically
from langchain.agents import AgentType, Tool, initialize_agent
from langchain_openai import OpenAI

In [54]:

tools = [
    Tool(
        name="Shakespeare Sonnets Expert",
        func=sonnets_of_shakespeare.run,
        description=(
            "Use this tool to answer questions about Shakespeare's Sonnets. "
            "Best for literary analysis, themes, line meanings, and sonnet structure. "
            "Input should be a complete natural language question."
        ),
    ),
    Tool(
        name="Ruff Linter Assistant",
        func=ruff.run,
        description=(
            "Use this tool to answer technical questions about Ruff, the Python linter. "
            "Helpful for linting rules, configuration, and usage examples. "
            "Input should be a well-formed technical question."
        ),
    ),
]


In [55]:
# Construct the agent. We will use the default agent type here.
# See documentation for a full list of options.
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

In [56]:
agent.invoke(
    "What is the main theme of Sonnet 18?"
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the Shakespeare Sonnets Expert tool to analyze the theme of Sonnet 18.
Action: Shakespeare Sonnets Expert
Action Input: "What is the main theme of Sonnet 18?"[0m
Observation: [36;1m[1;3m The main theme of Sonnet 18 is the idea of eternal beauty and love, and how it can be preserved through the power of poetry.[0m
Thought:[32;1m[1;3m I should also consider the use of imagery and language in the sonnet to further understand the theme.
Action: Shakespeare Sonnets Expert
Action Input: "How does Shakespeare use imagery and language to convey the theme of eternal beauty in Sonnet 18?"[0m
Observation: [36;1m[1;3m In Sonnet 18, Shakespeare uses imagery and language to compare the fleeting nature of summer to the eternal beauty of the subject of the poem. He describes how summer is temporary and can be marred by rough winds and hot sun, but the subject's beauty will never fade or be overshadowed by death. He use

{'input': 'What is the main theme of Sonnet 18?',
 'output': 'The main theme of Sonnet 18 is the idea of eternal beauty and love, conveyed through the use of imagery and language.'}

In [57]:
agent.invoke("Why use ruff over flake8?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m You should consider the differences between ruff and flake8 and how they may affect your code.
Action: Ruff Linter Assistant
Action Input: "What are the main differences between ruff and flake8?"[0m
Observation: [33;1m[1;3m Ruff implements more rules than Flake8 and does not support custom lint rules. It also has some minor differences in its rule set and does not include all of the 'opinionated' rules from flake8-bugbear. Additionally, Ruff is capable of automatically fixing its own lint violations and does not support custom or third-party rules.[0m
Thought:[32;1m[1;3m I now know the main differences between ruff and flake8.
Final Answer: The main differences between ruff and flake8 are that ruff has a larger rule set, does not support custom lint rules, and has some minor differences in its rule set. Ruff also has the ability to automatically fix its own lint violations and does not support custom or third-party rul

{'input': 'Why use ruff over flake8?',
 'output': 'The main differences between ruff and flake8 are that ruff has a larger rule set, does not support custom lint rules, and has some minor differences in its rule set. Ruff also has the ability to automatically fix its own lint violations and does not support custom or third-party rules.'}

## Use the Agent solely as a router

You can also set `return_direct=True` if you intend to use the agent as a router and just want to directly return the result of the RetrievalQAChain.

Notice that in the above examples the agent did some extra work after querying the RetrievalQAChain. You can avoid that and just return the result directly.

In [58]:
tools = [
    Tool(
        name="Shakespeare Sonnets QA",
        func=sonnets_of_shakespeare.run,
        description=(
            "Use this tool to answer literary questions about Shakespeare's Sonnets. "
            "Ask about themes, famous lines, or interpretations of specific sonnets."
        ),
        return_direct=True,
    ),
    Tool(
        name="Ruff Python Linter QA",
        func=ruff.run,
        description=(
            "Use this tool to answer technical questions about Ruff, the Python linter. "
            "Ask about configuration, supported rules, or how to ignore specific checks."
        ),
        return_direct=True,
    ),
]


In [59]:
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

In [60]:
agent.invoke(
    "How does Shakespeare explore the concept of time in his sonnets?"
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the Shakespeare Sonnets QA tool to answer this question.
Action: Shakespeare Sonnets QA
Action Input: "time"[0m
Observation: [36;1m[1;3m Time is a recurring theme in these sonnets, as the speaker reflects on the passing of time and its effects on beauty and youth. In Sonnet XII, the speaker questions the inevitability of time and its ability to destroy beauty and life. In Sonnet XIII, the speaker compares time to the waves that constantly move towards the shore, symbolizing the unstoppable force of time. In Sonnet LXI, the speaker acknowledges the destructive power of time, but also expresses hope that his words will stand the test of time and continue to praise the worth of the subject.[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m


{'input': 'How does Shakespeare explore the concept of time in his sonnets?',
 'output': ' Time is a recurring theme in these sonnets, as the speaker reflects on the passing of time and its effects on beauty and youth. In Sonnet XII, the speaker questions the inevitability of time and its ability to destroy beauty and life. In Sonnet XIII, the speaker compares time to the waves that constantly move towards the shore, symbolizing the unstoppable force of time. In Sonnet LXI, the speaker acknowledges the destructive power of time, but also expresses hope that his words will stand the test of time and continue to praise the worth of the subject.'}

In [61]:
agent.invoke("Why use ruff over flake8?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I am not familiar with either tool, so I should use the Ruff Python Linter QA tool to answer this question.
Action: Ruff Python Linter QA
Action Input: "What are the advantages of using Ruff over Flake8?"[0m
Observation: [33;1m[1;3m Ruff implements more rules than Flake8, has better compatibility with Black, and can automatically fix its own lint violations. It also supports more plugins and is capable of detecting more errors.[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m


{'input': 'Why use ruff over flake8?',
 'output': ' Ruff implements more rules than Flake8, has better compatibility with Black, and can automatically fix its own lint violations. It also supports more plugins and is capable of detecting more errors.'}

## Multi-Hop vector store reasoning

Because vector stores are easily usable as tools in agents, it is easy to use answer multi-hop questions that depend on vector stores using the existing agent framework.

In [62]:
tools = [
    Tool(
        name="Sonnets of Shakespeare QA System",
        func=sonnets_of_shakespeare.run,
        description=(
            "Useful for answering questions about Shakespeare's Sonnets. "
            "Ask fully-formed questions related to the themes, structure, lines, or meaning of specific sonnets—"
            "avoid vague or unclear references."
        ),
    ),
    Tool(
        name="Ruff Python Linter QA",
        func=ruff.run,
        description=(
            "Useful for answering technical questions about Ruff, the Python linter. "
            "Ask complete, well-formed questions directly related to Ruff’s functionality, "
            "rules, or integration in Python workflows—do not rely on previous context."
        ),
    ),
]


In [63]:
# Construct the agent. We will use the default agent type here.
# See documentation for a full list of options.
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

In [64]:
agent.invoke(
    "Which sonnet talks about the immortality of poetry?"
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the Sonnets of Shakespeare QA System to find the sonnet that talks about the immortality of poetry.
Action: Sonnets of Shakespeare QA System
Action Input: "immortality of poetry"[0m
Observation: [36;1m[1;3m The context suggests that the speaker believes that poetry has the power to outlive physical monuments and even war, and that it will continue to be praised and remembered by future generations.[0m
Thought:[32;1m[1;3m I should read the sonnets that mention poetry and see which one talks about its immortality.
Action: Sonnets of Shakespeare QA System
Action Input: "poetry"[0m
Observation: [36;1m[1;3m The context provided is a series of sonnets written by William Shakespeare. The sonnets are a form of poetry, specifically love poetry, in which the speaker expresses his love and admiration for a subject, often a young man or a woman. The sonnets also explore themes of beauty, time, and the power of poet

{'input': 'Which sonnet talks about the immortality of poetry?',
 'output': 'Sonnet 18, also known as "Shall I compare thee to a summer\'s day?", talks about the immortality of poetry.'}