# Lab | Agent & Vector store

**Change the state union dataset and replicate this lab by updating the prompts accordingly.**

One such dataset is the [sonnets.txt](https://github.com/martin-gorner/tensorflow-rnn-shakespeare/blob/master/shakespeare/sonnets.txt) dataset or any other data of your choice from the same git.

# Combine agents and vector stores

This notebook covers how to combine agents and vector stores. The use case for this is that you've ingested your data into a vector store and want to interact with it in an agentic manner.

The recommended method for doing so is to create a `RetrievalQA` and then use that as a tool in the overall agent. Let's take a look at doing this below. You can do this with multiple different vector DBs, and use the agent as a way to route between them. There are two different ways of doing this - you can either let the agent use the vector stores as normal tools, or you can set `return_direct=True` to really just use the agent as a router.

## Create the vector store

In [1]:
import os
from dotenv import load_dotenv, find_dotenv

load_dotenv(find_dotenv())

# Ensure the key exists
OPENAI_API_KEY = os.environ["OPENAI_API_KEY"]

In [2]:
import requests
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.document_loaders import TextLoader, WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [3]:
# LLM + embeddings
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
embeddings = OpenAIEmbeddings()

In [4]:
# Download sonnets.txt
url = "https://raw.githubusercontent.com/martin-gorner/tensorflow-rnn-shakespeare/master/shakespeare/sonnets.txt"
local_path = "sonnets.txt"
r = requests.get(url, timeout=30)
r.raise_for_status()
with open(local_path, "w", encoding="utf-8") as f:
    f.write(r.text)

# Load text file
loader = TextLoader(local_path, encoding="utf-8")
documents = loader.load()

In [5]:
# Split
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=150)
splits = text_splitter.split_documents(documents)

# Vector store + retriever
sonnets_db = Chroma.from_documents(
    documents=splits,
    embedding=embeddings,
    collection_name="sonnets"
)
sonnets_retriever = sonnets_db.as_retriever(search_kwargs={"k": 4})

In [6]:
# LCEL chain
sonnets_prompt = ChatPromptTemplate.from_messages([
    ("system", "You answer questions using ONLY the provided context from Shakespeare's sonnets. "
               "If the answer isn't in the context, say you don't know."),
    ("human", "Question: {question}\n\nContext:\n{context}")
])

def format_docs(docs):
    return "\n\n".join(d.page_content for d in docs)

sonnets_chain = (
    {
        "context": sonnets_retriever | format_docs,
        "question": RunnablePassthrough(),
    }
    | sonnets_prompt
    | llm
    | StrOutputParser()
)

sonnets_chain.invoke("What themes about time appear in these sonnets?")

"The themes about time in these sonnets include the inevitability of aging and decay, the transient nature of beauty, and the struggle against time's destructive force. Time is depicted as a relentless force that causes beauty and life to fade, as seen in the imagery of night overtaking day, flowers past their prime, and barren trees. The sonnets reflect on how everything is subject to time's scythe, which ultimately takes away loved ones and beauty. However, there is also a sense of hope, as the speaker suggests that through poetry, love and beauty can be preserved and celebrated despite time's cruel hand."

In [7]:
# Load Ruff FAQ webpage
web_loader = WebBaseLoader(
    web_paths=["https://beta.ruff.rs/docs/faq/"]
)
ruff_docs = web_loader.load()

# Split using same splitter
ruff_splits = text_splitter.split_documents(ruff_docs)

# Vector store + retriever
ruff_db = Chroma.from_documents(
    documents=ruff_splits,
    embedding=embeddings,
    collection_name="ruff"
)
ruff_retriever = ruff_db.as_retriever(search_kwargs={"k": 4})

In [8]:
# LCEL chain
ruff_prompt = ChatPromptTemplate.from_messages([
    ("system", "You answer questions using ONLY the provided context from the Ruff FAQ. "
               "If the answer isn't in the context, say you don't know."),
    ("human", "Question: {question}\n\nContext:\n{context}")
])

ruff_chain = (
    {
        "context": ruff_retriever | format_docs,
        "question": RunnablePassthrough(),
    }
    | ruff_prompt
    | llm
    | StrOutputParser()
)

ruff_chain.invoke("Does Ruff support pyproject.toml configuration?")

'Yes, Ruff supports pyproject.toml configuration, but if you prefer not to use it, you can use a ruff.toml file for configuration instead. The two files are functionally equivalent.'

## Create the Agent

In [9]:
# Import things that are needed generically
from langchain.agents import create_agent
from langchain.tools import tool
from langchain.messages import AIMessage, HumanMessage

In [10]:
model = ChatOpenAI(model="gpt-4o-mini", temperature=0)

@tool
def sonnets_qa(question: str) -> str:
    """Answer questions about Shakespeare's sonnets."""
    return sonnets_chain.invoke(question)

@tool
def ruff_qa(question: str) -> str:
    """Answer questions about Ruff, the Python linter."""
    return ruff_chain.invoke(question)

agent = create_agent(model,tools=[sonnets_qa, ruff_qa])

In [11]:
def final_text(result):
    """Extract the final assistant text from a LangGraph-style agent result."""
    # Most commonly result is a dict with 'messages'
    if isinstance(result, dict) and "messages" in result:
        msgs = result["messages"]
        # walk backwards to find the last AIMessage with content
        for m in reversed(msgs):
            if isinstance(m, AIMessage) and getattr(m, "content", None):
                return m.content
        # fallback: last message content if it exists
        if msgs and hasattr(msgs[-1], "content"):
            return msgs[-1].content

    # If it's already a string (some chains), just return it
    if isinstance(result, str):
        return result

    # Last resort
    return str(result)

def ask(agent, q: str) -> str:
    res = agent.invoke({"messages": [HumanMessage(q)]})
    return final_text(res)

In [12]:
ask(agent, "What did biden say about ketanji brown jackson in the state of the union address?")

"In his State of the Union address, President Joe Biden praised Ketanji Brown Jackson, highlighting her historic confirmation as the first Black woman to serve on the U.S. Supreme Court. He emphasized her qualifications, experience, and the importance of her appointment in representing diversity on the Court. Biden's remarks were part of a broader discussion on the significance of judicial appointments and the need for a judiciary that reflects the nation's diversity."

In [13]:
ask(agent, "Why use Ruff over Flake8?")

"Ruff can be used as a drop-in replacement for Flake8, especially when used without or with a small number of plugins, and is designed to work well alongside Black on Python 3 code. Here are some key reasons to consider using Ruff over Flake8:\n\n1. **Different Rule Set**: Ruff implements a different set of rules and uses different rule codes and prefixes than Flake8 plugins, which helps minimize conflicts.\n\n2. **Comprehensive Coverage**: Ruff implements over 800 rules, making it a more comprehensive tool compared to Flake8's total rules.\n\n3. **Performance**: Ruff is designed to be faster than Flake8, which can be beneficial for larger codebases.\n\n4. **Simplicity**: It can serve as a simpler solution when you don't need custom lint rules or the full set of opinionated rules from Flake8 plugins like flake8-bugbear.\n\nHowever, it's worth noting that Ruff does not support custom lint rules and may not include all the specific rules that some Flake8 plugins provide."

## Use the Agent solely as a router

You can also set `return_direct=True` if you intend to use the agent as a router and just want to directly return the result of the RetrievalQAChain.

Notice that in the above examples the agent did some extra work after querying the RetrievalQAChain. You can avoid that and just return the result directly.

In [14]:
router_model = ChatOpenAI(model="gpt-4o-mini", temperature=0)

@tool
def sonnets_qa(question: str) -> str:
    """Answer questions about Shakespeare's sonnets."""
    return sonnets_chain.invoke(question)

@tool
def ruff_qa(question: str) -> str:
    """Answer questions about Ruff, the Python linter."""
    return ruff_chain.invoke(question)

In [15]:
# Create agent (router behavior comes from instruction + tool outputs)
router_agent = create_agent(router_model, tools=[sonnets_qa, ruff_qa],)

def route(q: str):
    # force router behavior in-message
    msg = (
        "You are a router. Choose the single best tool and call it once. "
        "Return EXACTLY the tool output with no extra commentary.\n\n"
        f"Question: {q}"
    )
    res = router_agent.invoke({"messages": [HumanMessage(msg)]})
    return res["messages"][-1].content

In [16]:
route("Why use ruff over flake8?")

"Ruff can be used as a drop-in replacement for Flake8 when used without or with a small number of plugins, alongside Black, and on Python 3 code. It implements every rule in Flake8 under those conditions and re-implements some popular Flake8 plugins natively. Additionally, Ruff has a larger rule set, implementing over 800 rules compared to Flake8's ~409 total rules. However, Ruff does not support custom lint rules and does not include all the 'opinionated' rules from flake8-bugbear."

In [17]:
route("What themes about time appear in the sonnets?")

"The themes about time that appear in the sonnets include the inevitability of decay and the passage of time leading to loss. Time is depicted as a force that causes beauty and youth to fade, as seen in the imagery of nature's cycles, such as the day turning to night and plants that hold perfection only for a moment. The sonnets reflect on how everything grows and then diminishes, emphasizing the transient nature of life. Additionally, there is a sense of urgency to preserve beauty and love against the relentless advance of time, suggesting that procreation or creating a legacy is a way to defy time's scythe. Overall, time is portrayed as both a destructive force and a motivator for the speaker to immortalize beauty through verse."

In [18]:
route("What did biden say about ketanji brown jackson in the state of the union address?")

"I don't know."

## Multi-Hop vector store reasoning

Because vector stores are easily usable as tools in agents, it is easy to use answer multi-hop questions that depend on vector stores using the existing agent framework.

In [19]:
model = ChatOpenAI(model="gpt-4o-mini", temperature=0)

@tool
def sonnets_qa(question: str) -> str:
    """Answer questions about Shakespeare's sonnets. Use for literary themes, lines, or context from the sonnets."""
    return sonnets_chain.invoke(question)

@tool
def ruff_qa(question: str) -> str:
    """Answer questions about Ruff, the Python linter. Use for configuration, features, and behavior from the Ruff FAQ."""
    return ruff_chain.invoke(question)

In [20]:
multi_hop_agent = create_agent(model, tools=[sonnets_qa, ruff_qa])

def ask_multi(q: str):
    res = multi_hop_agent.invoke({"messages": [HumanMessage(q)]})
    return res["messages"][-1].content

In [21]:
ask_multi(
    "What tool does Ruff use to run over Jupyter Notebooks? "
    "Do the sonnets mention anything related to that tool or notebooks? "
    "Did the president mention that tool in the state of the union?"
)

"Ruff uses **nbQA** to run over Jupyter Notebooks. You can run Ruff over a notebook using the command: `$ nbqa ruff Untitled.ipynb`.\n\nAs for Shakespeare's sonnets, they do not mention anything related to tools or notebooks.\n\nRegarding whether the president mentioned Ruff in the state of the union, I don't have that information."