# Agentic RAG: A Minimal Implementation

This notebook demonstrates an **Agentic RAG** system. Unlike traditional RAG (which retrieves -> generates), an Agentic RAG can:
1.  **Decide** whether to retrieve information or not.
2.  **Refine** its search queries based on initial findings.
3.  **Synthesize** information from multiple steps.

We will use `LangChain` and `LangGraph` for this demonstration.

In [81]:
# Install necessary packages
%pip install -q -U langchain langchain-openai langchain-community chromadb langgraph python-dotenv


/Users/jay/work/task/ai/.venv/bin/python: No module named pip
Note: you may need to restart the kernel to use updated packages.


In [82]:
import os
from pathlib import Path

from dotenv import load_dotenv

# Load .env if present and expose helper for mandatory keys
load_dotenv()


def ensure_env_var(key: str) -> str:
    value = os.environ.get(key)
    if value:
        return value
    raise EnvironmentError(f"Missing required environment variable: {key}")


os.environ["OPENAI_API_KEY"] = ensure_env_var("OPENAI_API_KEY")


## 1. Setup Vector Store (The "Knowledge Base")
We'll create a simple in-memory vector store with some dummy data about a fictional company "Nostra".

In [83]:
from dataclasses import dataclass, field
from typing import Iterable, Mapping, Sequence

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document


@dataclass
class CorpusConfig:
    collection_name: str = "nostra_docs"
    default_metadata: Mapping[str, str] = field(default_factory=dict)


def _normalize_docs(raw_docs: Iterable[Mapping[str, str]], default_meta: Mapping[str, str]) -> Sequence[Document]:
    docs: list[Document] = []
    for item in raw_docs:
        content = item.get("page_content") or item.get("content")
        if not content:
            continue
        metadata = {**default_meta, **item.get("metadata", {})}
        docs.append(Document(page_content=content, metadata=metadata))
    if not docs:
        raise ValueError("At least one document with page_content is required")
    return docs


def build_retriever(raw_docs: Iterable[Mapping[str, str]], config: CorpusConfig | None = None):
    config = config or CorpusConfig()
    docs = _normalize_docs(raw_docs, config.default_metadata)
    embeddings = OpenAIEmbeddings()
    vectorstore = Chroma.from_documents(docs, embeddings, collection_name=config.collection_name)
    return vectorstore.as_retriever()


DEFAULT_DOCS = [
    {"page_content": "Nostra is a prediction market platform running on Arbitrum.", "metadata": {"source": "overview"}},
    {"page_content": "Nostra allows users to trade on future events like Sports and Politics.", "metadata": {"source": "features"}},
    {"page_content": "The native token of Nostra is NST, used for governance and fee rebates.", "metadata": {"source": "tokenomics"}},
    {"page_content": "Nostra uses a CTF (Conditional Token Framework) for its market resolution.", "metadata": {"source": "tech"}},
]

retriever = build_retriever(DEFAULT_DOCS)


## 2. Define Tools
The agent needs a tool to access the vector store.

In [84]:
from langchain_core.tools import Tool

def make_search_tool(retriever, *, name: str = "search_docs", description: str | None = None):
    desc = description or "Retrieves grounded knowledge snippets for the agent."

    def _search(query: str):
        return retriever.invoke(query)

    return Tool(name=name, description=desc, func=_search)


tools = [make_search_tool(retriever, description="Searches the configured Nostra corpus.")]


## 3. Build the Agent (Using LangGraph)
We will use a pre-built ReAct agent structure from LangGraph.

In [85]:
from langchain.agents import create_agent
from langchain_openai import ChatOpenAI


def build_agent(*, model: str = "gpt-4o-mini", temperature: float = 0.0, tools=None):
    llm = ChatOpenAI(model=model, temperature=temperature)
    return create_agent(llm, tools or [])


agent_executor = build_agent(tools=tools)


## 4. Run the Agent
Let's ask a question that requires retrieval.

In [86]:
def run_agent_query(agent, query: str):
    print(f"User: {query}\n")
    for chunk in agent.stream({"messages": [("human", query)]}):
        print(chunk)
        print("----")


run_agent_query(agent_executor, "What framework does Nostra use for market resolution?")


User: What framework does Nostra use for market resolution?

{'model': {'messages': [AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 20, 'prompt_tokens': 61, 'total_tokens': 81, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_50906f2aac', 'id': 'chatcmpl-CgSdsatyGeS5QMTb2jxp0Ai6bJXpm', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--d3e3e08a-c223-4365-97cb-d09602a48f1d-0', tool_calls=[{'name': 'search_docs', 'args': {'__arg1': 'Nostra market resolution framework'}, 'id': 'call_25XwcTsiVr3M8UkRa77ySoyk', 'type': 'tool_call'}], usage_metadata={'input_tokens': 61, 'output_tokens': 20, 'total_tokens': 81, 'input_token_details': {'audio'

## 5. Advanced: Inspecting the Trace
The output above shows the agent's reasoning steps:
1.  It identifies it needs to search.
2.  It calls `search_nostra_docs`.
3.  It receives the context.
4.  It synthesizes the final answer.

In [87]:
# Try another query to demonstrate reuse
run_agent_query(agent_executor, "What is the purpose of the NST token?")


User: What is the purpose of the NST token?

{'model': {'messages': [AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 60, 'total_tokens': 78, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_560af6e559', 'id': 'chatcmpl-CgSdxBogSdl0vWB9x5JLkMELiAZ46', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--071eb433-8eb7-4921-84d5-2d2281ef38fa-0', tool_calls=[{'name': 'search_docs', 'args': {'__arg1': 'NST token purpose'}, 'id': 'call_p1IIzMA4HqZ0OETmFtM6rF3z', 'type': 'tool_call'}], usage_metadata={'input_tokens': 60, 'output_tokens': 18, 'total_tokens': 78, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_to

## 6. Bonus Example: Multi-hop question
Ask something broader so the agent may retrieve multiple snippets before answering.


In [88]:
# Broader query that requires combining multiple documents
run_agent_query(agent_executor, "Give me a quick overview of Nostra and mention its key features.")


User: Give me a quick overview of Nostra and mention its key features.

{'model': {'messages': [AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 65, 'total_tokens': 86, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_50906f2aac', 'id': 'chatcmpl-CgSe8ruvKx1F3ajVOt4zhqrPdsimf', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--2852a8db-1aec-49d0-807b-7275bf925f21-0', tool_calls=[{'name': 'search_docs', 'args': {'__arg1': 'Nostra overview and key features'}, 'id': 'call_WFx69fvMBK0V9FewzhMe8NOO', 'type': 'tool_call'}], usage_metadata={'input_tokens': 65, 'output_tokens': 21, 'total_tokens': 86, 'input_token_details':