# Building Agentic RAG Systems

This notebook is part of the [Hugging Face Agents Course](https://www.hf.co/learn/agents-course), a free Course from beginner to expert, where you learn to build Agents.

![Agents course share](https://huggingface.co/datasets/agents-course/course-images/resolve/main/en/communication/share.png)

## Basic Retrieval with WebSerach

Let's build a simple agent that can search the web. This agent will retrieve information and synthesize responses to answer queries. With Agentic RAG, Alfred's agent can:

* Search for latest superhero party trends
* Refine results to include luxury elements
* Synthesize information into a complete plan

Here's how Alfred's agent can achieve this:

In [3]:
from smolagents import CodeAgent, WebSearchTool, LiteLLMModel

# Initialize the search tool
search_tool = WebSearchTool()

# Initialize the model
model = LiteLLMModel(
    model_id="ollama_chat/qwen2:7b",  # Or try other Ollama-supported models
    api_base="http://127.0.0.1:11434",  # Default Ollama local server
    num_ctx=8192,
)

agent = CodeAgent(
    model = model,
    tools=[search_tool]
)

# Example usage
response = agent.run(
    "Search for taylor swift's latest album, including the release date and the song list."
)

The agent follows this process:

1. **Analyzes the Request:** Alfred’s agent identifies the key elements of the ~~query—luxury superhero-themed party planning, with focus on decor, entertainment, and catering~~ Taylor Swift's newest album.
2. **Performs Retrieval:**  The agent leverages WebSearchTool to search for the most relevant and up-to-date information, ensuring it aligns with Alfred’s refined preferences for a luxurious event.
3. **Synthesizes Information:** After gathering the results, the agent processes them into a cohesive, actionable plan for Alfred, covering all aspects of the party.
4. **Stores for Future Reference:** The agent stores the retrieved information for easy access when planning future events, optimizing efficiency in subsequent tasks.

## Custom Knowledge Base Tool

For specialized tasks, a custom knowledge base can be invaluable. Let's create a tool that queries a vector database of technical documentation or specialized knowledge. Using semantic search, the agent can find the most relevant information for Alfred's needs.

This approach combines predefined knowledge with semantic search to provide context-aware solutions for event planning. With specialized knowledge access, Alfred can perfect every detail of the party.

Install the dependecies first and run!

In [None]:
from langchain_core.documents import Document # wrapper around text with metadata, struct{text, metadata}
from langchain_text_splitters import RecursiveCharacterTextSplitter # splot long texts into smaller chunks
from smolagents import Tool
from langchain_community.retrievers import BM25Retriever # search algo, scores and ranks docs
from smolagents import CodeAgent, LiteLLMModel

class PartyPlanningRetrieverTool(Tool):
    name = "party_planning_retriever"
    description = "Uses semantic search to retrieve relevant party planning ideas for Alfred’s superhero-themed party at Wayne Manor."
    inputs = {
        "query": {
            "type": "string",
            "description": "The query to perform. This should be a query related to party planning or superhero themes.",
        }
    }
    output_type = "string"

    def __init__(self, docs, **kwargs):
        super().__init__(**kwargs)
        # take a list of Documents and builds BM25 index over them.
        # BM25:
        # Computes IDF (Inverse Document Frequency) — words that appear in many docs (like "the", "a") get a low score. 
        # Words that appear in few docs (like "masquerade") get a high score, since they're more distinctive.

        # Stores all of this in memory as the index.
        # return the top 5 results
        self.retriever = BM25Retriever.from_documents(
            docs, k=5  
        )

    # agent calls this function
    def forward(self, query: str) -> str:
        assert isinstance(query, str), "Your search query must be a string"

        # 1. Tokenize the query
        # 2. For each document, compute a BM25 score based on how well the query terms match
        # 3. return the top 5 documents
        docs = self.retriever.invoke(
            query,
        )
        return "\nRetrieved ideas:\n" + "".join(
            [
                f"\n\n===== Idea {str(i)} =====\n" + doc.page_content
                for i, doc in enumerate(docs)
            ]
        )

# Simulate a knowledge base about party planning
party_ideas = [
    {"text": "A superhero-themed masquerade ball with luxury decor, including gold accents and velvet curtains.", "source": "Party Ideas 1"},
    {"text": "Hire a professional DJ who can play themed music for superheroes like Batman and Wonder Woman.", "source": "Entertainment Ideas"},
    {"text": "For catering, serve dishes named after superheroes, like 'The Hulk's Green Smoothie' and 'Iron Man's Power Steak.'", "source": "Catering Ideas"},
    {"text": "Decorate with iconic superhero logos and projections of Gotham and other superhero cities around the venue.", "source": "Decoration Ideas"},
    {"text": "Interactive experiences with VR where guests can engage in superhero simulations or compete in themed games.", "source": "Entertainment Ideas"}
]

# wrap party_ideas into a list of Documents
source_docs = [
    Document(page_content=doc["text"], metadata={"source": doc["source"]})
    for doc in party_ideas
]

# Split the documents
# LLMs have context limits, so we split long docs to shorter chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,     # each chunck has at most 500 chars
    chunk_overlap=50,   # each chunk has 50 char overlaps
    add_start_index=True,
    strip_whitespace=True,
    # try to split on these separators
    separators=["\n\n", "\n", ".", " ", ""],
)
docs_processed = text_splitter.split_documents(source_docs)

# Create the retriever tool
party_planning_retriever = PartyPlanningRetrieverTool(docs_processed)

# Initialize the model
model = LiteLLMModel(
    model_id="ollama_chat/qwen2:7b",  # Or try other Ollama-supported models
    api_base="http://127.0.0.1:11434",  # Default Ollama local server
    num_ctx=8192,
)

# Initialize the agent
agent = CodeAgent(tools=[party_planning_retriever], model=model)

# Example usage
response = agent.run(
    "Find ideas for a luxury superhero-themed party, including entertainment, catering, and decoration options."
)

print(response)

KeyboardInterrupt: 

This enhanced agent can:
1. First check the documentation for relevant information
2. Combine insights from the knowledge base
3. Maintain conversation context through memory