In this unit, we will help Alfred, our friendly agent who is hosting the gala, by using Agentic RAG to create a tool that can be used to answer questions about the guests at the gala.

## A Gala to Remember


Now, it’s time to get our hands dirty with an actual use case. Let’s set the stage!

**You decided to host the most extravagant and opulent party of the century**. This means lavish feasts, enchanting dancers, renowned DJs, exquisite drinks, a breathtaking fireworks display, and much more.

Alfred, your friendly neighbourhood agent, is getting ready to watch over all of your needs for this party, and **Alfred is going to manage everything himself**. To do so, he needs to have access to all of the information about the party, including the menu, the guests, the schedule, weather forecasts, and much more!

Not only that, but he also needs to make sure that the party is going to be a success, so **he needs to be able to answer any questions about the party during the party**, whilst handling unexpected situations that may arise.

He can’t do this alone, so we need to make sure that Alfred has access to all of the information and tools he needs.

First, let’s give him a list of hard requirements for the gala.

## The Gala Requirements
A properly educated person in the age of the **Renaissance** needs to have three main traits. He or she needed to be profound in the **knowledge of sports, culture, and science**. So, we need to make sure we can impress our guests with our knowledge and provide them with a truly unforgettable gala. However, to avoid any conflicts, there are some **topics, like politics and religion, that are to be avoided at a gala**. It needs to be a fun party without conflicts related to beliefs and ideals.

According to etiquette, **a good host should be aware of guests’ backgrounds**, including their interests and endeavours. A good host also gossips and shares stories about the guests with one another.

Lastly, we need to make sure that we’ve got **some general knowledge about the weather** to ensure we can continuously find a real-time update to ensure perfect timing to launch the fireworks and end the gala with a bang! 🎆

As you can see, Alfred needs a lot of information to host the gala. Luckily, we can help and prepare Alfred by giving him some **Retrieval Augmented Generation (RAG) training**!

Let’s start by creating the tools that Alfred needs to be able to host the gala!



## Agentic Retrieval Augmented Generation (RAG)
In this unit, we’ll be taking a look at how we can use Agentic RAG to help Alfred prepare for the amazing gala.

LLMs are trained on enormous bodies of data to learn general knowledge. However, the world knowledge model of LLMs may not always be relevant and up-to-date information. **RAG solves this problem by finding and retrieving relevant information from your data and forwarding that to the LLM**.

![](../../image/rag.png)

Now, think about how Alfred works:

1. We’ve asked Alfred to help plan a gala
2. Alfred needs to find the latest news and weather information
3. Alfred needs to structure and search the guest information

Just as Alfred needs to search through your household information to be helpful, any agent needs a way to find and understand relevant data. **Agentic RAG is a powerful way to use agents to answer questions about your data**. We can pass various tools to Alfred to help him answer questions. However, instead of answering the question on top of documents automatically, Alfred can decide to use any other tool or flow to answer the question.

![](../../image/agentic-rag.png)


Let’s start building our agentic RAG workflow!

First, we’ll create a RAG tool to retrieve up-to-date details about the invitees. Next, we’ll develop tools for web search, weather updates, and Hugging Face Hub model download statistics. Finally, we’ll integrate everything to bring our agentic RAG agent to life!

## Creating a RAG Tool for Guest Stories
Alfred, your trusted agent, is preparing for the most extravagant gala of the century. To ensure the event runs smoothly, Alfred needs quick access to up-to-date information about each guest. Let’s help Alfred by creating a custom Retrieval-Augmented Generation (RAG) tool, powered by our custom dataset.


### Why Rag for a Gala?

Imagine Alfred mingling among the guests, needing to recall specific details about each person at a moment’s notice. A traditional LLM might struggle with this task because:

- The guest list is specific to your event and not in the model’s training data
- Guest information may change or be updated frequently
- Alfred needs to retrieve precise details like email addresses
This is where Retrieval Augmented Generation (RAG) shines! By combining a retrieval system with an LLM, Alfred can access accurate, up-to-date information about your guests on demand.

## data set overview
Our dataset [agents-course/unit3-invitees](https://huggingface.co/datasets/agents-course/unit3-invitees) contains the following fields for each guest:

- Name: Guest’s full name
- Relation: How the guest is related to the host
- Description: A brief biography or interesting facts about the guest
- Email Address: Contact information for sending invitations or follow-ups


## Building the Guestbook Tool
We’ll create a custom tool that Alfred can use to quickly retrieve guest information during the gala. Let’s break this down into three manageable steps:

- Load and prepare the dataset
- Create the Retriever Tool
- Integrate the Tool with Alfred

Let’s start with loading and preparing the dataset!

In [None]:
import datasets
from langchain_core.documents import Document

# Load the dataset
guest_dataset = datasets.load_dataset("agents-course/unit3-invitees", split="train")

# Convert dataset entries into Document objects
docs = [
    Document(
        page_content="\n".join([
            f"Name: {guest['name']}",
            f"Relation: {guest['relation']}",
            f"Description: {guest['description']}",
            f"Email: {guest['email']}"
        ]),
        metadata={"name": guest["name"]}
    )
    for guest in guest_dataset
]

In the code above, we:

- Load the dataset
- Convert each guest entry into a Document object with formatted content
- Store the Document objects in a list
This means we’ve got all of our data nicely available so we can get started with configuring our retrieval.