<a href="https://colab.research.google.com/github/radve88/Learning-AI/blob/main/llama_index_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

We will use the Hugging Face datasets library to load the dataset and convert it into a list of Document objects from the llama_index.core.schema module.

In [None]:
import datasets
from llama_index.core.schema import Document

# Load the dataset
guest_dataset = datasets.load_dataset("agents-course/unit3-invitees", split="train")

# Convert dataset entries into Document objects
docs = [
    Document(
        text="\n".join([
            f"Name: {guest_dataset['name'][i]}",
            f"Relation: {guest_dataset['relation'][i]}",
            f"Description: {guest_dataset['description'][i]}",
            f"Email: {guest_dataset['email'][i]}"
        ]),
        metadata={"name": guest_dataset['name'][i]}
    )
    for i in range(len(guest_dataset))
]

n the code above, we:

Load the dataset
Convert each guest entry into a Document object with formatted content
Store the Document objects in a list
This means we’ve got all of our data nicely available so we can get started with configuring our retrieval.Step 2: Create the Retriever Tool
Now, let’s create a custom tool that Alfred can use to search through our guest information.

In [None]:
from llama_index.core.tools import FunctionTool
from llama_index.retrievers.bm25 import BM25Retriever

bm25_retriever = BM25Retriever.from_defaults(nodes=docs)

def get_guest_info_retriever(query: str) -> str:
    """Retrieves detailed information about gala guests based on their name or relation."""
    results = bm25_retriever.retrieve(query)
    if results:
        return "\n\n".join([doc.text for doc in results[:3]])
    else:
        return "No matching guest information found."

# Initialize the tool
guest_info_tool = FunctionTool.from_defaults(get_guest_info_retriever)

Let’s understand this tool step-by-step.

The docstring helps the agent understand when and how to use this tool
The type decorators define what parameters the tool expects (in this case, a search query)
We’re using a BM25Retriever, which is a powerful text retrieval algorithm that doesn’t require embeddings
The method processes the query and returns the most relevant guest information
Step 3: Integrate the Tool with Alfred Finally, let’s bring everything together by creating our agent and equipping it with our custom tool:

In [None]:
from llama_index.core.agent.workflow import AgentWorkflow
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI

# Initialize the Hugging Face model
llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")

# Create Alfred, our gala agent, with the guest info tool
alfred = AgentWorkflow.from_tools_or_functions(
    [guest_info_tool],
    llm=llm,
)

# Example query Alfred might receive during the gala
response = await alfred.run("Tell me about our guest named 'Lady Ada Lovelace'.")

print("🎩 Alfred's Response:")
print(response)

What’s happening in this final step:

We initialize a Hugging Face model using the HuggingFaceInferenceAPI class
We create our agent (Alfred) as a AgentWorkflow, including the tool we just created
We ask Alfred to retrieve information about a guest named “Lady Ada Lovelace”