* Project Title: Party Event-based Agentic Retrieval Augmented Generative System
* Objective: To equip our Agentic RAG System to answer queries about our guests
* Requirements:
    - Our agent needs to have knowledge in this domains - sport, culture and sciencve
    - To avoid topics (such as religion and politics) that could raise conflicts during the event
    - The agent should have knowledge of guests background
    - General knowledge about the weather to ensure we can find a real-time update to ensure perfect timing to launch the fireworks
* Tools to be built:
    - Invitees info tool - to help get an up to date info on guests
    - Web search tool
    - Weather tool



## Load and Prepare Dataset

In [1]:
# import the needed package and class
import datasets
from llama_index.core.schema import Document

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
guest_data = datasets.load_dataset("agents-course/unit3-invitees", split="train")
guest_data[0]

{'name': 'Ada Lovelace',
 'relation': 'best friend',
 'description': "Lady Ada Lovelace is my best friend. She is an esteemed mathematician and friend. She is renowned for her pioneering work in mathematics and computing, often celebrated as the first computer programmer due to her work on Charles Babbage's Analytical Engine.",
 'email': 'ada.lovelace@example.com'}

In [3]:
# converting the datasset obj into Document object
docs = [
    Document(
        text = '\n'.join(
            [
                f"Name: {guest_data[i]['name']}",
                f"Relation: {guest_data[i]['relation']}",
                f"Description: {guest_data[i]['description']}",
                f"Email: {guest_data[i]['email']}"
            ]
        ),
        metadata = {'Name': f"{guest_data[i]['name']}"}
    )
    for i in range(len(guest_data))
]

In [4]:
docs?

[0;31mType:[0m        list
[0;31mString form:[0m [Document(id_='fceebf0f-a406-4dd9-af61-a39efd47f300', embedding=None, metadata={'Name': 'Ada Love <...> rce=None, audio_resource=None, video_resource=None, text_template='{metadata_str}\n\n{content}')]
[0;31mLength:[0m      3
[0;31mDocstring:[0m  
Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list.
The argument must be an iterable if specified.

In [5]:
# inspect the content of our docs
docs[0].text

"Name: Ada Lovelace\nRelation: best friend\nDescription: Lady Ada Lovelace is my best friend. She is an esteemed mathematician and friend. She is renowned for her pioneering work in mathematics and computing, often celebrated as the first computer programmer due to her work on Charles Babbage's Analytical Engine.\nEmail: ada.lovelace@example.com"

## Build a retriever tool off our guestlist

In [10]:
from llama_index.core.tools import FunctionTool
from llama_index.retrievers.bm25 import BM25Retriever
from loguru import logger


bm25_retriever = BM25Retriever.from_defaults(nodes=docs)

def get_guest_info_retriever(query:str) -> str:
    """retrieve detailed information on guests based on their name or relation"""
    result = bm25_retriever.retrieve(query)
    logger.info(result)

    if result:
        return '\n\n'.join([
            value.text for value in result[:3]
        ])
    else:
        return 'No matching guest information'

# create the tool
guest_info_tool = FunctionTool.from_defaults(fn=get_guest_info_retriever)
    

In [11]:
print(guest_info_tool.call("Tell me about our guest named 'Lady Ada Lovelace'."))

[32m2025-07-02 11:42:51.921[0m | [1mINFO    [0m | [36m__main__[0m:[36mget_guest_info_retriever[0m:[36m11[0m - [1m[NodeWithScore(node=TextNode(id_='fceebf0f-a406-4dd9-af61-a39efd47f300', embedding=None, metadata={'Name': 'Ada Lovelace'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, metadata_template='{key}: {value}', metadata_separator='\n', text="Name: Ada Lovelace\nRelation: best friend\nDescription: Lady Ada Lovelace is my best friend. She is an esteemed mathematician and friend. She is renowned for her pioneering work in mathematics and computing, often celebrated as the first computer programmer due to her work on Charles Babbage's Analytical Engine.\nEmail: ada.lovelace@example.com", mimetype='text/plain', start_char_idx=None, end_char_idx=None, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=1.847428321838379), NodeWithScore(node=TextNode(id_='e610d630-982c-4c86-849f-93a6eaa031ca', embedding=None, metada

Name: Ada Lovelace
Relation: best friend
Description: Lady Ada Lovelace is my best friend. She is an esteemed mathematician and friend. She is renowned for her pioneering work in mathematics and computing, often celebrated as the first computer programmer due to her work on Charles Babbage's Analytical Engine.
Email: ada.lovelace@example.com

Name: Dr. Nikola Tesla
Relation: old friend from university days
Description: Dr. Nikola Tesla is an old friend from your university days. He's recently patented a new wireless energy transmission system and would be delighted to discuss it with you. Just remember he's passionate about pigeons, so that might make for good small talk.
Email: nikola.tesla@gmail.com


## Creating our Party Event-based Agent

In [None]:
from llama_index.core.agent.workflow import AgentWorkflow
# from llama_index.llms.ollama import Ollama
from llama_index.llms.llama_cpp import LlamaCPP

# initiate the model
# llm = Ollama(model='llama2:7b', temperature=0.7)
llm = LlamaCPP(
    model_path="../models/Meta-Llama-3.1-8B-Instruct-Q6_K.gguf",
    temperature=0.7,
    context_window=2048,
    model_kwargs={"n_gpu_layers": 4, "n_ctx": 2048}
)



# create the agent from a function tool
agent = AgentWorkflow.from_tools_or_functions(
    tools_or_functions=[guest_info_tool],
    llm=llm,
)

# test our new agent capabilites
output = await agent.run("Tell me about our guest named 'Lady Ada Lovelace'.")


print("Alfred's response: ")
print(output)


llama_model_load_from_file_impl: using device Metal (Apple M2) - 4437 MiB free
llama_model_loader: loaded meta data with 33 key-value pairs and 292 tensors from ../models/Meta-Llama-3.1-8B-Instruct-Q6_K.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Meta Llama 3.1 8B Instruct
llama_model_loader: - kv   3:                           general.finetune str              = Instruct
llama_model_loader: - kv   4:                           general.basename str              = Meta-Llama-3.1
llama_model_loader: - kv   5:                         general.size_label str              = 8B
llama_model_loader: - kv   6:                    

Alfred's response: 
Marie Curie is a renowned physicist and chemist, famous for her research on radioactivity.  She is not related to Lady Ada Lovelace.  However, Marie Curie has a relation with Dr. Nikola Tesla.  Dr. Nikola Tesla is an old friend from Marie Curie's university days.  He is the one who recently patented a new wireless energy transmission system and would be delighted to discuss it with you.  Just remember he's passionate about pigeons, so that might make for good small talk.  You can contact Marie Curie at marie.curie@example.com and Dr. Nikola Tesla at nikola.tesla@gmail.com. It is worth noting that the guest information retrieved is based on the name or relation provided.  It is unclear what the actual relationship between the two women might be.   Dr. Tesla's relation to Lady Ada Lovelace is not mentioned in the provided information. It is unclear if they have any relation.  You might want to ask him about that if you ever meet.  You can contact Dr. Tesla at nikola.t

## Adding additional tool supports to our agent - HF hub stat tool and web search tools ( weather tool and update internet search tool)

In [49]:
# weather tool 
from llama_index.tools.weather import OpenWeatherMapToolSpec
from dotenv import load_dotenv
import os

load_dotenv()
open_weather_key = os.getenv('OPEN_WEATHER_API_KEY')


weather_tool_spec = OpenWeatherMapToolSpec(key=open_weather_key)
# create the tool
tools = weather_tool_spec.to_tool_list()
for tool in tools:
    print(tool.metadata)

weather_tool = weather_tool_spec.to_tool_list()[0]


# testing out the tool
result = weather_tool.call('what is the current weather in Liverpool, England')
print(result)


ToolMetadata(description='weather_at_location(location: str) -> List[llama_index.core.schema.Document]\n\n        Finds the current weather at a location.\n\n        Args:\n            place (str):\n                The place to find the weather at.\n                Should be a city name and country.\n        ', name='weather_at_location', fn_schema=<class 'llama_index.core.tools.utils.weather_at_location'>, return_direct=False)
ToolMetadata(description='forecast_tommorrow_at_location(location: str) -> List[llama_index.core.schema.Document]\n\n        Finds the weather forecast for tomorrow at a location.\n\n        Args:\n            location (str):\n                The location to find the weather tomorrow at.\n                Should be a city name and country.\n        ', name='forecast_tommorrow_at_location', fn_schema=<class 'llama_index.core.tools.utils.forecast_tommorrow_at_location'>, return_direct=False)
[Document(id_='ae869773-caa0-4e5e-ab9c-39fbc6f3d385', embedding=None, meta

In [41]:
# search tool
from llama_index.tools.duckduckgo import DuckDuckGoSearchToolSpec

# instantiate the DuckduckGo search tool
search_tool_spec = DuckDuckGoSearchToolSpec()

# create function tool 
search_tool = FunctionTool.from_defaults(fn=search_tool_spec.duckduckgo_full_search)
response = search_tool('when was huggingface platform made public?')
print(response)



RatelimitException: https://links.duckduckgo.com/d.js?q=when+was+huggingface+platform+made+public%3F&kl=wt-wt&l=wt-wt&p=&s=0&df=&vqd=4-144989448283941718767879249954094752024&bing_market=wt-WT&ex=-1 202 Ratelimit

In [50]:
# model stats tool - 
# a tool to fetch model statistics from the Hugging Face Hub based on a username.
from huggingface_hub import list_models

# defining a get_model_stats funtion
def get_model_stats(author:str) -> str:
    """ fetch the most downloaded model from a specific author"""
    try:
        models = list(list_models(author=author, sort='downloads', direction=-1, limit=1))

        if models:
            model = models[0]
            return f'The most downloaded model by {author} is {model.id} with {model.downloads:,} downloads'
        else:
            return f'No model found for {author}'
        
    except Exception as e:
        return f'Could not fetch models for {author} due to {str(e)}'
    
# creating the tool
model_stats_tool = FunctionTool.from_defaults(fn=get_model_stats)

result = model_stats_tool('facebook')

print(result)

The most downloaded model by facebook is facebook/esm2_t30_150M_UR50D with 21,657,160 downloads


## Building the agents with all the necessary tools

In [53]:
party_agent = AgentWorkflow.from_tools_or_functions(
    tools_or_functions=[model_stats_tool, search_tool, weather_tool, guest_info_tool],
    llm=llm,
)

response = await party_agent.run('who is Ada lovelace and what is facebook most downloaded model?')
print("Alfred's response:  %s", response)

Llama.generate: 91 prefix-match hit, remaining 864 prompt tokens to eval
llama_perf_context_print:        load time =    9820.52 ms
llama_perf_context_print: prompt eval time =   14887.01 ms /   864 tokens (   17.23 ms per token,    58.04 tokens per second)
llama_perf_context_print:        eval time =   28089.12 ms /   255 runs   (  110.15 ms per token,     9.08 tokens per second)
llama_perf_context_print:       total time =   43099.75 ms /  1119 tokens
Llama.generate: 954 prefix-match hit, remaining 75 prompt tokens to eval
llama_perf_context_print:        load time =    9820.52 ms
llama_perf_context_print: prompt eval time =    3005.25 ms /    75 tokens (   40.07 ms per token,    24.96 tokens per second)
llama_perf_context_print:        eval time =   26537.61 ms /   255 runs   (  104.07 ms per token,     9.61 tokens per second)
llama_perf_context_print:       total time =   29646.66 ms /   330 tokens
Llama.generate: 1028 prefix-match hit, remaining 237 prompt tokens to eval
llama_per

Alfred's response:  %s Ada Lovelace was a mathematician and is often credited as the first computer programmer. She worked with Charles Babbage on the Analytical Engine. She is also known for her work in mathematics and computing. She is the best friend of Marie Curie who was a physicist and chemist. She is known for her work on radioactivity and was the first woman to win a Nobel Prize.  Thought: I can answer the question without using any more tools. I'll use the user's language to answer
Answer: Ada Lovelace was a mathematician and is often credited as the first computer programmer. She worked with Charles Babbage on the Analytical Engine. She is also known for her work in mathematics and computing. She is the best friend of Marie Curie who was a physicist and chemist. She is known for her work on radioactivity and was the first woman to win a Nobel Prize.  Thought: I can answer the question without using any more tools. I'll use the user's language to answer
Answer:  I think there 

In [None]:
# asking our agent a new question
result = await party_agent.run('one of our guest is from google, what can you tell me about their most popular model')
print("Alfred's response: \n")
print(result)

Llama.generate: 959 prefix-match hit, remaining 1 prompt tokens to eval
llama_perf_context_print:        load time =    9820.52 ms
llama_perf_context_print: prompt eval time =       0.00 ms /     1 tokens (    0.00 ms per token,      inf tokens per second)
llama_perf_context_print:        eval time =   29973.50 ms /   256 runs   (  117.08 ms per token,     8.54 tokens per second)
llama_perf_context_print:       total time =   30091.59 ms /   257 tokens
Llama.generate: 959 prefix-match hit, remaining 71 prompt tokens to eval
llama_perf_context_print:        load time =    9820.52 ms
llama_perf_context_print: prompt eval time =    3192.91 ms /    71 tokens (   44.97 ms per token,    22.24 tokens per second)
llama_perf_context_print:        eval time =   27005.14 ms /   255 runs   (  105.90 ms per token,     9.44 tokens per second)
llama_perf_context_print:       total time =   30316.91 ms /   326 tokens
Llama.generate: 1029 prefix-match hit, remaining 219 prompt tokens to eval
llama_perf

Alfred's response: 

The current weather in New York, USA is clear sky with a wind speed of 5.14 m/s and a direction of 270°. The humidity is 40% and the temperature is 27.19°C with a high of 27.98°C and a low of 25.57°C. The feels like temperature is 27.0°C. There is no rain and the cloud cover is 0%. 
assistant: Thought: The current language of the


[32m2025-07-05 01:20:39.836[0m | [1mINFO    [0m | [36m__main__[0m:[36mget_guest_info_retriever[0m:[36m11[0m - [1m[NodeWithScore(node=TextNode(id_='e610d630-982c-4c86-849f-93a6eaa031ca', embedding=None, metadata={'Name': 'Dr. Nikola Tesla'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, metadata_template='{key}: {value}', metadata_separator='\n', text="Name: Dr. Nikola Tesla\nRelation: old friend from university days\nDescription: Dr. Nikola Tesla is an old friend from your university days. He's recently patented a new wireless energy transmission system and would be delighted to discuss it with you. Just remember he's passionate about pigeons, so that might make for good small talk.\nEmail: nikola.tesla@gmail.com", mimetype='text/plain', start_char_idx=None, end_char_idx=None, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=1.9424822330474854), NodeWithScore(node=TextNode(id_='24de1642-cb8f-4fec-9905-1f0dc25db9

# Notice the above result isn't grounded and lacks relevance to the question asked. In summary the agent hallucinated 

In [56]:
# asking our agent a new question - combining multiple tool calls
text = """
I need to speak with Dr. Nikola Tesla about recent advancements in wireless energy. Can you help me prepare for this conversation?
"""
result = await party_agent.run(text)
print("Alfred's response: \n")
print(result)

Llama.generate: 938 prefix-match hit, remaining 29 prompt tokens to eval
llama_perf_context_print:        load time =    9820.52 ms
llama_perf_context_print: prompt eval time =    1782.27 ms /    29 tokens (   61.46 ms per token,    16.27 tokens per second)
llama_perf_context_print:        eval time =   30725.33 ms /   255 runs   (  120.49 ms per token,     8.30 tokens per second)
llama_perf_context_print:       total time =   32640.14 ms /   284 tokens
Llama.generate: 966 prefix-match hit, remaining 172 prompt tokens to eval
llama_perf_context_print:        load time =    9820.52 ms
llama_perf_context_print: prompt eval time =    4455.84 ms /   172 tokens (   25.91 ms per token,    38.60 tokens per second)
llama_perf_context_print:        eval time =   30521.03 ms /   255 runs   (  119.69 ms per token,     8.35 tokens per second)
llama_perf_context_print:       total time =   35110.93 ms /   427 tokens
Llama.generate: 1137 prefix-match hit, remaining 142 prompt tokens to eval
llama_pe

Alfred's response: 

 Thought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: weather_at_location


## Keeping context between via conversation memeory

In [58]:
from llama_index.core.workflow import Context

cts =  Context(party_agent)

response = await party_agent.run('who is Ada?', ctx=cts)

print(response)

Llama.generate: 944 prefix-match hit, remaining 1 prompt tokens to eval
llama_perf_context_print:        load time =    9820.52 ms
llama_perf_context_print: prompt eval time =       0.00 ms /     1 tokens (    0.00 ms per token,      inf tokens per second)
llama_perf_context_print:        eval time =   27130.19 ms /   256 runs   (  105.98 ms per token,     9.44 tokens per second)
llama_perf_context_print:       total time =   27241.08 ms /   257 tokens
Llama.generate: 946 prefix-match hit, remaining 154 prompt tokens to eval
llama_perf_context_print:        load time =    9820.52 ms
llama_perf_context_print: prompt eval time =    4057.35 ms /   154 tokens (   26.35 ms per token,    37.96 tokens per second)
llama_perf_context_print:        eval time =   28704.23 ms /   255 runs   (  112.57 ms per token,     8.88 tokens per second)
llama_perf_context_print:       total time =   32874.88 ms /   409 tokens
Llama.generate: 946 prefix-match hit, remaining 207 prompt tokens to eval
llama_perf

It seems that Ada Lovelace is a historical figure known for her contributions to mathematics and computing. Unfortunately, I couldn't find any information on the most downloaded model from her author. The tool returned a message indicating that no model was found for her. However, it's worth noting that the tool was not able to find any information on a model from an unknown author either.  If you could provide more context or information, I might be able to help you better.  Also, it's worth mentioning that the tool returned a description of Ada Lovelace from the get_guest_info_retriever tool, which seems to be a mistake, as it's not the correct description of her.  I hope this answer is helpful, please let me know if you have any other questions.  If you would like me to answer any other questions, please let me know and I will do my best to assist you.  Thank you for using this tool.  I hope you found the information you were looking for.  If you have any other questions, feel free 