# [STARTER] Exercise - Building an Agentic RAG System

In this exercise, you will build an Agentic RAG (Retrieval-Augmented Generation) system that 
combines the power of AI agents with traditional RAG pipelines. You'll create an agent that 
can decide when and how to retrieve information from different sources, including vector 
databases, web search, and other tools.


## Challenge

Your challenge is to create an Agentic RAG system that can:

- Build a RAG pipeline as a tool that can be used by the agent
- Create an agent that can decide which tool to use based on the query
- Handle different types of queries intelligently
- Combine information from multiple sources when needed


## Setup
First, let's import the necessary libraries:

In [None]:
# Only needed for Udacity workspace

import importlib.util
import sys

# Check if 'pysqlite3' is available before importing
if importlib.util.find_spec("pysqlite3") is not None:
    import pysqlite3
    sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

In [None]:
import os
from typing import List
from dotenv import load_dotenv

from lib.agents import Agent
from lib.llm import LLM
from lib.state_machine import Run
from lib.messages import BaseMessage
from lib.tooling import tool
from lib.vector_db import VectorStoreManager, CorpusLoaderService
from lib.rag import RAG

In [None]:
load_dotenv()

In [None]:
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

## Load data to Vector DB

In [None]:
db = VectorStoreManager(OPENAI_API_KEY)
db

In [None]:
loader_service = CorpusLoaderService(db)

In [None]:
rag_llm = LLM(
    model="gpt-4o-mini",
    temperature=0.3,
)

In [None]:
# TODO: Add the games pdf file path with the extenstion .pdf
# And define a store name in load_pdf() method

games_market_rag = RAG(
    llm=rag_llm,
    vector_store = loader_service.load_pdf()
)

In [None]:
result:Run = games_market_rag.invoke(
    "What's the  state of virtual reality"
)
print(result.get_final_state()["answer"])

In [None]:
# TODO: Add the electric vehicles pdf file path with the extenstion .pdf
# And define a store name in load_pdf() method

electric_vehicles_rag = RAG(
    llm=rag_llm,
    vector_store = loader_service.load_pdf()
)

In [None]:
result:Run = electric_vehicles_rag.invoke("What was the number of electric car sales and their market share in Brazil in 2024?")
print(result.get_final_state()["answer"])

## Tools

In a simple form, Agentic RAG can act like a router, choosing between multiple external sources to retrieve relevant information. These sources aren't limited to databases, they can also include tools like web search or APIs for services such as Slack or email.

In this case it will choose between two collections.

In [None]:
# TODO: Define a tool that returns result.get_final_state()["answer"]
# DONOT Forget about defining the tool docstrings
@tool
def search_global_ev_collection(query):
    return

In [None]:
# TODO: Define a tool that returns result.get_final_state()["answer"]
# DONOT Forget about defining the tool docstrings
@tool
def search_games_market_report_collection(query):
    return

In [None]:
# TODO: Add the tools you have defined and the instructions to your agent
agentic_rag = Agent(
    model_name="gpt-4o-mini",
    tools=[],    
    instructions=""
)

In [None]:
def print_messages(messages: List[BaseMessage]):
    for m in messages: 
        print(f" -> (role = {m.role}, content = {m.content}, tool_calls = {getattr(m, 'tool_calls', None)})")

## Run

In [None]:
run_1 = agentic_rag.invoke(
    query="Who won the 2025 Oscar for International Movie?", 
    session_id="oscar",
)

print("\nMessages from run 1:")
messages = run_1.get_final_state()["messages"]
print_messages(messages)

In [None]:
run_2 = agentic_rag.invoke(
    query= (
        "Which two countries accounted for most of the electric car exports from " 
        "the Asia Pacific region (excluding China) in 2024?"
    ),
    session_id="electric_car",
)

print("\nMessages from run 2:")
messages = run_2.get_final_state()["messages"]
print_messages(messages)

In [None]:
run_3 = agentic_rag.invoke(
    query= (
        "Why is generative AI seen more as an accelerator than a replacement in game development?"
    ),
    session_id="games",
)

print("\nMessages from run 3:")
messages = run_3.get_final_state()["messages"]
print_messages(messages)