<a href="https://colab.research.google.com/github/Zonavin/GDG_COLLAB/blob/main/agentic_rag_llamaindex.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install llama-index

In [None]:
!pip install llama-index-vector-stores-qdrant

In [None]:
!pip install llama-index-llms-google-genai llama-index-embeddings-fastembed

In [None]:
from llama_index.core import Settings
from llama_index.llms.google_genai import GoogleGenAI
from llama_index.embeddings.fastembed import FastEmbedEmbedding

import nest_asyncio
nest_asyncio.apply()

In [None]:
import os
from google.colab import userdata

os.environ["GOOGLE_API_KEY"] = userdata.get('GOOGLE_API_KEY')

In [None]:
query = "All the key features that were released in Google IO 2025"

In [None]:
llm = GoogleGenAI(
    model="gemini-2.5-flash-preview-05-20",
)

In [None]:
embed_model = FastEmbedEmbedding(model_name="BAAI/bge-small-en-v1.5")

Fetching 5 files:   0%|          | 0/5 [00:00<?, ?it/s]

tokenizer_config.json:   0%|          | 0.00/1.24k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/695 [00:00<?, ?B/s]

model_optimized.onnx:   0%|          | 0.00/66.5M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/706 [00:00<?, ?B/s]

In [None]:
Settings.llm = llm
Settings.embed_model = embed_model

In [None]:
llm.complete(query).text

## Agentic RAG

In [None]:
from llama_index.core import SimpleDirectoryReader

In [None]:
documents = SimpleDirectoryReader(input_files=["/content/luxella.pdf"]).load_data()

In [None]:
from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

In [None]:
import qdrant_client
from llama_index.core import SummaryIndex, VectorStoreIndex
from llama_index.core import StorageContext
from llama_index.vector_stores.qdrant import QdrantVectorStore

In [None]:
client = qdrant_client.QdrantClient(
    location=":memory:"
)

In [None]:
vector_store = QdrantVectorStore(client=client, collection_name="agentic-rag")
storage_context = StorageContext.from_defaults(vector_store=vector_store)

In [None]:
summary_index = SummaryIndex(nodes)

In [None]:
vector_index = VectorStoreIndex.from_documents(
    documents,
    storage_context=storage_context,
)

In [None]:
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()

In [None]:
from llama_index.core.tools import QueryEngineTool
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector

In [None]:
summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "This Agent is useful to summarize the Research paper in a simplied way by giving intuitions"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "This Agent is useful to answer to the user queries within the paper and retrieve the relevant piece of information"
    ),
)

In [None]:
query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
    verbose=True
)

In [None]:
response = query_engine.query("summarize the training approach")

[1;3;38;5;200mSelecting query engine 0: The question 'summarize the training approach' is a request for a summary. Choice (1) explicitly states that the Agent is 'useful to summarize the Research paper in a simplified way', which directly aligns with the need to summarize a specific aspect like the 'training approach' from a paper. Choice (2) is about answering specific queries and retrieving information, not generating a summary..
[0m

In [None]:
print(response)

The training approach for LuxLlama involved several key steps, beginning with the meticulous assembly of a specialized training corpus. This corpus, comprising over 400,000 examples, integrated publicly available Alpaca-styled instruction datasets adapted for Luxembourgish, along with articles from magazines and newspapers focused on reasoning tasks.

Before training, the diverse datasets underwent a systematic preprocessing pipeline. This included cleaning, standardizing text formats, and categorizing data by focus (general reasoning, mathematical reasoning, or Luxembourgish language tasks). Distinct prompt templates were applied to each category to guide the model, and all processed data was then amalgamated, shuffled, and partitioned into training and validation sets.

LuxLlama was derived from the Meta-Llama-3.1-8B-Instruct base model. To adapt it efficiently, Parameter-Efficient Fine-Tuning (PEFT) was employed, specifically the Low-Rank Adaptation (LoRA) technique. LoRA adapters w

In [None]:
response2 = query_engine.query("what is the benchmark dataset name?")

[1;3;38;5;200mSelecting query engine 1: The question 'what is the benchmark dataset name?' is a specific user query seeking to retrieve a relevant piece of information from a paper. Choice (2) explicitly states that the Agent is 'useful to answer to the user queries within the paper and retrieve the relevant piece of information', which directly aligns with the nature of the question. Choice (1) describes an Agent for summarization and providing intuitions, which is less relevant for retrieving a specific factual detail..
[0m

In [None]:
print(response2)

The benchmark dataset is named LUXELLA.


## ReACT Agent

Build a simple project: Search Assistant Agent

In [None]:
from llama_index.core.agent import ReActAgent

In [None]:
agent = ReActAgent.from_tools([vector_tool], llm=llm, verbose=True,allow_parallel_tool_calls=True)

In [None]:
response = agent.chat("what is the benchmark dataset name?")

> Running step e3e35763-623f-4532-ae60-d65e9010dada. Step input: what is the benchmark dataset name?
[1;3;38;5;200mThought: The current language of the user is: english. I need to use a tool to help me answer the question.
Action: query_engine_tool
Action Input: {'input': 'benchmark dataset name'}
[0m[1;3;34mObservation: The benchmark dataset is named LUXELLA (Luxembourgish Excellence Language Learning Assessment).
[0m> Running step 0e272b2c-5d5f-4561-9c96-30d356f9b073. Step input: None
[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer
Answer: The benchmark dataset is named LUXELLA (Luxembourgish Excellence Language Learning Assessment).
[0m

In [None]:
print(response)

The benchmark dataset is named LUXELLA (Luxembourgish Excellence Language Learning Assessment).
