# Generation-aware Retreival

Consists of these optional components:
* Reranking
* Filtering
* Disambiguation followups
* Query decomposition
* Personalization

In [1]:
#%pip install --quiet llama-index llama-index-llms-gemini llama-index-embeddings-huggingface pydantic-ai 

In [1]:
MODEL_ID = "gemini-2.0-flash"
EMBED_MODEL_ID = "BAAI/bge-small-en-v1.5"

import os
from dotenv import load_dotenv
load_dotenv("../keys.env")
assert os.environ["GEMINI_API_KEY"][:2] == "AI",\
       "Please specify the GEMINI_API_KEY access token in keys.env file"
assert os.environ["HF_TOKEN"][:2] == "hf",\
       "Please specify the HF_TOKEN access token in keys.env file"

In [2]:
import sys
sys.path.append('../basic_rag')
import gutenberg_text_loader as gtl
import logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

The examples here are on two Geology texts:
* An 1878 book: The Student's Elements of Geology:  https://www.gutenberg.org/cache/epub/3772/pg3772.txt
* A  1905 book: The Elements of Geology: https://www.gutenberg.org/cache/epub/4204/pg4204.txt

## Plain Semantic Indexing to use as a comparison

In [10]:
#!rm -rf .cache vector_index   # uncomment to start afresh

In [12]:
!ls ./.cache vector_index

./.cache:
pg3772_3736454afe.txt  pg4204_81e8e90db3.txt

vector_index:
default__vector_store.json  graph_store.json	      index_store.json
docstore.json		    image__vector_store.json


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [13]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core import StorageContext, load_index_from_storage
from llama_index.core import Document
import os
import pathlib

INDEX_DIR="vector_index"
Settings.embed_model = HuggingFaceEmbedding(
    model_name=EMBED_MODEL_ID
)

# these are the defaults in LlamaIndex
Settings.chunk_size = 1024; Settings.chunk_overlap = 20; TOP_K=2
#Settings.chunk_size = 100; Settings.chunk_overlap = 10; TOP_K=4

if os.path.isdir(INDEX_DIR):
    print("Loading in already created index")
    storage_context = StorageContext.from_defaults(persist_dir=INDEX_DIR)
    index = load_index_from_storage(storage_context)
else:
    # downloads into .cache the first time
    gs = gtl.GutenbergSource()
    gs.load_from_url("https://www.gutenberg.org/cache/epub/3772/pg3772.txt")
    gs.load_from_url("https://www.gutenberg.org/cache/epub/4204/pg4204.txt")
    # reads all files in .cache
    documents = SimpleDirectoryReader(input_dir="./.cache", required_exts=[".txt"], exclude_hidden=False).load_data()
    # creates a vector db
    index = VectorStoreIndex.from_documents(documents)
    index.storage_context.persist(persist_dir=INDEX_DIR)

2025-03-27 18:25:35,393 - INFO - Load pretrained SentenceTransformer: BAAI/bge-small-en-v1.5
2025-03-27 18:25:36,622 - INFO - 2 prompts are loaded, with the keys: ['query', 'text']


Loading in already created index


2025-03-27 18:25:38,643 - INFO - Loading all indices.


In [14]:
from llama_index.llms.gemini import Gemini
from llama_index.core.query_engine import RetrieverQueryEngine

llm = Gemini(model=f"models/{MODEL_ID}", api_key=os.environ["GEMINI_API_KEY"])

query_engine = RetrieverQueryEngine.from_args(
    retriever=index.as_retriever(similarity_top_k=TOP_K), llm=llm,
)

def semantic_rag(question):
    response = query_engine.query(question)
    response = {
        "answer": str(response),
        "source_nodes": response.source_nodes
    }
    print(response['answer'])
    for node in response['source_nodes']:
        print(node)
    return response
 
semantic_rag("Describe the geology of the Grand Canyon");

  llm = Gemini(model=f"models/{MODEL_ID}", api_key=os.environ["GEMINI_API_KEY"])


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

The Grand Canyon is north of the high plateaus of northern Arizona and southern Utah. The canyon is cut into stratified rocks that are more than ten thousand feet thick with a gentle inclination northward. From the broad platform rises a series of gigantic stairs, often more than one thousand feet high and a score or more miles in breadth. The retreating escarpments and the walls of the ravines are carved into architectural forms by weathering and deflation.

Node ID: 7b635fb9-7b61-4508-ad6a-370f5cd42822
Text: W. M. DAVIS    HARVARD UNIVERSITY, CAMBRIDGE, MASS.    JULY,
1905            CONTENTS    INTRODUCTION.--THE SCOPE AND AIM OF
GEOLOGY    PART I    EXTERNAL GEOLOGICAL AGENCIES         I. THE WORK
OF THE WEATHER      II. THE WORK OF GROUND WATER     III. RIVERS AND
VALLEYS      IV. RIVER DEPOSITS       V. THE WORK OF GLACIERS      VI.
THE WORK OF ...
Score:  0.771

Node ID: 2e6e56ad-1080-4534-9177-6e25d1db23ff
Text: As they are little protected by talus, which  commonly is
removed 

Saved answer:
<pre>
The Grand Canyon is north of the high plateaus of northern Arizona and southern Utah. The canyon is cut into stratified rocks that are more than ten thousand feet thick with a gentle inclination northward. From the broad platform rises a series of gigantic stairs, often more than one thousand feet high and a score or more miles in breadth. The retreating escarpments and the walls of the ravines are carved into architectural forms by weathering and deflation.
</pre>

In [15]:
# shouldn't work because the national forest didn't exist at the time the book was written
semantic_rag("Describe the geology of Petrified National Forest");

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

I'm sorry, but the provided text does not contain information about the geology of Petrified National Forest.

Node ID: 7bfb39c5-9900-465d-8f7b-b9f0d49e6617
Text: In the Nova Scotia  field, out of seventy-six distinct coal
seams, twenty are  underlain by old forest grounds.    The presence of
fire clay beneath a seam points in the same  direction. Such
underclays withstand intense heat and are used in  making fire brick,
because their alkalies have been removed by the  long-continued growth
of vegetation....
Score:  0.707

Node ID: a3556434-f9be-45c2-871f-0cf47f859389
Text: — Purity of the Coal explained. — Conversion of Coal into
Anthracite. — Origin of Clay-ironstone. — Marine and brackish-water
Strata in Coal. — Fossil Insects. — Batrachian Reptiles. —
Labyrinthodont Foot-prints in Coal-measures. — Nova Scotia  Coal-
measures with successive Growths of erect fossil Trees. —  Similarity
of American and European...
Score:  0.703



Saved answer:
<pre>
The provided text does not contain information about where Alexander died.
</pre>

## Limitations of plain-vanilla Semantic Indexing

