### There are mainly three basic retrieval technique in RAG 
Here arer some mentioned below

1) Exact match word
2) Embedding match based
3) Hybrid 

### There are some advance retrieval technique in RAG 
Here arer some mentioned below

1) Sentence window retrieval
2) Auto Merge Retrieval

### Exact match search - using simple approach

In [None]:
# -------------------------------
# Imports
# -------------------------------
from typing import List
from langchain.docstore.document import Document
from langchain.schema import BaseRetriever
from pydantic import PrivateAttr

from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

# -------------------------------
# 1Ô∏è‚É£ Custom Exact Match Retriever
# -------------------------------
class ExactMatchRetriever(BaseRetriever):
    _documents: List[Document] = PrivateAttr()

    def __init__(self, documents: List[Document], **kwargs):
        super().__init__(**kwargs)
        self._documents = documents

    def get_relevant_documents(self, query: str) -> List[Document]:
        query_lower = query.lower()
        return [
            doc for doc in self._documents
            if any(word in doc.page_content.lower() for word in query_lower.split())
        ]

# -------------------------------
# 2Ô∏è‚É£ Prepare Documents
# -------------------------------
docs = [
    Document(page_content="Python is a programming language."),
    Document(page_content="LlamaIndex is a framework for building LLM apps."),
    Document(page_content="FAISS provides similarity search."),
]

retriever = ExactMatchRetriever(docs)

# -------------------------------
# 3Ô∏è‚É£ Setup LLM + Prompt
# -------------------------------
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.0)

system_prompt = (
    "Use the given context to answer the question. "
    "If you don't know the answer, say you don't know. "
    "Use three sentences maximum and keep the answer concise. "
    "Context: {context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

# Create the document combination chain
question_answer_chain = create_stuff_documents_chain(llm, prompt)

# -------------------------------
# 4Ô∏è‚É£ Create Retrieval Chain
# -------------------------------
chain = create_retrieval_chain(retriever, question_answer_chain)

# -------------------------------
# 5Ô∏è‚É£ Query Example
# -------------------------------
query = "What is LlamaIndex?"
result = chain.invoke({"input": query})

print("Answer:\n", result)


## Using BM25

In [None]:
from typing import List
from rank_bm25 import BM25Okapi
from langchain.docstore.document import Document
from langchain.schema import BaseRetriever
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from pydantic import PrivateAttr

# -------------------------------
# 3Ô∏è‚É£ Prepare Documents
# -------------------------------
docs = [
    Document(page_content="Python is a programming language."),
    Document(page_content="LlamaIndex is a framework for building LLM apps."),
    Document(page_content="FAISS provides similarity search."),
]

# -------------------------------
# 4Ô∏è‚É£ BM25 Retriever with PrivateAttr
# -------------------------------
class BM25Retriever(BaseRetriever):
    _documents: List[Document] = PrivateAttr()
    _tokenized_docs: List[List[str]] = PrivateAttr()
    _bm25: BM25Okapi = PrivateAttr()

    def __init__(self, documents: List[Document], **kwargs):
        super().__init__(**kwargs)
        self._documents = documents
        self._tokenized_docs = [doc.page_content.lower().split() for doc in documents]
        self._bm25 = BM25Okapi(self._tokenized_docs)

    def get_relevant_documents(self, query: str) -> List[Document]:
        query_tokens = query.lower().split()
        scores = self._bm25.get_scores(query_tokens)
        ranked_docs = [doc for _, doc in sorted(zip(scores, self._documents), reverse=True)]
        return ranked_docs[:3]  # top 3 documents

retriever = BM25Retriever(docs)

# -------------------------------
# 5Ô∏è‚É£ Setup LLM + Prompt
# -------------------------------
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.0)

system_prompt = (
    "Use the given context to answer the question. "
    "If you don't know the answer, say you don't know. "
    "Use three sentences maximum and keep the answer concise. "
    "Context: {context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

question_answer_chain = create_stuff_documents_chain(llm, prompt)

# -------------------------------
# 6Ô∏è‚É£ Create Retrieval Chain
# -------------------------------
chain = create_retrieval_chain(retriever, question_answer_chain)

# -------------------------------
# 7Ô∏è‚É£ Query Example
# -------------------------------
query = "What is LlamaIndex?"
result = chain.invoke({"input": query})

print("Answer:\n", result)

### Embedding based 

In [None]:
import os
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.docstore.document import Document
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from getpass import getpass

# -------------------------------
# 3Ô∏è‚É£ Set OpenAI API Key
# -------------------------------
if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key: ")

# -------------------------------
# 4Ô∏è‚É£ Prepare Documents
# -------------------------------
docs = [
    Document(page_content="Python is a programming language."),
    Document(page_content="LlamaIndex is a framework for building LLM apps."),
    Document(page_content="FAISS provides similarity search."),
]

# -------------------------------
# 5Ô∏è‚É£ Create embeddings and FAISS vector store
# -------------------------------
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(docs, embeddings)

# -------------------------------
# 6Ô∏è‚É£ Setup Retriever
# -------------------------------
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 3}  # top 3 similar documents
)

# -------------------------------
# 7Ô∏è‚É£ Setup LLM and Prompt
# -------------------------------
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.0)

system_prompt = (
    "Use the given context to answer the question. "
    "If you don't know the answer, say you don't know. "
    "Use three sentence maximum and keep the answer concise. "
    "Context: {context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

question_answer_chain = create_stuff_documents_chain(llm, prompt)

# -------------------------------
# 8Ô∏è‚É£ Create Retrieval Chain
# -------------------------------
chain = create_retrieval_chain(retriever, question_answer_chain)

# -------------------------------
# 9Ô∏è‚É£ Query Example
# -------------------------------
query = "What is LlamaIndex?"
response = chain.invoke({"input": query})

print("Answer:\n", response)

### Hybrid Based

In [None]:
import os
from getpass import getpass
from typing import List

from rank_bm25 import BM25Okapi
from langchain.docstore.document import Document
from langchain.schema import BaseRetriever
from pydantic import PrivateAttr
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain.chains import create_retrieval_chain

# -------------------------------
# 3Ô∏è‚É£ Set OpenAI API Key
# -------------------------------
if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key: ")

# -------------------------------
# 4Ô∏è‚É£ Prepare Documents
# -------------------------------
docs = [
    Document(page_content="Python is a programming language."),
    Document(page_content="LlamaIndex is a framework for building LLM apps."),
    Document(page_content="FAISS provides similarity search."),
]

# -------------------------------
# 5Ô∏è‚É£ BM25 Retriever
# -------------------------------
class BM25Retriever:
    def __init__(self, documents: List[Document], top_k: int = 3):
        self._documents = documents
        self.top_k = top_k
        self.tokenized_docs = [doc.page_content.lower().split() for doc in documents]
        self.bm25 = BM25Okapi(self.tokenized_docs)

    def get_relevant_documents(self, query: str) -> List[Document]:
        query_tokens = query.lower().split()
        scores = self.bm25.get_scores(query_tokens)
        ranked_docs = [doc for _, doc in sorted(zip(scores, self._documents), reverse=True)]
        return ranked_docs[:self.top_k]

# -------------------------------
# 6Ô∏è‚É£ Embedding Retriever (FAISS)
# -------------------------------
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(docs, embeddings)
embedding_retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 3})

# -------------------------------
# 7Ô∏è‚É£ Hybrid Retriever
# -------------------------------
class HybridRetriever(BaseRetriever):
    _bm25_retriever: BM25Retriever = PrivateAttr()
    _embedding_retriever: BaseRetriever = PrivateAttr()

    def __init__(self, bm25_retriever: BM25Retriever, embedding_retriever: BaseRetriever, **kwargs):
        super().__init__(**kwargs)
        self._bm25_retriever = bm25_retriever
        self._embedding_retriever = embedding_retriever

    def get_relevant_documents(self, query: str) -> List[Document]:
        bm25_docs = self._bm25_retriever.get_relevant_documents(query)
        embedding_docs = self._embedding_retriever.get_relevant_documents(query)
        # Merge and remove duplicates
        seen = set()
        merged_docs = []
        for doc in bm25_docs + embedding_docs:
            if doc.page_content not in seen:
                merged_docs.append(doc)
                seen.add(doc.page_content)
        return merged_docs

# Instantiate hybrid retriever
bm25_retriever = BM25Retriever(docs, top_k=2)
retriever = HybridRetriever(bm25_retriever, embedding_retriever)

# -------------------------------
# 8Ô∏è‚É£ LLM and Prompt Setup
# -------------------------------
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.0)

system_prompt = (
    "Use the given context to answer the question. "
    "If you don't know the answer, say you don't know. "
    "Use three sentence maximum and keep the answer concise. "
    "Context: {context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

question_answer_chain = create_stuff_documents_chain(llm, prompt)

# -------------------------------
# 9Ô∏è‚É£ Create Retrieval Chain
# -------------------------------
chain = create_retrieval_chain(retriever, question_answer_chain)

# -------------------------------
# üîü Query Example
# -------------------------------
query = "What is LlamaIndex?"
response = chain.invoke({"input": query})

print("Answer:\n", response)

### Sentence Window Retrieval

In [29]:
from llama_index.core import SimpleDirectoryReader, Document

# load document
documents = SimpleDirectoryReader(
    input_dir="documents/"
).load_data(show_progress=True)

# merge pages into one
document = Document(text="\n\n".join([doc.text for doc in documents]))
print(document.text)

Loading files: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 3/3 [00:00<00:00, 123.10it/s]

In the legal system, documentation is regarded as an essential element. Extending the risk management dimension, failure to document relevant data is itself considered a significant breach of and deviation from the standard of care.1‚Äì3 Of course, protection from legal jeopardy is far from the only reason for documentation in clinical care. The patient's record provides the only enduring version of the care as it evolves over time and a reference work of value in emergency care, research, and quality assurance. This discussion will outline some basic principles of sound documentation with an emphasis on those aspects that serve the goals of risk management and liability prevention.

Basic Principles of Documentation
A significant portion of risk management advice regarding documentation unfortunately boils down to the injunction, ‚ÄúYou physicians ought to write more.‚Äù From my years in the medicolegal field, I have found that this advice not only fails to be useful, but is actually 




In [30]:
from llama_index.core.node_parser import SentenceWindowNodeParser
from llama_index.core import Document

# create the sentence window node parser
node_parser = SentenceWindowNodeParser.from_defaults(
    window_size=2,
    window_metadata_key="window",
    original_text_metadata_key="original_text",
)

# Get nodes
nodes = node_parser.get_nodes_from_documents([Document(text=document.text)])

# Print out individual nodes
print([x.text for x in nodes])

# Print out the window around the second node
print(nodes[1].metadata["window"])

['In the legal system, documentation is regarded as an essential element. ', 'Extending the risk management dimension, failure to document relevant data is itself considered a significant breach of and deviation from the standard of care.1‚Äì3 Of course, protection from legal jeopardy is far from the only reason for documentation in clinical care. ', "The patient's record provides the only enduring version of the care as it evolves over time and a reference work of value in emergency care, research, and quality assurance. ", 'This discussion will outline some basic principles of sound documentation with an emphasis on those aspects that serve the goals of risk management and liability prevention.\r\n\r\n', 'Basic Principles of Documentation\r\nA significant portion of risk management advice regarding documentation unfortunately boils down to the injunction, ‚ÄúYou physicians ought to write more.‚Äù From my years in the medicolegal field, I have found that this advice not only fails to 

In [35]:
# creating OpenAI gpt-3.5-turbo LLM and OpenAIEmbedding model
import os, getpass
from llama_index.llms.openai import OpenAI
from llama_index.core.node_parser import SentenceWindowNodeParser
from llama_index.core import Document, load_index_from_storage, StorageContext
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import ServiceContext, VectorStoreIndex

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")


llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
embed_model = OpenAIEmbedding()

# Initialize node parser
node_parser = SentenceWindowNodeParser.from_defaults(window_size=2)

In [36]:
if not os.path.exists("./sentence_window_storage"):
    # creating the vector store index
    index = VectorStoreIndex.from_documents(
        [document], service_context=node_parser
    )

    # make vector store persistant
    index.storage_context.persist(persist_dir="./sentence_window_storage")
else:
    # load vector store indexed if they exist
    index = load_index_from_storage(
        StorageContext.from_defaults(persist_dir="./sentence_window_storage"),
        service_context=node_parser
    )

Loading llama_index.core.storage.kvstore.simple_kvstore from ./sentence_window_storage\docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from ./sentence_window_storage\index_store.json.


2025-10-13 12:07:21,455 - INFO - Loading all indices.


In [None]:
from llama_index.core.postprocessor import MetadataReplacementPostProcessor

# Add metadata replacement post processor
postproc = MetadataReplacementPostProcessor(
    target_metadata_key="window"
)

In [None]:
import os
from langchain_cohere import CohereRerank
from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
from langchain.docstore.document import Document as LC_Document
from typing import List
from pydantic import BaseModel, Field
from langchain.schema import BaseRetriever

os.environ["CO_API_KEY"] = ""  # replace with your key

# Wrap LlamaIndex retriever and apply postprocessor
class LlamaIndexRetrieverWrapper(BaseRetriever, BaseModel):
    retriever: any = Field(...)
    postprocessor: any = Field(default=None)  # Optional postprocessor

    class Config:
        arbitrary_types_allowed = True

    def get_relevant_documents(self, query: str) -> List[LC_Document]:
        nodes = self.retriever.retrieve(query)

        # Apply metadata replacement if provided
        if self.postprocessor:
            nodes = self.postprocessor.postprocess_nodes(nodes)

        return [LC_Document(page_content=node.text, metadata=node.metadata) for node in nodes]

# Wrap retriever and attach postprocessor
wrapped_retriever = LlamaIndexRetrieverWrapper(
    retriever=index.as_retriever(similarity_top_k=5),
    postprocessor=postproc
)

# Cohere reranking as before
cohere_reranker = CohereRerank(model="rerank-english-v3.0", top_n=2)
compression_retriever = ContextualCompressionRetriever(
    base_compressor=cohere_reranker,
    base_retriever=wrapped_retriever
)

# Query
query = "Explain Sports Authority of India"
compressed_docs = compression_retriever.invoke(query)

for i, doc in enumerate(compressed_docs, start=1):
    print("*"*100)
    print(f"{i}. {doc.page_content} | metadata: {doc.metadata}")

  class LlamaIndexRetrieverWrapper(BaseRetriever, BaseModel):
  warn(
2025-10-13 12:11:54,871 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-10-13 12:11:55,388 - INFO - HTTP Request: POST https://api.cohere.com/v2/rerank "HTTP/1.1 200 OK"


****************************************************************************************************
1. Parallelization: Unlike traditional RNNs (Recurrent Neural Networks), which process data sequentially, Transformers can process multiple words at once, making them faster and more efficient.
Versatility: Transformers are not limited to language tasks; they can be applied to any problem involving sequential data, including tasks like image recognition and time-series forecasting.
LLMs vs. Transformers: A Comparative Analysis
1. Purpose of LLMs and Transformer
LLMs: Primarily focused on generating and understanding natural language, LLMs are built on various architectures, including Transformers.
Transformers: A neural network architecture used for various tasks, including but not limited to language modeling.
2. Architecture Design
LLMs: Can be based on different architectures, but many modern LLMs utilize the Transformer architecture to achieve state-of-the-art performance.
Transform

## Auto Merge Retrieval

In [50]:
# !pip install llama-index-readers-file pymupdf
# !pip install llama-index-llms-openai

In [52]:
from pathlib import Path

from llama_index.readers.file import PDFReader
from llama_index.readers.file import PyMuPDFReader

In [51]:
loader = PyMuPDFReader()
# docs0 = loader.load_data(file=Path("./data/llama2.pdf"))
docs0 = loader.load(file_path=Path("documents/sport.txt"))

In [53]:
from llama_index.core import Document

doc_text = "\n\n".join([d.get_content() for d in docs0])
docs = [Document(text=doc_text)]

### Parse Chunk Hierarchy from Text, Load into Storage


In [54]:
from llama_index.core.node_parser import (
    HierarchicalNodeParser,
    SentenceSplitter,
)

node_parser = HierarchicalNodeParser.from_defaults()
nodes = node_parser.get_nodes_from_documents(docs)


In [56]:
from llama_index.core.node_parser import get_leaf_nodes, get_root_nodes
leaf_nodes = get_leaf_nodes(nodes)
print(len(leaf_nodes))

root_nodes = get_root_nodes(nodes)
print(len(root_nodes))

81
4


### Load into Storage


In [57]:
# define storage context
from llama_index.core.storage.docstore import SimpleDocumentStore
from llama_index.core import StorageContext
from llama_index.llms.openai import OpenAI

docstore = SimpleDocumentStore()

# insert nodes into docstore
docstore.add_documents(nodes)

# define storage context (will include vector store by default too)
storage_context = StorageContext.from_defaults(docstore=docstore)

llm = OpenAI(model="gpt-3.5-turbo")

## Load index into vector index
from llama_index.core import VectorStoreIndex

base_index = VectorStoreIndex(
    leaf_nodes,
    storage_context=storage_context,
)

2025-10-13 12:53:33,265 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


### Define Retriever


In [58]:
from llama_index.core.retrievers import AutoMergingRetriever

base_retriever = base_index.as_retriever(similarity_top_k=6)
retriever = AutoMergingRetriever(base_retriever, storage_context, verbose=True)

# query_str = "What were some lessons learned from red-teaming?"
# query_str = "Can you tell me about the key concepts for safety finetuning"
query_str = (
    "Sports Authority of India"
)

nodes = retriever.retrieve(query_str)
base_nodes = base_retriever.retrieve(query_str)

print(base_nodes)

2025-10-13 12:54:47,099 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-10-13 12:54:47,889 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


[NodeWithScore(node=TextNode(id_='8fd0c6f7-58db-4c59-9f66-588786e67691', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='7fc5fdc5-a9e5-4e54-b2e8-cc4a0c008c11', node_type='4', metadata={}, hash='6ace29ac338c28eb58c9a9dd676515eee83bed2e93bc510ad3bf1fa45a206b2f'), <NodeRelationship.PARENT: '4'>: RelatedNodeInfo(node_id='b2f92704-27f7-49f6-a780-e7f9dae89e8b', node_type='1', metadata={}, hash='d5dc1254033eb23233a7ce69b34017c6c86f7814dd14feda6ba56f7422dea5da')}, metadata_template='{key}: {value}', metadata_separator='\n', text='Sports Authority of India (SAI) has presence across\nIndia. It provides training\nfacilities to athletes right from the grass-root\nlevel through to the elite level. It is observed that\nthere is a tendency of creating sports infrastructure\nfacilities of the highest international\nstandards irrespective of the level of athlete /\ntraining for which it 

In [60]:
all_texts = [node_score.node.text for node_score in base_nodes]

# Print result
for i, text in enumerate(all_texts, 1):
    print(f"Node {i}:\n{text}\n{'-'*50}\n")

Node 1:
Sports Authority of India (SAI) has presence across
India. It provides training
facilities to athletes right from the grass-root
level through to the elite level. It is observed that
there is a tendency of creating sports infrastructure
facilities of the highest international
standards irrespective of the level of athlete /
training for which it is intended. This not only
increases the initial cost but also leads to higher
operations and maintenance cost.
2.
--------------------------------------------------

Node 2:
e-mail at infradivisionsai@gmail.com.
6. Director / SE (Infrastructure) of Sports Authority
of India has compiled the proposed
specifications for SAI Regional Centres, Academic
Institutions, STCs and SAGs after studying
the norms prescribed by respective International
Sports Federation. I sincerely hope that the
publication serves the purpose for which it has been
prepared.
INJETI SRINIVAS, IAS
 Director General
 Sports Authority of India
- 5 -
PREFACE
SAI has pres

In [61]:
from llama_index.core.response.notebook_utils import display_source_node

for node in nodes:
    display_source_node(node, source_length=10000)

2025-10-13 12:56:59,835 - INFO - generated new fontManager


**Node ID:** 8fd0c6f7-58db-4c59-9f66-588786e67691<br>**Similarity:** 0.9002276593426858<br>**Text:** Sports Authority of India (SAI) has presence across
India. It provides training
facilities to athletes right from the grass-root
level through to the elite level. It is observed that
there is a tendency of creating sports infrastructure
facilities of the highest international
standards irrespective of the level of athlete /
training for which it is intended. This not only
increases the initial cost but also leads to higher
operations and maintenance cost.
2.<br>

**Node ID:** a7dedb64-fe50-45b4-b6b3-8b9d701e256e<br>**Similarity:** 0.8695449197151273<br>**Text:** e-mail at infradivisionsai@gmail.com.
6. Director / SE (Infrastructure) of Sports Authority
of India has compiled the proposed
specifications for SAI Regional Centres, Academic
Institutions, STCs and SAGs after studying
the norms prescribed by respective International
Sports Federation. I sincerely hope that the
publication serves the purpose for which it has been
prepared.
INJETI SRINIVAS, IAS
 Director General
 Sports Authority of India
- 5 -
PREFACE
SAI has presence PAN India.<br>

**Node ID:** d98bade3-b738-456d-aae6-ca35fba69563<br>**Similarity:** 0.8287309128816476<br>**Text:** Ltd.
E-23, Milan Cinema Road, Karampura,
Opposite Karampura Post Office, New
Delhi, Delhi 110015
011 2543 0429
shivnareshsports@shivnareshsports.com
Green HF
2<br>

**Node ID:** 8c60c89a-f71c-4b42-83e6-29e0716b3136<br>**Similarity:** 0.8231431849832169<br>**Text:** Desso Sports Systems BV (NV)
Robert Ramlotstraat 89, 9200 Dendermonde
BELGIUM
Tel: +32 52 262 660
Email: pvreijen@desso.com
Sportina Exim Pvt. Ltd.<br>

**Node ID:** 2914fefa-4c60-422b-8644-5be7dbdcdc0a<br>**Similarity:** 0.820765367452457<br>**Text:** Ltd.
218 Champaklal Estate, Sion Circle, Sion,
Mumbai - 400022, Near Cinemax Cinema
+(91)-22-38566057
DD Sportilux SL
3
FIELDTURF TARKETT
2 rue de l‚ÄôEgalite, 92748 Nanterre Cedex, France,
Tel: 33 1 4120 4382
E-mail:- benjamin.chardon@tarkett.com
Great Sports Infra
Flat No. 101, Plot No.52, Street Number 2,
Chikoti Gardens,<br>

**Node ID:** 99c118be-e758-40ce-aa69-70124a55248d<br>**Similarity:** 0.8199909139932822<br>**Text:** However these
additional margins are not required for SAI Training
Centres. The minimum safe area for each
standard of sportsactivities undertaken by the young
trainees at our SAI Training Centers is
much less than what is required for international
competitions.
In view of the above, it is considered necessary to
streamline the specifications to be provided
for infrastructure facilities and compile guidelines
on field of plays notified by respective
international federations for various SAI centers/
STC/SAG, all over India.<br>

In [62]:
for node in base_nodes:
    display_source_node(node, source_length=10000)

**Node ID:** 8fd0c6f7-58db-4c59-9f66-588786e67691<br>**Similarity:** 0.8989964953146579<br>**Text:** Sports Authority of India (SAI) has presence across
India. It provides training
facilities to athletes right from the grass-root
level through to the elite level. It is observed that
there is a tendency of creating sports infrastructure
facilities of the highest international
standards irrespective of the level of athlete /
training for which it is intended. This not only
increases the initial cost but also leads to higher
operations and maintenance cost.
2.<br>

**Node ID:** a7dedb64-fe50-45b4-b6b3-8b9d701e256e<br>**Similarity:** 0.8677232442396761<br>**Text:** e-mail at infradivisionsai@gmail.com.
6. Director / SE (Infrastructure) of Sports Authority
of India has compiled the proposed
specifications for SAI Regional Centres, Academic
Institutions, STCs and SAGs after studying
the norms prescribed by respective International
Sports Federation. I sincerely hope that the
publication serves the purpose for which it has been
prepared.
INJETI SRINIVAS, IAS
 Director General
 Sports Authority of India
- 5 -
PREFACE
SAI has presence PAN India.<br>

**Node ID:** d98bade3-b738-456d-aae6-ca35fba69563<br>**Similarity:** 0.8289097745301957<br>**Text:** Ltd.
E-23, Milan Cinema Road, Karampura,
Opposite Karampura Post Office, New
Delhi, Delhi 110015
011 2543 0429
shivnareshsports@shivnareshsports.com
Green HF
2<br>

**Node ID:** 8c60c89a-f71c-4b42-83e6-29e0716b3136<br>**Similarity:** 0.8241106746322506<br>**Text:** Desso Sports Systems BV (NV)
Robert Ramlotstraat 89, 9200 Dendermonde
BELGIUM
Tel: +32 52 262 660
Email: pvreijen@desso.com
Sportina Exim Pvt. Ltd.<br>

**Node ID:** 2914fefa-4c60-422b-8644-5be7dbdcdc0a<br>**Similarity:** 0.821383558223403<br>**Text:** Ltd.
218 Champaklal Estate, Sion Circle, Sion,
Mumbai - 400022, Near Cinemax Cinema
+(91)-22-38566057
DD Sportilux SL
3
FIELDTURF TARKETT
2 rue de l‚ÄôEgalite, 92748 Nanterre Cedex, France,
Tel: 33 1 4120 4382
E-mail:- benjamin.chardon@tarkett.com
Great Sports Infra
Flat No. 101, Plot No.52, Street Number 2,
Chikoti Gardens,<br>

**Node ID:** 99c118be-e758-40ce-aa69-70124a55248d<br>**Similarity:** 0.817819131282206<br>**Text:** However these
additional margins are not required for SAI Training
Centres. The minimum safe area for each
standard of sportsactivities undertaken by the young
trainees at our SAI Training Centers is
much less than what is required for international
competitions.
In view of the above, it is considered necessary to
streamline the specifications to be provided
for infrastructure facilities and compile guidelines
on field of plays notified by respective
international federations for various SAI centers/
STC/SAG, all over India.<br>

### Plug it into Query Engine

In [63]:
from llama_index.core.query_engine import RetrieverQueryEngine

query_engine = RetrieverQueryEngine.from_args(retriever)
base_query_engine = RetrieverQueryEngine.from_args(base_retriever)

response = query_engine.query(query_str)

print(str(response))

2025-10-13 13:03:54,269 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-10-13 13:03:56,715 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Sports Authority of India provides training facilities to athletes at various levels, from grassroots to elite. The organization aims to create sports infrastructure facilities of high international standards, regardless of the athlete's level or training. This approach can lead to increased initial costs and higher operations and maintenance expenses. The Director General of Sports Authority of India has compiled proposed specifications for SAI Regional Centres, Academic Institutions, STCs, and SAGs based on the norms prescribed by respective International Sports Federations.


In [64]:
base_response = base_query_engine.query(query_str)
print(str(base_response))

2025-10-13 13:04:06,712 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-10-13 13:04:08,953 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


The Sports Authority of India provides training facilities for athletes at all levels, from grassroots to elite. They aim to create sports infrastructure facilities of high international standards, which can lead to increased initial and operational costs. The Director General of SAI has compiled proposed specifications for various SAI centers based on norms prescribed by international sports federations.


In [65]:
print(len(str(response)))
print(len(str(base_response)))

583
408
