<a href="https://mng.bz/8wdg" target="_blank">
    <img src="../../Assets/Images/NewMEAPHeader.png" alt="New MEAP" style="width: 100%;" />
</a>

# Chapter 06 - Progression of RAG Systems: Naïve to Advanced, and Modular RAG

We have familiarized ourselves with the utility of RAG along with the development and evaluation of a basic RAG system. The basic, or the Naïve RAG approach that we have seen so far is, generally, inadequate when it comes to production-grade systems.

<a href="https://mng.bz/8wdg" target="_blank">
    <img src="../../Assets/Images/6.1.png" alt="Naive RAG Challenges" style="width: 100%;" />
</a>


In this chapter we will focus on more advanced concepts in RAG that make RAG possible in production. Let's begin by installing dependencies.

## Installing Dependencies

All the necessary libraries for running this notebook along with their versions can be found in __requirements.txt__ file in the root directory of this repository

You should go to the root directory and run the following command to install the libraries

```
pip install -r requirements.txt
```

This is the recommended method of installing the dependencies

___
Alternatively, you can run the command from this notebook too. The relative path may vary

In [1]:
%pip install -r ../../requirements.txt --quiet


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


## Advanced RAG Techniques

Advanced techniques in RAG have continued to emerge since the earliest experiments with Naïve RAG. There are three stages in which we can discuss these techniques – 
1.	Pre-retrieval Stage: Like the name suggests, there are certain interventions that can be employed before the retriever comes into action. This broadly covers two aspects 
    - Index Optimization – The way documents are stored in the knowledge base
    - Query Optimization – Optimizing the user query so it aligns better to the retrieval and generation tasks
2.	Retrieval Stage: Certain strategies can improve the recall and precision of the retrieval process. This goes beyond the capability of the underlying retrieval algorithms that we discussed in Chapter 4.
3.	Post-retrieval Stage: Once the information has been retrieved, the context can be further optimized to better align with the generation task and the downstream LLM.


<a href="https://mng.bz/8wdg" target="_blank">
    <img src="../../Assets/Images/6.2.png" alt="Naive RAG Challenges" style="width: 50%;" />
</a>

We will explore these techniques one by one.

To initialize the __OpenAI client__, we need to pass the api key.

Creating a .env file for storing the API key and using it # Recommended

Install the __dotenv__ library

_The dotenv library is a popular tool used in various programming languages, including Python and Node.js, to manage environment variables in development and deployment environments. It allows developers to load environment variables from a .env file into their application's environment._

- Create a file named .env in the root directory of their project.
- Inside the .env file, then define environment variables in the format VARIABLE_NAME=value. 

e.g.

OPENAI_API_KEY=YOUR API KEY

In [4]:
from dotenv import load_dotenv
import os

if load_dotenv():
    print("Success: .env file found with some environment variables")
else:
    print("Caution: No environment variables found. Please create .env file in the root directory or add environment variables in the .env file")

Success: .env file found with some environment variables


In [5]:
api_key=os.environ["OPENAI_API_KEY"]

from openai import OpenAI

client = OpenAI()


if api_key:
    try:
        client.models.list()
        print("OPENAI_API_KEY is set and is valid")
    except openai.APIError as e:
        print(f"OpenAI API returned an API Error: {e}")
        pass
    except openai.APIConnectionError as e:
        print(f"Failed to connect to OpenAI API: {e}")
        pass
    except openai.RateLimitError as e:
        print(f"OpenAI API request exceeded rate limit: {e}")
        pass

else:
    print("Please set you OpenAI API key as an environment variable OPENAI_API_KEY")



OPENAI_API_KEY is set and is valid


## 1. Pre-retrieval Stage

The primary objective of employing pre-retrieval techniques is to facilitate better retrieval. Retrieval failures can happen because of 2 reasons.
    
- Knowledge Base is not suited for retrieval
    
- Retriever doesn’t completely understand the input query

### 1.1 INDEX OPTIMIZATION

The objective of index Optimization is to set up the knowledge base for better retrieval. 

#### Context Enriched Chunking

This method adds the summary of the larger document to each chunk to enrich the context of the smaller chunk

In [35]:
# Import FAISS class from vectorstore library
from langchain_community.vectorstores import FAISS

# Import OpenAIEmbeddings from the library
from langchain_openai import OpenAIEmbeddings

# Instantiate the embeddings object
embeddings=OpenAIEmbeddings(model="text-embedding-3-large")


In [36]:
from langchain_community.document_loaders import AsyncHtmlLoader
from langchain_community.document_transformers import Html2TextTransformer

In [37]:
url="https://en.wikipedia.org/wiki/2023_Cricket_World_Cup"

In [38]:
loader = AsyncHtmlLoader (url)
data = loader.load()
html2text = Html2TextTransformer()
data_transformed = html2text.transform_documents(data)

Fetching pages: 100%|##########| 1/1 [00:00<00:00,  2.10it/s]


In [39]:
document_text=data_transformed[0].page_content

In [40]:
summary_prompt = f"Summarize the given document in a single paragraph\ndocument: {document_text}"

In [56]:
# Importing the OpenAI library
from openai import OpenAI

# Instantiate the OpenAI client
client = OpenAI()

# Make the API call passing the augmented prompt to the LLM
response = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=	[
    {"role": "user", "content": summary_prompt}
  		]
)

# Extract the answer from the response object
answer=response.choices[0].message.content

import textwrap
print(textwrap.fill(answer, width=80))

The 2023 ICC Men's Cricket World Cup, held from October 5 to November 19, 2023,
in India, was the tournament's 13th edition, featuring ten teams competing in a
round-robin and knockout format. Australia emerged victorious, claiming their
sixth title by defeating India in the final. Virat Kohli was crowned Player of
the Tournament and top scorer with 765 runs, while Mohammed Shami led in wickets
taken with 24. The tournament saw a record attendance of 1,250,307 fans, with
the final viewed by 518 million people in India. This World Cup was noted for
being the first to be solely hosted by India and included new rules for slow
over-rates and enhanced broadcast features.


In [42]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
#Set the CharacterTextSplitter parameters
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000, #Number of characters in each chunk 
chunk_overlap=200, #Number of overlapping characters between chunks
)
#Create Chunks
chunks=text_splitter.split_text(data_transformed[0].page_content)

In [43]:
context_enriched_chunks = [answer + "\n" + chunk for chunk in chunks]

In [44]:
embedding = OpenAIEmbeddings(openai_api_key=api_key)
vector_store = FAISS.from_texts(context_enriched_chunks, embedding)


In [45]:
query = "What records did Virat Kohli make?"
retrieved_docs = vector_store.similarity_search(query, k=2)


In [46]:
print(retrieved_docs[0].page_content)

The 2023 ICC Men's Cricket World Cup, hosted solely by India from October 5 to November 19, was the 13th edition of this prestigious tournament, featuring ten national teams in a One Day International format. Australia emerged victorious, claiming their sixth title by defeating India by six wickets in the final held at the Narendra Modi Stadium in Ahmedabad. Virat Kohli was named Player of the Tournament, scoring the most runs (765) and Mohammed Shami led in wickets taken (24). The tournament featured 48 matches across ten venues and set attendance and viewership records, with over 1.25 million people attending, and 518 million viewers tuning into the final match in India. The event marked a significant moment in cricket, showcasing competitive matches and new broadcasting innovations, contributing to a successful and highly viewed World Cup.
The tournament was contested by ten national teams, maintaining the same
format used in 2019. After six weeks of round-robin matches, India, Sout

---


#### Metadata Enhancement

This method adds the summary of the larger document to each chunk to enrich the context of the smaller chunk

In [125]:
import faiss
from langchain_community.vectorstores import FAISS
from langchain_community.docstore.in_memory import InMemoryDocstore
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
from langchain_community.document_loaders import AsyncHtmlLoader
from langchain_community.document_transformers import Html2TextTransformer
from langchain_text_splitters import RecursiveCharacterTextSplitter
from openai import OpenAI

# Initialize the OpenAI client
client = OpenAI()

# Function to extract fixed metadata using GPT-4o-mini with JSON response
def extract_fixed_metadata_from_chunk(chunk_text):
    prompt = f"""
    Extract the following fixed metadata in JSON format from the given text:
    {{
      "player_1": "",
      "player_2": "",
      "player_3": "",
      "player_4": "",
      "player_5": "",
      "team_1": "",
      "team_2": "",
      "team_3": "",
      "team_4": "",
      "team_5": "",
      "keyword_1": "",
      "keyword_2": "",
      "keyword_3": "",
      "keyword_4": "",
      "keyword_5": ""
    }}
    Here's the text:
    {chunk_text}
    """
    
    # Call GPT-4o-mini to extract structured metadata in JSON format
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        response_format={ "type": "json_object" }
    )
    
    # Extract the response in JSON format
    metadata_response = response.choices[0].message.content
    print(metadata_response)
    try:
        # Convert the response into a dictionary
        metadata = eval(metadata_response)  # This ensures it is a valid dictionary
    except Exception as e:
        print(f"Error parsing metadata: {e}")
        metadata = {
            "player_1": "", "player_2": "", "player_3": "", "player_4": "", "player_5": "",
            "team_1": "", "team_2": "", "team_3": "", "team_4": "", "team_5": "",
            "keyword_1": "", "keyword_2": "", "keyword_3": "", "keyword_4": "", "keyword_5": ""
        }
    return metadata

# Step 1: Load data from a URL (Wikipedia page)
url = "https://en.wikipedia.org/wiki/2023_Cricket_World_Cup"
loader = AsyncHtmlLoader(url)
data = loader.load()

# Step 2: Transform the HTML content to plain text
html2text = Html2TextTransformer()
data_transformed = html2text.transform_documents(data)

# Step 3: Split the text into smaller chunks using RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=10000,  # Number of characters in each chunk
    chunk_overlap=200  # Number of overlapping characters between chunks
)
chunks = text_splitter.split_text(data_transformed[0].page_content)

# Step 4: Initialize OpenAI Embeddings model
embedding_model = OpenAIEmbeddings(model="text-embedding-ada-002")

# Step 5: Initialize FAISS index for L2 (Euclidean) distance
embedding_dim = len(embedding_model.embed_query("hello world"))
index = faiss.IndexFlatL2(embedding_dim)

# Step 6: Initialize the InMemoryDocstore to store documents and metadata in memory
docstore = InMemoryDocstore()

# Step 7: Create FAISS vector store using the embedding function, FAISS index, and docstore
vector_store = FAISS(
    embedding_function=embedding_model,
    index=index,
    docstore=docstore,
    index_to_docstore_id={}
)

# Step 8: Add chunks (documents) with extracted metadata and embeddings to FAISS vector store
documents = []
for i, chunk in enumerate(chunks):
    # Extract fixed metadata using the LLM
    extracted_metadata = extract_fixed_metadata_from_chunk(chunk)
    
    # Create a document object with both the chunk content and the extracted metadata
    document = Document(
        page_content=chunk, 
        metadata={
            "source": url, 
            "category": "cricket world cup",
            "extracted_metadata": extracted_metadata  # Store the structured metadata
        }
    )
    
    # Append the document to the list
    documents.append(document)

# Create unique IDs for each chunk
ids = [f"chunk_{i}" for i in range(len(chunks))]

# Add the documents and their embeddings to the FAISS vector store
vector_store.add_documents(documents=documents, ids=ids)

# Step 9: Define a function to extract metadata from a query
def extract_fixed_metadata_from_query(query_text):
    prompt = f"""
    Extract the following fixed metadata in JSON format from the query:
    {{
      "player_1": "",
      "player_2": "",
      "player_3": "",
      "player_4": "",
      "player_5": "",
      "team_1": "",
      "team_2": "",
      "team_3": "",
      "team_4": "",
      "team_5": "",
      "keyword_1": "",
      "keyword_2": "",
      "keyword_3": "",
      "keyword_4": "",
      "keyword_5": ""
    }}
    Here's the query:
    {query_text}
    """
    
    # Call GPT-4o-mini to extract structured metadata from the query
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        response_format={ "type": "json_object" }
    )
    
    # Extract the response in JSON format
    metadata_response = response.choices[0].message.content
    try:
        # Convert the response into a dictionary
        metadata = eval(metadata_response)
    except Exception as e:
        print(f"Error parsing metadata: {e}")
        metadata = {
            "player_1": "", "player_2": "", "player_3": "", "player_4": "", "player_5": "",
            "team_1": "", "team_2": "", "team_3": "", "team_4": "", "team_5": "",
            "keyword_1": "", "keyword_2": "", "keyword_3": "", "keyword_4": "", "keyword_5": ""
        }
    return metadata

# Step 10: Extract metadata from the query
query = "Virat Kohli records in 2023 Cricket World Cup"
query_metadata = extract_fixed_metadata_from_query(query)

# Step 11: Define a metadata filter based on the query's extracted metadata
def metadata_filter(doc_metadata):
    query_players = {query_metadata[f"player_{i}"] for i in range(1, 6) if query_metadata[f"player_{i}"]}
    query_teams = {query_metadata[f"team_{i}"] for i in range(1, 6) if query_metadata[f"team_{i}"]}
    query_keywords = {query_metadata[f"keyword_{i}"] for i in range(1, 6) if query_metadata[f"keyword_{i}"]}
    doc_players = {doc_metadata["extracted_metadata"][f"player_{i}"] for i in range(1, 6) if doc_metadata["extracted_metadata"][f"player_{i}"]}
    doc_teams = {doc_metadata["extracted_metadata"][f"team_{i}"] for i in range(1, 6) if doc_metadata["extracted_metadata"][f"team_{i}"]}
    doc_keywords = {doc_metadata["extracted_metadata"][f"keyword_{i}"] for i in range(1, 6) if doc_metadata["extracted_metadata"][f"keyword_{i}"]}
    
    # Check if there's any overlap between the query metadata and document metadata
    return bool(query_players & doc_players or query_teams & doc_teams or query_keywords & doc_keywords)

# Step 12: Perform a similarity search on the stored chunks with the metadata filter
results = vector_store.similarity_search(query=query, k=3, filter=metadata_filter)

# Step 13: Display the results with metadata
for doc in results:
    print(f"Document: {doc.page_content}")
    print(f"Metadata: {doc.metadata}")


Fetching pages: 100%|##########| 1/1 [00:00<00:00,  1.45it/s]


{
  "player_1": "Virat Kohli",
  "player_2": "Mohammed Shami",
  "player_3": "Wesley Barresi",
  "player_4": "Noor Ahmad",
  "player_5": "",
  "team_1": "India",
  "team_2": "Australia",
  "team_3": "New Zealand",
  "team_4": "South Africa",
  "team_5": "Afghanistan",
  "keyword_1": "ICC",
  "keyword_2": "Cricket World Cup",
  "keyword_3": "One Day International",
  "keyword_4": "2023",
  "keyword_5": "hosts"
}
{
  "player_1": "Pathum Nissanka",
  "player_2": "Tanzid Hasan",
  "player_3": "Mohammad Rizwan",
  "player_4": "Rachin Ravindra",
  "player_5": "Kusal Mendis",
  "team_1": "Sri Lanka",
  "team_2": "Bangladesh",
  "team_3": "Pakistan",
  "team_4": "New Zealand",
  "team_5": "Afghanistan",
  "keyword_1": "Warm-up matches",
  "keyword_2": "2023",
  "keyword_3": "Cricket World Cup",
  "keyword_4": "Hyderabad",
  "keyword_5": "Guwahati"
}
{
  "player_1": "",
  "player_2": "",
  "player_3": "",
  "player_4": "",
  "player_5": "",
  "team_1": "England",
  "team_2": "New Zealand",
  "t

### 1.2 QUERY OPTIMIZATION

The objective of this stage is to optimize the input user query in a manner that makes it better suited for the retrieval tasks

#### Query Expansion

In query expansion, the original user query is enriched with the aim of retrieving more relevant information. This helps in increasing the recall of the system and overcomes the challenge of incomplete or very brief user queries.

In [16]:
original_query="How does climate change affect polar bears?"
num=5

In [83]:
response_structure='''
{
    "queries": [
        {
            "query": "query",
    },
    ...
]}
'''

In [84]:
expansion_prompt=f"Generate {num} variations of the following query: {original_query}. Respond in JSON format.\Stick to this Structure :\n{response_structure}"

In [18]:
step_back_expansion_prompt = f"Given the query: '{original_query}', generate a more abstract, higher-level conceptual query."

In [19]:
sub_query_expansion_prompt=f"Break down the following query into {num} sub-queries targeting different aspects of the query: '{original_query}'. Respond in JSON format."


In [85]:
# Importing the OpenAI library
from openai import OpenAI

# Instantiate the OpenAI client
client = OpenAI()

# Make the API call passing the augmented prompt to the LLM
response = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=	[
    {"role": "user", "content": expansion_prompt}
  		],
          response_format={ "type": "json_object" }
)

# Extract the answer from the response object
answer=response.choices[0].message.content

In [86]:
print(answer)

{
    "queries": [
        {
            "query": "What is the impact of climate change on polar bear populations?"
        },
        {
            "query": "In what ways does climate change threaten polar bears?"
        },
        {
            "query": "How are polar bears affected by changing climate conditions?"
        },
        {
            "query": "What are the consequences of climate change for polar bear habitats?"
        },
        {
            "query": "How does global warming influence the survival of polar bears?"
        }
    ]
}


In [25]:


# Make the API call passing the augmented prompt to the LLM
response = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=	[
    {"role": "user", "content": step_back_expansion_prompt}
  ]
)

# Extract the answer from the response object
answer=response.choices[0].message.content

In [26]:
print(answer)

'What are the broader ecological impacts of climate change on Arctic biodiversity?'


In [27]:

# Make the API call passing the augmented prompt to the LLM
response = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=	[
    {"role": "user", "content": sub_query_expansion_prompt}
  ],
  response_format={ "type": "json_object" }
)

# Extract the answer from the response object
answer=response.choices[0].message.content

In [28]:
print(answer)

{
  "sub_queries": [
    {
      "id": 1,
      "aspect": "Habitat Loss",
      "query": "What specific changes in sea ice patterns are impacting polar bear habitats due to climate change?"
    },
    {
      "id": 2,
      "aspect": "Food Availability",
      "query": "How does climate change affect the availability of seals, the primary food source for polar bears?"
    },
    {
      "id": 3,
      "aspect": "Reproductive Challenges",
      "query": "In what ways does climate change influence the reproductive success and behavior of polar bears?"
    },
    {
      "id": 4,
      "aspect": "Health Issues",
      "query": "What health issues are polar bears facing as a result of climate change and altered environmental conditions?"
    },
    {
      "id": 5,
      "aspect": "Conservation Efforts",
      "query": "What conservation strategies are being implemented to protect polar bears from the effects of climate change?"
    }
  ]
}


#### Query Transformation

Compared to query expansion, in query transformation, instead of the original user query retrieval happens on a transformed query which is more suitable for the retriever

In [29]:
original_query="How does climate change affect polar bears?"

In [30]:
system_prompt="You are an expert in climate change and arctic life."
hyde_prompt=f"Generate an answer to the question: {original_query}"

In [31]:

# Make the API call passing the augmented prompt to the LLM
response = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=	[
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": hyde_prompt}
  ]
)

# Extract the answer from the response object
answer=response.choices[0].message.content

In [32]:
print(answer)

Climate change significantly affects polar bears, primarily through its impact on their sea ice habitat. As global temperatures rise, Arctic sea ice is melting at an alarming rate, which is critical for polar bears for several reasons:

1. **Habitat Loss**: Polar bears rely on sea ice as a platform for hunting seals, their primary food source. As the ice melts, bears have to travel greater distances to find food, which can lead to malnutrition or starvation.

2. **Breeding and Denning**: Female polar bears typically build maternity dens on stable sea ice or on land where conditions are suitable. Warmer temperatures and irregular ice patterns can disrupt the timing of den construction and affect cub survival rates.

3. **Increased Competition**: As ice diminishes, polar bears may encounter increased competition for food, particularly in areas where their ranges overlap with other bear populations or where seals are less available.

4. **Health and Reproduction**: The stress of longer fa

In [33]:
# Initialize the OpenAIEmbeddings model
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

# Create embedding for the hypothetical answer
hyde_embedding = embeddings.embed_query(answer)

# Check and print the dimension of the embedding
embedding_dimension = len(hyde_embedding)
print(f"The embedding dimension is: {embedding_dimension}")

# Optionally, you can add error handling
if embedding_dimension != 3072:  # 3072 is the expected dimension for text-embedding-3-large
    print(f"Warning: Unexpected embedding dimension. Expected 3072, got {embedding_dimension}")

The embedding dimension is: 3072


## 2. Retrieval Strategies

Interventions in the pre-retrieval stage can bring significant improvements in the performance of the RAG system if the query and the knowledge base becomes well aligned with the retrieval algorithm. 

#### Hybrid Retrieval

In [137]:
import faiss
from langchain_community.vectorstores import FAISS
from langchain_community.docstore.in_memory import InMemoryDocstore
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
from langchain_community.document_loaders import AsyncHtmlLoader
from langchain_community.document_transformers import Html2TextTransformer
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.retrievers import BM25Retriever

# Step 1: Load data from a URL (Wikipedia page)
url = "https://en.wikipedia.org/wiki/2023_Cricket_World_Cup"
loader = AsyncHtmlLoader(url)
data = loader.load()

# Step 2: Transform the HTML content to plain text
html2text = Html2TextTransformer()
data_transformed = html2text.transform_documents(data)

# Step 3: Split the text into smaller chunks using RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,  # Number of characters in each chunk
    chunk_overlap=200  # Number of overlapping characters between chunks
)
chunks = text_splitter.split_text(data_transformed[0].page_content)

# Step 4: Dense Retrieval (FAISS + OpenAI Embeddings)

# Initialize OpenAI Embeddings model for dense retrieval
embedding_model = OpenAIEmbeddings(model="text-embedding-ada-002")

# Initialize FAISS index for dense retrieval
embedding_dim = len(embedding_model.embed_query("hello world"))
index = faiss.IndexFlatL2(embedding_dim)

# Create an in-memory document store to support adding documents
docstore = InMemoryDocstore()

# Initialize FAISS vector store
vector_store = FAISS(embedding_function=embedding_model, index=index, docstore=docstore, index_to_docstore_id={})

# Add chunks to FAISS vector store
documents = [Document(page_content=chunk) for chunk in chunks]
vector_store.add_documents(documents)

# Step 5: Sparse Retrieval (BM25 using LangChain's BM25Retriever)

# Initialize BM25Retriever
bm25_retriever = BM25Retriever.from_documents(documents)

# Step 6: Hybrid Retrieval Strategy

def hybrid_search(query, k=5):
    # Step 6.1: Perform dense retrieval using FAISS
    dense_results = vector_store.similarity_search(query=query, k=k)
    
    # Step 6.2: Perform sparse retrieval using BM25Retriever
    sparse_results = bm25_retriever.get_relevant_documents(query)
    
    # Limit sparse results to top-k
    sparse_results = sparse_results[:k]
    
    # Step 6.3: Combine dense and sparse results
    combined_results = []
    for dense_doc in dense_results:
        combined_results.append(("dense", dense_doc.page_content))

    for sparse_doc in sparse_results:
        combined_results.append(("sparse", sparse_doc.page_content))

    # Optionally, re-rank or further process combined results
    return combined_results

# Step 7: Perform a hybrid search
query = "Virat Kohli records in 2023 Cricket World Cup"
hybrid_results = hybrid_search(query)

# Step 8: Display the results
for retrieval_type, result in hybrid_results:
    print(f"Retrieval Type: {retrieval_type}")
    print(f"Result: {result}\n")


Fetching pages: 100%|##########| 1/1 [00:00<00:00,  1.65it/s]


Retrieval Type: dense
Result: 35. **^** "It's official! India set up 2023 World Cup semi-final against New Zealand in 2019 rematch; Pakistan knocked out". _Hindustan Times_. 11 November 2023. Archived from the original on 14 November 2023. Retrieved 12 November 2023.
  36. **^** "2023 World Cup Cricket Batting Records & Stats runs". _ESPNcricinfo_. Archived from the original on 18 October 2023. Retrieved 19 October 2023.

Retrieval Type: dense
Result: 37. **^** "2023 World Cup Cricket bowling Records & Stats wickets". _ESPNcricinfo_. Archived from the original on 9 October 2023. Retrieved 10 October 2023.
  38. **^** "India star named Player of the Tournament at ICC Men's Cricket World Cup". _Cricket World Cup_. Archived from the original on 19 November 2023. Retrieved 19 November 2023.

Retrieval Type: dense
Result: Main article: 2023 Cricket World Cup final

19 November 2023  
14:00 (D/N)  
Scorecard  
---  
**India **  
240 (50 overs) | **v** | **Australia**  
241/4 (43 overs)  
---

## 3. Post Retrieval Stage

At the post-retrieval stage the approaches of reranking and compression help in providing better context to the LLM for generation.

#### Compression

In prompt compression, language models are used to detect and remove unimportant and irrelevant tokens

In [49]:
document_to_compress=retrieved_docs[0].page_content

In [68]:
compress_prompt = f"Compress the following document into very short sentences, retaining only the extremely essential information:\n\n{document_to_compress}"

In [69]:
# Make the API call passing the augmented prompt to the LLM
response = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=	[
    {"role": "user", "content": compress_prompt}
  ]
)

# Extract the answer from the response object
answer=response.choices[0].message.content

In [70]:
print(textwrap.fill(answer, width=80))

The 2023 ICC Men's Cricket World Cup took place in India from October 5 to
November 19. Australia won, defeating India by six wickets in the final. Virat
Kohli was Player of the Tournament with 765 runs; Mohammed Shami led with 24
wickets. The event included 48 matches at ten venues with over 1.25 million
attendees and 518 million viewers for the final, setting records. The top four
teams were India, South Africa, Australia, and New Zealand. Australia claimed
their sixth title.


In [71]:
print(textwrap.fill(document_to_compress, width=80))

The 2023 ICC Men's Cricket World Cup, hosted solely by India from October 5 to
November 19, was the 13th edition of this prestigious tournament, featuring ten
national teams in a One Day International format. Australia emerged victorious,
claiming their sixth title by defeating India by six wickets in the final held
at the Narendra Modi Stadium in Ahmedabad. Virat Kohli was named Player of the
Tournament, scoring the most runs (765) and Mohammed Shami led in wickets taken
(24). The tournament featured 48 matches across ten venues and set attendance
and viewership records, with over 1.25 million people attending, and 518 million
viewers tuning into the final match in India. The event marked a significant
moment in cricket, showcasing competitive matches and new broadcasting
innovations, contributing to a successful and highly viewed World Cup. The
tournament was contested by ten national teams, maintaining the same format used
in 2019. After six weeks of round-robin matches, India, Sout

---

<img src="../../Assets/Images/profile_s.png" width=100> 

Hi! I'm Abhinav! I am an entrepreneur and Vice President of Artificial Intelligence at Yarnit. I have spent over 15 years consulting and leadership roles in data science, machine learning and AI. My current focus is in the applied Generative AI domain focussing on solving enterprise needs through contextual intelligence. I'm passionate about AI advancements constantly exploring emerging technologies to push the boundaries and create positive impacts in the world. Let’s build the future, together!

[If you haven't already, please subscribe to the MEAP of A Simple Guide to Retrieval Augmented Generation here](https://mng.bz/8wdg)

<a href="https://mng.bz/8wdg" target="_blank">
    <img src="../../Assets/Images/NewMEAPFooter.png" alt="New MEAP" style="width: 100%;" />
</a>

#### If you'd like to chat, I'd be very happy to connect

[![GitHub followers](https://img.shields.io/badge/Github-000000?style=for-the-badge&logo=github&logoColor=black&color=orange)](https://github.com/abhinav-kimothi)
[![LinkedIn](https://img.shields.io/badge/LinkedIn-000000?style=for-the-badge&logo=linkedin&logoColor=orange&color=black)](https://www.linkedin.com/comm/mynetwork/discovery-see-all?usecase=PEOPLE_FOLLOWS&followMember=abhinav-kimothi)
[![Medium](https://img.shields.io/badge/Medium-000000?style=for-the-badge&logo=medium&logoColor=black&color=orange)](https://medium.com/@abhinavkimothi)
[![Insta](https://img.shields.io/badge/Instagram-000000?style=for-the-badge&logo=instagram&logoColor=orange&color=black)](https://www.instagram.com/akaiworks/)
[![Mail](https://img.shields.io/badge/email-000000?style=for-the-badge&logo=gmail&logoColor=black&color=orange)](mailto:abhinav.kimothi.ds@gmail.com)
[![X](https://img.shields.io/badge/Follow-000000?style=for-the-badge&logo=X&logoColor=orange&color=black)](https://twitter.com/abhinav_kimothi)
[![Linktree](https://img.shields.io/badge/Linktree-000000?style=for-the-badge&logo=linktree&logoColor=black&color=orange)](https://linktr.ee/abhinavkimothi)
[![Gumroad](https://img.shields.io/badge/Gumroad-000000?style=for-the-badge&logo=gumroad&logoColor=orange&color=black)](https://abhinavkimothi.gumroad.com/)

---