<a href="https://colab.research.google.com/github/saikumarpochireddygari/Denis-Sai-RAG-Driven-Generative-AI/blob/main/Chapter01/RAG_Overview.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Introducing Naive, Advanced, and Modular RAG

Copyright 2024, Denis Rothman

This notebook introduces Naïve, Advanced, and Modular RAG through basic educational examples.

The Naïve, Advanced and modular RAG techniques offer flexibility in selecting retrieval strategies, allowing adaptation to various tasks and data characteristics.

**Summary**

**Part 1: Foundations and Basic Implementation**

1.Environment setup for OpenAI API integration  
2.Generator function using GPT models    
3.Dataetup with a list of documents (db_records)  
4.Query(user request)  

**Part 2: Advanced Techniques and Evaluation**

1.Retrieval metrics  
2.Naive RAG  
3.Advanced RAG  
4.Modular RAG Retriever  

# Part 1: Foundations and Basic Implementation

# 1.The Environment

In [13]:
!pip install --upgrade "openai==0.28.0"




In [14]:
#API Key
#Store you key in a file and read it(you can type it directly in the notebook but it will be visible for somebody next to you)
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [15]:
f = open("/content/drive/MyDrive/api_key.txt", "r")
API_KEY=f.readline().strip()
f.close()

#The OpenAI Key
import os
import openai
os.environ['OPENAI_API_KEY'] =API_KEY
openai.api_key = os.getenv("OPENAI_API_KEY")

# 2.The Generator


In [16]:
# import openai
# from openai import OpenAI

# client = OpenAI()
# gptmodel="gpt-4o"

# def call_llm_with_full_text(itext):
#     # Join all lines to form a single string
#     text_input = '\n'.join(itext)
#     prompt = f"Please elaborate on the following content:\n{text_input}"

#     try:
#       response = client.chat.completions.create(
#          model=gptmodel,
#          messages=[
#             {"role": "system", "content": "You are an expert Natural Language Processing exercise expert."},
#             {"role": "assistant", "content": "1.You can explain read the input and answer in detail"},
#             {"role": "user", "content": prompt}
#          ],
#          temperature=0.1  # Add the temperature parameter here and other parameters you need
#         )
#       return response.choices[0].message.content.strip()
#     except Exception as e:
#         return str(e)

TypeError: Client.__init__() got an unexpected keyword argument 'proxies'

In [36]:
import openai

#openai.api_key = "YOUR_API_KEY"  # Make sure your API key is set

gptmodel = "gpt-4o"  # or "gpt-3.5-turbo", etc.

def call_llm_with_full_text(itext):
    text_input = '\n'.join(itext)
    prompt = f"Please elaborate on the following content:\n{text_input}"

    try:
        response = openai.ChatCompletion.create(
            model=gptmodel,
            messages=[
                {"role": "system", "content": "You are an expert Natural Language Processing exercise expert."},
                {"role": "assistant", "content": "1. You can explain, read the input, and answer in detail."},
                {"role": "user", "content": prompt}
            ],
            temperature=0.1
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        return str(e)

## Formatted response

In [17]:
import textwrap

def print_formatted_response(response):
    # Define the width for wrapping the text
    wrapper = textwrap.TextWrapper(width=80)  # Set to 80 columns wide, but adjust as needed
    wrapped_text = wrapper.fill(text=response)

    # Print the formatted response with a header and footer
    print("Response:")
    print("---------------")
    print(wrapped_text)
    print("---------------\n")

 # 3.The Data

In [18]:
db_records = [
    "Retrieval Augmented Generation (RAG) represents a sophisticated hybrid approach in the field of artificial intelligence, particularly within the realm of natural language processing (NLP).",
    "It innovatively combines the capabilities of neural network-based language models with retrieval systems to enhance the generation of text, making it more accurate, informative, and contextually relevant.",
    "This methodology leverages the strengths of both generative and retrieval architectures to tackle complex tasks that require not only linguistic fluency but also factual correctness and depth of knowledge.",
    "At the core of Retrieval Augmented Generation (RAG) is a generative model, typically a transformer-based neural network, similar to those used in models like GPT (Generative Pre-trained Transformer) or BERT (Bidirectional Encoder Representations from Transformers).",
    "This component is responsible for producing coherent and contextually appropriate language outputs based on a mixture of input prompts and additional information fetched by the retrieval component.",
    "Complementing the language model is the retrieval system, which is usually built on a database of documents or a corpus of texts.",
    "This system uses techniques from information retrieval to find and fetch documents that are relevant to the input query or prompt.",
    "The mechanism of relevance determination can range from simple keyword matching to more complex semantic search algorithms which interpret the meaning behind the query to find the best matches.",
    "This component merges the outputs from the language model and the retrieval system.",
    "It effectively synthesizes the raw data fetched by the retrieval system into the generative process of the language model.",
    "The integrator ensures that the information from the retrieval system is seamlessly incorporated into the final text output, enhancing the model's ability to generate responses that are not only fluent and grammatically correct but also rich in factual details and context-specific nuances.",
    "When a query or prompt is received, the system first processes it to understand the requirement or the context.",
    "Based on the processed query, the retrieval system searches through its database to find relevant documents or information snippets.",
    "This retrieval is guided by the similarity of content in the documents to the query, which can be determined through various techniques like vector embeddings or semantic similarity measures.",
    "The retrieved documents are then fed into the language model.",
    "In some implementations, this integration happens at the token level, where the model can access and incorporate specific pieces of information from the retrieved texts dynamically as it generates each part of the response.",
    "The language model, now augmented with direct access to retrieved information, generates a response.",
    "This response is not only influenced by the training of the model but also by the specific facts and details contained in the retrieved documents, making it more tailored and accurate.",
    "By directly incorporating information from external sources, Retrieval Augmented Generation (RAG) models can produce responses that are more factual and relevant to the given query.",
    "This is particularly useful in domains like medical advice, technical support, and other areas where precision and up-to-date knowledge are crucial.",
    "Retrieval Augmented Generation (RAG) systems can dynamically adapt to new information since they retrieve data in real-time from their databases.",
    "This allows them to remain current with the latest knowledge and trends without needing frequent retraining.",
    "With access to a wide range of documents, Retrieval Augmented Generation (RAG) systems can provide detailed and nuanced answers that a standalone language model might not be capable of generating based solely on its pre-trained knowledge.",
    "While Retrieval Augmented Generation (RAG) offers substantial benefits, it also comes with its challenges.",
    "These include the complexity of integrating retrieval and generation systems, the computational overhead associated with real-time data retrieval, and the need for maintaining a large, up-to-date, and high-quality database of retrievable texts.",
    "Furthermore, ensuring the relevance and accuracy of the retrieved information remains a significant challenge, as does managing the potential for introducing biases or errors from the external sources.",
    "In summary, Retrieval Augmented Generation represents a significant advancement in the field of artificial intelligence, merging the best of retrieval-based and generative technologies to create systems that not only understand and generate natural language but also deeply comprehend and utilize the vast amounts of information available in textual form.",
    "A RAG vector store is a database or dataset that contains vectorized data points."
]

In [19]:
' '.join(db_records)

"Retrieval Augmented Generation (RAG) represents a sophisticated hybrid approach in the field of artificial intelligence, particularly within the realm of natural language processing (NLP). It innovatively combines the capabilities of neural network-based language models with retrieval systems to enhance the generation of text, making it more accurate, informative, and contextually relevant. This methodology leverages the strengths of both generative and retrieval architectures to tackle complex tasks that require not only linguistic fluency but also factual correctness and depth of knowledge. At the core of Retrieval Augmented Generation (RAG) is a generative model, typically a transformer-based neural network, similar to those used in models like GPT (Generative Pre-trained Transformer) or BERT (Bidirectional Encoder Representations from Transformers). This component is responsible for producing coherent and contextually appropriate language outputs based on a mixture of input prompt

In [20]:
import textwrap
paragraph = ' '.join(db_records)
wrapped_text = textwrap.fill(paragraph, width=80)
print(wrapped_text)

Retrieval Augmented Generation (RAG) represents a sophisticated hybrid approach
in the field of artificial intelligence, particularly within the realm of
natural language processing (NLP). It innovatively combines the capabilities of
neural network-based language models with retrieval systems to enhance the
generation of text, making it more accurate, informative, and contextually
relevant. This methodology leverages the strengths of both generative and
retrieval architectures to tackle complex tasks that require not only linguistic
fluency but also factual correctness and depth of knowledge. At the core of
Retrieval Augmented Generation (RAG) is a generative model, typically a
transformer-based neural network, similar to those used in models like GPT
(Generative Pre-trained Transformer) or BERT (Bidirectional Encoder
Representations from Transformers). This component is responsible for producing
coherent and contextually appropriate language outputs based on a mixture of
input prompts

# 4.The Query

In [83]:
query = "define a rag store"

In [44]:
#%shell openai migrate

Generation without augmentation

In [47]:
# Call the function and print the result
llm_response = call_llm_with_full_text(query)
print_formatted_response(llm_response)

Response:
---------------
Certainly! The term "RAG" in the context of Natural Language Processing (NLP)
and machine learning typically refers to "Retrieval-Augmented Generation." This
is a framework or approach used to enhance the capabilities of language models
by combining retrieval mechanisms with generative models. Here's a detailed
explanation:  ### Retrieval-Augmented Generation (RAG)  1. **Concept Overview**:
- RAG is designed to improve the performance of language models, especially in
tasks that require access to external knowledge or large datasets.    - It
combines two main components: a retrieval system and a generative model.  2.
**Components**:    - **Retrieval System**: This component is responsible for
searching and retrieving relevant information from a large corpus or database.
It acts like a search engine that finds documents or passages that are most
relevant to the input query.    - **Generative Model**: Once the relevant
information is retrieved, the generative mo

# Part 2: Advanced Techniques and Evaluation

# 1.Retrieval Metrics

## Cosine Similarity

In [48]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

def calculate_cosine_similarity(text1, text2):
    vectorizer = TfidfVectorizer(
        stop_words='english',
        use_idf=True,
        norm='l2',
        ngram_range=(1, 2),  # Use unigrams and bigrams
        sublinear_tf=True,   # Apply sublinear TF scaling
        analyzer='word'      # You could also experiment with 'char' or 'char_wb' for character-level features
    )
    tfidf = vectorizer.fit_transform([text1, text2])
    similarity = cosine_similarity(tfidf[0:1], tfidf[1:2])
    return similarity[0][0]

## Enhanced Similarity

In [49]:
import spacy
import nltk
nltk.download('wordnet')
from nltk.corpus import wordnet
from collections import Counter
import numpy as np

# Load spaCy model
nlp = spacy.load("en_core_web_sm")

def get_synonyms(word):
    synonyms = set()
    for syn in wordnet.synsets(word):
        for lemma in syn.lemmas():
            synonyms.add(lemma.name())
    return synonyms

def preprocess_text(text):
    doc = nlp(text.lower())
    lemmatized_words = []
    for token in doc:
        if token.is_stop or token.is_punct:
            continue
        lemmatized_words.append(token.lemma_)
    return lemmatized_words

def expand_with_synonyms(words):
    expanded_words = words.copy()
    for word in words:
        expanded_words.extend(get_synonyms(word))
    return expanded_words

def calculate_enhanced_similarity(text1, text2):
    # Preprocess and tokenize texts
    words1 = preprocess_text(text1)
    words2 = preprocess_text(text2)

    # Expand with synonyms
    words1_expanded = expand_with_synonyms(words1)
    words2_expanded = expand_with_synonyms(words2)

    # Count word frequencies
    freq1 = Counter(words1_expanded)
    freq2 = Counter(words2_expanded)

    # Create a set of all unique words
    unique_words = set(freq1.keys()).union(set(freq2.keys()))

    # Create frequency vectors
    vector1 = [freq1[word] for word in unique_words]
    vector2 = [freq2[word] for word in unique_words]

    # Convert lists to numpy arrays
    vector1 = np.array(vector1)
    vector2 = np.array(vector2)

    # Calculate cosine similarity
    cosine_similarity = np.dot(vector1, vector2) / (np.linalg.norm(vector1) * np.linalg.norm(vector2))

    return cosine_similarity

[nltk_data] Downloading package wordnet to /root/nltk_data...


# 2.Naive RAG

## Keyword search and matching

In [50]:
def find_best_match_keyword_search(query, db_records):
    best_score = 0
    best_record = None

    # Split the query into individual keywords
    query_keywords = set(query.lower().split())

    # Iterate through each record in db_records
    for record in db_records:
        # Split the record into keywords
        record_keywords = set(record.lower().split())

        # Calculate the number of common keywords
        common_keywords = query_keywords.intersection(record_keywords)
        current_score = len(common_keywords)

        # Update the best score and record if the current score is higher
        if current_score > best_score:
            best_score = current_score
            best_record = record

    return best_score, best_record

# Assuming 'query' and 'db_records' are defined in previous cells in your Colab notebook
best_keyword_score, best_matching_record = find_best_match_keyword_search(query, db_records)

print(f"Best Keyword Score: {best_keyword_score}")
print_formatted_response(best_matching_record)

Best Keyword Score: 1
Response:
---------------
A RAG vector store is a database or dataset that contains vectorized data
points.
---------------



## Metrics

In [51]:
# Cosine Similarity
score = calculate_cosine_similarity(query, best_matching_record)
print(f"Best Cosine Similarity Score: {score:.3f}")

Best Cosine Similarity Score: 0.079


In [52]:
# Enhanced Similarity
response = best_matching_record
print(query,": ", response)
similarity_score = calculate_enhanced_similarity(query, response)
print(f"Enhanced Similarity:, {similarity_score:.3f}")

Define RAG  :  A RAG vector store is a database or dataset that contains vectorized data points.
Enhanced Similarity:, 0.539


## Augmented input

In [53]:
augmented_input=query+ ": "+ best_matching_record

In [54]:
print_formatted_response(augmented_input)

Response:
---------------
Define RAG : A RAG vector store is a database or dataset that contains
vectorized data points.
---------------



## Generation

In [None]:
# Certainly! The term "RAG" in the context of Natural Language Processing (NLP)
# and machine learning typically refers to "Retrieval-Augmented Generation." This
# is a framework or approach used to enhance the capabilities of language models
# by combining retrieval mechanisms with generative models. Here's a detailed
# explanation:  ### Retrieval-Augmented Generation (RAG)  1. **Concept Overview**:
# - RAG is designed to improve the performance of language models, especially in
# tasks that require access to external knowledge or large datasets.    - It
# combines two main components: a retrieval system and a generative model.  2.
# **Components**:    - **Retrieval System**: This component is responsible for
# searching and retrieving relevant information from a large corpus or database.
# It acts like a search engine that finds documents or passages that are most
# relevant to the input query.    - **Generative Model**: Once the relevant
# information is retrieved, the generative model uses this information to produce
# a coherent and contextually appropriate response. This model is typically a
# neural network-based language model, such as GPT (Generative Pre-trained
# Transformer).  3. **How RAG Works**:    - **Input Query**: The process begins
# with an input query or prompt that needs a response.    - **Retrieval Step**:
# The retrieval system searches a large corpus to find documents or snippets that
# are relevant to the query.    - **Augmentation**: The retrieved information is
# then used to augment the input query, providing additional context or knowledge.
# - **Generation Step**: The generative model takes the augmented input and
# generates a response that is informed by both the original query and the
# retrieved information.  4. **Advantages**:    - **Enhanced Knowledge**: By
# accessing external information, RAG models can provide more accurate and
# informed responses, especially for queries that require specific or up-to-date
# knowledge.    - **Scalability**: The retrieval component allows the model to
# scale its knowledge base without needing to retrain the entire generative model.
# - **Flexibility**: RAG can be applied to various tasks, including question
# answering, dialogue systems, and content generation.  5. **Applications**:    -
# **Question Answering**: RAG can be used to answer complex questions by
# retrieving relevant documents and generating precise answers.    - **Customer
# Support**: In automated customer service, RAG can provide detailed and accurate
# responses by accessing a database of product information or FAQs.    - **Content
# Creation**: Writers and content creators can use RAG to generate articles or
# reports that are informed by the latest data or research.  Overall, Retrieval-
# Augmented Generation represents a powerful approach to leveraging both the vast
# knowledge available in external databases and the sophisticated language
# capabilities of generative models, resulting in more informed and contextually
# relevant outputs.

##Old Response.

In [55]:
# Call the function and print the result
llm_response = call_llm_with_full_text(augmented_input)
print_formatted_response(llm_response)

Response:
---------------
Certainly! Let's break down the concept of RAG and vector stores:  ### RAG
(Retrieval-Augmented Generation)  RAG stands for Retrieval-Augmented Generation,
a technique used in natural language processing (NLP) that combines the
strengths of information retrieval and text generation. The main idea behind RAG
is to enhance the capabilities of generative models by incorporating external
knowledge from a retrieval system. Here's how it works:  1. **Retrieval**: In
the first step, the system retrieves relevant information from a large corpus or
database. This is typically done using a retrieval model that searches for
documents or data points that are most relevant to the input query.  2.
**Augmentation**: The retrieved information is then used to augment the input to
a generative model. This means that the generative model has access to
additional context or knowledge that it can use to produce more accurate and
informative responses.  3. **Generation**: Finally, 

# 3.Advanced RAG

## 3.1.Vector search

### Search function

In [56]:
def find_best_match(text_input, records):
    best_score = 0
    best_record = None
    for record in records:
        current_score = calculate_cosine_similarity(text_input, record)
        if current_score > best_score:
            best_score = current_score
            best_record = record
    return best_score, best_record

In [57]:
best_similarity_score, best_matching_record = find_best_match(query, db_records)

In [59]:
query

'Define RAG '

In [58]:
print_formatted_response(best_matching_record)

Response:
---------------
While Retrieval Augmented Generation (RAG) offers substantial benefits, it also
comes with its challenges.
---------------



### Metrics

In [60]:
print(f"Best Cosine Similarity Score: {best_similarity_score:.3f}")

Best Cosine Similarity Score: 0.079


In [61]:
# Enhanced Similarity
response = best_matching_record
print(query,": ", response)
similarity_score = calculate_enhanced_similarity(query, best_matching_record)
print(f"Enhanced Similarity:, {similarity_score:.3f}")

Define RAG  :  While Retrieval Augmented Generation (RAG) offers substantial benefits, it also comes with its challenges.
Enhanced Similarity:, 0.564


### Augmented input

In [62]:
augmented_input=query+": "+best_matching_record

In [63]:
print_formatted_response(augmented_input)

Response:
---------------
Define RAG : While Retrieval Augmented Generation (RAG) offers substantial
benefits, it also comes with its challenges.
---------------



### Generation

In [64]:
# Call the function and print the result
llm_response = call_llm_with_full_text(augmented_input)
print_formatted_response(llm_response)

Response:
---------------
Retrieval Augmented Generation (RAG) is a sophisticated approach in the field of
Natural Language Processing (NLP) that combines the strengths of retrieval-based
and generation-based models to enhance the quality and relevance of generated
text. Here's a detailed explanation of RAG, its benefits, and the challenges it
faces:  ### What is RAG?  **Retrieval Augmented Generation (RAG)** is a hybrid
model that integrates two main components:  1. **Retrieval Component**: This
part of the model is responsible for fetching relevant information from a large
corpus or database. It uses techniques similar to those found in search engines
to identify documents or pieces of text that are most relevant to the input
query or context.  2. **Generation Component**: Once the relevant information is
retrieved, the generation component uses this information to produce coherent
and contextually appropriate responses or text. This is typically done using
advanced language models l

## 3.2.Index-based search

### Search Function

In [66]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

def setup_vectorizer(records):
    vectorizer = TfidfVectorizer()
    tfidf_matrix = vectorizer.fit_transform(records)
    return vectorizer, tfidf_matrix

def find_best_match(query, vectorizer, tfidf_matrix):
    query_tfidf = vectorizer.transform([query])
    similarities = cosine_similarity(query_tfidf, tfidf_matrix)
    best_index = similarities.argmax()  # Get the index of the highest similarity score
    best_score = similarities[0, best_index]
    return best_score, best_index

vectorizer, tfidf_matrix = setup_vectorizer(db_records)

best_similarity_score, best_index = find_best_match(query, vectorizer, tfidf_matrix)
best_matching_record = db_records[best_index]

print_formatted_response(best_matching_record)

Response:
---------------
A RAG vector store is a database or dataset that contains vectorized data
points.
---------------



### Metrics

In [73]:
# Cosine Similarity
print(f"Best Cosine Similarity Score: {best_similarity_score:.3f}")
print_formatted_response(best_matching_record)

Best Cosine Similarity Score: 0.215
Response:
---------------
A RAG vector store is a database or dataset that contains vectorized data
points.
---------------



In [78]:
# Enhanced Similarity
response = best_matching_record
print(query,": ", response)
similarity_score = calculate_enhanced_similarity(query, response)
print(f"Enhanced Similarity:, {similarity_score:.3f}")

Define RAG  :  A RAG vector store is a database or dataset that contains vectorized data points.
Enhanced Similarity:, 0.539


Feature extraction

In [79]:
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer

def setup_vectorizer(records):
    vectorizer = TfidfVectorizer()
    tfidf_matrix = vectorizer.fit_transform(records)

    # Convert the TF-IDF matrix to a DataFrame for display purposes
    tfidf_df = pd.DataFrame(tfidf_matrix.toarray(), columns=vectorizer.get_feature_names_out())

    # Display the DataFrame
    print(tfidf_df)

    return vectorizer, tfidf_matrix

vectorizer, tfidf_matrix = setup_vectorizer(db_records)

     ability    access  accuracy  accurate     adapt  additional  advancement  \
0   0.000000  0.000000  0.000000  0.000000  0.000000    0.000000     0.000000   
1   0.000000  0.000000  0.000000  0.216364  0.000000    0.000000     0.000000   
2   0.000000  0.000000  0.000000  0.000000  0.000000    0.000000     0.000000   
3   0.000000  0.000000  0.000000  0.000000  0.000000    0.000000     0.000000   
4   0.000000  0.000000  0.000000  0.000000  0.000000    0.236479     0.000000   
5   0.000000  0.000000  0.000000  0.000000  0.000000    0.000000     0.000000   
6   0.000000  0.000000  0.000000  0.000000  0.000000    0.000000     0.000000   
7   0.000000  0.000000  0.000000  0.000000  0.000000    0.000000     0.000000   
8   0.000000  0.000000  0.000000  0.000000  0.000000    0.000000     0.000000   
9   0.000000  0.000000  0.000000  0.000000  0.000000    0.000000     0.000000   
10  0.186734  0.000000  0.000000  0.000000  0.000000    0.000000     0.000000   
11  0.000000  0.000000  0.00

### Augmented input

In [80]:
augmented_input=query+": "+best_matching_record

In [81]:
print_formatted_response(augmented_input)

Response:
---------------
Define RAG : A RAG vector store is a database or dataset that contains
vectorized data points.
---------------



### Generation

In [82]:
# Call the function and print the result
llm_response = call_llm_with_full_text(augmented_input)
print_formatted_response(llm_response)

Response:
---------------
Certainly! Let's break down the concept of a RAG vector store:  ### RAG
(Retrieval-Augmented Generation)  RAG stands for Retrieval-Augmented Generation,
a framework that combines retrieval-based and generation-based approaches in
natural language processing (NLP). It is designed to enhance the capabilities of
language models by integrating external knowledge sources. Here's how it works:
1. **Retrieval**: In the RAG framework, a retrieval component is used to search
for relevant information from a large corpus or database. This component
identifies and retrieves data points that are most relevant to the input query
or context.  2. **Augmentation**: The retrieved information is then used to
augment the input to a generative model. This means that the generative model,
which is typically a neural network-based language model, uses the additional
context provided by the retrieved data to produce more accurate and contextually
relevant responses.  3. **Generation*

# 4.Modular RAG

Modular RAG can combine methods. For example:

**keyword search**:Searches through each document to find the one that best matches the keyword(s).

**vector search**: Searches through each document and calculates similarity.

**indexed search**: Uses a precomputed index (TF-IDF matrix) to compute cosine similarities.

**October 25, 2025 update**

`self.documents` is initialized in the fit method to hold the records used for searching and enable the `keyword_search` function to access them without error.

**Note on Vector search**

In this case, the `def vector_search(self, query):` uses `tfidf_matrix`to increase the vector search performance.

The `def vector_search(self, query):` function could use a brute-force method as implemented in `Section 3.1.Vector search` of this notebook.

In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

class RetrievalComponent:
    def __init__(self, method='vector'):
        self.method = method
        if self.method == 'vector' or self.method == 'indexed':
            self.vectorizer = TfidfVectorizer()
            self.tfidf_matrix = None

    def fit(self, records):
      self.documents = records  # Initialize self.documents here
      if self.method == 'vector' or self.method == 'indexed':
        self.tfidf_matrix = self.vectorizer.fit_transform(records)

    def retrieve(self, query):
        if self.method == 'keyword':
            return self.keyword_search(query)
        elif self.method == 'vector':
            return self.vector_search(query)
        elif self.method == 'indexed':
            return self.indexed_search(query)

    def keyword_search(self, query):
        best_score = 0
        best_record = None
        query_keywords = set(query.lower().split())
        for index, doc in enumerate(self.documents):
            doc_keywords = set(doc.lower().split())
            common_keywords = query_keywords.intersection(doc_keywords)
            score = len(common_keywords)
            if score > best_score:
                best_score = score
                best_record = self.documents[index]
        return best_record

    def vector_search(self, query):
        query_tfidf = self.vectorizer.transform([query])
        similarities = cosine_similarity(query_tfidf, self.tfidf_matrix)
        best_index = similarities.argmax()
        return db_records[best_index]

    def indexed_search(self, query):
        # Assuming the tfidf_matrix is precomputed and stored
        query_tfidf = self.vectorizer.transform([query])
        similarities = cosine_similarity(query_tfidf, self.tfidf_matrix)
        best_index = similarities.argmax()
        return db_records[best_index]

### Modular RAG Strategies

In [None]:
# Usage example
retrieval = RetrievalComponent(method='vector')  # Choose from 'keyword', 'vector', 'indexed'
retrieval.fit(db_records)
best_matching_record = retrieval.retrieve(query)

print_formatted_response(best_matching_record)

Response:
---------------
A RAG vector store is a database or dataset that contains vectorized data
points.
---------------



### Metrics

In [None]:
# Cosine Similarity
print(f"Best Cosine Similarity Score: {best_similarity_score:.3f}")
print_formatted_response(best_matching_record)

Best Cosine Similarity Score: 0.407
Response:
---------------
A RAG vector store is a database or dataset that contains vectorized data
points.
---------------



In [None]:
# Enhanced Similarity
response = best_matching_record
print(query,": ", response)
similarity_score = calculate_enhanced_similarity(query, response)
print("Enhanced Similarity:", similarity_score)

define a rag store :  A RAG vector store is a database or dataset that contains vectorized data points.
Enhanced Similarity: 0.641582812483307


### Augmented Input

In [None]:
augmented_input=query+ " "+ best_matching_record

In [None]:
print_formatted_response(augmented_input)

Response:
---------------
define a rag store A RAG vector store is a database or dataset that contains
vectorized data points.
---------------



### Generation

In [None]:
# Call the function and print the result
llm_response = call_llm_with_full_text(augmented_input)
print_formatted_response(llm_response)

Response:
---------------
A vector store, often referred to as a vector database or vector dataset, is a
specialized type of database designed to store and manage data in the form of
vectors. Vectors are mathematical representations of data points, typically in
the form of arrays or lists of numbers. These vectors can represent a wide range
of data types, including text, images, audio, and more, by capturing their
essential features in a numerical format.  The primary purpose of a vector store
is to facilitate efficient storage, retrieval, and manipulation of vectorized
data. This is particularly useful in applications involving machine learning,
artificial intelligence, and data analysis, where operations such as similarity
search, clustering, and classification are common.  Key characteristics and
functionalities of a vector store include:  1. **High-Dimensional Data
Handling**: Vector stores are optimized to handle high-dimensional data, which
is common in applications like natural 