# **Retrieval-Augmented Generation (RAG) Model for QA Bot Using Pinecone and OpenAI**

In [None]:
import os
import getpass
os.environ["PINECONE_API_KEY"] = getpass.getpass()
os.environ["OPENAI_API_KEY"] = getpass.getpass()

**Pipeline Demonstration: Data Loading to Question Answering**:

  We start by installing the necessary libraries. These include tools for document loading, text splitting, embeddings generation, and integration with Pinecone and OpenAI.

In [14]:
# Library installation
!pip install \
  langchain_community \
  langchain_pinecone \
  langchain_openai \
  unstructured \
  langchain-text-splitters \
  langchain




In [15]:
from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_text_splitters import RecursiveCharacterTextSplitter
import os
from langchain_community.document_loaders import PyPDFLoader
from langchain.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough



**Loading the Document:**

  We mount Google Drive to access the PDF file and load the document using PyPDFLoader. This loader breaks down the PDF into multiple pages, preparing them for embedding generation

In [16]:
from google.colab import drive
drive.mount('/content/drive')


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


**PDF** file loading

In [17]:
#with open('iesc111.pdf','r') as file:
location = '/content/drive/My Drive/iesc111.pdf'
pdf_loader = PyPDFLoader(location)

pages = pdf_loader.load_and_split()


**Splitting the Document into Chunks:**

  To ensure optimal embeddings, the document is split into smaller chunks. Here, we use the RecursiveCharacterTextSplitter, which divides the content into smaller parts with overlap to preserve context.

In [None]:
# load and split on characters
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
splits = text_splitter.split_documents(pages)

In [18]:
splits[0]

Document(metadata={'source': '/content/drive/My Drive/iesc111.pdf', 'page': 0}, page_content='Everyday we hear sounds from various\nsources like humans, bir ds, bells, machines,\nvehicles, televisions, radios etc. Sound is a\nform of ener gy which pr oduces a sensation\nof hearing in our ears. Ther e are also other\nforms of energy like mechanical energy, light\nenergy, etc. W e have talked about mechanical\nenergy in the pr evious chapters. Y ou have\nbeen taught about conservation of energy,\nwhich states that we can neither create nor\ndestr oy ener gy. W e  can just change it fr om\none for m to another . When you clap, a sound\nis produced. Can you produce sound without\nutilising your energy? Which form of energy\ndid you use to produce sound? In this\nchapter we are going to learn how sound is\nproduced and how it is transmitted through\na medium and received by our ears.\n11.1 Production of Sound\nActivity _____________ 11.1\n•Take a tuning fork and set it vibrating\nby strikin

# # Vectore Store Initialization

**Embeddings Generation and Vector Store Initialization:**

  Using the OpenAI model, we generate document embeddings. These embeddings are stored in Pinecone's vector database for efficient similarity search.

In [19]:
# Define embeddings and index for vectorestore
embeddings = OpenAIEmbeddings()

In [20]:
use_serverless = True

In [21]:
!pip install protoc-gen-openapiv2
# Install the missing 'protoc-gen-openapiv2' module.




**Configuring Pinecone Vector Store:**

We initialize and configure a Pinecone vector store to hold our document embeddings. The index is created with a dimensionality of 1536, matching the embedding dimensions.

**from pinecone.grpc import PineconeGRPC as Pinecone
index_name = "sarvam-chat"
pc = Pinecone()
pc.create_index(index_name, dimension=1536, metric='cosine')
index = pc.Index(index_name) **



In [22]:
from pinecone.grpc import PineconeGRPC as Pinecone
from pinecone import ServerlessSpec, PodSpec
import time
# configure client
pc = Pinecone()
if use_serverless:
    spec = ServerlessSpec(cloud='aws', region='us-east-1')
else:
    # if not using a starter index, you should specify a pod_type too
    spec = PodSpec()
# check for and delete index if already exists
index_name = "sarvam-chat"
items = pc.list_indexes().indexes
existing_indexes = [item['name'] for item in items]
if index_name in existing_indexes:
    pc.delete_index(index_name)
# create a new index
pc.create_index(
    index_name,
    dimension=1536,  # dimensionality of text-embedding-ada-002
    metric='cosine',
    spec=spec
)
# wait for index to be initialized
while not pc.describe_index(index_name).status['ready']:
    time.sleep(1)


In [23]:
!pip install protoc-gen-openapiv2
# Install the missing 'protoc-gen-openapiv2' module.



In [24]:
index = pc.Index(index_name)
index.describe_index_stats()

# Response:
# {'dimension': 1536,
# 'index_fullness': 0.0,
# 'namespaces': {},
# 'total_vector_count': 0}


{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 0}},
 'total_vector_count': 0}

**Adding Documents to the Vector Store:**

The split document chunks are stored in Pinecone's vector store to prepare for efficient retrieval.

In [25]:
from langchain_pinecone import PineconeVectorStore
text_field = "text"
vectorstore = PineconeVectorStore(
    index, embeddings, text_field
)


In [26]:
# async_req is required to make sure it is updated in real time
vectorstore.add_documents(documents= pages, async_req=False)

['222acefc-910b-4e24-bc34-db2126e4d8de',
 '0c84250d-d266-4b14-8dd2-c4508c6248d7',
 'ecd1766f-b557-459e-b166-be6e13b01204',
 'c13c4e7b-b4fb-428f-838b-e9c0b5b7b6a1',
 'a9bc041c-5c95-4a0f-8413-093c13028b43',
 'b8e1708c-d958-4892-ac77-ac1f79e4b498',
 'dbc8a4ea-ec22-4849-a835-3023eaffd8ce',
 'b9ad3ceb-92ea-417f-90c7-005319f25ad2',
 '060b9eaf-8581-4251-b592-cc44a4d1289c',
 '5e5e714d-53ef-44b1-b7b3-7b51366e613c',
 '742fda1b-811d-4124-9fc7-8fef66ef4f67',
 'd651a772-1bd0-4108-904d-c48cff9844f8',
 'c7c123e9-bfb3-4e85-b047-e7eb85f5d884',
 '3c662703-7763-4b62-b303-5847cbd17898']

**Testing Document Similarity Search:**

We test the system by performing a similarity search. Given a user query, we retrieve the most relevant document chunks from the vector store.

In [27]:
# Testing the similarity search
query = "What is the book about?"
similar_docs = vectorstore.similarity_search(query)

In [28]:
similar_docs

[]

In [29]:
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

** Generating Answers with Retrieval-Augmented Generation (RAG):**

We use a combination of retrieval and language model generation to answer user questions. The relevant document chunks are retrieved from Pinecone and passed as context to a generative language model (ChatGPT) to create answers.

In [30]:
# Using LCEL chains to QA
question = "What is the book about?"
retriever = vectorstore.as_retriever()
prompt_rag = (
    PromptTemplate.from_template("Generate answers for given question: {question} based on the context {context}. In generated response provide reference to metadata from which context used in generating response")
)
llm = ChatOpenAI(model_name="gpt-4o-mini", temperature=0)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt_rag
    | llm
)

res = rag_chain.invoke(question)

In [31]:
print(res.content)

The book appears to be a comprehensive educational resource focused on the science of sound, covering various fundamental concepts related to acoustics. It discusses the nature of sound waves, including their speed, frequency, wavelength, and the physiological aspects of how sound is perceived, such as loudness and pitch. The text also delves into practical applications of sound, such as the use of ultrasound in cleaning and detecting defects in materials, as well as the principles of sound propagation and reflection.

Key topics include:

1. **Sound Wave Properties**: The book explains how sound waves travel through different media, the relationship between frequency and pitch, and how amplitude affects loudness. It also provides mathematical relationships, such as the calculation of frequency based on speed and wavelength.

2. **Audibility and Sound Characteristics**: It defines the audible range for humans and distinguishes between infrasonic and ultrasonic sounds, emphasizing the p

In [32]:
question = "What are the characteristics of a sound wave?"
retriever = vectorstore.as_retriever()
prompt_rag = (
    PromptTemplate.from_template("Generate answers for given question: {question} based on the context {context}. In generated response provide reference to metadata from which context used in generating response")
)
llm = ChatOpenAI(model_name="gpt-4o-mini", temperature=0)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt_rag
    | llm
)

res = rag_chain.invoke(question)

In [33]:
print(res.content)

The characteristics of a sound wave can be summarized as follows:

1. **Nature of Sound**: Sound is produced by the vibration of different objects, creating disturbances in a medium (solid, liquid, or gas).

2. **Wave Type**: Sound travels as a longitudinal wave, meaning that the oscillations of the particles in the medium occur parallel to the direction of wave propagation. This results in regions of compression (high pressure) and rarefaction (low pressure) as the wave moves through the medium.

3. **Energy Propagation**: In sound propagation, it is the energy of the sound that travels through the medium, not the particles themselves. The particles oscillate around their equilibrium positions but do not move along with the wave.

4. **Wavelength and Frequency**: The distance between two consecutive compressions or rarefactions is called the wavelength (λ). The time taken for one complete oscillation is the time period (T), and the number of oscillations per unit time is the frequency



**Model Architecture:**

The architecture is designed around two primary components:

**Vector Store (Retrieval)**

**Document Loader:** Loads the document in the form of pages.

**Text Splitter:** Splits the document into smaller chunks (e.g., 1000 characters per chunk with overlap).
Embeddings Generation: Using OpenAI, document embeddings are generated. These embeddings represent the content in high-dimensional space.
Pinecone Vector Store: The document chunks and their embeddings are stored in Pinecone for efficient retrieval based on semantic similarity.

**Generative Model (Answer Generation):**

**Retriever:** Fetches the top-k most relevant document chunks from Pinecone based on the user query.

**Language Model (ChatGPT):** Receives the retrieved context and generates a coherent answer to the query using RAG.

**Prompt Template:** A structured prompt is created using a template that combines the query and the retrieved context, guiding the generative model.






**Retrieval Approach:**
The system follows a retrieval-augmented approach:

**Document Retrieval:** When a query is submitted, the system searches the vector store for the most relevant chunks using a similarity search (cosine similarity). This ensures that only the most pertinent parts of the document are considered for answering.
**Context and Question Prompting:** The retrieved chunks are then passed as context to the generative language model, which answers the query based on both the retrieved context and the query itself.


**Generative Response Creation**
The generative responses are created using the following process:

**Contextual Answering:** The retrieved document chunks act as the basis for the language model to answer the user's question. This ensures that the generated responses are not hallucinations but are grounded in the content of the document.
**Metadata Inclusion:** The prompt template ensures that the response contains references to the source of the information used, providing transparency and credibility.

**Example Queries and Outputs**
Here are a few examples of the system's queries and responses:

Query 1: "What is the book about?" Answer: "The book is a comprehensive guide on the subject of XYZ, covering topics such as... "

Query 2: "Explain the key concept." Answer: " it introduces the core concept of ABC, where..."

