<a href="https://colab.research.google.com/github/Aimy99/Project_Langchain_RAG_System/blob/main/Langchain_RAG_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **PIAIC Quarter 3 - Project 02: 'Langchain_Retrieval_Augmented_Generation_System'**
### *Developed by Aiman Baquar*

### **Overview:**

The LangChain RAG (Retriever-Augmented Generation) System combines LangChain, Pinecone, and Google Generative AI to enable efficient document-based question answering. This system retrieves pertinent information and generates contextually relevant responses.

### **Technology Stack:**

- **LangChain** (Integration): Streamlines integration between AI models and various data sources.
- **Pinecone** (Vector Database): Provides fast, scalable storage and retrieval of document embeddings.
- **Google Generative AI** (Gemini): Delivers precise, context-aware natural language responses.

### **Outcome:**

A robust, scalable AI-powered solution for delivering real-time, accurate question answering.

### ***Use Cases:***

- **Customer Support:** Automates query resolution with context-aware answers.
- **Legal Review:** Retrieves and answers questions on legal documents.
- **Healthcare:** Provides medical professionals with evidence-based answers.
- **E-learning:** Enhances learning by answering course-related queries.
- **Business Intelligence:** Extracts insights from large datasets.
- **Research:** Assists in retrieving and summarizing research data.

In [1]:
# Installing Langchain, Pinecone Client, tqdm, Google GenAI
!pip install -q -U langchain
!pip install pinecone-client
!pip install -q tqdm
!pip install -q langchain-google-genai openai

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.0 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.2/1.0 MB[0m [31m5.5 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.0/1.0 MB[0m [31m15.0 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m10.7 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/411.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m411.6/411.6 kB[0m [31m16.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting pinecone-client
  Downloading pinecone_client-5.0.1-py3-none-any.whl.metadata (19 kB)
Collecting pinecone-plugin-inference<2.0.0,>=1.0.3 (from pinecone-client)
  Downloading pinecone_plugin_inference-1.1.0-py3-none-any.whl.metadata (2.2 kB)
Collecting pin

In [2]:
# Setting environment and securing API Keys from userdata of GoogleColab
from google.colab import userdata
import os
PINECONE_API_KEY = userdata.get('PINECONE_API_KEY')
os.environ['PINECONE_API_KEY'] = PINECONE_API_KEY

GOOGLE_API_KEY = userdata.get('GEMINI_API_KEY')
os.environ['GOOGLE_API_KEY'] = GOOGLE_API_KEY
PINECONE_ENVIRONMENT = 'us-east-1'

In [3]:
# Import pinecone services
from pinecone import Pinecone, ServerlessSpec

# Initiaize pinecone client
pc = Pinecone(
    api_key=PINECONE_API_KEY
)

# Check if the index exists; if not, create it
index_name = "gemini-rag-index"
if index_name not in pc.list_indexes().names():
    pc.create_index(
        name=index_name,
        dimension=768,              # vector size
        metric="cosine",            # Choosing the metric: cosine, euclidean, or dotproduct to compare vectors
        spec=ServerlessSpec(
            cloud="aws",                       # cloud provide (AWS in this case)
            region=PINECONE_ENVIRONMENT  )  )  # Use your environment's region

# # Connect to the index
index = pc.Index(name=index_name)                         # index interaction (store & retrieve data)
print(f"Successfully connected to index: {index_name}")   # console confirmation

Successfully connected to index: gemini-rag-index


In [4]:
# Initializing Google Gemini embeddings to vectorize documents
from langchain_google_genai.embeddings import GoogleGenerativeAIEmbeddings
embeddings = GoogleGenerativeAIEmbeddings(
    model="models/embedding-001",             # model parameter specified
    api_key=GOOGLE_API_KEY
)

In [5]:
# Initializing Langchain necessary libraries
!pip install -q U langchain_community
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m27.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.4/58.4 kB[0m [31m3.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m48.9/48.9 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[?25h

In [6]:
from google.colab import files
uploaded = files.upload()

Saving rag_document.txt to rag_document.txt


In [7]:
# Load document
loader = TextLoader("rag_document.txt")  # Replace with your file
documents = loader.load()

# Split document into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
docs = text_splitter.split_documents(documents)

In [8]:
# Embed and Store Documents in Pinecone
from tqdm import tqdm     # for progress bar

# Create embeddings and upload to Pinecone
for doc in tqdm(docs):
    vector = embeddings.embed_query(doc.page_content)
# Modify the upsert to use a dictionary for metadata
    index.upsert([{
        "id": doc.metadata["source"],  # Use "id" instead of the first tuple element
        "values": vector,  # Use "values" for the vector
        "metadata": {
            "text": doc.page_content,  # Add the text as part of metadata
            "source": doc.metadata["source"]  # Include the source
        }
    }])

100%|██████████| 8/8 [00:05<00:00,  1.37it/s]


In [9]:
# Retrieve relevant information from pinecone vector database index
!pip install -q -U langchain-pinecone
from langchain_pinecone import Pinecone

# Use 'index' instead of 'pinecone_index'
retriever = Pinecone.from_existing_index(index_name=index_name, embedding=embeddings, text_key="text")

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.2 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.1/1.2 MB[0m [31m4.1 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━[0m [32m1.1/1.2 MB[0m [31m16.0 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m12.9 MB/s[0m eta [36m0:00:00[0m
[?25h

In [10]:
# Initialize the model for RAG generation
from langchain_google_genai import ChatGoogleGenerativeAI

gemini_model = ChatGoogleGenerativeAI(api_key=GOOGLE_API_KEY,model="gemini-2.0-flash-exp", temperature=0.7)

In [11]:
from langchain_pinecone import PineconeVectorStore
from langchain.chains import RetrievalQA

# Create a vector store using the Pinecone index
vectorstore = PineconeVectorStore(
    index_name=index_name,
    embedding=embeddings
)

# Create the retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})  # Retrieve top 2 most similar documents

# Create the QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=gemini_model,
    chain_type="stuff",
    retriever=retriever,  # Pass the retriever here
    return_source_documents=True  # Optional: to get the source documents used in the response
)

In [13]:
# Query the RAG System
query = "What is RAG Systems."
response = qa_chain.invoke(query)

# Print the answer
print("Agent Message:", response['result'])  # Access the 'result' key from the response dictionary

Agent Message: RAG systems are a powerful tool that combines retrieval with generation to provide more informed, context-aware, and personalized interactions. They are used for knowledge-intensive tasks such as question answering, chatbots, document summarization, and other forms of AI-driven content generation.

