# Project 2: LangChain RAG Project
### In this Project, we will create a simple LangChain RAG Colab Notebook that uses the Google Gemini Flash model to answer user questions from the document provided. This example below is provided to help you get started assumes you have access to the Gemini API, Pinecone and a basic Python environment. However, we are required to develop and submit our project using Google Colab.

## Step 1 : Install Required Libraries

In [36]:
!pip install langchain pinecone-client google-generativeai openai tqdm python-dotenv -q
!pip install langchain-google-genai -U -q
!pip install --upgrade langchain -q
!pip install langchain-community -U -q
!pip install -qU langchain-google-vertexai
!pip install google-generativeai --upgrade -q
!pip install langchain-google-vertexai --upgrade -q
!pip install google-cloud google-cloud-language google-auth google-auth-oauthlib google-auth-httplib2 -q
!pip install -U langchain-google-vertexai -q

### Steps to Create a .env File in Colab
Write the .env File to the Current Directory Use the following code snippet to create and save a .env file:

In [33]:
# @title
# Write API keys to a .env file
with open(".env", "w") as f:
    f.write("PINECONE_API_KEY=")
    f.write("PINECONE_ENVIRONMENT=us-east-1'\n")
    f.write("GOOGLE_API_KEY=\n")
print(".env file created successfully!")


.env file created successfully!


## Step 2 : Set Up Environment Variables

- Create API key variables for secure access.
- Use the python-dotenv package to manage the API keys.

In [34]:
import os
from dotenv import load_dotenv

load_dotenv()
PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
PINECONE_ENVIRONMENT = os.getenv("PINECONE_ENVIRONMENT")
GOOGLE_GEMINI_API_KEY = os.getenv("GOOGLE_API_KEY")

In [8]:
# Verify keys
#print("Pinecone API Key:", os.environ["PINECONE_API_KEY"])
#print("Google Gemini API Key:", os.environ["GOOGLE_API_KEY"])

## Step 3 : Initialize Pinecone
pinecone is used to store and retrieve vectorized documents. at first we have to create the index , but running the notebook again can cause errors due to existing index that is why we can use excisting index

In [37]:
# Initialize Pinecone
#from pinecone import Pinecone, ServerlessSpec

#pc = Pinecone(api_key=PINECONE_API_KEY, environment = PINECONE_ENVIRONMENT)

# Create a new index or connect to an existing one
#index_name = "gemini-rag-index"
#if index_name not in pc.list_indexes():
    # Define the index specification using ServerlessSpec
    #spec = ServerlessSpec(cloud="aws", region="us-east-1")
    #pc.create_index(index_name, dimension=768, spec=spec) # Provide the spec argument

# Connect to the index
#index = pc.Index(index_name)

In [38]:
import pinecone
from pinecone import Pinecone

# Instead of using pinecone.init(), create a Pinecone instance:
pc = Pinecone(api_key=PINECONE_API_KEY, environment=PINECONE_ENVIRONMENT)

index_name = "gemini-rag-index"
# Call list_indexes() on the Pinecone instance 'pc'
if index_name not in pc.list_indexes().names():  # Use pc.list_indexes().names() to get the index names
    pinecone.create_index(index_name, dimension=768)

index = pc.Index(index_name)
# Now you can use 'pc' to interact with Pinecone:
print(pc.list_indexes())

{'indexes': [{'deletion_protection': 'disabled',
              'dimension': 768,
              'host': 'gemini-rag-index-ppwml7v.svc.aped-4627-b74a.pinecone.io',
              'metric': 'cosine',
              'name': 'gemini-rag-index',
              'spec': {'serverless': {'cloud': 'aws', 'region': 'us-east-1'}},
              'status': {'ready': True, 'state': 'Ready'}}]}


## Step 4 : Use LangChain for RAG Workflow
1. Set Up Embedding Model
Use Google Gemini embeddings to vectorize documents.

In [18]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Load the documents, specifying the correct encoding
loader = TextLoader("MACHINE LEARNING MODULE 3.txt", encoding='latin-1')  # Try 'latin-1' or 'cp1252'
documents = loader.load()

# Split the documents into smaller chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
docs = text_splitter.split_documents(documents)

print(f"Split into {len(docs)} chunks.")

Split into 253 chunks.


## Step 5 : Embed and Store Documents in Pinecone
Use Google Gemini embeddings to vectorize the document chunks and upload to Pinecone.

In [19]:
from langchain_google_genai.embeddings import GoogleGenerativeAIEmbeddings
from langchain_google_genai.chat_models import ChatGoogleGenerativeAI
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.schema import Document
import time

# Initialize the embedding model
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001", api_key=GOOGLE_GEMINI_API_KEY)

# Upsert vectors into Pinecone
from tqdm import tqdm

# Upsert vectors into Pinecone
for i, doc in enumerate(tqdm(docs)):
    vector = embeddings.embed_query(doc.page_content)
    index.upsert([(str(i), vector, {"text": doc.page_content})])
    time.sleep(2)  # Introduce a 1-second delay between requests
print("Vectors upserted into Pinecone.")

print("Documents uploaded to Pinecone.")


100%|██████████| 253/253 [10:14<00:00,  2.43s/it]

Vectors upserted into Pinecone.
Documents uploaded to Pinecone.





## Step 7: Set Up Retriever
Connect LangChain’s Pinecone Retriever to fetch relevant documents.

In [20]:
from langchain.vectorstores import Pinecone

# Use from_documents to create the Pinecone retriever, and specify the index_name
retriever = Pinecone.from_documents(
    docs, embeddings, index_name="gemini-rag-index", text_key="text"
)
print("Retriever setup complete.")

Retriever setup complete.


## Step 8: Initialize Google Gemini Flash Model
Use Google Gemini Flash as the LLM for answering questions.

In [21]:
# Set up Google Gemini Flash model
from langchain_google_genai.chat_models import ChatGoogleGenerativeAI

# Access the API key using os.environ
gemini_model = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash",  # Specify the model name, e.g., 'gemini-1.5-flash'
    api_key=os.environ["GOOGLE_API_KEY"], #Use the environment variable for the API key
    temperature=0.7
)
print("Google Gemini Flash model initialized.")

Google Gemini Flash model initialized.


## Step 9: Combine Retriever and LLM with LangChain
Use the RetrievalQA chain to combine document retrieval with the Gemini model

In [22]:
from langchain.chains import RetrievalQA
from langchain.vectorstores.pinecone import Pinecone
from langchain.chains.retrieval_qa.base import BaseRetrievalQA

# Setup RetrievalQA chain
# Instead of directly using Pinecone, use Pinecone.as_retriever()
qa_chain = RetrievalQA.from_chain_type(
    llm=gemini_model,
    chain_type="stuff",
    retriever=retriever.as_retriever()  # Use as_retriever() to get a BaseRetriever
)
print("RetrievalQA chain ready.")

RetrievalQA chain ready.


## Step 10: Query the RAG System
Now, you can test your RAG system with questions.

In [23]:
query = "What is machine learning?"
response = qa_chain.run(query)

print("Question:", query)
print("Response:", response)


  response = qa_chain.run(query)


Question: What is machine learning?
Response: Machine learning (ML) is a discipline of artificial intelligence (AI) that provides machines with the ability to automatically learn from data and past experiences while identifying patterns to make predictions with minimal human intervention.  It's a method of training a machine to learn from data, rather than being explicitly programmed.  This involves feeding a machine a large dataset and allowing it to learn and improve performance over time.



In [24]:
query = "What are the types of machine learning?"
response = qa_chain.run(query)


print("Question:", query)
print("Response:", response)

Question: What are the types of machine learning?
Response: Machine learning algorithms can be broadly categorized into supervised, unsupervised, semi-supervised, and reinforcement learning.



In [25]:
query = "explain types of machine learning?"
response = qa_chain.run(query)


print("Question:", query)
print("Response:", response)

Question: explain types of machine learning?
Response: Machine learning algorithms can be broadly categorized into supervised, unsupervised, semi-supervised, and reinforcement learning.  Each category has specific types of algorithms designed for particular kinds of tasks.



------------

## Step 7: Deploy and Iterate
Deploy the RAG system as an API using FastAPI or Flask.


-------------

----------------