<a href="https://colab.research.google.com/github/exala/PIAIC_solutions/blob/main/Langchain_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **What is Retrieval-Augmented Generation (RAG)?**
**Retrieval-Augmented Generation (RAG)** is an advanced AI framework that blends the power of information retrieval with generative models to deliver accurate, context-aware responses. By retrieving relevant documents from a database or knowledge source, RAG enhances the generative model's ability to produce reliable and domain-specific answers. Its purpose is to improve the quality of AI-generated content, making it especially effective for tasks requiring detailed knowledge or up-to-date information.

---

## **Use cases:**
RAG is ideal for building intelligent Q&A systems, streamlining knowledge retrieval, and generating personalized insights from vast datasets. It enhances applications in customer support, education, and enterprise knowledge management.

#**Generative AI with Retrieval-Augmented Generation (RAG)**

This project demonstrates a **Retrieval-Augmented Generation (RAG)** pipeline for question-answering using LangChain, Pinecone, and Google Generative AI (Gemini). The workflow involves embedding documents, storing them in a Pinecone vector database, and querying them with a generative AI model.

---

## **Prerequisites**

Before running the notebook, ensure you have the following:

1. A **Pinecone** account with API keys and serverless settings.
2. A **Google Generative AI** API key.
3. A text or PDF file containing the documents you want to process (e.g., `fusionenergy1.txt`).

---

##**Installation**

Run the following command to install the required libraries:


In [None]:
!pip install langchain pinecone-client google-generativeai openai tqdm langchain_google_genai langchain_community langchain-pinecone

## **1. Environment Setup**
Retrieve and set API keys for Pinecone and Google Generative AI using Colab's userdata:

In [62]:
from google.colab import userdata
import os

PINECONE_API_KEY = userdata.get("PINECONE_API_KEY")
PINECONE_ENVIRONMENT = userdata.get("PINECONE_ENVIRONMENT")
GOOGLE_API_KEY = userdata.get("GOOGLE_API_KEY")

os.environ["PINECONE_API_KEY"] = PINECONE_API_KEY
os.environ["PINECONE_ENVIRONMENT"] = PINECONE_ENVIRONMENT
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY

## **2. Initialize Pinecone**
Set up the Pinecone client and create an index:

In [None]:
import os, time
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(
    api_key=os.environ.get("PINECONE_API_KEY")
)

cloud = os.environ.get('PINECONE_CLOUD') or 'aws'
region = os.environ.get('PINECONE_REGION') or 'us-east-1'
spec = ServerlessSpec(cloud=cloud, region=region)

index_name = "first-rag-oriject"

if index_name not in pc.list_indexes().names():
    pc.create_index(
        name=index_name,
        dimension=768,
        metric="cosine",
        spec=spec
    )
    # Wait for index to be ready
    while not pc.describe_index(index_name).status['ready']:
        time.sleep(1)

# See that it is empty
print("Index before upsert:")
print(pc.Index(index_name).describe_index_stats())
print("\n")
index = pc.Index(index_name)
print('index\n', index)

## **3. Generate Embeddings**
Use Google Generative AI Embeddings to create vector embeddings for your documents:

In [63]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(
    model="models/embedding-001",
    api_key=GOOGLE_API_KEY
    )

## **4. Document Preparation**
Load and split your document(s) into smaller chunks for better retrieval performance:

In [87]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

loader = TextLoader("/content/fusionenergy1.txt")  # Replace with your document file
documents = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
docs = text_splitter.split_documents(documents)

**Optional (for PDFs):**

In [None]:
### FOR PDFs
# !pip install -q pypdf
# from langchain.document_loaders import PyPDFLoader

# # Use PyPDFLoader specifically for PDF files
# loader = PyPDFLoader("/content/fusionenergy.pdf")
# documents = loader.load()

# text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
# docs = text_splitter.split_documents(documents)

## **5. Insert Data into Pinecone**
Embed document chunks into vectors and upsert them into Pinecone:

In [None]:
from tqdm import tqdm  # Shows a progress bar

for doc in tqdm(docs):  # Loop through document chunks
    vector = embeddings.embed_query(doc.page_content)  # Generate vector
    index.upsert([{
        "id": doc.metadata["source"],  # Use "id" instead of the first tuple element
        "values": vector,  # Use "values" for the vector
        "metadata": {
            "text": doc.page_content,  # Add the text as part of metadata
            "source": doc.metadata["source"]  # Include the source
        }
        }])

## **6. Configure the Generative AI Model**
Set up the Google Generative AI model for generating answers:

In [72]:
from langchain_google_genai import ChatGoogleGenerativeAI

gemini_model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", api_key=GOOGLE_API_KEY, temperature=0.7)

## **7. Set Up Retrieval-Based QA**
Create a retriever using the Pinecone vector store and set up a QA chain:

In [79]:
from langchain_pinecone import PineconeVectorStore
from langchain.chains import RetrievalQA

# Create a vector store using the Pinecone index
vectorstore = PineconeVectorStore(
    index_name=index_name,
    embedding=embeddings
)

# Create the retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})  # Retrieve top 4 most similar documents

# Create the QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=gemini_model,
    chain_type="stuff",
    retriever=retriever,  # Pass the retriever here
    return_source_documents=True  # Optional: to get the source documents used in the response
)

tags=['PineconeVectorStore', 'GoogleGenerativeAIEmbeddings'] vectorstore=<langchain_pinecone.vectorstores.PineconeVectorStore object at 0x7bd223737190> search_kwargs={'k': 4}


## **8. Query the QA System**
Ask a question and receive an answer from the RAG system:

In [None]:
query = "What is the current "
response = qa_chain.invoke(query)
print(response['result'])