<a href="https://colab.research.google.com/github/MishaJavaid787/PIAIC-Batch-62/blob/main/Project_02_LangChain_RAG_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Install Packages**

In [2]:
!pip install -qU langchain-pinecone langchain-google-genai

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/41.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.7/41.7 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m23.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m427.3/427.3 kB[0m [31m29.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m87.5/87.5 kB[0m [31m6.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.3/50.3 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[?25h

**API Keys**

In [3]:
from google.colab import userdata
from pinecone import Pinecone, ServerlessSpec

pinecone_api_key = userdata.get("PINECONE_API_KEY")

pc = Pinecone(api_key=pinecone_api_key)

**Initialize Pinecone**

In [4]:
import time
index_name = "gemini-rag-project"

pc.create_index (
        name=index_name,
        dimension=768,
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    )

index = pc.Index(index_name)

**Use LangChain for RAG Workflow**

*Use Google Gemini embeddings to vectorize documents*

In [6]:
!pip install -qU langchain-google-genai
from langchain_google_genai import GoogleGenerativeAIEmbeddings
import os

os.environ["GOOGLE_API_KEY"] = userdata.get("GOOGLE_API_KEY")

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

*Load and preprocess the documents*

In [8]:
!pip install -U langchain-community
!pip install docx2txt
from langchain.document_loaders import Docx2txtLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Use Docx2txtLoader instead of TextLoader
loader = Docx2txtLoader("/content/DeepSeek_txt.docx")
documents = loader.load()

# Split the documents into chunks

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)
for i, chunk in enumerate(chunks):
    print(f"\nChunk {i+1}:\n{chunk.page_content}\n")


Chunk 1:
What is DeepSeek?

DeepSeek is the name of a free AI-powered chatbot, which looks, feels and works very much like ChatGPT.

That means it's used for many of the same tasks, though exactly how well it works compared to its rivals is up for debate.

It is reportedly as powerful as OpenAI's o1 model - released at the end of last year - in tasks including mathematics and coding.


Chunk 2:
Like o1, R1 is a "reasoning" model. These models produce responses incrementally, simulating how humans reason through problems or ideas.

Deepseek says it has been able to do this cheaply - researchers behind it claim it cost $6m (£4.8m) to train, a fraction of the "over $100m" alluded to by OpenAI boss Sam Altman when discussing GPT-4.

It has also seemingly be able to minimise the impact of US restrictions on the most powerful chips reaching China.


Chunk 3:
DeepSeek's founder reportedly built up a store of Nvidia A100 chips, which have been banned from export to China since September 2022.

*Embed and Store Documents in Pinecone*

In [9]:
from tqdm import tqdm

batch_size = 100
batch = []

# Create embeddings and upload to Pinecone in batches
for i, doc in tqdm(enumerate(documents), total=len(documents), desc="Uploading to Pinecone"):
    vector = embeddings.embed_query(doc.page_content)

    # Unique ID for each document (fallback if "source" is missing)
    doc_id = doc.metadata.get("source", f"doc_{i}")

    # Append data to batch
    batch.append((doc_id, vector, {"text": doc.page_content}))

    # Upload in batches
    if len(batch) >= batch_size:
        index.upsert(batch)
        batch = []  # Clear the batch

# Upload any remaining chunks
if batch:
    index.upsert(batch)

print("Successfully stored vectorized chunks in Pinecone!")

Uploading to Pinecone: 100%|██████████| 1/1 [00:00<00:00,  5.32it/s]


Successfully stored vectorized chunks in Pinecone!


**Use LangChain’s Pinecone integration to retrieve relevant documents**

In [10]:
from langchain_pinecone import PineconeVectorStore

vector_store = PineconeVectorStore(index=index, embedding=embeddings)

**Initialize the Google Gemini Flash Model for RAG generation.**

In [11]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    # other params...
)

**Use LangChain to integrate the retriever and the LLM.**

In [12]:
from langchain.chains import RetrievalQA
retriever = vector_store.as_retriever()

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever
)

**Query the RAG System**

In [13]:
query = "How is DeepSeek different from other AI Models?"
response = qa_chain.run(query)
print(response)

  response = qa_chain.run(query)


DeepSeek distinguishes itself from other AI models in several key ways:

* **Lower cost:**  It was reportedly trained for $6 million, a fraction of the cost of models like GPT-4.  This was achieved through efficient use of hardware and potentially by combining high-end and lower-end chips.

* **Resource efficiency:** DeepSeek uses less memory than its rivals, leading to lower operational costs.

* **Performance:**  It's claimed to be comparable in performance to OpenAI's o1 model in tasks like mathematics and coding.  It's a "reasoning" model, similar to o1, producing responses incrementally.

* **Circumvention of US restrictions:**  Its development seemingly circumvented US restrictions on exporting high-performance chips to China.

* **Market impact:** Its release caused significant disruption in the AI market, leading to a drop in the stock prices of companies like Nvidia.  This highlights its potential to challenge the established dominance of US AI companies.

* **Political sensit

In [14]:
#Vector search
def answer_to_user(query: str):
#Vector search
  vector_results = vector_store.similarity_search_with_score(query, k=2)
# Pass Model vector search + Query
  final_answer = llm.invoke(f"ANSWER THIS QUERY: {query}, Here are some references to the answer {vector_results}")
  return final_answer

In [15]:
answer = answer_to_user("Is DeepSeek an AI model?")
answer.content

'Based on the provided text, DeepSeek is an AI-powered chatbot.  The document explicitly states that it is "a free AI-powered chatbot" and describes its capabilities in tasks like mathematics and coding, comparing it to other AI models like OpenAI\'s.'

In [16]:
answer = answer_to_user("How has China reacted to DeepSeek's impact?")
answer.content



**Playing with parameters**

In [None]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=500)
chunks = text_splitter.split_documents(documents)
for i, chunk in enumerate(chunks):
    print(f"\nChunk {i+1}:\n{chunk.page_content}\n")


Chunk 1:
What is DeepSeek?

DeepSeek is the name of a free AI-powered chatbot, which looks, feels and works very much like ChatGPT.

That means it's used for many of the same tasks, though exactly how well it works compared to its rivals is up for debate.

It is reportedly as powerful as OpenAI's o1 model - released at the end of last year - in tasks including mathematics and coding.

Like o1, R1 is a "reasoning" model. These models produce responses incrementally, simulating how humans reason through problems or ideas.

Deepseek says it has been able to do this cheaply - researchers behind it claim it cost $6m (£4.8m) to train, a fraction of the "over $100m" alluded to by OpenAI boss Sam Altman when discussing GPT-4.

It has also seemingly be able to minimise the impact of US restrictions on the most powerful chips reaching China.


Chunk 2:
Like o1, R1 is a "reasoning" model. These models produce responses incrementally, simulating how humans reason through problems or ideas.

Deeps