<a href="https://colab.research.google.com/github/MishaJavaid787/PIAIC-Batch-62/blob/main/Project_02_LangChain_RAG_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Install Packages**

In [1]:
!pip install -qU langchain-pinecone langchain-google-genai

In [2]:
from google.colab import userdata
from pinecone import Pinecone, ServerlessSpec

pinecone_api_key = userdata.get("PINECONE_API_KEY")

pc = Pinecone(api_key=pinecone_api_key)

**Initialize Pinecone**

In [4]:
import time
index_name = "gemini-rag-project-1"

pc.create_index (
        name=index_name,
        dimension=768,
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    )

index = pc.Index(index_name)

In [5]:
!pip install -U langchain-community



In [6]:
!pip install docx2txt
from langchain.document_loaders import Docx2txtLoader

# Use Docx2txtLoader instead of TextLoader
loader = Docx2txtLoader("/content/DeepSeek_txt.docx")
documents = loader.load()

# Split the documents into chunks
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)
for i, chunk in enumerate(chunks):
    print(f"\nChunk {i+1}:\n{chunk.page_content}\n")


Chunk 1:
What is DeepSeek?

DeepSeek is the name of a free AI-powered chatbot, which looks, feels and works very much like ChatGPT.

That means it's used for many of the same tasks, though exactly how well it works compared to its rivals is up for debate.

It is reportedly as powerful as OpenAI's o1 model - released at the end of last year - in tasks including mathematics and coding.


Chunk 2:
Like o1, R1 is a "reasoning" model. These models produce responses incrementally, simulating how humans reason through problems or ideas.

Deepseek says it has been able to do this cheaply - researchers behind it claim it cost $6m (£4.8m) to train, a fraction of the "over $100m" alluded to by OpenAI boss Sam Altman when discussing GPT-4.

It has also seemingly be able to minimise the impact of US restrictions on the most powerful chips reaching China.


Chunk 3:
DeepSeek's founder reportedly built up a store of Nvidia A100 chips, which have been banned from export to China since September 2022.

**Using LangChain for RAG Workflow**

*Use Google Gemini embeddings to vectorize* *documents*

In [7]:
!pip install -qU langchain-google-genai

In [8]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
import os

os.environ["GOOGLE_API_KEY"] = userdata.get("GOOGLE_API_KEY")

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

In [11]:
from tqdm import tqdm

batch_size = 100
batch = []

# Create embeddings and upload to Pinecone in batches
for i, doc in tqdm(enumerate(documents), total=len(documents), desc="Uploading to Pinecone"):
    vector = embeddings.embed_query(doc.page_content)

    # Unique ID for each document (fallback if "source" is missing)
    doc_id = doc.metadata.get("source", f"doc_{i}")

    # Append data to batch
    batch.append((doc_id, vector, {"text": doc.page_content}))

    # Upload in batches
    if len(batch) >= batch_size:
        index.upsert(batch)  # Perform batch upload
        batch = []  # Clear batch

# Upload any remaining chunks
if batch:
    index.upsert(batch)

print("Successfully stored vectorized chunks in Pinecone!")

Uploading to Pinecone: 100%|██████████| 1/1 [00:00<00:00,  6.83it/s]


Successfully stored vectorized chunks in Pinecone!


In [12]:
from langchain_pinecone import PineconeVectorStore

vector_store = PineconeVectorStore(index=index, embedding=embeddings)

In [13]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    # other params...
)

In [14]:
from langchain.chains import RetrievalQA
retriever = vector_store.as_retriever()

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever
)

In [15]:
query = "How is DeepSeek different from other AI Models?"
response = qa_chain.run(query)
print(response)

  response = qa_chain.run(query)


DeepSeek distinguishes itself from other AI models in several key ways:

* **Lower cost:**  It was reportedly trained for $6 million, a fraction of the cost of models like GPT-4.  This was achieved through efficient use of hardware and potentially by combining high-end and lower-end chips.

* **Lower memory usage:** DeepSeek uses less memory than its rivals, leading to reduced operational costs.

* **Performance:**  It's claimed to be comparable in performance to OpenAI's o1 model in tasks like mathematics and coding.  It's a "reasoning" model, similar to o1, producing responses incrementally.

* **Circumvention of US chip restrictions:**  DeepSeek's development seemingly circumvented US restrictions on exporting high-performance chips to China, raising questions about the reliance on top-tier hardware for AI advancement.

* **Market impact:** Its relatively low cost and high performance caused significant disruption in the AI market, leading to a sell-off in the stock prices of compan

In [16]:
#Vector search
def answer_to_user(query: str):
#Vector search
  vector_results = vector_store.similarity_search_with_score(query, k=2)
# Pass Model vector search + Query
  final_answer = llm.invoke(f"ANSWER THIS QUERY: {query}, Here are some references to the answer {vector_results}")
  return final_answer

In [17]:
answer = answer_to_user("Is DeepSeek an AI model?")
answer.content

'Based on the provided text, DeepSeek is an AI-powered chatbot.  The document explicitly states that it is "a free AI-powered chatbot" and describes its capabilities as comparable to other large language models like ChatGPT and OpenAI\'s o1 model.  Therefore, the answer is **yes**.'

In [18]:
answer = answer_to_user("How has China reacted to DeepSeek's impact?")
answer.content



Deploy as an API with FastAPI

In [19]:
!pip install fastapi # Installs the fastapi library
!pip install python-multipart # uvicorn dependency
!pip install uvicorn # ASGI server

from fastapi import FastAPI, UploadFile, File
import shutil
import docx2txt

app = FastAPI()

@app.post("/upload/")
async def upload_docx(file: UploadFile = File(...)):
    file_location = f"./{file.filename}"
    with open(file_location, "wb") as buffer:
        shutil.copyfileobj(file.file, buffer)

    text = docx2txt.process(file_location)

    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
    docs = text_splitter.create_documents([text])

    batch = []
    for i, doc in enumerate(docs):
        vector = embeddings.embed_query(doc.page_content)
        doc_id = f"doc_{i}"
        batch.append((doc_id, vector, {"text": doc.page_content}))

    index.upsert(batch)

    return {"message": "Document uploaded and indexed successfully!"}

@app.post("/query/")
def query_rag(question: str):
    query_vector = embeddings.embed_query(question)
    results = index.query(vector=query_vector, top_k=1, include_metadata=True)

    context = "\n".join([match["metadata"]["text"] for match in results["matches"]])
    return {"question": question, "answer": context}



In [None]:
!uvicorn rag_api.ipynb:app --reload --host 0.0.0.0 --port 8000

[32mINFO[0m:     Will watch for changes in these directories: ['/content']
[32mINFO[0m:     Uvicorn running on [1mhttp://0.0.0.0:8000[0m (Press CTRL+C to quit)
[32mINFO[0m:     Started reloader process [[36m[1m59104[0m] using [36m[1mStatReload[0m
Process SpawnProcess-1:
Traceback (most recent call last):
  File "/usr/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.11/dist-packages/uvicorn/_subprocess.py", line 80, in subprocess_started
    target(sockets=sockets)
  File "/usr/local/lib/python3.11/dist-packages/uvicorn/server.py", line 66, in run
    return asyncio.run(self.serve(sockets=sockets))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/pytho

In [36]:
!curl -X POST -F "file=/content/DeepSeek_txt.docx" http://127.0.0.1:8000/upload/

curl: (7) Failed to connect to 127.0.0.1 port 8000 after 0 ms: Connection refused


In [37]:
!curl -X POST "http://127.0.0.1:8000/query/?question=What are the DeepSeek challenges?"

curl: (3) URL using bad/illegal format or missing URL
