<a href="https://colab.research.google.com/github/Saifullah785/langchain-generative-ai-journey/blob/main/Lecture_05_RAG_based_Application_using_Langchain_Deepseek/Lecture_05_RAG_based_Application_using_Langchain_Deepseek.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Document Loading and Processing:

The project loads content from a PDF document using PyPDFLoader and splits it into smaller, manageable chunks using RecursiveCharacterTextSplitter.

# Vector Embeddings Creation:

It utilizes HuggingFaceEmbeddings to convert the text chunks into numerical vector representations.

# Vector Store Implementation:

The project builds a FAISS vector store from the document embeddings, enabling efficient similarity searches.

# Hugging Face Model Integration:

It connects to a Hugging Face text generation model (specifically "deepseek-ai/DeepSeek-R1-0528") via HuggingFaceEndpoint to serve as the Large Language Model (LLM).

# Retrieval-Augmented Generation (RAG) QA Chain:

It sets up a RetrievalQA chain that retrieves relevant document chunks based on a user query and then uses the integrated Hugging Face LLM to generate an answer based on the retrieved information.




# Installing necessary libraries using pip.

# langchain-huggingface: For integrating Hugging Face models with LangChain.

# langchain: The core LangChain library.

# pypdf: To load PDF documents.

# faiss-cpu: A library for efficient similarity search and clustering (CPU version).

In [2]:

!pip install langchain-huggingface langchain pypdf faiss-cpu

# Upgrade the langchain-community library to the latest version.
# This often contains integrations and components.

!pip install -U langchain-community

Collecting langchain-huggingface
  Downloading langchain_huggingface-0.2.0-py3-none-any.whl.metadata (941 bytes)
Collecting pypdf
  Downloading pypdf-5.6.0-py3-none-any.whl.metadata (7.2 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.11.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (4.8 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers>=2.6.0->langchain-huggingface)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers>=2.6.0->langchain-huggingface)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers>=2.6.0->langchain-huggingface)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torc

# Importing necessary libraries for document loading,

# processing, embeddings, vector stores,

# and chain creation from LangChain and other modules.

In [14]:

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from langchain.document_loaders import PyPDFLoader # Or use other loaders like TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
import os
from google.colab import userdata

In [15]:
# Loading the document

loader = PyPDFLoader('/content/interview question.pdf')
documents = loader.load()

# Split the loaded document into smaller chunks.

# chunk_size: The maximum number of characters in each chunk.

# chunk_overlap: The number of characters to overlap between consecutive chunks.

# This helps maintain context across splits.

In [16]:

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_documents(documents)

# Create embeddings for the text chunks using a Hugging Face model.

# Embeddings convert text into numerical vectors, capturing semantic meaning.


In [17]:

embeddings = HuggingFaceEmbeddings()

  embeddings = HuggingFaceEmbeddings()


# Create a FAISS vector store from the text chunks and their embeddings.

# FAISS allows for fast similarity search on the embeddings.

In [18]:

vectorstore = FAISS.from_documents(texts, embeddings)

# Get the Hugging Face API key from Colab secrets

In [19]:

hf_api_key = userdata.get("HUGGINGFACEHUB_API_TOKEN")

In [20]:
# Check if the API key is available
if not hf_api_key:
    raise ValueError("Hugging Face API token not found in Colab Secrets. Please store your API token under the key 'HUGGINGFACEHUB_API_TOKEN'.")

# Initialize the HuggingFaceEndpoint to interact with a specific Hugging Face model.

# repo_id: The identifier of the model on the Hugging Face Hub.

# task: Specifies the task the model is used for (e.g., "text-generation").

# huggingfacehub_api_token: Your API key for authentication.

In [21]:

llm = HuggingFaceEndpoint(
        repo_id="deepseek-ai/DeepSeek-R1-0528",
        task="text-generation",
        huggingfacehub_api_token=hf_api_key
    )

# Wrap the HuggingFaceEndpoint LLM with ChatHuggingFace.

# This allows using the model within LangChain's chat-based interfaces if needed,

# although here it's used within a retrieval chain.

In [22]:

model = ChatHuggingFace(llm=llm)

# Create a Retrieval-Augmented Generation (RAG) chain.

# This chain combines document retrieval with the language model.

# llm: The language model to use for generating answers.

# chain_type="stuff": A simple chain type that stuffs all retrieved documents into the prompt.

# retriever: The component responsible for retrieving relevant documents from the vector store.

In [23]:

qa_chain = RetrievalQA.from_chain_type(
    llm=model,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

# Other chain types include "map_reduce", "refine", etc.

# Now you can ask questions related to your document

In [25]:

query = "What is the main topic of the document?"

response = qa_chain.invoke(query)

print(response)

# You can ask more questions by changing the 'query' variable and rerunning this cell

{'query': 'What is the main topic of the document?', 'result': '<think>\nHmm, the user is asking about the main topic of a document. Let me look at the context provided. The document sections mention investigations, reactive and proactive monitoring, maintenance and safety inspections, safety surveys, peer group influence, workplace groups, communication methods, safety standards like ANSI codes, method statements, JSA, and gas flammability limits. \n\nThe recurring themes are safety procedures, monitoring methods, communication tools, and organizational culture related to workplace safety. Terms like "accidents," "preventing recurrence," "compliance with standards," and safety equipment standards (ANSI) appear throughout. \n\nThe document seems structured as a training or policy guide, with numbered sections covering various aspects of safety management. The mention of "induction training" in the header confirms this is likely a company safety manual or orientation document. \n\nThe c