This Retrieval-Augmented Generation (RAG) chatbot is designed to answer questions accurately by leveraging Hugging Face models for text embeddings and generation. The solution integrates PDF text extraction, FAISS indexing, and a language model to provide a seamless question-answering experience.**bold text**

Using Hugging Face APIs and models for building a Retrieval-Augmented Generation (RAG) chatbot involves leveraging Hugging Face's transformers, datasets, and FAISS libraries

Step 1: Install Required Libraries


In [1]:
pip install transformers sentence-transformers faiss-cpu datasets gradio

Collecting faiss-cpu
  Downloading faiss_cpu-1.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.4 kB)
Collecting datasets
  Downloading datasets-3.1.0-py3-none-any.whl.metadata (20 kB)
Collecting gradio
  Downloading gradio-5.6.0-py3-none-any.whl.metadata (16 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess<0.70.17 (from datasets)
  Downloading multiprocess-0.70.16-py310-none-any.whl.metadata (7.2 kB)
Collecting fsspec<=2024.9.0,>=2023.1.0 (from fsspec[http]<=2024.9.0,>=2023.1.0->datasets)
  Downloading fsspec-2024.9.0-py3-none-any.whl.metadata (11 kB)
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Collecting fastapi<1.0,>=0.115.2 (from gradio)
  Downloading fastapi-0.115.5-py3-no

Step 2: Import Libraries
python

In [2]:
from transformers import pipeline, AutoTokenizer, AutoModelForSeq2SeqLM
from sentence_transformers import SentenceTransformer
from datasets import Dataset
import faiss
import gradio as gr

In [4]:
!pip install PyPDF2 # Install the PyPDF2 library using pip

Collecting PyPDF2
  Downloading pypdf2-3.0.1-py3-none-any.whl.metadata (6.8 kB)
Downloading pypdf2-3.0.1-py3-none-any.whl (232 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m232.6/232.6 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: PyPDF2
Successfully installed PyPDF2-3.0.1


Step 3: Define Functions for Each Component
1. Load and Preprocess PDF

In [5]:
from PyPDF2 import PdfReader

def extract_text_from_pdf(pdf_path):
    reader = PdfReader(pdf_path)
    text = ""
    for page in reader.pages:
        text += page.extract_text()
    return text

2. Create FAISS Index for Retrieval

In [6]:
def create_faiss_index(texts, embedding_model_name="sentence-transformers/all-MiniLM-L6-v2"):
    # Load the embedding model
    model = SentenceTransformer(embedding_model_name)

    # Generate embeddings for text chunks
    embeddings = model.encode(texts)

    # Create FAISS index
    dimension = embeddings.shape[1]
    faiss_index = faiss.IndexFlatL2(dimension)
    faiss_index.add(embeddings)

    return faiss_index, model

3. Retrieve Relevant Chunks


In [7]:
def retrieve_chunks(query, faiss_index, texts, embedding_model):
    query_embedding = embedding_model.encode([query])
    distances, indices = faiss_index.search(query_embedding, k=3)
    return [texts[idx] for idx in indices[0]]


4. Load a Hugging Face Generative Model


In [8]:
def load_generator(model_name="google/flan-t5-large"):
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
    return tokenizer, model


5. Generate Answers
Combine retrieved chunks with the query to generate an answers

In [9]:
def generate_answer(query, retrieved_texts, tokenizer, model):
    # Combine query and retrieved text
    context = "\n".join(retrieved_texts)
    input_text = f"Context: {context}\n\nQuestion: {query}\nAnswer:"

    # Tokenize input
    inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)

    # Generate answer
    outputs = model.generate(**inputs, max_length=100, num_beams=4, early_stopping=True)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)


Step 4: Build the Complete Workflow
Integrate all components into a single functions

In [10]:
def chat_with_pdf_hf(pdf_path, query):
    # Step 1: Extract text from the PDF
    text = extract_text_from_pdf(pdf_path)

    # Step 2: Split text into chunks
    chunks = [text[i:i+1000] for i in range(0, len(text), 1000)]

    # Step 3: Create FAISS index
    faiss_index, embedding_model = create_faiss_index(chunks)

    # Step 4: Retrieve relevant chunks
    retrieved_texts = retrieve_chunks(query, faiss_index, chunks, embedding_model)

    # Step 5: Load generative model
    tokenizer, model = load_generator()

    # Step 6: Generate an answer
    answer = generate_answer(query, retrieved_texts, tokenizer, model)

    return answer


Step 5: Create a Gradio Interface


In [11]:
def chatbot_interface(pdf, question):
    pdf_path = pdf.name
    answer = chat_with_pdf_hf(pdf_path, question)
    return answer

interface = gr.Interface(
    fn=chatbot_interface,
    inputs=[gr.File(label="Upload PDF"), gr.Textbox(label="Ask a question")],
    outputs="text",
    title="Hugging Face PDF RAG Chatbot",
)
interface.launch()

Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://78be3e9367918035af.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


