# QA Bot Web App

In this project, we build a complete Question‑Answering (QA) system that can respond to user queries using the content of uploaded documents. 
The goal is to construct a QA bot that leverages LangChain and a Large Language Model (LLM) to answer questions directly from the documents we load into the system.

Instead of relying on predefined rules or static responses, the bot uses a modern technique called Retrieval‑Augmented Generation (RAG). This approach allows the model to combine two powerful capabilities:

- **Retrieval:** The bot searches through the uploaded documents to find the most relevant pieces of information.

- **Generation:** The bot passes the retrieved context to a Large Language Model (LLM), which then generates a natural‑language answer.

By the end, this project becomes a fully functional RAG pipeline that transforms raw documents into an interactive, document‑aware assistant. 

## Importing The Required Libraries

In [1]:
# Use this section to suppress warnings generated by your code:
import warnings

def warn(*args, **kwargs):
    pass

warnings.warn = warn
warnings.filterwarnings('ignore')

In [2]:
import gradio as gr

from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_community.document_loaders import PyPDFLoader
from langchain_classic.chains import RetrievalQA
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_huggingface import HuggingFacePipeline

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

from typing import List

## Document Loader

In [3]:
####
# Document loader for multiple files
####

def document_loader(file_paths: List[str]):
    """
    file_paths: list of local file paths (strings) from Gradio when type='filepath' and file_count='multiple'
    returns: list of langchain.Document objects with metadata['source'] set to filename
    """
    all_docs = []
    for path in file_paths:
        loader = PyPDFLoader(path)
        docs = loader.load()
        # Add source metadata so answers can reference file names/pages
        for d in docs:
            # PyPDFLoader usually sets metadata['source'] but ensure it's present and include filename
            d.metadata = dict(d.metadata or {})
            d.metadata["source_file"] = path
        all_docs.extend(docs)
    return all_docs

## Text Splitter

In [4]:
####
# Text splitter
####

def text_splitter(documents):
    splitter = RecursiveCharacterTextSplitter(
        chunk_size = 1000,
        chunk_overlap = 100,
        length_function = len,
    )
    chunks = splitter.split_documents(documents)
    return chunks

## Embeddings and Vector DB

In [5]:
####
# Embeddings and vector db
####

def bge_embedding():
    return HuggingFaceEmbeddings(
        model_name = "BAAI/bge-base-en-v1.5",
        # Change device to "cuda" if you have GPU:
        model_kwargs = {"device": "cpu"},
        # To improve cosine similarity and retrieval quality:
        encode_kwargs = {"normalize_embeddings": True},
    )

In [6]:
####
# Vector DB
####

def vector_database(chunks):
    embedding_model = bge_embedding()
    vectordb = Chroma.from_documents(
        documents = chunks,
        embedding = embedding_model
    )
    return vectordb

## Retriever

In [7]:
####
# Retriever pipeline
####

def retriever_from_files(file_paths: List[str]):
    docs = document_loader(file_paths)
    chunks = text_splitter(docs)
    vectordb = vector_database(chunks)
    retriever = vectordb.as_retriever()
    return retriever

## Initialize the LLM

In [8]:

####
# LLM
####

def get_llm():
    """ 
    Initializes and returns a local Large Language Model (LLM) wrapped 
    as a LangChain-compatible pipeline. 
    
    This function loads a pretrained language model and tokenizer 
    from Hugging Face (in this case, 'google/gemma-2b'), configures it 
    for text‑generation, and wraps the resulting pipeline so it can be used 
    inside LangChain chains such as RetrievalQA. 
    """

    # Model ID from Hugging Face
    model_id = "google/gemma-2b"  
    
    # Load tokenizer and model
    tokenizer = AutoTokenizer.from_pretrained(
        model_id,
        # trust_remote_code= True,
    )
    model = AutoModelForCausalLM.from_pretrained(
        model_id,
        # trust_remote_code= True,
        device_map = "cpu",    # "auto" if GPU is available
        torch_dtype = None,
    )

    # Create a text-generation pipeline
    pipe = pipeline(
        "text-generation",
        model = model,
        tokenizer = tokenizer,
        # Control the maximum number of tokens in the generated output:
        max_new_tokens = 256,
        # Randomness or creativity of the model's responses:
        temperature = 0.5,
        # Pick next word based on probability + randomness
        do_sample = True,
        # return_full_text= False,
        eos_token_id = tokenizer.eos_token_id,
        pad_token_id = tokenizer.eos_token_id,
    )

    return HuggingFacePipeline(pipeline= pipe)

## QA Chain

In [9]:
####
# QA chain
####

def retriever_qa(file_paths: List[str], query: str):
    # file_paths is a list of strings when Gradio passes multiple files
    if not file_paths:
        return "Please upload at least one PDF file."

    llm= get_llm()
    retriever_obj = retriever_from_files(file_paths)

    qa = RetrievalQA.from_chain_type(
        llm = llm,
        chain_type = "stuff",
        retriever = retriever_obj,
        return_source_documents = False
    )

    response = qa.invoke(query)
    return response['result']

## Gradio Interface

In [10]:
####
# Gradio interface
####

rag_app = gr.Interface(
    fn= retriever_qa,
    flagging_mode = "never", 
    inputs = [
        # Drag and drop file upload:
        gr.File(
            label = "Upload PDF Files",
            file_count = "multiple",
            file_types = [".pdf"],
            type = "filepath"
        ),
        gr.Textbox(
            label = "Input Query",
            lines = 3,
            placeholder = "Type your question here..."
        )
    ],
    outputs = gr.Textbox(
        label = "Output",
        lines = 5,
    ),
    title = "RAG Chatbot",
    description = "Upload PDF documents and ask a question. The chatbot will combine documents and answer from them."
)

In [11]:
if __name__ == "__main__":
    rag_app.launch(
        server_name = '0.0.0.0',
        server_port = 7860,
    )

* Running on local URL:  http://0.0.0.0:7860
* To create a public link, set `share=True` in `launch()`.


# PDF and environment.yml

In [1]:
!conda env export -n rag > environment.yml

In [2]:
!jupyter nbconvert --to webpdf NB.ipynb

[NbConvertApp] Converting notebook NB.ipynb to webpdf
[NbConvertApp] Building PDF
[NbConvertApp] PDF successfully created
[NbConvertApp] Writing 158624 bytes to NB.pdf
