# Building a Basic RAG App with Gradio UI - A Guide
This guide outlines the components of a basic Retrieval-Augmented Generation (RAG) application built using the Gradio framework.

# Understanding the Application Flow:
The application interacts with the user through functions triggered by specific actions. Let's break down the key functions and their roles:

# process_pdf:
This function is the engine that kicks in when you upload a PDF file and click the "Analyse" button. It takes over the uploaded document and performs the following tasks:

    Extraction: Extracts the text content from the PDF file.
    Splitting: Divides the extracted text into manageable units (e.g., sentences or paragraphs) for further processing.
    Storage: Stores the processed text data for later use.
    Chain creation: Generates "chains" based on the processed text. These chains are essential for the interaction between the user and the system (explained further in the PDF_Processor section).



# PDF_Processor Class:
This class acts as a dedicated worker for handling the PDF input. It encapsulates the functionalities mentioned in the process_pdf function: extracting, splitting, and storing the PDF content. Additionally, it plays a crucial role in creating the aforementioned "chains" - a concept unique to RAG applications. These chains are likely data structures that link relevant information within the processed document, crucial for the system to retrieve contextual answers during user interaction.

# QandA_response_handler:
This function tackles user queries in a question-answering format. It performs the following steps:

    Interaction: Takes the user's question as input.
    Response Generation: Triggers the llm_chat_query function to send the refined query to the large language model (LLM) and retrieves the answer generated by the LLM.

#chatbot_response_handler:
This function is responsible for handling user input in a broader conversational format. It likely operates as follows:

    Input Acquisition: Receives the user's message as input.
    LLM Interaction: Triggers the llm_chat_query function to send the user's message to the LLM.
    Response Retrieval: Receives the response generated by the LLM based on the user's message.






In [1]:
"""
This code installs several Python packages that are required for the chatbot application:

- `gradio`: A Python library for creating interactive web applications.
- `openai`: The official Python client library for the OpenAI API, which is used for language modeling and generation.
- `PyMUPDF`: A Python wrapper for the MuPDF library, which is used for reading and manipulating PDF files.
- `chromadb`: A Python library for storing and querying vector embeddings, which is used for the chatbot's knowledge base.
- `langchain_openai`: Provides integration between LangChain and OpenAI models.
- `langchain`: A framework for building applications with large language models.
- `langchain_community`: Additional community-contributed functionality for LangChain.
These packages are installed using the `pip install` command, which is a package installer for Python. The `-q` flag is used to suppress the output of the installation process.
"""

!pip install gradio -q
!pip install openai -q
!pip install PyMUPDF -q
!pip install chromadb -q
!pip install langchain_openai -q
!pip install langchain -q
!pip install langchain_community



In [2]:
"""
This code imports various libraries and modules that are commonly used in building chatbots and natural language processing applications. The libraries include:

- `gradio`: A library for creating interactive web applications.
- `os`: A module for interacting with the operating system.
- `openai`: A library for accessing the OpenAI API.
- `fitz`: A library for working with PDF documents.
- `pandas`: A library for data manipulation and analysis.
- `langchain`: A library for building language models and chains.
- `chromadb`: A library for storing and querying vector embeddings.
- `datetime`: A module for working with dates and times.

These libraries are likely used in the larger context of the chatbot application to handle tasks such as text processing, document analysis, and model integration.
"""
import gradio as gr
import os
import openai
import fitz  #imported from PyMUPDF
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_openai.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_openai.embeddings import OpenAIEmbeddings
import time
import chromadb #vectorstore
from datetime import datetime

  from .autonotebook import tqdm as notebook_tqdm


In [5]:
"""
Sets the OpenAI API key as an environment variable.

This code sets the OPENAI_API_KEY environment variable to the provided API key. This allows the OpenAI API to be used throughout the application without needing to explicitly pass the API key.
"""
import getpass
# Set OpenAI API key
OPENAI_API_KEY = getpass.getpass()
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY

In [6]:
"""
Initialize the ChatOpenAI language model with the specified parameters.

Args:
    model (str): The name of the GPT-3.5 model to use, e.g. 'gpt-3.5-turbo-0125'.
    temperature (float): The temperature parameter to control the randomness of the model's output, typically between 0 and 1.

Returns:
    ChatOpenAI: An instance of the ChatOpenAI language model.
"""
llm = ChatOpenAI(model='gpt-3.5-turbo-0125', temperature=0.1)

In [7]:
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)

"""
Loads PDF and Word documents, extracts text, splits text into chunks,
vectorizes chunks for semantic search, and sets up a QA retrieval
pipeline.

load_and_split extracts text from PDF and Word documents.

vectorstore_and_chain vectorizes text chunks and sets up a QA
retrieval pipeline using Chroma/Anthropic.

extract_document_keywords extracts keywords/entities from extracted
text.
"""

class PDFprocessor:


    def load_and_split(self,filepaths):
      """
      Loads and splits PDF documents into a list of Document objects.

      Args:
          filepaths (list): A list of file paths to PDF documents.

      Returns:
          list: A list of Document objects, where each Document contains a text chunk and its associated metadata.
      """
      documents=[]
      for file_path in filepaths:
        if file_path.endswith(".pdf"):
           loader = PyMuPDFLoader(file_path)
           documents.extend(loader.load())
      text_splitter = RecursiveCharacterTextSplitter(
            # Set a really small chunk size, just to show.
            chunk_size=100,
            chunk_overlap=20,
            length_function=len,
            is_separator_regex=False,
        )
      docs=text_splitter.split_documents(documents)

      return docs


      """
      Generates a Chroma vector database from a list of documents, creates a RetrievalQA chain using the vector database, and returns the chain.

      Args:
          documents (list): A list of Document objects, where each Document contains a text chunk and its associated metadata.

      Returns:
          RetrievalQA: A RetrievalQA chain that can be used to answer questions based on the provided documents.
      """
    def vectorstore_and_chain(self, docs):


        """
        Creates an OpenAIEmbeddings instance using the "text-embedding-ada-002" model.

        The OpenAIEmbeddings class is used to generate vector embeddings for text using the OpenAI language model. The "text-embedding-ada-002" model is a general-purpose text embedding model that can be used for a variety of natural language processing tasks.
        """
        embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")
        current_time = datetime.now()
        formatted_time = current_time.strftime("%Y-%m-%d%H%M%S")
        print("Current Time:", formatted_time)
        collection_name=formatted_time #need to give a unique name to the collection of vectors this is done to avoid the retrieval of previous instance pdfs
        if not should_stop_analysis:
         try:
            vectordb = Chroma.from_documents(docs,embeddings,collection_name=collection_name)
         except openai.RateLimitError as e:
            print("Rate limit exceeded, waiting before retrying...")
            # time.sleep(60)
            vectordb = Chroma.from_documents(docs, embeddings,collection_name=collection_name)
        else:
            vectordb=Chroma.from_documents([],embeddings,collection_name)
        template = """Imagine you are a good question answering system and you answer questions based on the Relevant information.
        Don't make up any information\n
        Also, include all the relevant numerical figures\n
        Recheck your answer so that it is more coherent with what user is asking\n
        Relevant information:{context}
        Question: For the company,{question}
        Answer:"""
        global chain
        QA_CHAIN_PROMPT = PromptTemplate(input_variables=["context", "question"], template=template)
        retriever=vectordb.as_retriever(search_kwargs={"k": 5})
        """
        Initializes a RetrievalQA chain using the provided language model (llm) and retriever.

        The chain is configured to return the source documents along with the answer, and the prompt is set using the QA_CHAIN_PROMPT parameter.

        Args:
            llm (LLMChain): The language model to use for the chain.
            retriever (Retriever): The retriever to use for the chain.
            QA_CHAIN_PROMPT (str): The prompt to use for the QA chain.

        Returns:
            RetrievalQA: The initialized RetrievalQA chain.
        """
        chain = RetrievalQA.from_chain_type(
            llm,
            retriever=retriever,
            return_source_documents=True,
            chain_type_kwargs={"prompt": QA_CHAIN_PROMPT}
        )
        memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
        system_message = SystemMessagePromptTemplate.from_template(
        """Imagine you are a good question answering system and you answer questions based on the Relevant information.
        Don't make up any information\n
        Also, include all the relevant numerical figures\n
        Recheck your answer so that it is more coherent with what user is asking\n
        Relevant information:{context}
        Question: For the company,{question}
        Answer:
        """
    )
        human_message = human_message_prompt = HumanMessagePromptTemplate.from_template("{question}")
        conv_chain = ConversationalRetrievalChain.from_llm(
        llm=llm,
        retriever=vectordb.as_retriever(search_kwargs={"k": 5}),
        memory=memory,
        combine_docs_chain_kwargs={
            "prompt": ChatPromptTemplate.from_messages(
                [
                    system_message,
                    human_message,
                ]
            ),
        },
    )

        return chain,conv_chain

In [8]:
"""
Processes a list of PDF file paths and returns the processed data or an error message.

Args:
    filepaths (list): A list of file paths to PDF files.

Returns:
    str: A message indicating the result of the PDF processing, or an error message.
    Any: The processed data, or None if an error occurred.
"""

def process_pdf(filepaths):
    global should_stop_analysis
    should_stop_analysis = False
    global processed_data_cache

    if not filepaths or not isinstance(filepaths, list) or not filepaths[0]:
        return "Please upload a valid PDF file.", None
    if should_stop_analysis:
        print(should_stop_analysis)
        return "Analysis stopped by user.", None
    else:
      pdf_processing=PDFprocessor()
      documents=pdf_processing.load_and_split(filepaths)
      global chain,conv_chain
      chain,conv_chain=pdf_processing.vectorstore_and_chain(documents)
      return "Processed pdf"


In [14]:
"""
Handles incoming chat messages by preparing document context, querying the LLM,
and returning the response.

Parameters:
  message (str): The incoming chat message.
  history (list): The chat history so far.

Returns:
  str: The generated response for the given message and context.
"""
async def QandA_response_handler(message, history):

    print("Processing chat response...")
    response = await dynamic_llm_query(message,history)
    print(f"Message: {message}, Response: {response}")

    return response


In [9]:
"""
Handles querying the LLM to generate a response for an incoming chat message.

Uses a RetrievalQA chain if available to query the document context.
Otherwise falls back to querying the LLM directly.

Parameters:
  message (str): The incoming chat message.

Returns:
  str: The generated response.
"""

async def dynamic_llm_query(message,history):
    # Retrieve the QA chain, if available
    global chain
    if chain:
        try:
            # Use the QA chain for detailed document-related queries
            response = chain.invoke({"query": message,"chat_history":history}, max_tokens=300)
            answer = response["result"]

        except Exception as e:
            print(f"Error querying RetrievalQA chain: {e}")
            answer = "An error occurred while querying the RetrievalQA chain."
    else:
        # Fallback or additional logic for using the LLM directly
        answer = "If document is uploaded please wait until the document processes or if it is not uploaded please upload a file"

    return answer

In [10]:
async def chatbot_response_handler(message, history):

    print("Processing chat response...")
    response = await dynamic_llm_chat_query(message,history)
    print(f"Message: {message}, Response: {response}")

    return response

In [11]:

async def dynamic_llm_chat_query(message,history):
# Retrieve the QA chain, if available
    global conv_chain
    if conv_chain:
        try:
            # Use the QA chain for detailed document-related queries
            response = conv_chain.invoke({"question": message,"chat_history":history[-5:]}, max_tokens=300)
            answer = response["answer"]
        except Exception as e:
            print(f"Error querying RetrievalQA chain: {e}")
            answer = "An error occurred while querying the RetrievalQA chain."
    else:
        # Fallback or additional logic for using the LLM directly
        answer = "If document is uploaded please wait until the document processes or if it is not uploaded please upload a file"

    return answer

In [12]:
general_questions=[
    "Under what circumstances can the policy be canceled?",
    "Are there any provisions for renewal or cancellation?"
]

In [15]:

"""
Analyzes PDF documents uploaded by the user.

Outputs analysis results and a CSV download.
"""
def main():
    custom_css = """
    .header-text h1, .header-text h2 {
        text-align: center;    }
    .instruction-text {
        text-align: justify;
        margin: 20px;
        font-size: 18px;
    }
    .output-container {
        max-height: 500px; /* Adjust based on your preference */
        overflow-y: auto;
    }
    """
    theme = gr.themes.Soft(
    primary_hue="rose",
    secondary_hue="rose",
    font=[gr.themes.GoogleFont('Poppins'), 'ui-sans-serif', 'system-ui', 'sans-serif'],
).set()
    # def clear_all():
    #             file_upload.value = []  # Clear selected files
    #             output_container.value = ""
    with gr.Blocks(css=custom_css, theme=theme) as demo:
        # with gr.Row():
        #    with gr.Column():
        #          Header texts
        #         gr.HTML("<div class='header-text'><h1>Risk Analyser Tool</h1></div>")
        #         gr.HTML("<div class='header-text'><h2>Analysing Reports Made Easy</h2></div>")
        #         gr.HTML("<div class='header-text'><h3>Get insights from your PDF document instantly</h3></div>")

        with gr.Row():
            with gr.Column():
                gr.Markdown("## Upload and Analyze PDF")
                with gr.Group():
                    file_upload = gr.Files(show_label=False,file_count="multiple", file_types=[".pdf","pdf"])
                    analyze_button = gr.Button("Analyze PDF")
                    analyze_button.click(
                    fn=process_pdf,
                    inputs=file_upload,
                    outputs=gr.Text(show_label=False)
                )

                clear=gr.ClearButton(components=[file_upload],value="Clear")

                """
                Stops the ongoing analysis and resets the analysis chain.

                This function is called when the "Clear" button is clicked. It sets a global flag `should_stop_analysis` to `True` to signal that the analysis should be stopped. It also sets the `chain` global variable to `None` to reset the analysis chain.

                A warning message is displayed to the user, advising them to wait for approximately 20 seconds if they have interrupted the processing.
                """
                def on_file_clear():
                    global should_stop_analysis
                     # Set the flag to stop the analysis
                    should_stop_analysis = True
                    global chain
                    chain =None
                    gr.Warning("If you have interrupted the processing please wait for 20s")

                    #demo.close()
                clear.click(
                    fn=on_file_clear
                )
        with gr.Tab("Question Answering"):
          with gr.Column():
            gr.ChatInterface(
                fn=QandA_response_handler,
                examples=general_questions,
                retry_btn=None,
                undo_btn=None,
                clear_btn="Clear",
              )
        with gr.Tab("Chatbot"):
          with gr.Column():
            gr.ChatInterface(
                fn=chatbot_response_handler,
                examples=general_questions,
                retry_btn=None,
                undo_btn=None,
                clear_btn="Clear",
              )
        with gr.Row():
             # Instructions
             gr.HTML("<div class='instruction-text'>"
                     "<h2><strong>Instructions:</strong></h2><br>"
                     "1. Upload the file and click on the Analyse button.<br>"
                     "2. Please hold on until the processing is complete and you see a text confirmation processed pdf<br>"
                     "4.Click on the 'Clear' button to clear the file and the output.<br>"
                     "Note: If you want to stop the processing in the middle,click on the Clear Button and wait for approx 20s to clear the file and the output.<br>"
                     "</div>")


    demo.launch(debug=True)


if __name__ == "__main__":
    main()


Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.


Current Time: 2024-09-23172049
Processing chat response...
Message: What is the employee agreement about, Response: The employee agreement outlines the salary, rights, and any other benefits of the Executive under this Agreement or as an employee. It also specifies that during the Employment Term, the Agreement may terminate without further compensation obligations.
Processing chat response...
Message: what is the compensation , Response: The compensation for the company includes a combination of base salary, bonuses, and long-term incentives. The long-term incentives are payable for the achievement of performance goals established by the Compensation Committee. The amount of long-term incentives can be up to a certain amount determined by the committee. Additionally, other benefits and compensation under existing plans are also provided based on the terms and rules of those plans.
Keyboard interruption in main thread... closing server.
