<a href="https://colab.research.google.com/github/Sagnik-Nandi/PDFQueryBot---Chatbot-over-PDFs-using-RAG/blob/main/final%20project/llama_notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Smart Assistant Bot for Document Queries Using Llama 3 8b model and Groq api

References :
- [https://colab.research.google.com/github/groq/groq-api-cookbook/blob/main/tutorials/benchmarking-rag-langchain/benchmarking_rag.ipynb#scrollTo=x4Mkw0lQ7v7I]
- [https://jayant017.medium.com/rag-q-a-chatbot-using-openai-langchain-chromadb-and-gradio-536945dd92f9]
- [https://www.youtube.com/watch?v=MUJUXmz2i6U](Rag chatbot with gradio)

## Dependencies

In [1]:
!pip install groq -q
!pip install langchain -q
!pip install langchain_chroma -q
!pip install langchain_community -q
!pip install langchain_groq -q
# !pip install grandalf -q
# !pip install numpy -q
# !pip install pandas -q
!pip install pypdf -q
# !pip install sentence-transformers -q #takes 2 min to exec
!pip install groq-gradio -q

In [2]:
# import nest_asyncio
# nest_asyncio.apply()

In [3]:
from google.colab import userdata
import os
os.environ["GROQ_API_KEY"] = userdata.get("GROQ_API_KEY")
os.environ["TOKENIZERS_PARALLELISM"] = "false" # To suppress huggingface warnings

import warnings
warnings.filterwarnings("ignore")

from groq import Groq

In [4]:
from langchain_groq import ChatGroq
from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings

# Here comes the model
embed_model = HuggingFaceEmbeddings(model_name="BAAI/bge-small-en-v1.5")
rag_llm = ChatGroq(
    model="llama3-8b-8192",
    temperature = 0.1,
  )

  embed_model = HuggingFaceEmbeddings(model_name="BAAI/bge-small-en-v1.5")


## Processing the pdf -
The UGrulebook has been taken as an example for this notebook

In [5]:
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

#Loading
pdf_path = "./ugrulebook.pdf"
loader = PyPDFLoader(pdf_path)
docs = loader.load() # list of pages

In [6]:
# Splitting
# split a long document into smaller chunks that can fit into your model's context window
# 2 hyperparameters : chunk size and overlap
text_splitter = RecursiveCharacterTextSplitter(
    separators = ["\n \n",
        " \n",
        " ",
        "",],
    keep_separator=False,
    chunk_size=2000,
    chunk_overlap=200
  )
docs_spl = text_splitter.split_documents(docs)
for doc in docs_spl:
  doc.page_content = doc.page_content.replace("\n","") # remove the \n
len(docs_spl)
docs_spl[10].page_content

'their final year. At various stages of the pr ogramme, students are initiated into research methodologies, reading and interpreting research papers, use of engineering and scientific equipments/ instruments, mod-ern computational techniques, writing technical and scientific reports and effective communication. Apart from the minimum credit requirements for the award of the degree, opportunities exist for supplementing the learning experience by crediting additional courses, in diverse areas. These ad-ditional credits, when they are in focused areas, can earn the students’ credentials like Minor/ Hon-ours. The requirements for degree programmes run by the Institute are broadly classified as: a) Institute Requirements  (further divided into Compulsory courses, Elective courses and other requirements). b) Departmental Requirements  (further divided into Compulsory courses, Elective courses and other requirements). The curriculum for various programmes are available on the Institute websi

In [7]:
from langchain_chroma import Chroma

# Storing
vectorstore = Chroma.from_documents(docs_spl, embedding=embed_model, collection_name="groq_rag") # takes one min to run
retriever = vectorstore.as_retriever()

In [8]:
res = await retriever.ainvoke("What are the eligibility criteria for applying for a change of branch/ programme?")
# res[0].page_content.replace("\n","")
res[0].page_content = "".join(res[0].page_content.split('\n'))
res[0].page_content

'there are valid requests. E) All changes of branch can b e effected only once at the beginning of the second academic year. No application for change of branch during the subsequent academic years will be entertained. F) Branch change decisions will be final and will not be reversed. G) To run the LASE programme, the minimum student strength for the LASE programme should be 10. If less than 10 students are allotted the LASE programme after branch change then the result will be considered as null and void.'

## Defining the rag chain

In [9]:
from langchain_core.documents import Document
from langchain.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from typing import List, Dict

RAG_SYSTEM_PROMPT = """\
You are an assistant for question-answering tasks. \
Use the following pieces of retrieved context given within delimiters to answer the human's questions.
```
{context}
```
If you don't know the answer, just say that you don't know.\
""" # adapted from https://smith.langchain.com/hub/rlm/rag-prompt-llama3

RAG_HUMAN_PROMPT = "{input}"

RAG_PROMPT = ChatPromptTemplate.from_messages([
    ("system", RAG_SYSTEM_PROMPT),
    ("human", RAG_HUMAN_PROMPT)
])

def format_docs(docs: List[Document]):
    """Format the retrieved documents"""
    return "\n".join(doc.page_content for doc in docs)

rag_chain = (
    {
        "context": retriever | format_docs, # Use retriever to retrieve docs from vectorstore -> format the documents into a string
        "input": RunnablePassthrough() # Propogate the 'input' variable to the next step
    }
    | RAG_PROMPT # format prompt with 'context' and 'input' variables
    | rag_llm # get response from LLM using the formatteed prompt
    | StrOutputParser() # Parse through LLM response to get only the string response

)

In [10]:
res = await rag_chain.ainvoke("What are the time for holding lectures for first year UG students ?")
res = "".join(res.split('\n'))
res

'According to the given context, the lectures for first-year UG students are to be held ONLY between 8:30 am and 5:30 pm and only on working days.'

In [11]:
res = await rag_chain.ainvoke("How many candidates does IIT Bombay take in annually ?")
res = "".join(res.split('\n'))
res

'According to the provided context, IIT Bombay on an average annually admits:* More than 1000 candidates for undergraduate programmes (B.Tech./Dual Degree and B.S.) through the Joint Entrance Examination (JEE)* More than 30 candidates for B.Des. Programme through the Undergraduate Common Entrance Exam for Design (UCEED)* Around 300 candidates for M.Sc. and M.Sc. Ph.D. Dual Degree programmes* More than 1000 candidates for postgraduate programmes* Around 300 candidates for Ph.D. programmes'

In [12]:
res = await rag_chain.ainvoke("What are the requirements of getting a degree ?")
res = "".join(res.split('\n'))
res

'According to the provided context, the requirements for degree programmes at the Institute are broadly classified as:a) Institute Requirements:\t* Compulsory courses\t* Elective courses\t* Other requirementsb) Departmental Requirements:\t* Compulsory courses\t* Elective courses\t* Other requirementsAdditionally, the student must:* Take and pass all the courses prescribed for the degree under the general institutional and departmental requirements* Satisfactorily fulfill other academic requirements such as practical training, NSS/NSO/NCC, work visits, seminar, and projects, as specified for the discipline/programme* Pay all the Institute duesFor Dual Degree students, they must also complete the requirements for Honours, as prescribed by the department, which may be different from those prescribed for a B.Tech. student.'

In [13]:
res = await rag_chain.ainvoke("Explain the organisational structure for academic matters.")
res = "".join(res.split('\n'))
res

'According to the provided context, the organisational structure for academic matters at the Institute is as follows:* The Senate is the supreme body that governs all academic matters of the Institute. It is a statutory body that approves rules and regulations for academic programmes.* The Senate has two Institute-level sub-committees:\t+ Undergraduate Programmes Committee (UGPC) for undergraduate programmes\t+ Post-Graduate Programmes Committee (PGPC) for post-graduate programmes* The Dean of Academic Programmes (Dean, AP) and the Associate Dean of Academic Programmes (Associate Dean, AP) are the Conveners & Co-conveners respectively of these committees.* The Senate also has two Institute-level committees for performance and evaluation:\t+ Undergraduate Academic Performance Evaluation Committee (UGAPEC)\t+ Postgraduate Academic Performance Evaluation Committee (PGAPEC)* Conveners for these committees are designated from among Senate members.* Each department has two department-level c

In [14]:
res = await rag_chain.ainvoke("What is SPI and what is CPI ? Refer to the Glossary")
res = "".join(res.split('\n'))
res

'According to the provided context, SPI stands for Semester Performance Index, which is a number that indicates the performance of a student in a semester. It is the weighted average of the grade points obtained in all the courses registered by the student during the semester.CPI stands for Cumulative Performance Index, which is an up-to-date assessment of the overall performance of a student from the time they entered the Institute. It considers all the courses registered by the student, towards the minimum requirement of the degree they have enrolled for, since they entered the Institute. The CPI is calculated at the end of every semester to two decimal places.'

## Final Function calls and Gradio

In [15]:
def add_docs(path):
    loader = PyPDFLoader(file_path=path)
    docs = loader.load_and_split(text_splitter=RecursiveCharacterTextSplitter(chunk_size = 1000,
                                                                                chunk_overlap = 200,
                                                                                length_function = len,
                                                                                is_separator_regex=False))
    vectorstore = Chroma.from_documents(documents=docs,embedding= embed_model, persist_directory="output/general_knowledge")
    return vectorstore

def answer_query(message, chat_history):
    vectorstore = Chroma(persist_directory="output/general_knowledge", embedding_function=embed_model)
    retriever = vectorstore.as_retriever()

    rag_chain = (
        {
            "context": retriever | format_docs, # Use retriever to retrieve docs from vectorstore -> format the documents into a string
            "input": RunnablePassthrough() # Propogate the 'input' variable to the next step
        }
        | RAG_PROMPT # format prompt with 'context' and 'input' variables
        | rag_llm # get response from LLM using the formatteed prompt
        | StrOutputParser() # Parse through LLM response to get only the string response

    )

    response = rag_chain.invoke(message)
    chat_history.append((message, response))
    return "", chat_history


In [16]:
import gradio as gr
import groq_gradio

with gr.Blocks() as demo:
    gr.HTML("<h1 align = 'center'>Smart Assistant</h1>")

    with gr.Row():

        upload_files = gr.File(label = 'Upload a PDF',file_types=['.pdf'],file_count='single')

    chatbot = gr.Chatbot()
    msg = gr.Textbox(label = "Enter your question here")
    upload_files.upload(add_docs,upload_files)
    msg.submit(answer_query,[msg,chatbot],[msg,chatbot])

demo.launch()

Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://d706eef2d600fb6b2b.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


