# Building RAG Chatbots for Technical Documentation

## Table of contents

- [Introduction](#introduction)
- [Environment Setup](#environment-setup)
- [Indexing](#load-and-split-the-document)
- []

## Introduction 

This project involves implementing a retrieval augmented generation (RAG) with *LangChain* to create a chatbot for
answering questions about technical documentation. The document chosen for this assignment was the following: **The European Union Medical Device Regulation - Regulation (EU) 2017/745 (EU MDR)**. 

## Environment Setup

Install the packages and dependencies to be used:

In [None]:
# Install required libraries
%pip install -qU langchain langchain-community langchain-chroma langchain-text-splitters unstructured sentence_transformers langchain-huggingface huggingface_hub pdfplumber langchain-google-genai

As Google's generative AI model is being used, ensure that the ``GOOGLE_API_KEY`` is securely stored in the ``.env`` file.

In [1]:
from dotenv import load_dotenv

load_dotenv()

True

## Embeddings

Firstly, we start by connecting to Google's generative AI embeddings model. The **Text Embeddings 004** model from Gemini is employed for the embedding generation, with the task_type set to *retrieval_document* to optimize embeddings for retrieval tasks. 

In [None]:
from langchain_chroma import Chroma
from langchain_google_genai import GoogleGenerativeAIEmbeddings, 

embeddings = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004", task_type="retrieval_document")

## Indexing

In the indexing stage, we start by loading the PDF document and splitting it into manageable sections. To optimize execution time and improve efficiency, we store the vector store locally in a folder named "db." This allows us to quickly access previously processed data without having to re-index the document each time.

In [3]:
import os
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PDFPlumberLoader

if os.path.exists("db"): 
    vectorstore = Chroma(persist_directory="db", embedding_function=embeddings)
else:
    loader = PDFPlumberLoader("document.pdf")
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=1000,
        chunk_overlap=200,
        add_start_index=True,
        separators=["\n\n", "\n", " ", ""],
    )
    pages = loader.load_and_split(text_splitter)
    vectorstore = Chroma.from_documents(
        documents=pages, embedding=embeddings, persist_directory="db"
    )

 ## Retriever

From the vector store, a retriever is created, configured to perform similarity searches and return the top 5 most relevant results:


In [4]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 5})

### Usage example

In [5]:
retrieved_docs = retriever.invoke("Describe the use of harmonised standards")

print("Retrieved document number : " + str(len(retrieved_docs)))

for doc in retrieved_docs:
    print("page " + str(doc.metadata["page"] + 1) + ":", doc.page_content[:300])

Retrieved document number : 5
page 169: of those staff, in order to ensure that personnel who carry out and perform
assessment and verification operations are competent to fulfil the tasks
required of them.
page 127: ated with each hazard as well as the overall residual risk is judged
acceptable. In selecting the most appropriate solutions, manufacturers
shall, in the following order of priority:
(a) eliminate or reduce risks as far as possible through safe design and
manufacture;
page 196: conformity assessment procedures,
— identification of applicable general safety and performance
requirements and solutions to fulfil those requirements, taking
applicable CS and, where opted for, harmonised standards or
other adequate solutions into account,
— risk management as referred to in Secti
page 147: purpose, and shall include a justification, validation and verification of the
solutions adopted to meet those requirements. The demonstration of
conformity shall include:
(a) the general safet

## Prompt

We establish a structured format for the prompts sent to the LLM. This prompt format conveys the context while instructing the LLM to refrain from answering when it lacks confidence, thereby minimizing the risk of hallucinations.

In [None]:
from langchain_core.prompts import PromptTemplate

template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible. Mention in which pages the answer is found.

Context: {context}

Question: {question}

Helpful Answer:"""
prompt = PromptTemplate.from_template(template)

### Usage example

In [11]:
example_messages = prompt.invoke(
    {"context": "filler context", "question": "filler question"}
).to_messages()
example_messages

print(example_messages[0].content)

Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible. Mention in which pages the answer is found.

filler context

Question: filler question

Helpful Answer:


# LLM

The LLM utilized in this project is **Gemini 1.5 Flash**, recognized as Google Gemini’s fastest multimodal model. It boasts an impressive context window of 1 million tokens, allowing for comprehensive understanding and processing of extensive inputs.

In [14]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash")

### Usage example

In [8]:
from IPython.display import Markdown

result = llm.invoke("What is an LLM?")

print(result.__dict__.keys())
Markdown(result.content)

dict_keys(['content', 'additional_kwargs', 'response_metadata', 'type', 'name', 'id', 'example', 'tool_calls', 'invalid_tool_calls', 'usage_metadata'])


LLM stands for **Large Language Model**. 

Here's a breakdown:

**What is a Large Language Model?**

* **A type of artificial intelligence (AI):** LLMs are a specific type of AI that excels at understanding and generating human-like text.
* **Trained on massive datasets:**  They are trained on enormous amounts of text data, like books, articles, code, and websites. This training allows them to learn patterns and relationships in language.
* **Capable of many language tasks:** LLMs can perform a wide range of tasks, including:
    * **Text generation:** Writing stories, poems, articles, and even code.
    * **Translation:** Converting text from one language to another.
    * **Summarization:** Condensing large amounts of text into concise summaries.
    * **Question answering:** Providing answers to complex questions.
    * **Code generation:** Writing code in various programming languages.
    * **Conversation:** Engaging in natural-sounding conversations.

**Examples of LLMs:**

* **GPT-3 (Generative Pre-trained Transformer 3):** Developed by OpenAI, known for its impressive text generation capabilities.
* **LaMDA (Language Model for Dialogue Applications):** Developed by Google, specifically designed for conversational AI.
* **BERT (Bidirectional Encoder Representations from Transformers):** Developed by Google, excels at understanding the meaning of text.

**Key Features of LLMs:**

* **Deep learning:** They use complex neural networks to process information.
* **Contextual understanding:** They can understand the meaning of words based on their context in a sentence or paragraph.
* **Generative capabilities:** They can create new text that is coherent and grammatically correct.
* **Scalability:** They can be trained on massive datasets and deployed on powerful hardware.

**Implications of LLMs:**

LLMs are revolutionizing various fields, including:

* **Content creation:** Automating writing tasks, improving content quality, and making content more accessible.
* **Customer service:** Providing automated support, answering questions, and resolving issues.
* **Education:** Personalizing learning experiences and providing real-time feedback.
* **Research:** Analyzing large amounts of text data and generating insights.

**Important Considerations:**

* **Bias:** LLMs can reflect biases present in the training data, so it's crucial to address potential biases.
* **Ethical implications:** The potential for misuse, such as generating fake news or impersonating individuals, needs careful consideration.

Overall, LLMs are powerful tools with the potential to change the way we interact with language and information. Their capabilities and implications are constantly evolving, making them a fascinating and rapidly developing area of AI research. 


## RAG chain

Putting it all together, we can now define a RAG chain that takes a question, retrieves relevant documents, constructs a prompt, passes it into a model, and parses the output.

In [15]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

def format_docs(docs):
    formatted_docs = []
    for doc in docs:
        page_number = doc.metadata["page"] + 1 
        content_with_page = f"Page {page_number}:\n{doc.page_content}"
        formatted_docs.append(content_with_page)
    return "\n\n".join(formatted_docs)

In [16]:
rag_chain = (
    {
        "context": retriever | format_docs,
        "question": RunnablePassthrough(),
    }
    | prompt
    | llm
    | StrOutputParser()
)

### Usage example

In [18]:
query = "What is the medical devices regulation?"
Markdown(rag_chain.invoke(query))

The Medical Devices Regulation (MDR) is a regulation of the European Union that governs the safety and performance of medical devices. It was adopted in 2017 and came into full effect on May 26, 2021. The MDR is a comprehensive piece of legislation that sets out requirements for the design, manufacture, and marketing of medical devices, including the use of a unique device identification (UDI) system. The MDR can be found in its entirety in Regulation (EU) 2017/745, which is referenced throughout the provided text. 


## Evaluation metrics and comparison

### Tuning parameters

In [None]:
temperatures = [0.1, 0.5, 1.0]
top_ps = [0.1, 0.5, 0.9]

results = "| Temperature | Top P | Response |\n" + "|-------------|-------|----------|\n"

for temperature in temperatures:
    for top_p in top_ps:
        llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=temperature, top_p=top_p)

        query = "What is harmonised standards?"
        response = rag_chain.invoke(query)

        results += f"| {temperature} | {top_p}  | {response}"

Markdown(results)

| Temperature | Top P | Response |
|-------------|-------|----------|
| 0.1 | 0.1  | Harmonised standards are standards published in the Official Journal of the European Union. These standards are presumed to be in conformity with the requirements of the Regulation. This information is found on page 16. 
| 0.1 | 0.5  | Harmonized standards are standards published in the Official Journal of the European Union. These standards are presumed to be in conformity with the requirements of the Regulation. This information is found on page 16. 
| 0.1 | 1.0  | Harmonised standards are standards published in the Official Journal of the European Union. These standards are presumed to be in conformity with the requirements of the Regulation. The answer can be found on page 16. 
| 0.5 | 0.1  | Harmonised standards are standards published in the Official Journal of the European Union. These standards are presumed to be in conformity with the requirements of the regulation. This information is found on page 16. 
| 0.5 | 0.5  | Harmonized standards are standards published in the Official Journal of the European Union. They are presumed to be in conformity with the requirements of the Regulation. This information is found on page 16. 
| 0.5 | 1.0  | Harmonized standards are standards published in the Official Journal of the European Union. These standards are presumed to be in conformity with the requirements of the Regulation. This information is found on page 16. 
| 1.0 | 0.1  | Harmonised standards are standards that have been published in the Official Journal of the European Union. These standards are presumed to be in conformity with the requirements of the Regulation. This information is found on page 16. 
| 1.0 | 0.5  | Harmonised standards are standards that have been published in the Official Journal of the European Union. These standards are presumed to be in conformity with the requirements of the Regulation. This information is found on page 16. 
| 1.0 | 1.0  | Harmonised standards are standards that have been published in the Official Journal of the European Union. These standards are presumed to be in conformity with the requirements of the Regulation. This information is found on page 16 of the document. 


### Gemini vs GPT2

In [31]:
from langchain_huggingface.llms import HuggingFacePipeline
from transformers import pipeline

generator = pipeline('text-generation', model='gpt2', max_length=1000, pad_token_id=50256, return_full_text=False)
gpt2 = HuggingFacePipeline(pipeline=generator)

model.safetensors:  78%|#######8  | 430M/548M [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [33]:
rag_chain_gpt2 = (
    {
        "context": retriever | format_docs,
        "question": RunnablePassthrough(),
    }
    | prompt
    | gpt2
    | StrOutputParser()
)

query = "What is the medical devices regulation?"
Markdown(rag_chain_gpt2.invoke(query))



An individual or institution is not a person or entity and is not subject to either the Medical

Device Regulation or the medical device regulations. It is essential to

list all the necessary devices and procedures in the medical device

registry, to ensure that the necessary devices and procedures are

considered in relation to the specific individual who qualifies for the licensing

of such medical devices under the Regulation and to maintain their functionality and

accuracy. When the Medical Device Regulation applies to an individual, his

health information must be clearly identified at the time he receives his

license or a copy of the license must be provided to the patient.

Question: What is the'medical devices policy', "medical device

licensing policy", which applies to consumers seeking

medical devices license?

Find the right answer of yes or no for the Medical Device, Medical

Device Regulation, and Medical Device licensing policy under your choice of

the'medical device policy' under your choice of

the'medical device licensing policy' at the end of your order of

licensing.



### Prompt tuning

In [50]:
template = """
{context}

Question: {question}

Helpful Answer:"""
prompt = PromptTemplate.from_template(template)

In [53]:
query = "What is a LLM?"
rag_chain_prompt_tuning = (
    {
        "context": retriever | format_docs,
        "question": RunnablePassthrough(),
    }
    | prompt
    | llm
    | StrOutputParser()
)
Markdown(rag_chain_prompt_tuning.invoke(query))

The provided text snippets don't contain any direct mention or definition of what an LLM is.  These snippets are excerpts from various European Union Regulations and Directives. 

**LLM stands for Large Language Model.**  It is a type of artificial intelligence that is trained on massive amounts of text data and can generate human-like text, translate languages, write different kinds of creative content, and answer your questions in an informative way.

The text you provided likely discusses various regulations regarding data protection, technical and organizational measures, and compliance with EU directives. While LLMs are capable of analyzing text and generating outputs, the provided text doesn't provide any information on this specific technology. 


## Chat history

In [67]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnableLambda
from langchain.memory import ChatMessageHistory

chat_history = ChatMessageHistory()

system_template = """
You are a Q&A chatbot that helps to answer the user's questions about a given document. Always follow these rules to answer the question:

Use the following pieces of context to answer the questions.
If the question is not related to the context, just say it is not related.
If you don't know the answer to any of the questions, just say that you don't know, don't try to make up an answer.
Always mention in which pages the information you give are found.

<context>
{context}
</context>
"""

question_answering_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            system_template,
        ),
        MessagesPlaceholder(variable_name="chat_history"),
    ]
)

question_runnable = RunnableLambda(lambda input: input["question"])
chat_history_runnable = RunnableLambda(lambda input: input["chat_history"])


rag_chain = (
            {
                "context": question_runnable | retriever | format_docs,
                "question": question_runnable,
                "chat_history": chat_history_runnable,
            }
            | prompt
            | llm
            | StrOutputParser()
        )

In [70]:
def QuestionAnswerLoop():
    print("Enter your question (type 'quit' to exit): ")
    while True:
        user_input = input("Enter your question (type 'quit' to exit): ")
        if user_input.lower() == 'quit':
            print("Exiting Q&A chat. Goodbye!")
            break
        else:
            response = rag_chain.invoke(
                {
                    "question": user_input, 
                    "chat_history": chat_history.messages
                }
            )

            # Add the AI's response to the chat history
            chat_history.add_ai_message(response)

            # Print the response
            print("Question: " + user_input)
            print("Answer: " + response)
        

In [73]:
QuestionAnswerLoop()

Enter your question (type 'quit' to exit): 
Question: what is MDR
Answer: MDR stands for **Medical Devices Regulation**. 

Specifically, it refers to **Regulation (EU) 2017/745 of the European Parliament and of the Council of 5 April 2017 on medical devices**, which replaced the older Directives for medical devices (90/385/EEC and 93/42/EEC) in Europe. This Regulation sets a new framework for the safety and performance standards of medical devices in the European Union.

You can see this abbreviation mentioned on the pages you provided. For example, on Page 3, you can read "The requirements of **Regulation (EU) 2017/746** shall apply to the in vitro diagnostic medical device part of the device."

The text you provided appears to be excerpts from documents related to MDR. 

Question: what is MDR
Answer: MDR stands for **Medical Device Regulation**. It refers to **Regulation (EU) 2017/745** of the European Parliament and of the Council of 5 April 2017 on medical devices. 

It's the curre