# Building RAG Chatbots for Technical Documentation

## Table of contents

- [Introduction](#introduction)
- [Environment Setup](#environment-setup)
- [Load and split the document](#load-and-split-the-document)
- [Generate and store the embeddings](#generate-and-store-the-embeddings)

## Introduction 

This project involves implementing a retrieval augmented generation (RAG) with `LangChain` to create a chatbot for
answering questions about technical documentation. The document chosen for this assignment was the following: The European Union Medical Device Regulation - Regulation (EU) 2017/745 (EU MDR). 

## Environment Setup

Install the packages and dependencies to be used:

In [1]:
# Install required libraries
%pip install -qU langchain langchain-community langchain-chroma langchain-text-splitters unstructured sentence_transformers langchain-huggingface huggingface_hub pdfplumber langchain-google-genai

Note: you may need to restart the kernel to use updated packages.


## Load and split the document

In [1]:
# Using PDF document

from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PDFPlumberLoader

loader = PDFPlumberLoader("document.pdf")
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, add_start_index=True
)
pages = loader.load_and_split(text_splitter)

print(pages[0])


page_content='02017R0745 — EN — 09.07.2024 — 004.001 — 1
This text is meant purely as a documentation tool and has no legal effect. The Union's institutions do not assume any liability
for its contents. The authentic versions of the relevant acts, including their preambles, are those published in the Official
Journal of the European Union and available in EUR-Lex. Those official texts are directly accessible through the links
embedded in this document
►B REGULATION (EU) 2017/745 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL
of 5 April 2017
on medical devices, amending Directive 2001/83/EC, Regulation (EC) No 178/2002 and
Regulation (EC) No 1223/2009 and repealing Council Directives 90/385/EEC and 93/42/EEC
(Text with EEA relevance)
(OJ L 117, 5.5.2017, p. 1)
Amended by:
Official Journal
No page date
►M1 Regulation (EU) 2020/561 of the European Parliament and of the L 130 18 24.4.2020
Council of 23 April 2020
►M2 Commission Delegated Regulation (EU) 2023/502 of 1 December 2022 L 70 1 8.

## Generate and store the embeddings

In [3]:
# Generate and store the embeddings
from langchain_chroma import Chroma
from langchain_huggingface.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

vectorstore = Chroma.from_documents(documents=pages, embedding=embeddings, persist_directory="db")

 ## Retrieve


In [4]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 3})

retrieved_docs = retriever.invoke("Describe the use of harmonised standards")

print(len(retrieved_docs))

for doc in retrieved_docs:
    print("page " + str(doc.metadata["page"] + 1) + ":", doc.page_content[:300])

3
page 16: which have been published in the Official Journal of the European
Union, shall be presumed to be in conformity with the requirements
of this Regulation covered by those standards or parts thereof.
The first subparagraph shall also apply to system or process
requirements to be fulfilled in accordance
page 197: used, particularly as regards sterilisation and the relevant documents;
and
(e) the appropriate tests and trials which are to be carried out before,
during and after manufacture, the frequency with which they are to
take place, and the test equipment to be used; it shall be possible to
trace back ad
page 168: an initial start-up phase.
1.6. Participation in coordination activities
1.6.1. The notified body shall participate in, or ensure that its assessment
personnel is informed of, any relevant standardisation activities and in
the activities of the notified body coordination group referred to in
Article


## LLM

In [7]:
from langchain_huggingface.llms import HuggingFacePipeline
from transformers import pipeline, set_seed

generator = pipeline('text-generation', model='gpt2', max_length=1000, pad_token_id=50256, return_full_text=False)

gpt2 = HuggingFacePipeline(pipeline=generator)

set_seed(42)
generator("Describe the use of harmonised standards", max_length=30, num_return_sequences=5)


Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


[{'generated_text': ':\n\nThis is an extremely important question—how can a large proportion of people who may not speak the'},
 {'generated_text': ' to encourage good use of the same product\n\n(2)Where there is a strong preference among experts in'},
 {'generated_text': ", including the EU's 'Duke of Limbo' regulations as a 'guarantee that the countries"},
 {'generated_text': ', one based on international practice of the EU.\n\n1. A national harmonised standard: whether under'},
 {'generated_text': '. I find that most of the information I have for a particular type of document has nothing to do with the'}]

# Better LLM

In [8]:
from langchain_google_genai import ChatGoogleGenerativeAI
from IPython.display import Markdown

from dotenv import load_dotenv
load_dotenv()

llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash")
result = llm.invoke("What is an LLM?")

print(result.__dict__.keys())
Markdown(result.content)

dict_keys(['content', 'additional_kwargs', 'response_metadata', 'type', 'name', 'id', 'example', 'tool_calls', 'invalid_tool_calls', 'usage_metadata'])


LLM stands for **Large Language Model**. It's a type of artificial intelligence (AI) that excels at understanding and generating human-like text. 

Here's a breakdown:

**What it is:**

* **A complex neural network:** LLMs are built using deep learning techniques, specifically neural networks with many layers.
* **Trained on massive text datasets:** These models are trained on enormous amounts of text data, like books, articles, code, and online conversations. 
* **Capable of various language tasks:** LLMs can perform a wide range of tasks, including:
    * **Text generation:** Writing stories, poems, articles, emails, and more.
    * **Translation:** Converting text from one language to another.
    * **Summarization:** Condensing large amounts of text into concise summaries.
    * **Question answering:** Providing answers to questions based on given text.
    * **Code generation:** Writing code in various programming languages.
    * **Dialogue generation:** Engaging in natural-sounding conversations.

**Examples:**

* **GPT-3 (Generative Pre-trained Transformer 3):** Developed by OpenAI, it's known for its impressive text generation capabilities.
* **BERT (Bidirectional Encoder Representations from Transformers):** Developed by Google, it excels at understanding the context of words in a sentence.
* **LaMDA (Language Model for Dialogue Applications):** Developed by Google, it's designed for natural and engaging conversation.

**Key features:**

* **Generative:** LLMs can create new text based on the patterns they learned during training.
* **Contextual:** They can understand the meaning of words based on their surrounding context.
* **Versatile:** LLMs can be adapted to various language tasks.

**Limitations:**

* **Bias and misinformation:** LLMs can reflect the biases present in their training data, leading to potentially harmful or misleading outputs.
* **Lack of true understanding:** Despite their fluency, LLMs don't truly understand the meaning of the text they generate.
* **Computational demands:** Training and running LLMs require significant computational resources.

**Overall, LLMs are powerful tools with the potential to revolutionize how we interact with language. However, it's important to be aware of their limitations and use them responsibly.** 


In [None]:
result_gpt2 = gpt2.invoke("What is an LLM?")
Markdown(result_gpt2)

## Generate

In [9]:
from langchain_core.prompts import PromptTemplate

template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible. Mention in which pages the answer is found.

{context}

Question: {question}

Helpful Answer:"""
prompt = PromptTemplate.from_template(template)


example_messages = prompt.invoke(
    {"context": "filler context", "question": "filler question"}
).to_messages()
example_messages

print(example_messages[0].content)

Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible. Mention in which pages the answer is found.

filler context

Question: filler question

Helpful Answer:


In [10]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

def format_docs(docs):
    formatted_docs = []
    for doc in docs:
        page_number = doc.metadata["page"] + 1 
        content_with_page = f"Page {page_number}:\n{doc.page_content}"
        formatted_docs.append(content_with_page)
    return "\n\n".join(formatted_docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

Markdown(rag_chain.invoke("Describe the use of harmonised standards"))

Harmonized standards, whose references are published in the Official Journal of the European Union, are presumed to comply with the requirements of the regulation. This applies to both product requirements and system or process requirements, such as quality management systems. Notified bodies are required to assess conformity with harmonized standards when manufacturers use them. (Pages 16 and 197) 


In [11]:
Markdown(rag_chain.invoke("What is an LLM?"))

I'm sorry, but the provided text does not contain any information about LLMs (Large Language Models). 


In [None]:
import ipywidgets as widgets
from IPython.display import display

# Function that processes the user's question
def process_input(user_input):
    complete_prompt = f"WIP"
    answer = rag_chain.invoke(user_input)
    return answer, complete_prompt

# Text input widget (for user question)
text_input = widgets.Text(
    description='Input:',
    placeholder='Type something here...',
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='500px')
)

# Primary output widget to display LLM's answer
primary_output = widgets.Output()

# Secondary output widget for the complete generated prompt (collapsible)
secondary_output = widgets.Output()

# Progress indicator (shown while processing)
progress_indicator = widgets.Output()

# Event handler for when Enter is pressed in the text input
def on_text_submit(change):
    if change['type'] == 'change' and change['name'] == 'value':
        input_value = change.new.strip()
        if input_value == "":
            return

        # Show the progress indicator
        with progress_indicator:
            progress_indicator.clear_output()
            print("Processing... Please wait.")

        answer, complete_prompt = process_input(input_value)  # Processes the input

        with primary_output:
            primary_output.clear_output()       # Clears the previous output
            print(f"User Question: {input_value}")
            print("LLM Answer:")
            print(answer)

        with secondary_output:
            secondary_output.clear_output()     # Clears the previous output
            print("Complete prompt:")
            print(complete_prompt)

        # Hide the progress indicator after processing is complete
        with progress_indicator:
            progress_indicator.clear_output()

        change.new = ""

# Attach the event handler to the text input widget for Enter key submission
text_input.continuous_update = False
text_input.observe(on_text_submit, names='value', type="change")

# Make the complete generated prompt collapsible
accordion = widgets.Accordion(children=[secondary_output])
accordion.set_title(0, 'Complete generated prompt')

# Display all the fields: text input, LLM's answer, complete prompt
display(text_input, primary_output, accordion, progress_indicator)

Text(value='', continuous_update=False, description='Input:', layout=Layout(width='500px'), placeholder='Type …

Output()

Accordion(children=(Output(),), titles=('Complete generated prompt',))

Output()

## Chat Memory

In [None]:
from langchain_core.messages import HumanMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

system_template = """
You are a Q&A chatbot that helps to answer the user's questions about a given document. Always follow these rules to answer the question:

Use the following pieces of context to answer the questions.
If the question is not related to the context, just say it is not related.
If you don't know the answer to any of the questions, just say that you don't know, don't try to make up an answer.
Always mention in which pages the information you give are found.

<context>
{context}
</context>
"""

question_answering_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            system_template,
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

def format_docs(docs):
    formatted_docs = []
    for doc in docs:
        page_number = doc.metadata["page"] + 1 
        content_with_page = f"Page {page_number}:\n{doc.page_content}"
        formatted_docs.append(content_with_page)
    return "\n\n".join(formatted_docs)

docs = (retriever | format_docs).invoke("Describe the use of harmonised standards")

chain = question_answering_prompt | llm


chain.invoke(
    {
        "context" : docs,
        "messages": [
            HumanMessage(
                content="Describe the use of harmonised standards"
            ),
        ],
    }
)


AIMessage(content='Harmonised standards are standards that are recognized and published by the Official Journal of the European Union (page 16). These standards are presumed to be in conformity with the requirements of the regulation covered by those standards (page 16). The notified body will assess conformity with those standards when a manufacturer uses them (page 197). \n', additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-ee53edca-2e80-4a8c-bd08-20eaef1570a8-0', usage_metadata={'input_tokens': 677, 'output_token

In [20]:
from langchain.memory import ChatMessageHistory

chat_history = ChatMessageHistory()

## Chat loop

In [None]:

def QuestionAnswerLoop():
    print("Enter your question (type 'quit' to exit): ")
    while True:
        user_input = input("Enter your question (type 'quit' to exit): ")
        if user_input.lower() == 'quit':
            print("Exiting Q&A chat. Goodbye!")
            break
        else:
            # get document blocks from retriever
            retrieved_docs = (retriever | format_docs).invoke(user_input)
            
            # Add the user's question to the chat history
            chat_history.add_user_message(user_input)

            # Generate the response with context from the retrieved documents
            response = chain.invoke(
                {
                    "messages": chat_history.messages,
                    "context": retrieved_docs,
                }
            )

            # Add the AI's response to the chat history
            chat_history.add_ai_message(response)

            # Print the response
            print("Question: " + user_input)
            print("Answer: " + response.content)
        


In [23]:
QuestionAnswerLoop()

Question: what are some retrictions on medical devices in the EU?
Answer: The document states that national laws concerning the organization, delivery, or financing of health services can impose restrictions on medical devices. For example, Page 5 mentions that certain devices might require a medical prescription, be dispensed only by specific professionals, or require professional counseling. 

Question: tell me more about the need for medical prescriptions
Answer: This document does not provide specific information about the need for medical prescriptions for medical devices. 

Question: what does article 39 talk about?
Answer: Article 39, found on Page 52, discusses the re-assessment of notified bodies in the EU.  It states that the authority responsible for notified bodies can conduct a complete re-assessment before the dates mentioned in the first subparagraph, either upon request by the notified body or if they have concerns about the body's continued compliance with the requirem