# RAG for Deadly Disease in the UK Using Third Party Websites

Student id: 5504970

##1. LLM Installation
The first step is to install Ollama and setup the desired LLM for this project.

Ollama is chosen as it is simple to use, allows data operation on local machine, and offers some level of customization on how to interact with the LLM.

In [None]:
%%capture
# Install Ollama v0.1.30
!curl https://ollama.ai/install.sh | sed 's#https://ollama.ai/download#https://github.com/jmorganca/ollama/releases/download/v0.1.30#' | sh

In [None]:
%%capture
# Setup the model as a global variable
OLLAMA_MODEL='phi:latest'

# Add the model to the environment of the operating system
import os
os.environ['OLLAMA_MODEL'] = OLLAMA_MODEL
!echo $OLLAMA_MODEL # print the global variable to check it saved

import subprocess
import time

# Start ollama on the server ("serve")
command = "nohup ollama serve&" # "nohup" and "&" means run in the background

# Use subprocess.Popen to run the command
process = subprocess.Popen(command,
                            shell=True,
                            stdout=subprocess.PIPE,
                            stderr=subprocess.PIPE)

time.sleep(5)  # Makes Python wait for 5 seconds

# Install prerequisites
!pip install llama-index-embeddings-huggingface
!pip install llama-index-llms-ollama
!pip install llama-index ipywidgets
!pip install llama-index-llms-huggingface
!pip install llama_index.readers.web
!pip install llama-index-vector-stores-chroma
!pip install chromadb

# Import required modules from the llama_index library
from llama_index.core import VectorStoreIndex, SummaryIndex, SimpleDirectoryReader
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings
from llama_index.llms.ollama import Ollama
from llama_index.core import StorageContext

# Import ChromaVectorStore and chromadb module
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadb

# Import the Ollama class
from llama_index.llms.ollama import Ollama

# Use the global variable (OLLAMA_MODEL) as our LLM
# Set a timeout of 8 minutes in case of CPU
llm = Ollama(model=OLLAMA_MODEL, request_timeout=480.0)

##2.  Query Test for LLM (Ollama) without RAG
To ensure that the LLM is working, we do a query test by sending it several questions.


In [None]:
# Query the model via the command line
# First time running it will "pull" (import) the model
# First query related to deadly disease

!ollama run $OLLAMA_MODEL "What are the leading causes of death in the UK?"

[?25l⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [

In [None]:
# Second query related to deadly disease
!ollama run $OLLAMA_MODEL "What is the cause of death of Christine McVie?"

[?25l⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[?25l[2K[1G[?25h[2K[1G[?25h I[?25l[?25h'm[?25l[?25h sorry[?25l[?2

In [None]:
# Third query related to deadly disease
!ollama run $OLLAMA_MODEL "What cancer does Married At First Sight star Mel Schilling suffer from?"

[?25l⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[?25l[2K[1G[?25h[2K[1G[?25h I[?25l[?25h'm[?25l[?25h sorry[?25l[?25h,[?25l[?25h as[?25l[?25h an[?25l[?25

##3. Adding New Information to the LLM
New information or knowledge from third party url or website is added to the LLM.

In [None]:
from llama_index.readers.web import BeautifulSoupWebReader

# List of URLs
urls = [
    "https://uk.style.yahoo.com/leading-causes-of-death-uk-144557562.html",
    "https://uk.style.yahoo.com/stroke-symptoms-causes-treatment-120059074.html",
    "https://uk.style.yahoo.com/symptoms-uk-common-cancer-cases-soar-under-50s-175516176.html"
]

# Function to fetch and clean text from a URL
def fetch_and_clean_text(url):
    documents = BeautifulSoupWebReader().load_data([url])
    text = documents[0].text
    text = text.replace("\n", "").replace("\t", "")
    return text

# Fetch and concatenate text from all URLs
documents = ""
for url in urls:
    documents += fetch_and_clean_text(url) + " "

# Print the concatenated cleaned text
print("Cleaned text")
print(documents)

Cleaned text


##4. Embedding
The information added from third party url is transformed into lower-dimensional representation.

This project used HuggingFace Embedding. Huggingface offers a vast collection of pre-trained embedding models, offering versatility for various tasks, and is well-integrated with other popular machine learning libraries and frameworks.

BAAI (Beijing Academy of Artificial Intelligence) model was chosen for this embedding step because of its reputable performance, having achieved high scores on benchmarks like the Massive Text Embedding Benchmark (MTEB), despite being open-source and free. It is also known to be computationally efficient and can be used for various tasks.

In [None]:
# Initialize a HuggingFace Embedding model
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

# Specify the LLM and embedding model into LlamaIndex's settings
Settings.llm = llm
Settings.embed_model = embed_model



##5. Chunking
The embedded information is then split into smaller chunks of information so they are easier to process.

Fixed-size overlap chunking is used here to minimize the potential loss of information during its process and to maintain context accros the resulting chunks.

In [None]:
import shutil

# Function to perform chunking with overlap
def chunk_text(text, chunk_size=200, overlap=20):
    words = text.split()
    chunks = []
    for i in range(0, len(words), chunk_size - overlap):
        chunk = ' '.join(words[i:i + chunk_size])
        chunks.append(chunk)
        if i + chunk_size >= len(words):
            break
    return chunks

# Perform chunking with overlap
split_docs = chunk_text(documents, chunk_size=200, overlap=20)
print(f"Number of chunks created: {len(split_docs)}")

count = 0

# Ensure the directory exists and clear it
output_dir = '/content/data/Output'
if os.path.exists(output_dir):
    shutil.rmtree(output_dir)
os.makedirs(output_dir)

# Save chunks to files
for doc in split_docs:
    fname = f"{output_dir}/{count}.txt"
    with open(fname, "w") as text_file:
        text_file.write(doc)
    count += 1

# Load documents
reader = SimpleDirectoryReader(output_dir)
docs = reader.load_data()
print(f"Loaded {len(docs)} documents")

# Create client ("db") and load the existing database ("chroma_db")
db = chromadb.PersistentClient(path="./chroma_db")

# Check if collection exists, and delete it if so
new_collection_name = "collection"
try:
    db.get_collection(new_collection_name)
    db.delete_collection(new_collection_name)
    print(f"Deleted existing collection: {new_collection_name}")
except KeyError:
    print(f"Collection {new_collection_name} does not exist, creating a new one.")

# Create a new collection/table
chroma_collection = db.create_collection(new_collection_name)

# Set up ChromaVectorStore and load in data
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# Specify Chroma as our vector db
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Create the vector index
vector_index = VectorStoreIndex.from_documents(
    docs,  # the file created earlier
    storage_context=storage_context,
    embed_model=embed_model
)

# Print the metadata
print(chroma_collection)

# Print the name of the new collection (table)
print(f'Collection name is: {chroma_collection.name}')

Number of chunks created: 36
Loaded 36 documents
Deleted existing collection: collection
name='collection' id=UUID('824da212-94e4-431b-9456-32e106ca09f9') metadata=None tenant='default_tenant' database='default_database'
Collection name is: collection


The chunking process has created 36 documents from the previously added information. We read some of these documents to make sure they are ready to process.

In [None]:
docs[0].text

"Britain's biggest killers: UK's leading causes of death HOME MAIL NEWS FINANCE SPORT CELEBRITY STYLE WEATHER MORE... Yahoo Style Search query Sign in Mail Sign in to view your emails Life Royals Shopping Deals Top Rated Tried and Tested Celebrities Style Fashion Beauty Family Parenting Relationships Health Fitness & Nutrition Mental Health Sexual Health Home & Garden Travel Horoscopes … AdvertisementTried & Tested:I'm obsessed with Finery's summer dresses, these are my most complimented onesWith all but one priced under £60, it’s no wonder these versatile midi dresses are selling out fast.Close this contentRead full articleYahoo Life UKBritain's biggest killers: UK's leading causes of deathLaura HampsonUpdated 16 August 2023 at 10:45 am·5-min readLink copiedHeart disease is one of the leading causes of death for men in the UK. (Getty Images)Up until a decade ago, dementia and Alzheimer’s disease did not feature on the Office for National Statistics (ONS) leading causes of death list. 

In [None]:
docs[11].text

'adds. "Rehabilitation helps people to make the best recovery possible so they can relearn skills for everyday life."Because every stroke is different, there is no set pattern for recovering from one. The quickest recovery takes place in the days and weeks after a stroke. But recovery can continue for months and years after a stroke."Stroke survivors tell us that it can take a lot of effort and determination to keep going with rehabilitation. It can be very hard work, physically and mentally, but many people find it helps them make vital progress with speaking, walking and other key skills."What should you do if you think you or someone else is having a stroke?"Stroke changes lives in an instant, but the brain can adapt. If you think you or someone you know is having a stroke Act FAST and call 999 as a stroke is a medical emergency," says the Association.If you have been affected by a stroke, including as family, friends and carers, you can call the Stroke Association\'s helpline on 03

In [None]:
docs[22].text

"them regularly for changes and look out for a lump or area of thickened breast tissue. While lumps are likely not cancerous, if found, it's important to have them examined professionally.Other than lumps, symptoms to look out for include:a change in the size or shape of one or both breastsdischarge from either of your nipples (which may contain blood)a lump or swelling in either of your armpits, dimplesa rash around the nipple, and a change in the nipple's appearanceA lump is one of the most important symptoms to check for for breast cancer. (Getty Images)2. Prostate cancerProstate cancer is often very slow to develop, so people may live with it for a few years without noticing any symptoms at all.According to the NHS, prostate cancer symptoms can include:needing to pee more frequently, often during the nightneeding to rush to the toiletdifficulty in starting to pee (hesitancy)straining or taking a long time while peeingweak flowfeeling that your bladder has not emptied fullyblood in 

In [None]:
docs[35].text

'the MSNBC anchor.HuffPostEx-Trump White House Attorney Spots 1 Concerning Change Since Conviction“I think there should be concern," said Ty Cobb.The TelegraphPutin has missed his chance to crush UkraineFighter jets roar overhead, artillery booms, a crowd cries out.The IndependentRyanair flies British couple to the wrong country after ‘unbelievable’ airport mistakeExclusive: Andrew and Victoria Gore were part of a 12-strong family trip to Spain’s Costa Brava – but ended up thousands of miles awayMore storiestwitterfacebookinstagramHomeRoyalsShoppingCelebritiesStyleFamilyHealthHome & GardenTravelHoroscopesTerms and Privacy PolicyPrivacy dashboardHelpShare your feedbackAbout usAbout our ads© 2024 Yahoo. All rights reserved. Stroke signs and symptoms as thousands of lives saved by NHS \'wonder drug\' HOME MAIL NEWS FINANCE SPORT CELEBRITY STYLE WEATHER MORE... Yahoo Style Search query Sign in Mail Sign in to view your emails Life Royals Shopping Deals Top Rated Tried and Tested Celebritie

##6. Set Prompt Template
The promp template is set using an LLM for question answering tasks. The template is also set in a way so it will always answer a query according to its prior knowledge even if the query is asking for an information not yet added to the prompt.

In [None]:
from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.core import ChatPromptTemplate

# Create prompt template
qa_prompt_str = (
    "Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the question: {query_str}\n"
)

# Text QA Prompt
chat_text_qa_msgs = [
    ChatMessage(
        role=MessageRole.SYSTEM,
        content=(
            "Always answer the question, even if the context isn't helpful."
        ),
    ),
    ChatMessage(role=MessageRole.USER, content=qa_prompt_str),
]

text_qa_template = ChatPromptTemplate(chat_text_qa_msgs)

##7. Testing RAG with Prompt Template
The prompt template is then tested by asking it a question to see if it will provide a correct answer this time.

In [None]:
# Execute the query and print the result
print(
    vector_index.as_query_engine(
        text_qa_template=text_qa_template,
        llm=llm,
    ).query("What is the the leading cause of death in the UK in 2022?")
)

 Dementia and Alzheimer's disease.



##8. Evaluation

###8.1. Comparing Each Query with Different Response Mode
The prompt template is asked with various questions, each question is asked multiple times for all the different response modes.

####a. First Query: "What are the leading causes of death in the UK?"

In [None]:
import time

# Start the timer
start_time = time.time()

# Execute the query
result = vector_index.as_query_engine(
    text_qa_template=text_qa_template,
    llm=llm
).query("What are the leading causes of death in the UK?")

# Stop the timer
end_time = time.time()

# Calculate the response time
response_time = end_time - start_time

# Print the result and the response time
print(f"Response: {result}")
print(f"Response Time: {response_time} seconds")

Response:  The leading causes of death in the UK are dementia and Alzheimer's disease, heart disease, lower respiratory diseases, strokes, lung cancer, influenza and pneumonia.

Response Time: 200.90995955467224 seconds


In [None]:
import time

# Start the timer
start_time = time.time()

# Execute the query
result = vector_index.as_query_engine(
    text_qa_template=text_qa_template,
    llm=llm,
    response_mode="refine"
).query("What are the leading causes of death in the UK?")

# Stop the timer
end_time = time.time()

# Calculate the response time
response_time = end_time - start_time

# Print the result and the response time
print(f"Response: {result}")
print(f"Response Time: {response_time} seconds")

Response:  Based on the information provided, the top two leading causes of death for men in the UK are cancer and lung cancer. For women, they are heart disease and lung cancer. These diseases are not mentioned as the leading cause in any of the common diseases but they have a significant impact on the overall number of deaths.

Response Time: 532.2186551094055 seconds


In [None]:
import time

# Start the timer
start_time = time.time()

# Execute the query
result = vector_index.as_query_engine(
    text_qa_template=text_qa_template,
    llm=llm,
    response_mode="compact"
).query("What are the leading causes of death in the UK?")

# Stop the timer
end_time = time.time()

# Calculate the response time
response_time = end_time - start_time

# Print the result and the response time
print(f"Response: {result}")
print(f"Response Time: {response_time} seconds")

Response:  The leading causes of death in the UK are dementia and Alzheimer's disease. In 2022, they accounted for 11.4% of all deaths in England and Wales, surpassing heart disease as the top cause of death.


Rules: 
1. We have three categories of data: Heart Diseases (HD), Dementia & Alzheimer's Disease (DAD), and Lower Respiratory Diseases (LRD).
2. Each category represents a proportion of all deaths in 2020, and these proportions are represented by the numbers 300k, 400k, and 200k respectively. 
3. In 2021, as per the context information, Dementia & Alzheimer's Disease have become the leading cause of death surpassing Heart Diseases, while Lower Respiratory Diseases remain relatively stable.
4. Assume that the increase in deaths from Heart Diseases to lower respiratory diseases is directly proportional to the change in the proportion of dementia and Alzheimer's disease as the leading cause of death in 2021. 

Question: If we assume the death rate for Lower Respiratory Diseases sta

In [None]:
import time

# Start the timer
start_time = time.time()

# Execute the query
result = vector_index.as_query_engine(
    text_qa_template=text_qa_template,
    llm=llm,
    response_mode="tree_summarize"
).query("What are the leading causes of death in the UK?")

# Stop the timer
end_time = time.time()

# Calculate the response time
response_time = end_time - start_time

# Print the result and the response time
print(f"Response: {result}")
print(f"Response Time: {response_time} seconds")

Response:  The top five causes of death in the UK as per the data provided are Dementia and Alzheimer's disease, heart disease, lower respiratory diseases, strokes and lung cancer.

Response Time: 219.7080295085907 seconds


From the results for the first query, default response mode gave the quickest response at 200.91 seconds while "refine" response mode gave the slowest response at 532.22 seconds.

Besides the slow response time, "refine" response mode also gave the wrong or outdated answer compared to every other response modes. Meanwhile, "compact" response mode gave reasonings/rules behind their response despite not being nearly as slow as "refine" response mode.

However, "compact" response mode still ultimately only gave two leading causes of death in the UK compared to five causes by "tree_summarize" response mode and eight causes by default response mode.

####b. Second Query: "What is the cause of death of Christine McVie?"

In [None]:
import time

# Start the timer
start_time = time.time()

# Execute the query
result = vector_index.as_query_engine(
    text_qa_template=text_qa_template,
    llm=llm
).query("What is the cause of death of Christine McVie?")

# Stop the timer
end_time = time.time()

# Calculate the response time
response_time = end_time - start_time

# Print the result and the response time
print(f"Response: {result}")
print(f"Response Time: {response_time} seconds")

Response:  The cause of death of Christine McVie was a stroke. 


Response Time: 188.37259435653687 seconds


In [None]:
import time

# Start the timer
start_time = time.time()

# Execute the query
result = vector_index.as_query_engine(
    text_qa_template=text_qa_template,
    llm=llm,
    response_mode="refine"
).query("What is the cause of death of Christine McVie?")

# Stop the timer
end_time = time.time()

# Calculate the response time
response_time = end_time - start_time

# Print the result and the response time
print(f"Response: {result}")
print(f"Response Time: {response_time} seconds")

Response:  Hi there! The cause of death of Christine McVie was a heart attack. She had a history of heart problems, which eventually led to her passing. Hope this helps!

Response Time: 159.03444743156433 seconds


In [None]:
import time

# Start the timer
start_time = time.time()

# Execute the query
result = vector_index.as_query_engine(
    text_qa_template=text_qa_template,
    llm=llm,
    response_mode="compact"
).query("What is the cause of death of Christine McVie?")

# Stop the timer
end_time = time.time()

# Calculate the response time
response_time = end_time - start_time

# Print the result and the response time
print(f"Response: {result}")
print(f"Response Time: {response_time} seconds")

Response:  The most common type of stroke, which Christine McVie had, causes many deaths every year. 

Response Time: 178.40658283233643 seconds


In [None]:
import time

# Start the timer
start_time = time.time()

# Execute the query
result = vector_index.as_query_engine(
    text_qa_template=text_qa_template,
    llm=llm,
    response_mode="tree_summarize"
).query("What is the cause of death of Christine McVie?")

# Stop the timer
end_time = time.time()

# Calculate the response time
response_time = end_time - start_time

# Print the result and the response time
print(f"Response: {result}")
print(f"Response Time: {response_time} seconds")

Response:  The cause of death of Christine McVie was a stroke.

Response Time: 204.64676547050476 seconds


For the second query, "refine" response mode gave the quickest response at 159.04 seconds while "tree_summarize" response mode gave the slowest response at 204.65 seconds.

Once again, "refine" response mode gave the wrong answer to the query compared to all the other response modes despite being the quickest responder this time.With that in mind, the quickest responder among the correst answers was "compact" response mode which also specify the kind of stroke Christine McVie suffered.

####c. Third Query: "What cancer does Married At First Sight star Mel Schilling suffer from?"

In [None]:
import time

# Start the timer
start_time = time.time()

# Execute the query
result = vector_index.as_query_engine(
    text_qa_template=text_qa_template,
    llm=llm
).query("What cancer does Married At First Sight star Mel Schilling suffer from?")

# Stop the timer
end_time = time.time()

# Calculate the response time
response_time = end_time - start_time

# Print the result and the response time
print(f"Response: {result}")
print(f"Response Time: {response_time} seconds")

Response:  The passage does not provide enough information to accurately determine what type of cancer Mel Schilling suffers from. It only mentions that she has been diagnosed with colon cancer. However, it does not specify any further details about her diagnosis or treatment plan. Further information would be needed to answer this question definitively.

Response Time: 200.37891912460327 seconds


In [None]:
import time

# Start the timer
start_time = time.time()

# Execute the query
result = vector_index.as_query_engine(
    text_qa_template=text_qa_template,
    llm=llm,
    response_mode="refine"
).query("What cancer does Married At First Sight star Mel Schilling suffer from?")

# Stop the timer
end_time = time.time()

# Calculate the response time
response_time = end_time - start_time

# Print the result and the response time
print(f"Response: {result}")
print(f"Response Time: {response_time} seconds")

Response:  Unfortunately, the passage does not give any information about Mel Schilling's specific cancer diagnosis. Please double-check your query and try again!

Response Time: 129.26367449760437 seconds


In [None]:
import time

# Start the timer
start_time = time.time()

# Execute the query
result = vector_index.as_query_engine(
    text_qa_template=text_qa_template,
    llm=llm,
    response_mode="compact"
).query("What cancer does Married At First Sight star Mel Schilling suffer from?")

# Stop the timer
end_time = time.time()

# Calculate the response time
response_time = end_time - start_time

# Print the result and the response time
print(f"Response: {result}")
print(f"Response Time: {response_time} seconds")

Response:  Colon Cancer

Response Time: 153.25465631484985 seconds


In [None]:
import time

# Start the timer
start_time = time.time()

# Execute the query
result = vector_index.as_query_engine(
    text_qa_template=text_qa_template,
    llm=llm,
    response_mode="tree_summarize"
).query("What cancer does Married At First Sight star Mel Schilling suffer from?")

# Stop the timer
end_time = time.time()

# Calculate the response time
response_time = end_time - start_time

# Print the result and the response time
print(f"Response: {result}")
print(f"Response Time: {response_time} seconds")

Response:  According to the context information provided, it is mentioned that Mel Schilling has been diagnosed with colon cancer. However, due to the AI's restrictions on referencing the given context directly, I cannot provide a specific answer without access to additional sources or data.

Response Time: 196.0378303527832 seconds


This time, "refine" response mode gave the fastest response at 129.26 seconds while default response mode was the slowest at 200.38 seconds.

However unlike with the previous two queries, only "compact" and "tree_summarize" response modes provided an actual answer in their responses to the third query, where "compact" response mode gave the fastest answer at 153.26 seconds while "tree_summarize" response mode gave its answer at 196.03 seconds.

From these comparisons, it can be concluded that:

*   "refine" is the worst response mode for this prompt template as it consistently provided the slowest and wrong response for all three queries
*   "compact" is the best response mode for this prompt template, as it gave relatively fast response while still providing correct answer for all three queries

##8.2 Comparing the Result of LLM with RAG and without RAG

The result of LLM with and without RAG are vastly different.



*   For the question "What are the leading causes of death in the UK?", LLM without RAG responded with Ischemic heart disease (heart attacks) and Lung cancer. With RAG, the prompt template generally responded with Dementia and Alzheimer's disease instead.
*  Meanwhile for the first and third question, only LLM with RAG managed to respond with actual answers for those queries, while LLM without RAG didn't have any information that could have helped it answering those two questions.

This showed that RAG helped LLM to have a more updated information, which helped it to answer more questions and with higher accuracy.