# ***Retrieval Augmented Generation on Medical Data***

> In this project, a Retrieval-Augmented Generation (RAG) model is designed to assist users with medical queries. It utilizes Wikipedia's medical data to provide accurate and comprehensive information on a variety of topics within the medical domain, including:
> - **Symptoms**: Understanding signs of different medical conditions.
> - **Treatments**: Recommended therapies, medications, and procedures for various health issues.
> - **Precautions**: Preventative measures to minimize risks associated with illnesses or health conditions.
> - **Causes**: Factors or conditions that contribute to the onset of diseases.
> - **Suggestions**: General advice or guidelines for managing health and wellness.
>
> The RAG model combines information retrieval and language generation, using Wikipedia’s extensive medical data as a knowledge base to answer user questions accurately and effectively. This approach ensures that users receive reliable, evidence-based responses to their medical inquiries.

---






# Installations

In [1]:
!pip install langchain



In [2]:
!pip install pypdf



In [3]:
!pip install sentence-transformers



In [4]:
!pip install chromadb



In [5]:
!pip install groq



In [6]:
!pip install umap-learn



In [7]:
!pip install datasets



# Medical Data

In [8]:
from datasets import load_dataset

ds = load_dataset("gamino/wiki_medical_terms")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [9]:
ds['train'][0].keys()

dict_keys(['page_title', 'page_text', '__index_level_0__'])

In [10]:
import chromadb

from langchain.text_splitter import RecursiveCharacterTextSplitter, SentenceTransformersTokenTextSplitter
import numpy as np
from pypdf import PdfReader
from tqdm import tqdm

from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction

In [11]:
texts = []
for i in range(10):
    texts.append(ds['train'][i]['page_text'])

In [12]:
def recursive_chunker(texts):
  """ Chunk texts to ensure that all the chunks are relatively of same length for better embeddings."""
  character_splitter = RecursiveCharacterTextSplitter(
      separators=["\n\n", "\n", ". ", " ", ""],
      chunk_size=1000,
      chunk_overlap=0
  )
  character_split_texts = character_splitter.split_text('\n\n'.join(texts))

  token_splitter = SentenceTransformersTokenTextSplitter(chunk_overlap=0, tokens_per_chunk=256)

  token_split_texts = []
  for text in character_split_texts:
      token_split_texts += token_splitter.split_text(text)

  return token_split_texts

In [13]:
sample_chunks = recursive_chunker(texts)
for c in sample_chunks:
    print(len(c))



902
969
510
18
872
895
5
966
726
675
437
15
889
321
9
809
637
681
37
988
207
344
672
982
25
33
833
888
438
14
801
925
997
1008
881
950
997
12
817
534
149
910
1001
104
856
405
24
899
894
960
183
761
381
844
178
953
485
971
618
7
823
775
478
671
22
621
402
9
979
67
54
977
915
643
939
931
477
274
768
432
944
146
394
830
327
100
1011
423
776
6
901
524
21
1014
848
731
871
448
920
766
884
906
529
1012
297
535
689
781
18
816
308
15
361
936
955
596
1004
1005
705
590
848
528
572
14
882
182
855
685
373
926
769
783
696
728
728
877
819
8
823
368
529
29
827
191
233
824
983
658
599
544
567
574
9
773
919
880
506
949
55
117
860
538
970
420
718
471
924
18
920
831
15
918
482
501
754
718
397
9
799
393
674
981
921
1003
455
11
929
286
21
900
522
15
976
537
8
966
929
760
804
378
15
768
915
429
606
976
946
678
633
994
50
18
845
503
6
951
516
349
74
949
831
9
959
881
572
9
960
141
971
740
952
837
207
18
610
422
437
293
990
192
297
715
855
956
496
993
407
936
748
24
1011
619
112
11
904
779
704
318
934
17
958
5

In [14]:
def load_chroma(chunks, collection_name, embedding_function):
    chroma_cliet = chromadb.Client()
    chroma_collection = chroma_cliet.create_collection(name=collection_name, embedding_function=embedding_function)

    ids = [str(i) for i in range(len(chunks))]

    chroma_collection.add(ids=ids, documents=chunks)

    return chroma_collection

In [15]:
def word_wrap(string, n_chars=72):
    # Wrap a string at the next space after n_chars
    if len(string) < n_chars:
        return string
    else:
        return string[:n_chars].rsplit(' ', 1)[0] + '\n' + word_wrap(string[len(string[:n_chars].rsplit(' ', 1)[0])+1:], n_chars)

In [16]:
collection_name = 'medical-data'
embedding_function = SentenceTransformerEmbeddingFunction()
chroma_collection = load_chroma(sample_chunks, collection_name, embedding_function)

In [17]:
#chroma_cliet.delete_collection(name=collection_name)
chroma_collection.count()

315

In [18]:
similar_docs = chroma_collection.query(query_texts = "dizziness, tiredness, vomiting and confusion",
                        n_results=5, include=['documents', 'embeddings','metadatas'])

In [19]:
retrieved_documents = similar_docs['documents'][0]

for doc in retrieved_documents:
    print(word_wrap(doc))
    print('')

pulmonary edema ( fluid in the lungs ) symptoms similar to bronchitis
persistent dry cough fever shortness of breath even when
restingcerebral edema ( swelling of the brain ) headache that does not
respond to analgesics unsteady gait gradual loss of consciousness
increased nausea and vomiting

altitude sickness, the mildest form being acute mountain sickness ( ams
), is the harmful effect of high altitude, caused by rapid exposure to
low amounts of oxygen at high elevation. people can respond to high
altitude in different ways. symptoms may include headaches, vomiting,
tiredness, confusion, trouble sleeping, and dizziness. acute mountain
sickness can progress to high - altitude pulmonary edema ( hape ) with
associated shortness of breath or high - altitude cerebral edema ( hace
) with associated confusion. chronic mountain sickness may occur after
long - term exposure to high altitude. altitude sickness typically
occurs only above 2, 500 metres ( 8, 000 ft ), though some are affected
a

# Analyse Retrieved documents

Take a close look at the retrieved documents, even though all the symptoms match closely with "Altitude Sickness", the top match result is "Pulmonary Edema". Let's look at similarity scores to see why this happens.

In [20]:
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

model = SentenceTransformer('paraphrase-MiniLM-L6-v2')

def get_text_embeddings(texts):
    embeddings = model.encode(texts)
    return embeddings

def compute_cosine_similarity(embedding_a, embedding_b):
    similarity = cosine_similarity([embedding_a], [embedding_b])[0][0]
    return similarity

query = "dizziness, tiredness, vomiting and confusion"
text_1 = retrieved_documents[0] # pulmonary edema
text_2 = retrieved_documents[1] # altitude sickness

embeddings = get_text_embeddings([query, text_1, text_2])

similarity_1 = compute_cosine_similarity(embeddings[0], embeddings[1])
similarity_2 = compute_cosine_similarity(embeddings[0], embeddings[2])

print(f"Cosine Similarity : {similarity_1} for pulmonary edema")
print(f"Cosine Similarity : {similarity_2} for altitude sickness")


Cosine Similarity : 0.5803083181381226 for pulmonary edema
Cosine Similarity : 0.5016669631004333 for altitude sickness


# Impact of Length of Chunk

The similarity score for 'altitude sickness' with the query is low, which might be why it isn't appearing as the top match. What could be causing the low similarity score? Does the length of the text play a role in this?

In [21]:
print(len(text_1), len(text_2))
# make 2 chunks equal in length with query terms being retained in text_2 after chunking.
text_21 = text_2[:335]

embeddings = get_text_embeddings([query, text_1, text_21])

similarity_1 = compute_cosine_similarity(embeddings[0], embeddings[1])
similarity_2 = compute_cosine_similarity(embeddings[0], embeddings[2])

print(f"Cosine Similarity : {similarity_1} for pulmonary edema")
print(f"Cosine Similarity : {similarity_2} for altitude sickness")

293 952
Cosine Similarity : 0.5803083181381226 for pulmonary edema
Cosine Similarity : 0.5235142707824707 for altitude sickness


# Lost in Middle

The similarity score slightly increased but not significant enough even after chunking the texts to be of same length. Does placement of information effects the embedding and thus similairity?

In [22]:
# Place the query similar words apeearing in both the texts at the same position.
text_21 = text_2[240:500]

embeddings = get_text_embeddings([query, text_1, text_21])

similarity_1 = compute_cosine_similarity(embeddings[0], embeddings[1])
similarity_2 = compute_cosine_similarity(embeddings[0], embeddings[2])

print(f"Cosine Similarity : {similarity_1} for pulmonary edema")
print(f"Cosine Similarity : {similarity_2} for altitude sickness")

Cosine Similarity : 0.5803083181381226 for pulmonary edema
Cosine Similarity : 0.6456915736198425 for altitude sickness


**When the information similar to query appears in the initial parts of sentence that text seems to gain more similarity score than the text with the same information placed elsewhere.**

# Handling length by merging chunks

In [23]:
def merge_chunk_texts(texts):
    """Chunk texts to ensure that all the chunks are relatively of the same length for better embeddings."""
    character_splitter = RecursiveCharacterTextSplitter(
        separators=["\n\n", "\n", ". ", " ", ""],
        chunk_size=1000,
        chunk_overlap=0
    )
    character_split_texts = character_splitter.split_text('\n\n'.join(texts))
    character_split_texts_new = []

    i = 0
    max_ij = len(character_split_texts)

    while i < max_ij:
        current_chunk = character_split_texts[i]
        i += 1

        while i < max_ij and len(current_chunk + character_split_texts[i]) < 1000:
            current_chunk += character_split_texts[i]
            i += 1

        character_split_texts_new.append(current_chunk)

    token_splitter = SentenceTransformersTokenTextSplitter(chunk_overlap=0, tokens_per_chunk=256)

    token_split_texts = []
    for text in character_split_texts_new:
        token_split_texts += token_splitter.split_text(text)

    return token_split_texts


In [24]:
eql_chunks = merge_chunk_texts(texts)
for c in eql_chunks:
    print(len(c))



902
969
529
872
901
966
726
675
453
889
331
809
637
719
988
552
672
982
25
866
888
453
801
925
997
1008
881
950
1010
817
684
910
1001
961
429
899
894
960
945
381
844
178
953
485
971
626
823
775
478
694
621
412
979
122
977
915
643
939
931
752
768
432
944
541
830
428
1011
423
783
901
546
1014
848
731
871
448
920
766
884
906
529
1012
833
689
800
816
685
936
955
596
1004
1005
705
590
848
528
587
882
182
855
685
373
926
769
783
696
728
728
877
828
823
928
827
425
824
983
658
599
544
567
584
773
919
880
506
949
173
860
538
970
420
718
471
943
920
847
918
984
754
718
406
799
393
674
981
921
1003
467
929
308
900
538
976
546
966
929
760
804
394
768
915
429
606
976
946
678
633
994
914
510
951
940
949
841
959
881
582
960
141
971
740
952
837
836
860
293
990
490
715
855
956
496
993
407
936
773
1011
744
904
779
704
318
952
958
521
623
895
785
985
790
978
792
521
859
476
867
396
843
803
591
637
962
908
865
711
960
996
507
837
899
787
933
413
1001
926
965
598
514
998
349
924
546
710
986
643


To optimize both retrieval accuracy and generation quality in a Retrieval-Augmented Generation (RAG) system, we balance chunk size and contextual linking:

1. **For Retrieval:** Smaller text chunks improve retrieval relevance because shorter segments lead to higher similarity scores, ensuring that the retriever fetches precise and relevant information.
  
2. **For Generation:** Large language models (LLMs) benefit from more extensive context. To support this, we add the indices of adjacent (previous and next) chunks to each chunk's metadata. This linking allows the retriever to return neighboring chunks alongside the primary result, enriching the context provided to the generator without requiring additional embeddings.

3. **Customizable Nearest Neighbor Retrieval:** The number of neighboring chunks retrieved can be customized based on the data domain and the retrieval needs. This flexibility supports better contextual understanding and improved generation outcomes across varied applications.

This approach ensures high-quality retrieval for the retriever and enhanced context for the generator, optimizing the performance of RAG systems across different use cases.

# Metadata for Sentence Window Retrieval

In [36]:
def load_chroma_with_metadata(chunks, collection_name, embedding_function, n=1):
    """
    Load chunks into a Chroma collection with metadata that includes up to `n` nearest neighbors for each chunk.

    Parameters:
    - chunks: List of text chunks to be added to the collection.
    - collection_name: Name of the Chroma collection.
    - embedding_function: Embedding function to be used.
    - n: Number of neighbors to include in the metadata (default is 1).
    """
    chroma_client = chromadb.Client()
    chroma_collection = chroma_client.create_collection(name=collection_name, embedding_function=embedding_function)

    ids = [str(i) for i in range(len(chunks))]
    metadatas = []

    for i in range(len(chunks)):
        neighbors = [j for j in range(max(0, i - n), min(len(chunks), i + n + 1)) if j != i]
        neighbor_metadata = "_".join(map(str, neighbors))
        metadatas.append({str(i): neighbor_metadata})

    chroma_collection.add(ids=ids, documents=chunks, metadatas=metadatas)

    return chroma_collection


In [38]:
chroma_cliet = chromadb.Client()
collection_name = 'medical-data-metadata'
embedding_function = SentenceTransformerEmbeddingFunction()
chroma_collection_metadata = load_chroma_with_metadata(eql_chunks, collection_name, embedding_function)

In [39]:
#chroma_cliet.delete_collection(name=collection_name)
chroma_collection_metadata.count()

245

In [40]:
similar_metadata_docs = chroma_collection_metadata.query(query_texts = "dizziness, tiredness, vomiting and confusion",
                        n_results=5, include=['documents', 'embeddings','metadatas'])

In [41]:
retrieved_metadata_documents = similar_metadata_docs['documents'][0]

for doc in retrieved_metadata_documents:
    print(word_wrap(doc))
    print('')

pulmonary edema ( fluid in the lungs ) symptoms similar to bronchitis
persistent dry cough fever shortness of breath even when
restingcerebral edema ( swelling of the brain ) headache that does not
respond to analgesics unsteady gait gradual loss of consciousness
increased nausea and vomiting

altitude sickness, the mildest form being acute mountain sickness ( ams
), is the harmful effect of high altitude, caused by rapid exposure to
low amounts of oxygen at high elevation. people can respond to high
altitude in different ways. symptoms may include headaches, vomiting,
tiredness, confusion, trouble sleeping, and dizziness. acute mountain
sickness can progress to high - altitude pulmonary edema ( hape ) with
associated shortness of breath or high - altitude cerebral edema ( hace
) with associated confusion. chronic mountain sickness may occur after
long - term exposure to high altitude. altitude sickness typically
occurs only above 2, 500 metres ( 8, 000 ft ), though some are affected
a

In [32]:
# def project_embeddings(embeddings, umap_transform):
#     umap_embeddings = np.empty((len(embeddings),2))
#     for i, embedding in enumerate(tqdm(embeddings)):
#         umap_embeddings[i] = umap_transform.transform([embedding])
#     return umap_embeddings

In [33]:
import os
from groq import Groq
os.environ["GROQ_API_KEY"] = "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" # set this to your own GROQ API key

groq_api_key = os.getenv('GROQ_API_KEY')
client = Groq(api_key = groq_api_key)

In [34]:
#chroma_collection.get(ids='184')

In [35]:
# import umap
# embeddings = chroma_collection.get(include=['embeddings'])['embeddings']
# umap_transform = umap.UMAP(random_state=0, transform_seed=0).fit(embeddings)
# projected_dataset_embeddings = project_embeddings(embeddings, umap_transform)

# Query Classification

In [42]:
model = "mixtral-8x7b-32768"
def classify_query(query):
    prompt = """
                You are a smart assistant. Classify whether the given information belongs to medical domain or not.
                Respond with "Yes" if it belongs to medical domain else "No".
    """
    messages = [
        {
            "role": "system",
            "content": prompt
        },
        {   "role": "user",
            "content": query}
    ]

    response = client.chat.completions.create(
        model=model,
        messages=messages,
        max_tokens=20
    )
    content = response.choices[0].message.content
    return content

# Generate Questions

In [43]:
model = "mixtral-8x7b-32768"
def generate_questions(topic_info):

    messages = [
        {
            "role": "system",
            "content": "You are a helpful medical researcher."
            "Suggest five questions that can be asked from the information provided. "
            "Suggest only short questions without compound sentences. Suggest a variety of questions that cover different aspects of the topic."
            "Make sure they are complete questions, and that they are related to the given information."
        },
        {
            "role": "user",
            "content": f"Topic Information : {topic_info}",
        }
    ]
    response = client.chat.completions.create(
        model= model,
        messages=messages,
        max_tokens=4096
    )
    response_message = response.choices[0].message

    return response_message.content

In [44]:
topic_info = """Anticholinergics (anticholinergic agents) are substances that block the action of the neurotransmitter called acetylcholine (ACh) at synapses in the central and peripheral nervous system.These agents inhibit the parasympathetic nervous system by selectively blocking the binding of ACh to its receptor in nerve cells. The nerve fibers of the parasympathetic system are responsible for the involuntary movement of smooth muscles present in the gastrointestinal tract, urinary tract, lungs, sweat glands, and many other parts of the body.In broad terms, anticholinergics are divided into two categories in accordance with their specific targets in the central and peripheral nervous system and at the neuromuscular junction: antimuscarinic agents, and antinicotinic agents (ganglionic blockers, neuromuscular blockers).The term "anticholinergic" is typically used to refer to antimuscarinics which competitively inhibit the binding of ACh to muscarinic acetylcholine receptors; such agents do not antagonize the binding at nicotinic acetylcholine receptors at the neuromuscular junction, although the term is sometimes used to refer to agents which do so. Medical uses Anticholinergic drugs are used to treat a variety of conditions: Dizziness (including vertigo and motion sickness-related symptoms) Extrapyramidal symptoms, a potential side-effect of antipsychotic medications Gastrointestinal disorders (e.g., peptic ulcers, diarrhea, pylorospasm, diverticulitis, ulcerative colitis, nausea, and vomiting) Genitourinary disorders (e.g., cystitis, urethritis, and prostatitis) Insomnia, although usually only on a short-term basis Respiratory disorders (e.g., asthma, chronic bronchitis, and chronic obstructive pulmonary disease [COPD]) Sinus bradycardia due to a hypersensitive vagus nerve Organophosphate based nerve agent poisoning, such as VX, sarin, tabun, and soman (atropine is favoured in conjunction with an oxime, usually pralidoxime)Anticholinergics generally have antisialagogue effects (decreasing saliva production), and most produce some level of sedation, both being advantageous in surgical procedures.Until the beginning of the 20th century anticholinergic drugs were widely used to treat psychiatric disorders. Physiological effects Delirium (often with hallucinations and delusions indistinguishable from reality) Ocular symptoms (from eye drops): mydriasis, pupil dilation, and acute angle-closure glaucoma in those with shallow anterior chamber Anhidrosis, dry mouth, dry skin Fever Constipation Tachycardia Urinary retention Cutaneous vasodilationClinically the most significant feature is delirium, particularly in the elderly, who are most likely to be affected by the toxidrome. Side effects Long-term use may increase the risk of both cognitive and physical decline. It is unclear whether they affect the risk of death generally. However, in older adults they do appear to increase the risk of death.Possible effects of anticholinergics include: Possible effects in the central nervous system resemble those associated with delirium, and may include: Older patients are at a higher risk of experiencing CNS side effects. Toxicity An acute anticholinergic syndrome is reversible and subsides once all of the causative agents have been excreted. Reversible acetylcholinesterase inhibitor agents such as physostigmine can be used as an antidote in life-threatening cases. Wider use is discouraged due to the significant side effects related to cholinergic excess including seizures, muscle weakness, bradycardia, bronchoconstriction, lacrimation, salivation, bronchorrhea, vomiting, and diarrhea. Even in documented cases of anticholinergic toxicity, seizures have been reported after the rapid administration of physostigmine. Asystole has occurred after physostigmine administration for tricyclic antidepressant overdose, so a conduction delay (QRS > 0.10 second) or suggestion of tricyclic antidepressant ingestion is generally considered a contraindication to physostigmine administration. Pharmacology Anticholinergics are classified according to the receptors that are affected: Antimuscarinic agents operate on the muscarinic acetylcholine receptors. The majority of anticholinergic drugs are antimuscarinics. Antinicotinic agents operate on the nicotinic acetylcholine receptors. The majority of these are non-depolarising skeletal muscle relaxants for surgical use that are structurally related to curare. Several are depolarizing agents. Examples Examples of common anticholinergics: Plants of the family Solanaceae contain various anticholinergic tropane alkaloids, such as scopolamine, atropine, and hyoscyamine. Physostigmine is one of only a few drugs that can be used as an antidote for anticholinergic poisoning. Nicotine also counteracts anticholinergics by activating nicotinic acetylcholine receptors. Caffeine (although an adenosine receptor antagonist) can counteract the anticholinergic symptoms by reducing sedation and increasing acetylcholine activity, thereby causing alertness and arousal. Psychoactive uses When a significant amount of an anticholinergic is taken into the body, a toxic reaction known as acute anticholinergic syndrome may result. This may happen accidentally or intentionally as a consequence of either recreational or entheogenic drug use, though many users find the side effects to be exceedingly unpleasant and not worth the recreational effects they experience. In the context of recreational use, anticholinergics are often called deliriants. Plant sources The most common plants containing anticholinergic alkaloids (including atropine, scopolamine, and hyoscyamine among others) are: Atropa belladonna (deadly nightshade) Brugmansia species Datura species Garrya species Hyoscyamus niger (henbane) Mandragora officinarum (mandrake) Use as a deterrent Several narcotic and opiate-containing drug preparations, such as those containing hydrocodone and codeine are combined with an anticholinergic agent to deter intentional misuse. Examples include Hydromet/Hycodan (hydrocodone/homatropine), Lomotil (diphenoxylate/atropine) and Tussionex (hydrocodone polistirex/chlorpheniramine). However, it is noted that opioid/antihistamine combinations are used clinically for their synergistic effect in the management of pain and maintenance of dissociative anesthesia (sedation) in such preparations as Meprozine (meperidine/promethazine) and Diconal (dipipanone/cyclizine), which act as strong anticholinergic agents. == References ==
"""

In [45]:
print(word_wrap(topic_info))

Anticholinergics (anticholinergic agents) are substances that block the
action of the neurotransmitter called acetylcholine (ACh) at synapses
in the central and peripheral nervous system.These agents inhibit the
parasympathetic nervous system by selectively blocking the binding of
ACh to its receptor in nerve cells. The nerve fibers of the
parasympathetic system are responsible for the involuntary movement of
smooth muscles present in the gastrointestinal tract, urinary tract,
lungs, sweat glands, and many other parts of the body.In broad terms,
anticholinergics are divided into two categories in accordance with
their specific targets in the central and peripheral nervous system and
at the neuromuscular junction: antimuscarinic agents, and antinicotinic
agents (ganglionic blockers, neuromuscular blockers).The term
"anticholinergic" is typically used to refer to antimuscarinics which
competitively inhibit the binding of ACh to muscarinic acetylcholine
receptors; such agents do not antag

In [46]:
print(word_wrap(generate_questions(topic_info)))

1. What is the primary neurotransmitter that anticholinergics block in
the nervous system?
2. How do anticholinergics influence the
parasympathetic nervous system?
3. What are the two main categories of
anticholinergics based on their specific targets?
4. In which medical
conditions are anticholinergics commonly used to treat?
5. What is the
primary side effect of anticholinergics that is clinically significant,
particularly in the elderly?


# Testing Repsonses

In [47]:
model = "mixtral-8x7b-32768"
def generate_response(query,relevant_info):
    relevant_info = '\n\n---------------------------------\n\n'.join(relevant_info)
    prompt = """
                You are a reliable Medical Advisor. Answer patients' queries
                using only the provided information. Follow these rules:

                - Use only the explicit information given.
                - If the answer is not found, respond with "Don't Know."
                - Do not make assumptions or add details not provided.
                - Be concise, clear, and professional.
                - When needed, add this note at the end of the response:
                  "For informational purposes only. Consult your local medical authority for advice."
                - If the query doesn't seem to be relating to medical information, respond with
                  "Sorry, I can only answer questions related to medical data."

                Your goal is to provide accurate, trustworthy, and empathetic responses.
    """
    messages = [
        {
            "role": "system",
            "content": prompt
        },
        {"role": "user", "content": query + "Relevant information : " + relevant_info}
    ]

    response = client.chat.completions.create(
        model=model,
        messages=messages,
    )
    content = response.choices[0].message.content
    return content

In [48]:
query = "dizziness, tiredness, vomiting and confusion"
#"What are some potential side effects of long-term use of anticholinergics, particularly in older adults?"

In [50]:
print(word_wrap(generate_response(query,retrieved_documents)))

The symptoms you described include dizziness, tiredness, vomiting, and
confusion. These symptoms are associated with both pulmonary edema
(fluid in the lungs) and cerebral edema (swelling of the brain).
Additionally, you mentioned a persistent dry cough, fever, and
shortness of breath even when resting, which are symptoms of pulmonary
edema. Headache that does not respond to analgesics, unsteady gait,
increased nausea and vomiting, and gradual loss of consciousness are
symptoms of cerebral edema.

Based on the provided information, it is
possible that you are experiencing symptoms of altitude sickness, which
can progress to high-altitude pulmonary edema (HAPE) or high-altitude
cerebral edema (HACE). Altitude sickness typically occurs above 2,500
meters (8,000 ft).

It is important to consult with a local medical
authority for advice regarding these symptoms.

For informational
purposes only. Consult your local medical authority for advice.

Sorry,
I can only answer questions related to

In [51]:
print(word_wrap(generate_response(query,retrieved_metadata_documents)))

The provided information relates to altitude sickness, which includes
pulmonary edema (fluid in the lungs) and cerebral edema (swelling of
the brain) as symptoms. Considering the patient's symptoms of
dizziness, tiredness, vomiting, and confusion, these could be
consistent with altitude sickness. The symptom of a persistent dry
cough could indicate the early stages of pulmonary edema, while the
shortness of breath even when resting might suggest a more progressed
stage. The headache that does not respond to analgesics, unsteady gait,
and gradual loss of consciousness could indicate cerebral edema.
However, it is essential to consult local medical authorities for
advice.

For informational purposes only. Consult your local medical
authority for advice.


In [None]:
query = "What are some potential side effects of long-term use of anticholinergics, particularly in older adults?"

In [None]:
if classify_query(query) == "Yes":
    results = chroma_collection_metadata.query(query_texts=query, n_results=5, include=['documents', 'embeddings'])
    retrieved_documents = results['documents'][0]
    print(word_wrap(generate_response(query,retrieved_documents)))
else:
    print(word_wrap(generate_response(query,[])))

Anticholinergics are a type of medication that can block the action of
acetylcholine, a neurotransmitter in the body. These drugs are used to
treat various conditions such as chronic obstructive pulmonary disease
(COPD), asthma, allergies, and overactive bladder. 

Long-term use of
anticholinergics, especially in older adults, can potentially lead to
several side effects, including:

1. Cognitive impairment: Prolonged
use of anticholinergics may increase the risk of cognitive decline,
including dementia and confusion.
2. Increased risk of falls:
Anticholinergics may cause dizziness, drowsiness, or unsteadiness,
leading to an increased risk of falls and fractures in older adults.
3.
Constipation: Anticholinergics can slow down gut motility, potentially
leading to constipation.
4. Dry mouth: Anticholinergics can reduce
saliva production, causing dry mouth and, in some cases, difficulty
swallowing.
5. Blurred vision: These medications can affect the muscles
in the eyes, leading to blurred