![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/healthcare-nlp/36.0.Loading_Medical_and_Open_Source_LLMs.ipynb)

# Loading Medical and Open Source LLMs



`LLMLoader` is designed to interact with a LLMs that are converted into gguf format. This module allows using John Snow Labs' licensed LLMs at various sizes that are finetuned on medical context for certain tasks. It provides various methods for setting parameters, loading models, generating text, and retrieving metadata. The `LLMLoader` includes methods for setting various parameters such as input prefix, suffix, cache prompt, number of tokens to predict, sampling techniques, temperature, penalties, and more. Overall, the `LLMLoader`  provides a flexible and extensible framework for interacting with language models in a Python and Scala environment using PySpark and Java.

## Colab Setup

In [None]:
# Install the johnsnowlabs library to access Spark-OCR and Spark-NLP for Healthcare, Finance, and Legal.
! pip install -q johnsnowlabs

In [None]:
from google.colab import files
print('Please Upload your John Snow Labs License using the button below')
license_keys = files.upload()

In [None]:
from johnsnowlabs import nlp, medical

# After uploading your license run this to install all licensed Python Wheels and pre-download Jars the Spark Session JVM
nlp.settings.enforce_versions=True
nlp.install(refresh_install=True)

In [None]:
from johnsnowlabs import nlp, medical
import pandas as pd

# Automatically load license data and start a session with all jars user has access to
spark = nlp.start(
   # hardware_target="gpu" # if you have GPU
)

👌 Detected license file /content/spark_nlp_for_healthcare_spark_ocr_9596 (2).json
👌 Launched [92mcpu optimized[39m session with with: 🚀Spark-NLP==5.5.0, 💊Spark-Healthcare==5.5.0, running on ⚡ PySpark==3.4.0


In [None]:
spark

# Medical LLMs



| Model Name              | Description |
|-------------------------|-------------|
|[JSL_MedS_q16_v1](https://nlp.johnsnowlabs.com/2024/07/12/jsl_meds_q16_v1_en.html)      | Summarization and Q&A  |
|[JSL_MedS_q8_v1](https://nlp.johnsnowlabs.com/2024/07/12/jsl_meds_q8_v1_en.html)       | Summarization and Q&A |
|[JSL_MedS_q4_v1](https://nlp.johnsnowlabs.com/2024/07/12/jsl_meds_q4_v1_en.html)       | Summarization and Q&A  |
|[JSL_MedM_q16_v1](https://nlp.johnsnowlabs.com/2024/07/12/jsl_medm_q16_v1_en.html)      |  Summarization, Q&A, RAG, and Chat |
|[JSL_MedM_q8_v1](https://nlp.johnsnowlabs.com/2024/07/12/jsl_medm_q8_v1_en.html)       | Summarization, Q&A, RAG, and Chat |
|[JSL_MedM_q4_v1](https://nlp.johnsnowlabs.com/2024/07/12/jsl_medm_q4_v1_en.html)       | Summarization, Q&A, RAG, and Chat |
|[JSL_MedSNer_ZS_q16_v1](https://nlp.johnsnowlabs.com/2024/07/12/jsl_medsner_zs_q16_v1_en.html)| Extract and link medical named entities |
|[JSL_MedSNer_ZS_q8_v1](https://nlp.johnsnowlabs.com/2024/07/12/jsl_medsner_zs_q8_v1_en.html) | Extract and link medical named entities |
|[JSL_MedSNer_ZS_q4_v1](https://nlp.johnsnowlabs.com/2024/07/12/jsl_medsner_zs_q4_v1_en.html) | Extract and link medical named entities |


**We recommend using 8b quantized versions of the models as the qualitative performance difference between q16 and q8 versions is very negligible.**

## JSL_MedS


This LLM model is trained to perform Summarization and Q&A based on a given context.

In [None]:
jsl_meds_llm = medical.LLMLoader(spark).pretrained("jsl_meds_q8_v1", "en", "clinical/models")

In [None]:
prompt = """
Based on the following text, what age group is most susceptible to breast cancer?

## Text:
The exact cause of breast cancer is unknown. However, several risk factors can increase your likelihood of developing breast cancer, such as:
- A personal or family history of breast cancer
- A genetic mutation, such as BRCA1 or BRCA2
- Exposure to radiation
- Age (most commonly occurring in women over 50)
- Early onset of menstruation or late menopause
- Obesity
- Hormonal factors, such as taking hormone replacement therapy
"""

response = jsl_meds_llm.generate(prompt)

In [None]:
response

' The age group most susceptible to breast cancer, as mentioned in the text, is women over the age of 50.'

## JSL_MedM


This LLM model is trained to perform Q&A, Summarization, RAG, and Chat.

In [None]:
jsl_medm_llm = medical.LLMLoader(spark).pretrained("jsl_medm_q8_v1", "en", "clinical/models")

In [None]:
prompt = """
A 23-year-old pregnant woman at 22 weeks gestation presents with burning upon urination. She states it started 1 day ago and has been worsening despite drinking more water and taking cranberry extract. She otherwise feels well and is followed by a doctor for her pregnancy. Her temperature is 97.7°F (36.5°C), blood pressure is 122/77 mmHg, pulse is 80/min, respirations are 19/min, and oxygen saturation is 98% on room air. Physical exam is notable for an absence of costovertebral angle tenderness and a gravid uterus.
Which of the following is the best treatment for this patient?
A: Ampicillin
B: Ceftriaxone
C: Ciprofloxacin
D: Doxycycline
E: Nitrofurantoin
"""

response = jsl_medm_llm.generate(prompt)

In [None]:
print(response)

The correct answer is E: Nitrofurantoin.

The patient is 22 weeks pregnant and has symptoms of burning upon urination, which is a common symptom of urinary tract infection (UTI). Nitrofurantoin is a first-line antibiotic for uncomplicated UTI in pregnant women.


In [None]:
### Output:
"""The correct answer is E: Nitrofurantoin.

The patient is presenting with symptoms of urinary tract infection (UTI), which is common during pregnancy. Nitrofurantoin is a first-line antibiotic for uncomplicated UTI during pregnancy. It is safe and effective in treating UTI during pregnancy and has been used for many years without any adverse effects on the fetus.
"""

## JSL_MedSNer


This LLM model is trained to extract and link entities in a document.
Users needs to define an input schema as explained in the example section.
Drug is defined as a list which tells the model that there could be multiple drugs in the document and it has to extract all of them.
Each drug has properties like "name" and "reaction". Since "name" is only one, it is a string, but there could be multiple reactions, hence it is a list.
Similarly, users can define any schema for any type of entity.

In [None]:
jsl_medner_llm = medical.LLMLoader(spark).pretrained("jsl_medsner_zs_q16_v1", "en", "clinical/models")

In [None]:
prompt = """
### Template:
{
    "drugs": [
        {
            "name": "",
            "reactions": []
        }
    ]
}
### Text:
I feel a bit drowsy & have a little blurred vision , and some gastric problems .
I 've been on Arthrotec 50 for over 10 years on and off , only taking it when I needed it .
Due to my arthritis getting progressively worse , to the point where I am in tears with the agony.
Gp 's started me on 75 twice a day and I have to take it every day for the next month to see how I get on , here goes .
So far its been very good , pains almost gone , but I feel a bit weird , did n't have that when on 50.
"""

response = jsl_medner_llm.generate(prompt)

In [None]:
response

' {\n    "drugs": [\n        {\n            "name": "Arthrotec",\n            "reactions": [\n                "drowsiness",\n                "blurred vision",\n                "gastric problems"\n            ]\n        }\n    ]\n}\n {\n    "drugs": [\n        {\n            "name": "Arthrotec",\n            "reactions": [\n                "drowsiness",\n                "blurred vision",\n                "gastric problems"\n            ]\n        }\n    ]\n}\n {\n    "drugs": [\n        {\n            "name": "Arthrotec",\n            "reactions": [\n                "drowsiness",\n                "blurred vision",\n                "gastric problems"\n            ]\n        }\n    ]\n}\n {\n    "drugs": [\n        {\n            "name": "Arthrotec",\n            "reactions": [\n                "drowsiness",\n                "blurred vision",\n                "gastric problems"\n            ]\n        }\n    ]\n}\n {\n    "drugs": [\n        {\n            "name": "Arthrotec",\n          

In [None]:
print(response)

 {
    "drugs": [
        {
            "name": "Arthrotec",
            "reactions": [
                "drowsiness",
                "blurred vision",
                "gastric problems"
            ]
        }
    ]
}
 {
    "drugs": [
        {
            "name": "Arthrotec",
            "reactions": [
                "drowsiness",
                "blurred vision",
                "gastric problems"
            ]
        }
    ]
}
 {
    "drugs": [
        {
            "name": "Arthrotec",
            "reactions": [
                "drowsiness",
                "blurred vision",
                "gastric problems"
            ]
        }
    ]
}
 {
    "drugs": [
        {
            "name": "Arthrotec",
            "reactions": [
                "drowsiness",
                "blurred vision",
                "gastric problems"
            ]
        }
    ]
}
 {
    "drugs": [
        {
            "name": "Arthrotec",
            "reactions": [
                "drowsiness",
      

In [None]:
####### Model: JSL_MedSNer_ZS_q16_v1
### Output:
"""
{
    "drugs": [
        {
            "name": "Arthrotec",
            "reactions": [
                "drowsy",
                "blurred vision",
                "gastric problems"
            ]
        }
    ]
}
"""

# Opensource LLM

## mistral-7b

In [None]:
# ! pip install huggingface-hub

!huggingface-cli download TheBloke/Mistral-7B-v0.1-GGUF mistral-7b-v0.1.Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False


Downloading 'mistral-7b-v0.1.Q4_K_M.gguf' to '.huggingface/download/mistral-7b-v0.1.Q4_K_M.gguf.ce6253d2e91adea0c35924b38411b0434fa18fcb90c52980ce68187dbcbbe40c.incomplete'
mistral-7b-v0.1.Q4_K_M.gguf: 100% 4.37G/4.37G [01:17<00:00, 56.4MB/s]
Download complete. Moving file to mistral-7b-v0.1.Q4_K_M.gguf
mistral-7b-v0.1.Q4_K_M.gguf


In [None]:
llm_loader = medical.LLMLoader(spark)

In [None]:
%%time

llm_loader\
    .setUseChatTemplate(True)\
    .setTemperature(0.0)\
    .setStopStrings(["<|im_end|>"])\
    .encodeModel(
        "./mistral-7b-v0.1.Q4_K_M.gguf",
        "./mistral-7b-v0.1.Q4_K_M/",
        metadata = {
          "licensed": "false"
        })

CPU times: user 135 ms, sys: 18.8 ms, total: 154 ms
Wall time: 24.3 s


In [None]:
!ls -l ./mistral-7b-v0.1.Q4_K_M/

total 4270088
-rw-r--r-- 1 root root 4372561920 Jul 22 11:19 gguf
-rw-r--r-- 1 root root        120 Jul 22 11:19 metadata.json


In [None]:
%%time
llm = llm_loader.load("./mistral-7b-v0.1.Q4_K_M")

CPU times: user 6.2 ms, sys: 2.33 ms, total: 8.53 ms
Wall time: 763 ms


In [None]:
%%time
prompt = "What is the indication for the drug Methadone?"
response = llm.generate(prompt)

CPU times: user 79.4 ms, sys: 8.88 ms, total: 88.3 ms
Wall time: 12.3 s


In [None]:
response

'Methadone is used to treat opioid addiction. It is also used to treat severe pain.\n'

# MedicalLLM Annotator (AutoGGUFModel)

MedicalLLM was designed to load and run large language models (LLMs) in GGUF format with scalable performance. Ideal for clinical and healthcare applications, MedicalLLM supports tasks like medical entity extraction, summarization, Q&A, Retrieval Augmented Generation (RAG), and conversational AI. With simple integration into Spark NLP pipelines, it allows for customizable batch sizes, prediction settings, and chat templates. GPU optimization is also available, enhancing its capabilities for high-performance environments. MedicalLLM empowers users to link medical entities and perform complex NLP tasks with efficiency and precision. MedicalLLM can be accessed using the `AutoGGUFModel`.

## JSL_MedS_NER

This LLM model is trained to extract and link entities in a document. Users needs to define an input schema as explained in the example section. Drug is defined as a list which tells the model that there could be multiple drugs in the document and it has to extract all of them. Each drug has properties like "name" and "reaction". Since "name" is only one, it is a string, but there could be multiple reactions, hence it is a list. Similarly, users can define any schema for any type of entity.

In [None]:
document_assembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

medical_llm = medical.AutoGGUFModel.pretrained("jsl_meds_ner_q4_v2", "en", "clinical/models")\
    .setInputCols("document")\
    .setOutputCol("completions")\
    .setBatchSize(1)\
    .setNPredict(100)\
    .setUseChatTemplate(True)\
    .setTemperature(0)\
    #.setNGpuLayers(100) # if you have GPU


pipeline = nlp.Pipeline(
    stages = [
        document_assembler,
        medical_llm
])

jsl_meds_ner_q4_v2 download started this may take some time.
[OK!]


In [None]:
med_ner_prompt = """
### Template:
{
    "drugs": [
        {
            "name": "",
            "reactions": []
        }
    ]
}
### Text:
I feel a bit drowsy & have a little blurred vision , and some gastric problems .
I 've been on Arthrotec 50 for over 10 years on and off , only taking it when I needed it .
Due to my arthritis getting progressively worse , to the point where I am in tears with the agony.
Gp 's started me on 75 twice a day and I have to take it every day for the next month to see how I get on , here goes .
So far its been very good , pains almost gone , but I feel a bit weird , did n't have that when on 50.
"""

data = spark.createDataFrame([[med_ner_prompt]]).toDF("text")
data.show(truncate=100)

+----------------------------------------------------------------------------------------------------+
|                                                                                                text|
+----------------------------------------------------------------------------------------------------+
|\n### Template:\n{\n    "drugs": [\n        {\n            "name": "",\n            "reactions": ...|
+----------------------------------------------------------------------------------------------------+



In [None]:
%%time
results = pipeline.fit(data).transform(data).cache()
results.select("completions").show(truncate=False)

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|completions                                                                                                                                                                                                                                                                                                                                                                        |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In [None]:
print(results.select("completions").collect()[0].completions[0].result)


{
    "drugs": [
        {
            "name": "Arthrotec",
            "reactions": [
                "drowsy",
                "blurred vision",
                "gastric problems"
            ]
        }
    ]
}
</s> #### Template:
{"drugs": [{"name": "", "reaction": []}]}
#### Text:
The patient is a 65-year


## JSL_MedM

This LLM model is trained to perform Summarization and Q&A based on a given context.

In [None]:
document_assembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

medical_llm = medical.AutoGGUFModel.pretrained("jsl_medm_q8_v1", "en", "clinical/models")\
    .setInputCols("document")\
    .setOutputCol("completions")\
    .setBatchSize(1)\
    .setNPredict(100)\
    .setUseChatTemplate(True)\
    .setTemperature(0)\
    #.setNGpuLayers(100) # if you have GPU

pipeline = nlp.Pipeline(
    stages = [
        document_assembler,
        medical_llm
])

jsl_medm_q8_v1 download started this may take some time.
[OK!]


In [None]:
medm_prompt = """
summarize the following content.

 content:
 ---------------------------- INDICATIONS AND USAGE ---------------------------
 KISUNLA is an amyloid beta-directed antibody indicated for the
 treatment of Alzheimer’s disease. Treatment with KISUNLA should be
 initiated in patients with mild cognitive impairment or mild dementia
 stage of disease, the population in which treatment was initiated in the
 clinical trials. (1)
 ------------------------DOSAGE AND ADMINISTRATION-----------------------
 • Confirm the presence of amyloid beta pathology prior to initiating
 treatment. (2.1)
 • The recommended dosage of KISUNLA is 700 mg administered as
 an intravenous infusion over approximately 30 minutes every four
 weeks for the first three doses, followed by 1400 mg every four
 weeks. (2.2)
 • Consider stopping dosing with KISUNLA based on reduction of
 amyloid plaques to minimal levels on amyloid PET imaging. (2.2)
 • Obtain a recent baseline brain MRI prior to initiating treatment.
 (2.3, 5.1)
 • Obtain an MRI prior to the 2nd, 3rd, 4th, and 7th infusions. If
 radiographically observed ARIA occurs, treatment
 recommendations are based on type, severity, and presence of
 symptoms. (2.3, 5.1)
 • Dilution to a final concentration of 4 mg/mL to 10 mg/mL with 0.9%
 Sodium Chloride Injection, is required prior to administration. (2.4)
 ----------------------DOSAGE FORMS AND STRENGTHS---------------------
 Injection: 350 mg/20 mL (17.5 mg/mL) in a single-dose vial. (3)
 ------------------------------- CONTRAINDICATIONS ------------------------------
 KISUNLA is contraindicated in patients with known serious
 hypersensitivity to donanemab-azbt or to any of the excipients. (4, 5.2)
 ------------------------WARNINGS AND PRECAUTIONS-----------------------
 • Amyloid Related Imaging Abnormalities (ARIA): Enhanced clinical
 vigilance for ARIA is recommended during the first 24 weeks of
 treatment with KISUNLA. Risk of ARIA, including symptomatic
 ARIA, was increased in apolipoprotein E ε4 (ApoE ε4)
 homozygotes compared to heterozygotes and noncarriers. The risk
 of ARIA-E and ARIA-H is increased in KISUNLA-treated patients
 with pretreatment microhemorrhages and/or superficial siderosis. If
 a patient experiences symptoms suggestive of ARIA, clinical
 evaluation should be performed, including MRI scanning if
 indicated. (2.3, 5.1)
 • Infusion-Related Reactions: The infusion rate may be reduced, or
 the infusion may be discontinued, and appropriate therapy initiated
 as clinically indicated. Consider pre-treatment with antihistamines,
 acetaminophen, or corticosteroids prior to subsequent dosing. (5.3)
 -------------------------------ADVERSE REACTIONS------------------------------
 Most common adverse reactions (at least 10% and higher incidence
 compared to placebo): ARIA-E, ARIA-H microhemorrhage, ARIA-H
 superficial siderosis, and headache. (6.1)
"""

data = spark.createDataFrame([[medm_prompt]]).toDF("text")
data.show(truncate=100)

+----------------------------------------------------------------------------------------------------+
|                                                                                                text|
+----------------------------------------------------------------------------------------------------+
|\nsummarize the following content.\n\n content:\n ---------------------------- INDICATIONS AND US...|
+----------------------------------------------------------------------------------------------------+



In [None]:
%%time
results = pipeline.fit(data).transform(data).cache()
results.select("completions").show(truncate=False)

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|completions                                                                                                                                                                                                                                                                                                                                                                                                                                           

In [None]:
print(results.select("completions").collect()[0].completions[0].result)

KISUNLA is an amyloid beta-directed antibody indicated for the treatment of Alzheimer's disease. It is recommended to initiate treatment in patients with mild cognitive impairment or mild dementia stage of disease. The recommended dosage is 700 mg administered as an intravenous infusion over approximately 30 minutes every four weeks for the first three doses, followed by 1400 mg every four weeks. Patients should have a recent baseline brain MRI prior to initiating treatment and obtain an MRI prior to the 2nd, 
