![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings/Healthcare/46.Loading_Medical_and_Open_Source_LLMs.ipynb)

# Loading Medical and Open Souce LLMs



## Colab Setup

In [None]:
import json, os
from google.colab import files

if 'spark_jsl.json' not in os.listdir():
  license_keys = files.upload()
  os.rename(list(license_keys.keys())[0], 'spark_jsl.json')

with open('spark_jsl.json') as f:
    license_keys = json.load(f)

# Defining license key-value pairs as local variables
locals().update(license_keys)
os.environ.update(license_keys)

In [None]:
# Installing pyspark and spark-nlp
! pip install --upgrade -q pyspark==3.5.1 spark-nlp==$PUBLIC_VERSION

# Installing Spark NLP Healthcare
! pip install --upgrade -q spark-nlp-jsl==$JSL_VERSION  --extra-index-url https://pypi.johnsnowlabs.com/$SECRET

# Installing Spark NLP Display Library for visualization
#! pip install -q spark-nlp-display

In [3]:
import json
import os

import sparknlp
import sparknlp_jsl

from sparknlp.base import *
from sparknlp.annotator import *
from sparknlp_jsl.annotator import *

from pyspark.ml import Pipeline,PipelineModel
from pyspark.sql import SparkSession

import warnings
warnings.filterwarnings('ignore')

params = {
    "spark.driver.memory":"100G",
    "spark.kryoserializer.buffer.max":"2000M",
    "spark.driver.maxResultSize":"2000M",
}

spark = sparknlp_jsl.start(license_keys['SECRET'],
                           params=params,
                           #gpu=True # if you have a GPU
                           )

print("Spark NLP Version :", sparknlp.version())
print("Spark NLP_JSL Version :", sparknlp_jsl.version())

spark

Spark NLP Version : 6.1.3
Spark NLP_JSL Version : 6.1.1


# Medical LLMs

| **Model Name**             | **Quantization Options**   | **Description**   |
| -------------------------- | -------------------------- | ----------------- |
| JSL_MedM_v1                | [q4](https://nlp.johnsnowlabs.com/2024/07/12/jsl_medm_q4_v1_en.html), [q8](https://nlp.johnsnowlabs.com/2024/07/12/jsl_medm_q8_v1_en.html), [q16](https://nlp.johnsnowlabs.com/2024/07/12/jsl_medm_q16_v1_en.html) | Summarization, Q&A, RAG, and Chat      |
| JSL_MedM_v2                | [q4](https://nlp.johnsnowlabs.com/2024/10/06/jsl_medm_q4_v2_en.html), [q8](https://nlp.johnsnowlabs.com/2024/10/06/jsl_medm_q8_v2_en.html), [q16](https://nlp.johnsnowlabs.com/2024/10/08/jsl_medm_q16_v2_en.html) | Summarization, Q&A, RAG, and Chat      |
| JSL_MedM_v3                | [q4](https://nlp.johnsnowlabs.com/2024/10/06/jsl_medm_q4_v3_en.html), [q8](https://nlp.johnsnowlabs.com/2024/10/08/jsl_medm_q8_v3_en.html), [q16](https://nlp.johnsnowlabs.com/2024/10/23/jsl_medm_q16_v3_en.html) | Summarization, Q&A, RAG, and Chat      |
| JSL_MedS_v1                | [q4](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_q4_v1_en.html), [q8](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_q8_v1_en.html), [q16](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_q16_v1_en.html) | Summarization, Q&A, RAG |
| JSL_MedS_v2                | [q4](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_q4_v2_en.html), [q8](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_q8_v2_en.html), [q16](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_q16_v2_en.html) | Summarization, Q&A, RAG |
| JSL_MedS_v3                | [q4](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_q4_v3_en.html), [q8](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_q8_v3_en.html), [q16](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_q16_v3_en.html) | Summarization, Q&A, RAG |
| JSL_MedS_8B_v4             | [q4](https://nlp.johnsnowlabs.com/2025/08/05/jsl_meds_8b_q4_v4_en.html), [q8](https://nlp.johnsnowlabs.com/2025/08/05/jsl_meds_8b_q8_v4_en.html), [q16](https://nlp.johnsnowlabs.com/2025/08/05/jsl_meds_8b_q16_v4_en.html) | Summarization, Q&A, RAG |
| JSL_MedS_4B_v4             | [q4](https://nlp.johnsnowlabs.com/2025/08/05/jsl_meds_4b_q4_v4_en.html), [q8](https://nlp.johnsnowlabs.com/2025/08/05/jsl_meds_4b_q8_v4_en.html), [q16](https://nlp.johnsnowlabs.com/2025/08/05/jsl_meds_4b_q16_v4_en.html) | Summarization, Q&A, RAG |
| JSL_MedS_4B_v5             | [q4](https://nlp.johnsnowlabs.com/2025/08/05/jsl_meds_4b_q4_v5_en.html), [q8](), [q16](https://nlp.johnsnowlabs.com/2025/08/05/jsl_meds_4b_q16_v5_en.html) | Summarization, Q&A, RAG |
| JSL_MedS_NER_ZS_v1         | [q4](https://nlp.johnsnowlabs.com/2024/10/04/jsl_meds_ner_zs_q4_v1_en.html), [q8](https://nlp.johnsnowlabs.com/2024/10/04/jsl_meds_ner_zs_q8_v1_en.html), [q16](https://nlp.johnsnowlabs.com/2024/10/04/jsl_meds_ner_zs_q16_v1_en.html) | Extract and link medical named entities |
| JSL_MedS_NER_v2            | [q4](https://nlp.johnsnowlabs.com/2024/10/01/jsl_meds_ner_q4_v2_en.html), [q8](https://nlp.johnsnowlabs.com/2024/10/04/jsl_meds_ner_q8_v2_en.html), [q16](https://nlp.johnsnowlabs.com/2024/10/04/jsl_meds_ner_q16_v2_en.html) | Extract and link medical named entities |
| JSL_MedS_NER_v3            | [q4](https://nlp.johnsnowlabs.com/2025/06/20/jsl_meds_ner_q4_v3_en.html), [q8](https://nlp.johnsnowlabs.com/2025/06/20/jsl_meds_ner_q8_v3_en.html), [q16](https://nlp.johnsnowlabs.com/2025/06/20/jsl_meds_ner_q16_v3_en.html) | Extract and link medical named entities |
| JSL_MedS_NER_v4            | [q4](https://nlp.johnsnowlabs.com/2025/07/01/jsl_meds_ner_q4_v4_en.html), [q8](https://nlp.johnsnowlabs.com/2025/07/01/jsl_meds_ner_q8_v4_en.html) | Extract and link medical named entities |
| JSL_MedS_NER_OpenVINO_v4   | [q4](https://nlp.johnsnowlabs.com/2025/06/30/jsl_meds_ner_openvino_q4_v4_en.html), [q8](https://nlp.johnsnowlabs.com/2025/07/01/jsl_meds_ner_openvino_q8_v4_en.html), [q16](https://nlp.johnsnowlabs.com/2025/07/01/jsl_meds_ner_openvino_q16_v4_en.html) | Extract and link medical named entities |
| JSL_MedS_RAG_v1            | [q4](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_rag_q4_v1_en.html), [q8](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_rag_q8_v1_en.html), [q16](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_rag_q16_v1_en.html) | LLM component of Retrieval Augmented Generation (RAG) |
| JSL_MedS_Text2SOAP_v1      |                [base](https://nlp.johnsnowlabs.com/2025/04/09/jsl_meds_text2soap_v1_en.html) | Generate structured SOAP (Subjective, Objective, Assessment, Plan) summaries |
| JSL_MEDS_TEXT2SQL_1B_v1      |                [q16](https://nlp.johnsnowlabs.com/2025/08/04/jsl_meds_text2sql_1b_q16_v1_en.html) | Transforming natural language queries into SQL |
| JSL_MedS_VLM_3B_v1         | [q4](https://nlp.johnsnowlabs.com/2025/08/08/jsl_meds_vlm_3b_q8_v1_en.html), [q8](https://nlp.johnsnowlabs.com/2025/08/08/jsl_meds_vlm_3b_q4_v1_en.html), [q16](https://nlp.johnsnowlabs.com/2025/08/08/jsl_meds_vlm_3b_q16_v1_en.html) | Extract and link structured medical named entities |
| JSL_MedS_NER_VLM_2B_v1         | [q4](https://nlp.johnsnowlabs.com/2025/08/08/jsl_meds_ner_vlm_2b_q4_v1_en.html), [q8](https://nlp.johnsnowlabs.com/2025/08/10/jsl_meds_ner_vlm_2b_q8_v1_en.html), [q16](https://nlp.johnsnowlabs.com/2025/08/10/jsl_meds_ner_vlm_2b_q16_v1_en.html) | Extract and link structured medical named entities |
| JSL_MedS_NER_VLM_2B_v2         | [q4](https://nlp.johnsnowlabs.com/2025/08/10/jsl_meds_ner_vlm_2b_q4_v2_en.html), [q8](https://nlp.johnsnowlabs.com/2025/08/10/jsl_meds_ner_vlm_2b_q8_v2_en.html), [q16](https://nlp.johnsnowlabs.com/2025/08/10/jsl_meds_ner_vlm_2b_q16_v2_en.html) | Extract and link structured medical named entities |



**We recommend using 8b quantized versions of the models as the qualitative performance difference between q16 and q8 versions is very negligible.**


# Medical Small LLMs

| Model Name             | Disk Size | Model Size | Modality   | Available quantizations | Gpu memory required | Token/sec | Max Context Window **\*** | Tasks                                                         |
|-------------------------|-----------|-------------|------------|-------------------------|---------------------|-----------|--------------------|---------------------------------------------------------------|
| JSL_MedM_v3            | 8.2G      | 14B         | text-only  | [q4](https://nlp.johnsnowlabs.com/2024/10/06/jsl_medm_q4_v3_en.html) | 24GB | 79  | 32,768  | Summarization, Q&A, RAG, and Chat |
|                         | 14G       | 14B         | text-only  | [q8](https://nlp.johnsnowlabs.com/2024/10/08/jsl_medm_q8_v3_en.html) | 24GB | 84  | 32,768  | Summarization, Q&A, RAG, and Chat |
|                         | 21.9G     | 14B         | text-only  | [q16](https://nlp.johnsnowlabs.com/2024/10/23/jsl_medm_q16_v3_en.html) | 24GB | 253 | 32,768  | Summarization, Q&A, RAG, and Chat |
| JSL_MedS_v3            | 2.2G      | 3.5B        | text-only  | [q4](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_q4_v3_en.html) | 10GB | 28.5 | 131,072 | Summarization, Q&A, RAG |
|                         | 3.7G      | 3.5B        | text-only  | [q8](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_q8_v3_en.html) | 10GB | 18.7 | 131,072 | Summarization, Q&A, RAG |
|                         | 5.6G      | 3.5B        | text-only  | [q16](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_q16_v3_en.html) | 10GB | 50.2 | 131,072 | Summarization, Q&A, RAG |
| JSL_MedS_8B_v4         | 4.6G      | 8B          | text-only  | [q4](https://nlp.johnsnowlabs.com/2025/08/05/jsl_meds_8b_q4_v4_en.html) | 16GB | 83  | 32,768  | Summarization, Q&A, RAG |
|                         | 7.8G      | 8B          | text-only  | [q8](https://nlp.johnsnowlabs.com/2025/08/05/jsl_meds_8b_q8_v4_en.html) | 16GB | 84  | 32,768  | Summarization, Q&A, RAG |
|                         | 12.2G     | 8B          | text-only  | [q16](https://nlp.johnsnowlabs.com/2025/08/05/jsl_meds_8b_q16_v4_en.html) | 16GB | 272 | 32,768  | Summarization, Q&A, RAG |
| JSL_MedS_NER_v4        | 2.2G      | 3.5B        | text-only  | [q4](https://nlp.johnsnowlabs.com/2025/07/01/jsl_meds_ner_q4_v4_en.html) | 10GB | 28.5 | 131,072 | Extract and link medical named entities |
|                         | 3.7G      | 3.5B        | text-only  | [q8](https://nlp.johnsnowlabs.com/2025/07/01/jsl_meds_ner_q8_v4_en.html) | 10GB | 18.7 | 131,072 | Extract and link medical named entities |
| JSL_MedS_RAG_v1        | 2.2G      | 3B          | text-only  | [q4](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_rag_q4_v1_en.html) | 10GB | 30  | 32,768  | LLM component of Retrieval Augmented Generation (RAG) |
|                         | 3.7G      | 3B          | text-only  | [q8](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_rag_q8_v1_en.html) | 10GB | 20  | 32,768  | LLM component of Retrieval Augmented Generation (RAG) |
|                         | 5.6G      | 3B          | text-only  | [q16](https://nlp.johnsnowlabs.com/2024/10/05/jsl_meds_rag_q16_v1_en.html) | 10GB | 53  | 32,768  | LLM component of Retrieval Augmented Generation (RAG) |
| JSL_MedS_Text2SOAP_v1  | 2.2G      | 3B          | text-only  | [base](https://nlp.johnsnowlabs.com/2025/04/09/jsl_meds_text2soap_v1_en.html) | 10GB | 53  | 32,768  | Generate structured SOAP (Subjective, Objective, Assessment, Plan) summaries |
| JSL_MedS_VLM_3B_v1     | 2.5G      | 3B          | multimodal | [q4](https://nlp.johnsnowlabs.com/2025/08/08/jsl_meds_vlm_3b_q8_v1_en.html) | 10GB | 8   | 128,000 | Extract and link structured medical named entities |
|                         | 3.6G      | 3B          | multimodal | [q8](https://nlp.johnsnowlabs.com/2025/08/08/jsl_meds_vlm_3b_q4_v1_en.html) | 10GB | 11  | 128,000 | Extract and link structured medical named entities |
|                         | 5.6G      | 3B          | multimodal | [q16](https://nlp.johnsnowlabs.com/2025/08/08/jsl_meds_vlm_3b_q16_v1_en.html) | 10GB | 40.1| 128,000 | Extract and link structured medical named entities |
| JSL_MedS_NER_VLM_2B_v2 | 1.5G      | 2B          | multimodal | [q4](https://nlp.johnsnowlabs.com/2025/08/10/jsl_meds_ner_vlm_2b_q4_v2_en.html) | 10GB | 25.5| 32,768  | Extract and link structured medical named entities |
|                         | 2.1G      | 2B          | multimodal | [q8](https://nlp.johnsnowlabs.com/2025/08/10/jsl_meds_ner_vlm_2b_q8_v2_en.html) | 10GB | 13.7| 32,768  | Extract and link structured medical named entities |
|                         | 3.3G      | 2B          | multimodal | [q16](https://nlp.johnsnowlabs.com/2025/08/10/jsl_meds_ner_vlm_2b_q16_v2_en.html) | 10GB | 48.9| 32,768  | Extract and link structured medical named entities |

<br>

> **\*** Larger context window requires larger GPU Memory


# MedicalLLM Annotator

`MedicalLLM` was designed to load and run large language models (LLMs) in GGUF format with scalable performance. Ideal for clinical and healthcare applications, MedicalLLM supports tasks like medical entity extraction, summarization, Q&A, Retrieval Augmented Generation (RAG), and conversational AI. With simple integration into Spark NLP pipelines, it allows for customizable batch sizes, prediction settings, and chat templates. GPU optimization is also available, enhancing its capabilities for high-performance environments. MedicalLLM empowers users to link medical entities and perform complex NLP tasks with efficiency and precision.

## JSL_MedS


In [4]:
document_assembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

medical_llm = MedicalLLM.pretrained("jsl_meds_4b_q16_v5", "en", "clinical/models")\
    .setInputCols("document")\
    .setOutputCol("completions")\
    .setBatchSize(1)\
    .setNPredict(100)\
    .setUseChatTemplate(True)\
    .setTemperature(0)

pipeline = Pipeline(stages=[
    document_assembler,
    medical_llm
])

jsl_meds_4b_q16_v5 download started this may take some time.
Approximate size to download 5.7 GB
[OK!]


**Close Ended Question Answering**

In [5]:
prompt = """
A 23-year-old pregnant woman at 22 weeks gestation presents with burning upon urination. She states it started 1 day ago and has been worsening despite drinking more water and taking cranberry extract. She otherwise feels well and is followed by a doctor for her pregnancy. Her temperature is 97.7°F (36.5°C), blood pressure is 122/77 mmHg, pulse is 80/min, respirations are 19/min, and oxygen saturation is 98% on room air. Physical exam is notable for an absence of costovertebral angle tenderness and a gravid uterus.
Which of the following is the best treatment for this patient?
A: Ampicillin
B: Ceftriaxone
C: Ciprofloxacin
D: Doxycycline
E: Nitrofurantoin
"""

data = spark.createDataFrame([[prompt]]).toDF("text")

In [6]:
results = pipeline.fit(data).transform(data).cache()

In [7]:
results.select("completions").show(truncate=False)

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|completions                                                                                                                                                                                                                                                                                                                                                                                                                                                                                

In [8]:
print(results.select("completions").collect()[0].completions[0].result)

The patient presents with symptoms suggestive of a urinary tract infection (UTI) during pregnancy. Given the gestational age of 22 weeks, the most appropriate treatment option is **E: Nitrofurantoin**.

Here's why:

*   **Nitrofurantoin:** Nitrofurantoin is a commonly used and generally safe antibiotic for UTIs in pregnancy, especially in the first and second trimesters. It concentrates well in the urine and has a relatively low risk of systemic side effects compared to


**Open-Ended Question**

In [9]:
prompt = """### Instruction:
You will be given a health-related question. Provide a clear, accurate, and concise answer in no more than 3 sentences.
Base your answer only on medically accepted and verifiable information.
Do not add personal opinions or unverified claims.

### Question:
What is hypertension?

### Response:
"""

data = spark.createDataFrame([[prompt]]).toDF("text")

In [10]:
result = pipeline.fit(data).transform(data).cache()
result.select("completions").show(truncate=False)

+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|completions                                                                                                                                                                                                                                                                                                                                                                    |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In [11]:
print(result.select("completions").collect()[0].completions[0].result)

Hypertension, or high blood pressure, is a condition in which the force of your blood against your artery walls is consistently too high. This can damage your heart, blood vessels, kidneys, and other organs. It is often called the "silent killer" because it usually has no symptoms until it causes serious health problems.



**Summarization**

In [12]:
prompt = """### Instruction:
You will be given a long piece of text. Summarize it into a **short version that is no more than 30–40% of the original length**.
Keep only the main ideas and key facts, removing all unnecessary details or examples.
Do not add new information that is not in the text.

### Text:
Climate change refers to long-term shifts in temperatures and weather patterns. These shifts may be natural,
but since the 1800s, human activities have been the main driver, primarily due to burning fossil fuels like coal,
oil, and gas. Burning these materials releases greenhouse gases, which trap the sun’s heat and raise global temperatures.
Consequences include rising sea levels, more extreme weather events, and disruptions to food and water supply.

### Response:
"""


data = spark.createDataFrame([[prompt]]).toDF("text")

In [13]:
results = pipeline.fit(data).transform(data).cache()
results.select("completions").show(truncate=False)

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|completions                                                                                                                                                                                                                                                                                                                                                                   |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In [14]:
print(results.select("completions").collect()[0].completions[0].result)

Climate change is a long-term shift in temperatures and weather patterns, primarily driven by human activities since the 1800s. Burning fossil fuels releases greenhouse gases, trapping heat and raising global temperatures. This leads to rising sea levels, more extreme weather, and disruptions to food and water supplies.



**Named Entity Recognition**

In [15]:
med_ner_prompt = """
### Template:
{
    "drugs": [
        {
            "name": "",
            "reactions": []
        }
    ],
    "chunks_with_labels": [
        {
            "text": "",
            "label": ""
        }
    ]
}

### Instructions:
Extract entities from the Text and return:
1) The required JSON under the "drugs" key EXACTLY in the schema above:
   - "name": the drug name or brand as written in the text.
   - "reactions": a list of adverse effects mentioned for that drug (strings only, no extra fields).
2) Additionally, return a flat list of all extracted chunks and their labels under "chunks_with_labels":
   - Each item must have "text" (the exact span from the text) and "label" (e.g., DRUG, REACTION, DOSAGE, FREQUENCY, DURATION).
   - Do not invent information not present in the text.
   - Use the exact casing and wording from the text.
   - If something is not present, return an empty list for that section.

Only use information from the Text. Output valid JSON matching the Template.

### Text:
I feel a bit drowsy & have a little blurred vision , and some gastric problems .
I 've been on Arthrotec 50 for over 10 years on and off , only taking it when I needed it .
Due to my arthritis getting progressively worse , to the point where I am in tears with the agony.
Gp 's started me on 75 twice a day and I have to take it every day for the next month to see how I get on , here goes .
So far its been very good , pains almost gone , but I feel a bit weird , did n't have that when on 50.
"""


data = spark.createDataFrame([[med_ner_prompt]]).toDF("text")
data.show(truncate=100)

+----------------------------------------------------------------------------------------------------+
|                                                                                                text|
+----------------------------------------------------------------------------------------------------+
|\n### Template:\n{\n    "drugs": [\n        {\n            "name": "",\n            "reactions": ...|
+----------------------------------------------------------------------------------------------------+



In [16]:
%%time
results = pipeline.fit(data).transform(data).cache()
results.select("completions").show(truncate=False)

+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|completions                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+-------------------------------------------------------------------------------------------------------------------------------

In [17]:
print(results.select("completions").collect()[0].completions[0].result)

```json
{
    "drugs": [
        {
            "name": "Arthrotec 50",
            "reactions": [
                "drowsy",
                "blurred vision",
                "gastric problems",
                "weird"
            ]
        },
        {
            "name": "Arthrotec 75",
            "reactions": [
                "weird"
            ]
        }
    ],



**Retriveal Augmented Generation**

In [18]:
prompt = """
### Template:
Use the following pieces of context to answer the user's question. If you return an answer, end with 'It's my pleasure'.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

### Context:
'Hypertension is a chronic medical condition in which the blood pressure in the arteries is persistently elevated. Long-term high blood pressure is a major risk factor for stroke, heart disease, and kidney failure.',
'Several studies have shown that dietary habits, physical inactivity, and high sodium intake contribute significantly to the development of hypertension. Lifestyle modifications such as reducing salt intake, exercising regularly, and maintaining a healthy weight can help prevent and manage the condition.',

### Questions:
impact of high sodium intake on hypertension?
"""

data = spark.createDataFrame([[prompt]]).toDF("text")
data.show(truncate=100)

+----------------------------------------------------------------------------------------------------+
|                                                                                                text|
+----------------------------------------------------------------------------------------------------+
|\n### Template:\nUse the following pieces of context to answer the user's question. If you return...|
+----------------------------------------------------------------------------------------------------+



In [19]:
%%time
results = pipeline.fit(data).transform(data).cache()
results.select("completions").show(truncate=False)

+----------------------------------------------------------------------------------------------------------------------------------------------+
|completions                                                                                                                                   |
+----------------------------------------------------------------------------------------------------------------------------------------------+
|[{document, 0, 98, High sodium intake contributes significantly to the development of hypertension. It's my pleasure.\n, {sentence -> 0}, []}]|
+----------------------------------------------------------------------------------------------------------------------------------------------+

CPU times: user 10.5 ms, sys: 5.86 ms, total: 16.4 ms
Wall time: 13.3 s


In [20]:
print(results.select("completions").collect()[0].completions[0].result)

High sodium intake contributes significantly to the development of hypertension. It's my pleasure.



**Text to SQL**

In [21]:
prompt = """### Instruction:
Table: HospitalVisits
- visit_id (INT)
- patient_id (INT)
- visit_date (DATE)
- department (VARCHAR)
- doctor_name (VARCHAR)
- diagnosis (VARCHAR)

Retrieve the visit dates and doctor names for patients diagnosed with pneumonia.

### Response:
"""

data = spark.createDataFrame([[prompt]]).toDF("text")


In [22]:
result = pipeline.fit(data).transform(data).cache()
result.select("completions").show(truncate=False)

+----------------------------------------------------------------------------------------------------------------------------------------------------+
|completions                                                                                                                                         |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
|[{document, 0, 100, ```sql\nSELECT DISTINCT visit_date, doctor_name\nFROM HospitalVisits\nWHERE diagnosis = 'Pneumonia';\n```, {sentence -> 0}, []}]|
+----------------------------------------------------------------------------------------------------------------------------------------------------+



In [23]:
example_output  = result.select("completions").collect()[0].completions[0].result
print(example_output)

```sql
SELECT DISTINCT visit_date, doctor_name
FROM HospitalVisits
WHERE diagnosis = 'Pneumonia';
```


## JSL_MedS_NER

This LLM model is trained to extract and link entities in a document. Users needs to define an input schema as explained in the example section. Drug is defined as a list which tells the model that there could be multiple drugs in the document and it has to extract all of them. Each drug has properties like "name" and "reaction". Since "name" is only one, it is a string, but there could be multiple reactions, hence it is a list. Similarly, users can define any schema for any type of entity.

In [24]:
document_assembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

medical_llm = MedicalLLM.pretrained("jsl_meds_ner_q8_v4", "en", "clinical/models")\
    .setInputCols("document")\
    .setOutputCol("completions")\
    .setBatchSize(1)\
    .setNPredict(100)\
    .setUseChatTemplate(True)\
    .setTemperature(0)\
    #.setNGpuLayers(100) # if you have GPU


pipeline = Pipeline(
    stages = [
        document_assembler,
        medical_llm
])

jsl_meds_ner_q8_v4 download started this may take some time.
Approximate size to download 3.7 GB
[OK!]


In [25]:
med_ner_prompt = """
### Template:
{
    "drugs": [
        {
            "name": "",
            "reactions": []
        }
    ]
}
### Text:
I feel a bit drowsy & have a little blurred vision , and some gastric problems .
I 've been on Arthrotec 50 for over 10 years on and off , only taking it when I needed it .
Due to my arthritis getting progressively worse , to the point where I am in tears with the agony.
Gp 's started me on 75 twice a day and I have to take it every day for the next month to see how I get on , here goes .
So far its been very good , pains almost gone , but I feel a bit weird , did n't have that when on 50.
"""

data = spark.createDataFrame([[med_ner_prompt]]).toDF("text")
data.show(truncate=100)

+----------------------------------------------------------------------------------------------------+
|                                                                                                text|
+----------------------------------------------------------------------------------------------------+
|\n### Template:\n{\n    "drugs": [\n        {\n            "name": "",\n            "reactions": ...|
+----------------------------------------------------------------------------------------------------+



In [26]:
%%time
results = pipeline.fit(data).transform(data).cache()
results.select("completions").show(truncate=False)

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|completions                                                                                                                                                                                                                                                                                                                                                                                           |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In [27]:
print(results.select("completions").collect()[0].completions[0].result)

{
    "drugs": [
        {
            "name": "Arthrotec 50",
            "reactions": [
                "drowsy",
                "blurred vision",
                "gastric problems"
            ]
        },
        {
            "name": "75",
            "reactions": [
                "weird"
            ]
        }
    ]
}



## JSL_MedS_RAG

In [28]:
document_assembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

medical_llm = MedicalLLM.pretrained("jsl_meds_rag_q8_v1", "en", "clinical/models")\
    .setInputCols("document")\
    .setOutputCol("completions")\
    .setBatchSize(1)\
    .setNPredict(100)\
    .setUseChatTemplate(True)\
    .setTemperature(0)


pipeline = Pipeline(
    stages = [
        document_assembler,
        medical_llm
])

jsl_meds_rag_q8_v1 download started this may take some time.
Approximate size to download 3.7 GB
[OK!]


In [29]:
prompt = """
### Template:
Use the following pieces of context to answer the user's question. If you return an answer, end with 'It's my pleasure'.
If you don't know the answer, just say that you don't know, don't try to make up an answer .


### Context:
'Background: Diabetes is referred to a group of diseases characterized by high glucose levels in blood. It is caused by a deficiency in the production or function of insulin or both, which can occur because of different reasons, resulting in protein and lipid metabolic disorders. The aim of this study was to systematically review the prevalence and incidence of type 1 diabetes in the world.',
'A higher prevalence of diabetes mellitus was observed in Addis Ababa public health institutions. Factors such as age, alcohol drinking, HDL, triglycerides, and vagarious physical activity were associated with diabetes mellitus. Concerned bodies need to work over the ever-increasing diabetes mellitus in Addis Ababa.',

### Questions:
relationship between diabetes and obesity?
"""

data = spark.createDataFrame([[prompt]]).toDF("text")
data.show(truncate=100)

+----------------------------------------------------------------------------------------------------+
|                                                                                                text|
+----------------------------------------------------------------------------------------------------+
|\n### Template:\nUse the following pieces of context to answer the user's question. If you return...|
+----------------------------------------------------------------------------------------------------+



In [30]:
%%time
results = pipeline.fit(data).transform(data).cache()
results.select("completions").show(truncate=False)

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|completions                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
+-----------------------------------------------------------------------------

In [31]:
print(results.select("completions").collect()[0].completions[0].result)


Diabetes and obesity are closely related conditions. Obesity is a significant risk factor for the development of type 2 diabetes. Excess body fat, particularly around the abdomen, can increase the body's resistance to insulin, leading to higher blood glucose levels. This insulin resistance can eventually result in type 2 diabetes. Additionally, obesity can exacerbate the complications associated with diabetes,


## Text2SOAP

In [32]:
document_assembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

medical_llm = MedicalLLM.pretrained("jsl_meds_text2soap_v1", "en", "clinical/models")\
    .setInputCols("document")\
    .setOutputCol("completions")\
    .setBatchSize(1)\
    .setNPredict(1024)\
    .setUseChatTemplate(True)\
    .setTemperature(0)\
    .setNGpuLayers(1024)

pipeline = Pipeline(
    stages = [
        document_assembler,
        medical_llm
])

jsl_meds_text2soap_v1 download started this may take some time.
Approximate size to download 4.6 GB
[OK!]


In [33]:
text = '''
A 28-year-old female with a history of gestational diabetes mellitus diagnosed eight years prior to presentation and subsequent type two diabetes mellitus ( T2DM ), one prior episode of HTG-induced pancreatitis three years prior to presentation , and associated with an acute hepatitis , presented with a one-week history of polyuria , poor appetite , and vomiting .
She was on metformin , glipizide , and dapagliflozin for T2DM and atorvastatin and gemfibrozil for HTG . She had been on dapagliflozin for six months at the time of presentation .
Physical examination on presentation was significant for dry oral mucosa ; significantly , her abdominal examination was benign with no tenderness , guarding , or rigidity . Pertinent laboratory findings on admission were : serum glucose 111 mg/dl ,  creatinine 0.4 mg/dL , triglycerides 508 mg/dL , total cholesterol 122 mg/dL , and venous pH 7.27 .
'''

In [34]:
text_with_prompt = f"""
<|im_start|>user
You are an expert medical professor assisting in the creation of medically accurate SOAP summaries.
Please ensure the response follows the structured format: S:, O:, A:, P: without using markdown or special formatting.
Create a Medical SOAP note summary from the dialogue, following these guidelines:
    S (Subjective): Summarize the patient's reported symptoms, including chief complaint and relevant history. Rely on the patient's statements as the primary source and ensure standardized terminology.
    O (Objective): Highlight critical findings such as vital signs, lab results, and imaging, emphasizing important details like the side of the body affected and specific dosages. Include normal ranges where relevant.
    A (Assessment): Offer a concise assessment combining subjective and objective data. State the primary diagnosis and any differential diagnoses, noting potential complications and the prognostic outlook.
    P (Plan): Outline the management plan, covering medication, diet, consultations, and education. Ensure to mention necessary referrals to other specialties and address compliance challenges.
    Considerations: Compile the report based solely on the transcript provided. Maintain confidentiality and document sensitively. Use concise medical jargon and abbreviations for effective doctor communication.
    Please format the summary in a clean, simple list format without using markdown or bullet points. Use 'S:', 'O:', 'A:', 'P:' directly followed by the text. Avoid any styling or special characters.
### Dialogue:
{text}
<|im_end|>
<|im_start|>assistant
"""

data = spark.createDataFrame([[text_with_prompt]]).toDF("text")
data.show(truncate=100)

+----------------------------------------------------------------------------------------------------+
|                                                                                                text|
+----------------------------------------------------------------------------------------------------+
|\n<|im_start|>user\nYou are an expert medical professor assisting in the creation of medically ac...|
+----------------------------------------------------------------------------------------------------+



In [35]:
results = pipeline.fit(data).transform(data)

results.select("completions").show(truncate=False)

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In [36]:
example_output  = results.select("completions").collect()[0].completions[0].result
print(example_output)

S: The patient is a 28-year-old female with a history of gestational diabetes mellitus diagnosed eight years ago and type two diabetes mellitus (T2DM), with a prior episode of HTG-induced pancreatitis three years ago. She presented with a one-week history of polyuria, poor appetite, and vomiting. She has been on metformin, glipizide, dapagliflozin, atorvastatin, and gemfibrozil for her conditions.
O: Physical examination showed dry oral mucosa but was otherwise benign with no tenderness, guarding, or rigidity. Laboratory findings revealed serum glucose at 111 mg/dL, creatinine at 0.4 mg/dL, triglycerides at 508 mg/dL, total cholesterol at 122 mg/dL, and venous pH at 7.27. She was on dapagliflozin for six months prior to presentation.
A: The primary diagnosis is diabetic ketoacidosis (DKA) based on the patient's symptoms of polyuria, poor appetite, vomiting, and elevated triglycerides and total cholesterol. Differential diagnoses include complications from her underlying diabetes and HT

## JSL_MedM

This LLM model is trained to perform Summarization and Q&A based on a given context.

In [37]:
document_assembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

medical_llm = MedicalLLM.pretrained("jsl_medm_q8_v3", "en", "clinical/models")\
    .setInputCols("document")\
    .setOutputCol("completions")\
    .setBatchSize(1)\
    .setNPredict(100)\
    .setUseChatTemplate(True)\
    .setTemperature(0)\
    #.setNGpuLayers(100) # if you have GPU

pipeline = Pipeline(
    stages = [
        document_assembler,
        medical_llm
])

jsl_medm_q8_v3 download started this may take some time.
Approximate size to download 14 GB
[OK!]


In [38]:
medm_prompt = """
summarize the following content.

 content:
 ---------------------------- INDICATIONS AND USAGE ---------------------------
 KISUNLA is an amyloid beta-directed antibody indicated for the
 treatment of Alzheimer’s disease. Treatment with KISUNLA should be
 initiated in patients with mild cognitive impairment or mild dementia
 stage of disease, the population in which treatment was initiated in the
 clinical trials. (1)
 ------------------------DOSAGE AND ADMINISTRATION-----------------------
 • Confirm the presence of amyloid beta pathology prior to initiating
 treatment. (2.1)
 • The recommended dosage of KISUNLA is 700 mg administered as
 an intravenous infusion over approximately 30 minutes every four
 weeks for the first three doses, followed by 1400 mg every four
 weeks. (2.2)
 • Consider stopping dosing with KISUNLA based on reduction of
 amyloid plaques to minimal levels on amyloid PET imaging. (2.2)
 • Obtain a recent baseline brain MRI prior to initiating treatment.
 (2.3, 5.1)
 • Obtain an MRI prior to the 2nd, 3rd, 4th, and 7th infusions. If
 radiographically observed ARIA occurs, treatment
 recommendations are based on type, severity, and presence of
 symptoms. (2.3, 5.1)
 • Dilution to a final concentration of 4 mg/mL to 10 mg/mL with 0.9%
 Sodium Chloride Injection, is required prior to administration. (2.4)
 ----------------------DOSAGE FORMS AND STRENGTHS---------------------
 Injection: 350 mg/20 mL (17.5 mg/mL) in a single-dose vial. (3)
 ------------------------------- CONTRAINDICATIONS ------------------------------
 KISUNLA is contraindicated in patients with known serious
 hypersensitivity to donanemab-azbt or to any of the excipients. (4, 5.2)
 ------------------------WARNINGS AND PRECAUTIONS-----------------------
 • Amyloid Related Imaging Abnormalities (ARIA): Enhanced clinical
 vigilance for ARIA is recommended during the first 24 weeks of
 treatment with KISUNLA. Risk of ARIA, including symptomatic
 ARIA, was increased in apolipoprotein E ε4 (ApoE ε4)
 homozygotes compared to heterozygotes and noncarriers. The risk
 of ARIA-E and ARIA-H is increased in KISUNLA-treated patients
 with pretreatment microhemorrhages and/or superficial siderosis. If
 a patient experiences symptoms suggestive of ARIA, clinical
 evaluation should be performed, including MRI scanning if
 indicated. (2.3, 5.1)
 • Infusion-Related Reactions: The infusion rate may be reduced, or
 the infusion may be discontinued, and appropriate therapy initiated
 as clinically indicated. Consider pre-treatment with antihistamines,
 acetaminophen, or corticosteroids prior to subsequent dosing. (5.3)
 -------------------------------ADVERSE REACTIONS------------------------------
 Most common adverse reactions (at least 10% and higher incidence
 compared to placebo): ARIA-E, ARIA-H microhemorrhage, ARIA-H
 superficial siderosis, and headache. (6.1)
"""

data = spark.createDataFrame([[medm_prompt]]).toDF("text")
data.show(truncate=100)

+----------------------------------------------------------------------------------------------------+
|                                                                                                text|
+----------------------------------------------------------------------------------------------------+
|\nsummarize the following content.\n\n content:\n ---------------------------- INDICATIONS AND US...|
+----------------------------------------------------------------------------------------------------+



In [39]:
%%time
results = pipeline.fit(data).transform(data).cache()
results.select("completions").show(truncate=False)

+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|completions                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          

In [40]:
print(results.select("completions").collect()[0].completions[0].result)

KISUNLA is an amyloid beta-directed antibody used to treat Alzheimer's disease, specifically in patients with mild cognitive impairment or mild dementia. The recommended dosage is 700 mg administered intravenously over 30 minutes every four weeks for the first three doses, followed by 1400 mg every four weeks. Treatment should be initiated after confirming the presence of amyloid beta pathology and obtaining a baseline brain MRI. The drug is available as a 350 mg/


## Multimodal Process

We can obtain results from PDF files using our LLM models with Reader2Doc annotators.

In [41]:
!mkdir -p pdf_files

!wget -O pdf_files/mt_sample.pdf "https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/healthcare-nlp/data/mt_sample_01.pdf"

--2025-10-23 16:07:59--  https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/healthcare-nlp/data/mt_sample_01.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 41382 (40K) [application/octet-stream]
Saving to: ‘pdf_files/mt_sample.pdf’


2025-10-23 16:07:59 (3.95 MB/s) - ‘pdf_files/mt_sample.pdf’ saved [41382/41382]



In [42]:
from sparknlp.reader.reader2doc import Reader2Doc

reader2doc = Reader2Doc() \
    .setContentType("application/pdf")\
    .setContentPath("pdf_files/*pdf")\
    .setOutputCol("raw_text")\
    .setExplodeDocs(False)\
    .setFlattenOutput(True)\
    .setOutputFormat("plain-text")

pipeline = Pipeline(stages=[reader2doc])

empty_df = spark.createDataFrame([], "string").toDF("text")

model = pipeline.fit(empty_df)

result_df = model.transform(empty_df)

result_df.show(truncate=False)

+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In [43]:
prompt = """
### Template:
{
  "clinical":
    [
        "disease":[],
        "sympthom":[]
    ],

  "demographic":
    [
        "name":[],
        "date":[],
        "id":[]
    ]
}
### Text:
"""

medical_llm = MedicalLLM.pretrained("jsl_meds_ner_vlm_2b_q16_v1", "en", "clinical/models")\
    .setInputCols("document")\
    .setOutputCol("completions")\
    .setBatchSize(1)\
    .setNPredict(10000)\
    .setUseChatTemplate(True)\
    .setTemperature(0)\
    .setNGpuLayers(100)

jsl_meds_ner_vlm_2b_q16_v1 download started this may take some time.
Approximate size to download 3.3 GB
[OK!]


**`custom_llm_preprocessor`** is a function designed to help you transform annotations within a DataFrame. We modify the relevant Spark DF by adding the NER prompt as a prefix to the input column.

In [44]:
def custom_llm_preprocessor(annotations):
    new_annotations = []
    flattened_docs = ""

    for annotation in annotations: # annotations here are document annotations
        document_text = annotation.result
        if document_text:
            flattened_docs +=  ("\n" + document_text)
    result =  prompt + flattened_docs
    absolute_end = len(result)
    new_annotations.append(
        sparknlp.base.Annotation(
            annotatorType="document",  # Sentence annotations are typically 'document' type
            begin=0,
            end=absolute_end -1 ,
            result=result,
            metadata=annotation.metadata,
            embeddings=annotation.embeddings,
        )
    )
    return new_annotations

In [45]:
custom_llm_preprocessor_converter = AnnotationConverter(f=custom_llm_preprocessor)\
    .setInputCol("raw_text")\
    .setOutputCol("document")\
    .setOutputAnnotatorType("document") # Output type is 'document' for sentences

pipeline = Pipeline(
    stages=[
        reader2doc,
        custom_llm_preprocessor_converter,
        medical_llm
])




In [46]:
empty_df = spark.createDataFrame([], "string").toDF("text")

result_df = pipeline.fit(empty_df).transform(empty_df).cache()

In [47]:
%%time
collected_result = result_df.collect()

CPU times: user 9.18 ms, sys: 3.71 ms, total: 12.9 ms
Wall time: 41 s


In [48]:
print(json.loads(collected_result[0].completions[0].result))

{'clinical': {'disease': ['Mesothelioma', 'pleural eﬀusion', 'atrial ﬁbrillaIon', 'anemia', 'ascites', 'esophageal reﬂux', 'deep venous thrombosis'], 'sympthom': ['nonproductive cough', 'right-sided chest pain', 'fever', 'right-sided pleural eﬀusion', 'cough with right-sided chest pain', 'pericardiIs', 'pericardectomy', 'atrial ﬁbrillaIon', 'RNCA with intracranial thrombolyIc treatment', 'PTA of MCA', 'Mesenteric venous thrombosis', 'pericardial window', 'cholecystectomy', 'LeZ thoracentesis']}, 'demographic': {'name': ['Hendrickson, Ora MR.'], 'date': ['2007-08-24', '2007-08-20', '2007-08-31'], 'id': ['7194334']}}


In [49]:
import json

raw_json = collected_result[0].completions[0].result

data = json.loads(raw_json)

print(json.dumps(data, indent=4, ensure_ascii=False))

{
    "clinical": {
        "disease": [
            "Mesothelioma",
            "pleural eﬀusion",
            "atrial ﬁbrillaIon",
            "anemia",
            "ascites",
            "esophageal reﬂux",
            "deep venous thrombosis"
        ],
        "sympthom": [
            "nonproductive cough",
            "right-sided chest pain",
            "fever",
            "right-sided pleural eﬀusion",
            "cough with right-sided chest pain",
            "pericardiIs",
            "pericardectomy",
            "atrial ﬁbrillaIon",
            "RNCA with intracranial thrombolyIc treatment",
            "PTA of MCA",
            "Mesenteric venous thrombosis",
            "pericardial window",
            "cholecystectomy",
            "LeZ thoracentesis"
        ]
    },
    "demographic": {
        "name": [
            "Hendrickson, Ora MR."
        ],
        "date": [
            "2007-08-24",
            "2007-08-20",
            "2007-08-31"
        ],
        "

## Pretrained Pipeline

| Model Name                                                            |      Description            |
|-----------------------------------------------------------------------|-----------------------------|
| [`jsl_meds_4b_q16_v4_pipeline`](https://nlp.johnsnowlabs.com/2025/08/16/jsl_meds_4b_q16_v4_pipeline_en.html) |  Q&A, NER, Summarization, RAG, and Chat. |
| [`jsl_meds_8b_q8_v4_pipeline`](https://nlp.johnsnowlabs.com/2025/08/16/jsl_meds_8b_q8_v4_pipeline_en.html) |  Q&A, NER, Summarization, RAG, and Chat. |
| [`jsl_meds_ner_2b_q16_v2_pipeline`](https://nlp.johnsnowlabs.com/2025/08/16/jsl_meds_ner_2b_q16_v2_pipeline_en.html) |  Q&A, NER, Summarization, RAG, and Chat. |
| [`jsl_meds_ner_q16_v4_pipeline`](https://nlp.johnsnowlabs.com/2025/08/16/jsl_meds_ner_q16_v4_pipeline_en.html) |  Q&A, NER |
| [`jsl_meds_ner_vlm_2b_q16_v2_pipeline`](https://nlp.johnsnowlabs.com/2025/08/16/jsl_meds_ner_vlm_2b_q16_v2_pipeline_en.html) |  Q&A, NER |

In [50]:
from sparknlp.pretrained import PretrainedPipeline

In [51]:
pipeline = PretrainedPipeline("jsl_meds_ner_2b_q16_v2_pipeline", "en", "clinical/models")

jsl_meds_ner_2b_q16_v2_pipeline download started this may take some time.
Approx size to download 2.3 GB
[OK!]


In [52]:
text = """
# Template:
{
  "Patient Name": "string",
  "Patient Age": "integer",
  "Patient Gender": "string",
  "Hospital Number": "string",
  "Episode Number": "string",
  "Episode Date": "date-time"
}
# Context:
The patient, Johnathan Miller, is a 54-year-old male admitted under hospital number HN382914.
His most recent episode number is EP2024-1178, recorded on 2025-08-10.
The patient presented with chronic knee pain and swelling.
Past medical history includes hypertension and type 2 diabetes.
"""

data = spark.createDataFrame([[text]]).toDF("text")
data.show(truncate=100)

+----------------------------------------------------------------------------------------------------+
|                                                                                                text|
+----------------------------------------------------------------------------------------------------+
|\n# Template:\n{\n  "Patient Name": "string",\n  "Patient Age": "integer",\n  "Patient Gender": "...|
+----------------------------------------------------------------------------------------------------+



In [53]:
%%time
results = pipeline.transform(data).cache()
results.select("completions").show(truncate=False)

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|completions                                                                                                                                                                                                               |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|[{document, 0, 174, {"Patient Name": "Johnathan Miller", "Patient Age": 54, "Patient Gender": "male", "Hospital Number": "HN382914", "Episode Number": "EP2024-1178", "Episode Date": "2025-08-10"}, {sentence -> 0}, []}]|
+-------------------------------------------------------------------------------------------------------------------

# LLMLoader

`LLMLoader` is designed to interact with a LLMs that are converted into gguf format. This module allows using John Snow Labs' licensed LLMs at various sizes that are finetuned on medical context for certain tasks. It provides various methods for setting parameters, loading models, generating text, and retrieving metadata. The `LLMLoader` includes methods for setting various parameters such as input prefix, suffix, cache prompt, number of tokens to predict, sampling techniques, temperature, penalties, and more. Overall, the `LLMLoader`  provides a flexible and extensible framework for interacting with language models in a Python and Scala environment using PySpark and Java.

The `LLMLoader`, now based on the llama cpp dependency, allows loading models (either `AutoGGUFModel` or `MedicalLLM`) via pretrained and load methods. It can also load any other gguf model outside of the Models Hub. It automatically detects whether a model is licensed and can also load GGUF files directly using the `loadGGUF` method. This streamlines model loading and usage without requiring the inclusion of the llama cpp dependency.

## JSL_MedS


This LLM model is trained to perform Summarization and Q&A based on a given context.

In [54]:
from sparknlp_jsl.llm import LLMLoader

jsl_meds_llm = LLMLoader(spark).pretrained("jsl_meds_q8_v3", "en", "clinical/models")

In [55]:
prompt = """
Based on the following text, what age group is most susceptible to breast cancer?

## Text:
The exact cause of breast cancer is unknown. However, several risk factors can increase your likelihood of developing breast cancer, such as:
- A personal or family history of breast cancer
- A genetic mutation, such as BRCA1 or BRCA2
- Exposure to radiation
- Age (most commonly occurring in women over 50)
- Early onset of menstruation or late menopause
- Obesity
- Hormonal factors, such as taking hormone replacement therapy
"""

response = jsl_meds_llm.generate(prompt)

In [56]:
response

'\nBased on the provided text, the age group most susceptible to breast cancer is women over the age of 50. This is because the text explicitly states that breast cancer is most commonly occurring in women over 50. While other risk factors such as genetic mutations, exposure to radiation, early onset of menstruation, late menopause, obesity, and hormonal factors can increase the likelihood of developing breast cancer, age is specifically mentioned as a significant risk factor. Therefore, women in the age group of 50 and above are at a higher risk of developing breast cancer. It is important to note that while age is a significant risk factor, it does not guarantee the development of breast cancer, and other factors should also be considered. Regular screenings and early detection can help in managing and treating breast cancer effectively.'

## JSL_MedM


This LLM model is trained to perform Q&A, Summarization, RAG, and Chat.

In [57]:
from sparknlp_jsl.llm import LLMLoader

jsl_medm_llm = LLMLoader(spark).pretrained("jsl_medm_q8_v3", "en", "clinical/models")

In [58]:
prompt = """
A 23-year-old pregnant woman at 22 weeks gestation presents with burning upon urination. She states it started 1 day ago and has been worsening despite drinking more water and taking cranberry extract. She otherwise feels well and is followed by a doctor for her pregnancy. Her temperature is 97.7°F (36.5°C), blood pressure is 122/77 mmHg, pulse is 80/min, respirations are 19/min, and oxygen saturation is 98% on room air. Physical exam is notable for an absence of costovertebral angle tenderness and a gravid uterus.
Which of the following is the best treatment for this patient?
A: Ampicillin
B: Ceftriaxone
C: Ciprofloxacin
D: Doxycycline
E: Nitrofurantoin
"""

response = jsl_medm_llm.generate(prompt)

In [59]:
print(response)

The best treatment for this patient is E: Nitrofurantoin.

Explanation:
The patient presents with symptoms of a urinary tract infection (UTI), specifically cystitis, which is characterized by burning upon urination. The absence of costovertebral angle tenderness suggests that the infection is not extending into the kidneys, which would be indicated by flank pain and fever. The patient is pregnant, which is a significant factor in choosing an appropriate antibiotic.

Ampicillin (A) is not the best choice because it is not typically used for uncomplicated cystitis in pregnant women due to its potential for causing diarrhea and other gastrointestinal side effects.

Ceftriaxone (B) is a broad-spectrum antibiotic that is generally reserved for more severe infections or when there is a high suspicion of pyelonephritis. It is not the first-line treatment for uncomplicated cystitis.

Ciprofloxacin (C) is contraindicated in pregnancy due to its potential to cause harm to the developing fetus, p

## Opensource LLM

### mistral-7b

In [60]:
# ! pip install huggingface-hub

!huggingface-cli download TheBloke/Mistral-7B-v0.1-GGUF mistral-7b-v0.1.Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False


Downloading 'mistral-7b-v0.1.Q4_K_M.gguf' to '.cache/huggingface/download/rAdFtzyS6dBYqo0ywrkBjjqajgw=.ce6253d2e91adea0c35924b38411b0434fa18fcb90c52980ce68187dbcbbe40c.incomplete'
mistral-7b-v0.1.Q4_K_M.gguf: 100% 4.37G/4.37G [00:31<00:00, 140MB/s] 
Download complete. Moving file to mistral-7b-v0.1.Q4_K_M.gguf
mistral-7b-v0.1.Q4_K_M.gguf


In [61]:
llm_loader = LLMLoader(spark)

In [62]:
%%time

llm_loader\
    .setUseChatTemplate(True)\
    .setTemperature(0.0)\
    .setStopStrings(["<|im_end|>"])\
    .loadGGUF("./mistral-7b-v0.1.Q4_K_M.gguf")

CPU times: user 3.29 ms, sys: 1 ms, total: 4.29 ms
Wall time: 16.3 s


com.johnsnowlabs.ml.gguf.LLMLoader@c7c4c97

In [63]:
%%time
prompt = "What is the indication for the drug Methadone?"
response = llm_loader.generate(prompt)

CPU times: user 2.95 ms, sys: 2.02 ms, total: 4.96 ms
Wall time: 20.4 s


In [64]:
response

'Methadone is used to treat opioid addiction. It is a long-acting opioid that can help reduce cravings and withdrawal symptoms in people who are addicted to other opioids, such as heroin or prescription painkillers. Methadone is also used to treat chronic pain, but it is not as commonly used for this purpose as other opioids.'