![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/healthcare_jsl/ASSERTION_RADIOLOGY.ipynb)

# `assertion_dl_radiology` **Models**

This model extract radiology entities using the radiology NER model in the pipeline and assign assertion status for them with `assertion_dl_radiology` pretrained model. 

## 1. Colab Setup

In [None]:
# Install the johnsnowlabs library to access Spark-OCR and Spark-NLP for Healthcare, Finance, and Legal.
! pip install -q johnsnowlabs

In [None]:
from google.colab import files
print("Please Upload your John Snow Labs License using the button below")
license_keys = files.upload()

In [None]:
from johnsnowlabs import *

# After uploading your license run this to install all licensed Python Wheels and pre-download Jars the Spark Session JVM
# Make sure to restart your notebook afterwards for changes to take effect

jsl.install()

## 2. Start Session

In [None]:
from johnsnowlabs import *
# Automatically load license data and start a session with all jars user has access to
spark = jsl.start()

## 3. Select the model and construct the pipeline

In [None]:
NER_MODEL_NAME = "ner_radiology"
ASSERTION_MODEL_NAME = "assertion_dl_radiology"

**Create the pipeline**

In [None]:
document_assembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

sentence_detector = nlp.SentenceDetector() \
    .setInputCols(['document'])\
    .setOutputCol('sentence')

tokenizer = nlp.Tokenizer()\
    .setInputCols(["sentence"])\
    .setOutputCol("token")

embeddings = nlp.WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models") \
    .setInputCols(["sentence", 'token'])\
    .setOutputCol("embeddings")

ner = medical.NerModel.pretrained(NER_MODEL_NAME, "en", "clinical/models") \
    .setInputCols(["sentence", "token", "embeddings"]) \
    .setOutputCol("ner")

ner_converter = medical.NerConverterInternal() \
    .setInputCols(["sentence", "token", "ner"]) \
    .setOutputCol("ner_chunk")\
    .setWhiteList(["ImagingFindings"])

radiology_assertion = medical.AssertionDLModel.pretrained(ASSERTION_MODEL_NAME, "en", "clinical/models") \
    .setInputCols(["sentence", "ner_chunk", "embeddings"]) \
    .setOutputCol("assertion")


nlp_pipeline = Pipeline(
    stages = [
        document_assembler,
        sentence_detector,
        tokenizer,
        embeddings,
        ner,
        ner_converter,
        radiology_assertion
  ])


embeddings_clinical download started this may take some time.
Approximate size to download 1.6 GB
[OK!]
ner_radiology download started this may take some time.
[OK!]
assertion_dl_radiology download started this may take some time.
[OK!]


## 4. Create example inputs

In [None]:
sample_text = [
"""HSample Name: 2-D Doppler. Description: Normal left ventricle, moderate biatrial enlargement, and likely mild tricuspid regurgitation, but only mild increase in right heart pressures. 2-D STUDY 1. Mild aortic stenosis, widely calcified, minimally restricted. 2. Likely mild left ventricular hypertrophy but normal systolic function. 3. Moderate biatrial enlargement. 4. Normal right ventricle. 5. Normal appearance of the tricuspid and mitral valves. 6. Normal left ventricle and left ventricular systolic function. DOPPLER 1. There is 1 to 2+ aortic regurgitation likely seen, but no aortic stenosis. 2. Mild tricuspid regurgitation with only mild increase in right heart pressures, 30-35 mmHg maximum. SUMMARY 1. Normal left ventricle. 2. Moderate biatrial enlargement. 3. Mild tricuspid regurgitation, but only mild increase in right heart pressures.""",

"""Description: 2-D M-Mode. Doppler. 2-D M-MODE: 1. Left atrial enlargement with left atrial diameter of 4.7 cm. 2. Normal size right and left ventricle. 3. Normal LV systolic function with left ventricular ejection fraction of 51%. 4. Likely normal LV diastolic function. 5. No pericardial effusion. 6. Normal morphology of aortic valve, mitral valve, tricuspid valve, and pulmonary valve. 7. PA systolic pressure is 36 mmHg. DOPPLER: 1. Mild mitral and tricuspid regurgitation. 2. Trace aortic and pulmonary regurgitation.""",

"""Description: 2-D Echocardiogram. COMMENTS: 1. The left ventricular cavity size and wall thickness appear normal. The wall motion and left ventricular systolic function appears hyperdynamic with estimated ejection fraction of 70% to 75%. There is near-cavity obliteration seen. There also appears to be increased left ventricular outflow tract gradient at the mid cavity level consistent with hyperdynamic left ventricular systolic function. There is abnormal left ventricular relaxation pattern seen as well as elevated left atrial pressures seen by Doppler examination. 2. The left atrium likely appears mildly dilated. 3. The right atrium and right ventricle appear normal. 4. The aortic root appears normal. 5. The aortic valve appears calcified with mild aortic valve stenosis, likely calculated aortic valve area is 1.3 cm square with a maximum instantaneous gradient of 34 and a mean gradient of 19 mm. 6. There is mitral annular calcification extending to leaflets and supportive structures with thickening of mitral valve leaflets with mild mitral regurgitation. 7. The tricuspid valve appears normal with trace tricuspid regurgitation with moderate pulmonary artery hypertension. Estimated pulmonary artery systolic pressure is 49 mmHg. Estimated right atrial pressure of 10 mmHg. 8. The pulmonary valve appears normal with trace pulmonary insufficiency. 9. There is no pericardial effusion or intracardiac mass seen. 10. There is a color Doppler suggestive of a patent foramen ovale with lipomatous hypertrophy of the interatrial septum. 11. The study was somewhat technically limited and hence subtle abnormalities could be missed from the study.""",

"""Description: 2-D Echocardiogram. 2-D ECHOCARDIOGRAM Multiple views of the heart and great vessels reveal normal intracardiac and great vessel relationships. Cardiac function is normal. There is no significant chamber enlargement or hypertrophy. There is no pericardial effusion or vegetations seen. Doppler interrogation, including color flow imaging, reveals systemic venous return to the right atrium with normal tricuspid inflow. Pulmonary outflow is normal at the valve. Pulmonary venous return is to the left atrium. The interatrial septum is intact. Mitral inflow and ascending aorta flow are normal. The aortic valve is trileaflet. The coronary arteries likely appear to be normal in their origins. The aortic arch is left-sided and patent with likely normal descending aorta pulsatility.""",

"""Description: Echocardiogram and Doppler. DESCRIPTION: 1. Likely normal cardiac chambers size. 2. Normal left ventricular size. 3. Normal LV systolic function. Ejection fraction estimated around 60%. 4. Aortic valve seen with good motion. 5. Mitral valve seen with good motion. 6. Tricuspid valve seen with good motion. 7. No pericardial effusion or intracardiac masses. DOPPLER: 1. Likely trace mitral regurgitation. 2. Trace tricuspid regurgitation. IMPRESSION: 1. Normal LV systolic function. 2. Ejection fraction estimated around 60%."""
]


In [None]:
from pyspark.sql.types import StringType, IntegerType

df = spark.createDataFrame(sample_text, StringType()).toDF('text')

df.show(truncate = 100)

+----------------------------------------------------------------------------------------------------+
|                                                                                                text|
+----------------------------------------------------------------------------------------------------+
|HSample Name: 2-D Doppler. Description: Normal left ventricle, moderate biatrial enlargement, and...|
|Description: 2-D M-Mode. Doppler. 2-D M-MODE: 1. Left atrial enlargement with left atrial diamete...|
|Description: 2-D Echocardiogram. COMMENTS: 1. The left ventricular cavity size and wall thickness...|
|Description: 2-D Echocardiogram. 2-D ECHOCARDIOGRAM Multiple views of the heart and great vessels...|
|Description: Echocardiogram and Doppler. DESCRIPTION: 1. Likely normal cardiac chambers size. 2. ...|
+----------------------------------------------------------------------------------------------------+



## 5. Use the pipeline to create outputs

In [None]:
result = nlp_pipeline.fit(df).transform(df)

result.select(F.explode(F.arrays_zip(result.ner_chunk.result, 
                                     result.ner_chunk.begin, 
                                     result.ner_chunk.end,
                                     result.ner_chunk.metadata,
                                     result.assertion.result,)).alias("cols"))\
        .select(F.expr("cols['0']").alias("chunk"),
                F.expr("cols['1']").alias("begin"),
                F.expr("cols['2']").alias("end"),
                F.expr("cols['3']['entity']").alias("entity"),
                F.expr("cols['4']").alias("assertion")).show(truncate=30)

+------------------------------+-----+---+---------------+---------+
|                         chunk|begin|end|         entity|assertion|
+------------------------------+-----+---+---------------+---------+
|         Normal left ventricle|   40| 60|ImagingFindings|Confirmed|
| moderate biatrial enlargement|   63| 91|ImagingFindings|Confirmed|
|  mild tricuspid regurgitation|  105|132|ImagingFindings|Confirmed|
|mild increase in right hear...|  144|181|ImagingFindings|Confirmed|
|                      stenosis|  209|216|ImagingFindings|Confirmed|
|                     calcified|  226|234|ImagingFindings|Confirmed|
|          minimally restricted|  237|256|ImagingFindings|Confirmed|
|                   hypertrophy|  291|301|ImagingFindings|Suspected|
|      normal systolic function|  307|330|ImagingFindings|Confirmed|
|                   enlargement|  354|364|ImagingFindings|Confirmed|
|        Normal right ventricle|  370|391|ImagingFindings|Confirmed|
|             Normal appearance|  

## 6. Visualize results

In [None]:
from sparknlp_display import AssertionVisualizer

viz = AssertionVisualizer()


for j in range(len(sample_text)):
    viz.display(result = result.collect()[j], label_col = "ner_chunk", assertion_col = "assertion")
    print("\n\n")
























