![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/named_entity_recognition_NER/NLU_explain_clinical_doc_vop_pipeline.ipynb)

# Explain Clinical Document - Voice Of Patient (VOP)

This pipeline is designed to:

- extract all healthcare-related entities

- assign assertion status to the extracted entities

- establish relations between the extracted entities

from the documents transferred from the patient‚Äôs sentences. In this pipeline, six NER models, one assertion model, and one relation extraction model were used to achieve those tasks.

In [None]:
! pip install nlu pyspark==3.1.2

In [None]:
! pip install johnsnowlabs

In [1]:
import json, os
from google.colab import files

if 'spark_jsl.json' not in os.listdir():
  license_keys = files.upload()
  os.rename(list(license_keys.keys())[0], 'spark_jsl.json')

with open('spark_jsl.json') as f:
    license_keys = json.load(f)

# Defining license key-value pairs as local variables
locals().update(license_keys)
os.environ.update(license_keys)

In [3]:
# Installing pyspark and spark-nlp
! pip install --upgrade -q pyspark==3.1.2 spark-nlp==$PUBLIC_VERSION

# Installing NLU
! pip install --upgrade --q nlu --no-dependencies

# Installing Spark NLP Healthcare
! pip install --upgrade -q spark-nlp-jsl==$JSL_VERSION  --extra-index-url https://pypi.johnsnowlabs.com/$SECRET

# Installing Spark NLP Display Library for visualization
! pip install -q spark-nlp-display

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
johnsnowlabs 5.3.2 requires pyspark==3.4.0, but you have pyspark 3.1.2 which is incompatible.[0m[31m
[2K     [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m507.6/507.6 kB[0m [31m18.9 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
import json
import os

import sparknlp
import sparknlp_jsl
import nlu

from sparknlp.base import *
from sparknlp.annotator import *
from sparknlp_jsl.annotator import *

from pyspark.sql import SparkSession
from pyspark.sql import functions as F
from pyspark.ml import Pipeline,PipelineModel

import pandas as pd
pd.set_option('display.max_colwidth', 200)

import warnings
warnings.filterwarnings('ignore')

params = {"spark.driver.memory":"16G",
          "spark.kryoserializer.buffer.max":"2000M",
          "spark.driver.maxResultSize":"2000M"}

print("Spark NLP Version :", sparknlp.version())
print("Spark NLP_JSL Version :", sparknlp_jsl.version())

spark = sparknlp_jsl.start(license_keys['SECRET'],params=params)

spark

Spark NLP Version : 5.3.1
Spark NLP_JSL Version : 5.3.1


In [3]:
pipe = nlu.load("en.explain_doc.pipeline_vop")

explain_clinical_doc_radiology download started this may take some time.
Approx size to download 1.7 GB
[OK!]


In [4]:
text = ["""I had been feeling really tired all the time and was losing weight without even trying. My doctor checked my sugar levels and they came out to be high. So, I have type 2 diabetes.
He put me on two medications - I take metformin 500 mg twice a day, and glipizide 5 mg before breakfast and dinner. I also have to watch what I eat and try to exercise more.
Now, I also have chronic acid reflux disease or GERD. Now I take daily omeprazole 20 mg to control the heartburn symptoms."""]

In [5]:
df = pipe.predict(text)

[91müö® Your Spark-Healthcare is outdated, installed==5.3.1 but latest version==5.3.0
You can run [92m nlp.install() [39mto update Spark-Healthcare


In [6]:
df

Unnamed: 0,assertion,document,entities_jsl_ner_chunk,entities_jsl_ner_chunk_class,entities_jsl_ner_chunk_confidence,entities_jsl_ner_chunk_origin_chunk,entities_jsl_ner_chunk_origin_sentence,entities_ner_chexpert_chunk,entities_ner_oncology_chunk,entities_ner_oncology_chunk_class,...,entities_radiology_ner_chunk_class,entities_radiology_ner_chunk_confidence,entities_radiology_ner_chunk_origin_chunk,entities_radiology_ner_chunk_origin_sentence,matched_pos,pos,relations,sentence_dl,unlabeled_dependency,word_embedding_embeddings
0,"[Suspected, Confirmed, Confirmed, Confirmed, Suspected, Confirmed]","I had been feeling really tired all the time and was losing weight without even trying. My doctor checked my sugar levels and they came out to be high. So, I have type 2 diabetes.\nHe put me on tw...","[feeling really tired, losing weight, sugar levels, high, He, metformin, twice a day, glipizide, before breakfast and dinner, acid reflux disease, GERD, daily, omeprazole, heartburn symptoms]","[Symptom, Symptom, Test, Test_Result, Gender, Drug, Frequency, Drug, Frequency, Disease_Syndrome_Disorder, Disease_Syndrome_Disorder, Frequency, Drug, Symptom]","[0.42706665, 0.7063, 0.81455, 0.8967, 1.0, 0.978, 0.86443335, 0.999, 0.866575, 0.5604667, 0.9963, 0.9646, 0.9959, 0.70985]","[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]","[0, 0, 1, 1, 3, 3, 3, 3, 3, 5, 5, 6, 6, 6]",[],[He],[Gender],...,"[Symptom, Test, Disease_Syndrome_Disorder, Measurements, Units, Measurements, Units, Disease_Syndrome_Disorder, Disease_Syndrome_Disorder, Measurements, Units, Symptom]","[0.6786, 0.40135002, 0.4851, 0.9707, 0.9833, 0.9051, 0.9769, 0.643025, 0.9902, 0.8762, 0.9376, 0.7716]","[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]","[0, 1, 2, 3, 3, 3, 3, 5, 5, 6, 6, 6]","[feeling really tired, losing weight, sugar levels, high, diabetes, He, two medications, metformin, 500, mg, twice a day, glipizide, 5, mg, before breakfast and dinner, chronic acid reflux disease...","[MC, VHD, VBN, VVG, RR, VVNJ, DB, DD, NN, CC, VBD, VVGJ, NN, II, RR, VVGJ, NN, NN, NN, VVD, NN, NN, NNS, CC, PN, VVB, II, TO, VBI, JJ, NN, NN, NN, MC, VHB, NN, MC, NN, NN, NN, NN, NN, II, MC, NNS,...",,"[I had been feeling really tired all the time and was losing weight without even trying., My doctor checked my sugar levels and they came out to be high., So, I have type 2 diabetes., He put me on...","[feeling, feeling, feeling, ROOT, tired, feeling, time, time, tired, losing, losing, weight, tired, trying, trying, feeling, feeling, doctor, checked, ROOT, levels, levels, checked, came, came, ch...","[[0.09337666630744934, 0.031265825033187866, 0.152923122048378, -0.24998794496059418, 0.49187055230140686, -0.44001245498657227, 0.14361239969730377, -0.3373923599720001, 0.1620967984199524, 0.066..."
