![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/healthcare/NER_OPIOID.ipynb)

## Setup

In [None]:
import json
import os

from google.colab import files

license_keys = files.upload()

with open(list(license_keys.keys())[0]) as f:
    license_keys = json.load(f)

locals().update(license_keys)

os.environ.update(license_keys)

In [None]:
# Installing pyspark and spark-nlp
! pip install --upgrade -q pyspark==3.1.2 spark-nlp==$PUBLIC_VERSION

# Installing Spark NLP Healthcare
! pip install --upgrade -q spark-nlp-jsl==$JSL_VERSION  --extra-index-url https://pypi.johnsnowlabs.com/$SECRET

# Installing Spark NLP Display Library for visualization
! pip install -q spark-nlp-display

In [3]:
import sparknlp
import sparknlp_jsl

from sparknlp.base import *
from sparknlp.annotator import *
from sparknlp_jsl.annotator import *
from sparknlp_jsl.pretrained import InternalResourceDownloader

from pyspark.sql import SparkSession
from pyspark.sql import functions as F
from pyspark.ml import Pipeline,PipelineModel

import pandas as pd
pd.set_option('display.max_colwidth', 200)

import warnings
warnings.filterwarnings('ignore')

params = {"spark.driver.memory":"16G",
          "spark.kryoserializer.buffer.max":"2000M",
          "spark.driver.maxResultSize":"2000M"}

print("Spark NLP Version :", sparknlp.version())
print("Spark NLP_JSL Version :", sparknlp_jsl.version())

spark = sparknlp_jsl.start(license_keys['SECRET'],params=params)

spark

Spark NLP Version : 5.2.0
Spark NLP_JSL Version : 5.2.0


# ner_opioid_small_wip

In [4]:
document_assembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl_healthcare","en","clinical/models")\
    .setInputCols(["document"])\
    .setOutputCol("sentence")

tokenizer = Tokenizer()\
    .setInputCols(["sentence"])\
    .setOutputCol("token")

clinical_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")\
    .setInputCols(["sentence", "token"])\
    .setOutputCol("embeddings")

ner_model = MedicalNerModel.pretrained("ner_opioid_small_wip", "en", "clinical/models")\
    .setInputCols(["sentence", "token","embeddings"])\
    .setOutputCol("ner")

ner_converter = NerConverterInternal()\
    .setInputCols(["sentence", "token", "ner"])\
    .setOutputCol("ner_chunk")

pipeline = Pipeline(stages=[
    document_assembler,
    sentenceDetector,
    tokenizer,
    clinical_embeddings,
    ner_model,
    ner_converter
    ])

sentence_detector_dl_healthcare download started this may take some time.
Approximate size to download 367.3 KB
[OK!]
embeddings_clinical download started this may take some time.
Approximate size to download 1.6 GB
[OK!]
ner_opioid_small_wip download started this may take some time.
[OK!]


In [5]:
text_list = [

"""  20 year old male transferred from [**Hospital1 112**] for liver transplant evaluation after percocet overdose.
On Sunday [**3-27**] had a stressful day and pt took approximately 20 percocet (5/325) throughout the day after
a series of family arguments. Denies trying to hurt himself. Parents confirm to suicidal attempts in the past.
On Monday, the patient felt the effects of 'Percocet withdrawal' and took an additional 5 Percocet.
Pt was admitted to the SICU and followed by Liver, Transplant, Toxicology, and [**Month/Year (2) **].
He was started on NAC q4hr with gradual decline in LFT's and INR.  His recovery was c/b hypertension,
for which he was started on clonidine.  Pt was  transferred to the floor on [**4-1**].

Past Medical History:
Bipolar D/o (s/p suicide attempts in the past)
ADHD
S/p head injury [**2160**]: s/p MVA with large L3 transverse process
fx, small right frontal epidural hemorrhage-- with
post-traumatic seizures (was previously on dilantin, now dc'd)

Social History:
Father is HCP, student in [**Name (NI) 108**], Biology major, parents and brother live in
[**Name (NI) 86**], single without children, lived in a group home for 3 years as a teenager,
drinks alcohol 1 night a week, denies illict drug use, pt in [**Location (un) 86**] for neuro eval
""",

"""Discharge Disposition:
Home

Discharge Diagnosis:
Primary:
Methadone overdose

Secondary:
1. HCV/Alcoholic cirrhosis status post TIPS in ___
2. Type 2 diabetes mellitus
3. Chronic abdominal pain


Discharge Condition:
Afebrile, vital signs stable.


Discharge Instructions:
You were admitted to the hospital after an overdose of
methadone.  You should take your methadone only as prescribed.
Do not take extra pills no matter what.  Call your primary care
doctor if your pain is not controlled on the dose of methadone
you were prescribed.  You were only discharged with 5 pills of
methadone 10mg.  Take only one pill every 12 hours.  Do not take
extra no matter what.

You were only given enough methadone to cover you until your
appointment with Dr. ___ on ___.  If you take more than
10mg twice per day you will run out early and will not be given
extra methadone.""",

"""History of Present Illness:
___ with EtOH cirrhosis, h/o SIs, HTN brought to ED after
excessive EtOH followed by ?seroquel and tramadol intoxication,
intubated for AMS.

Pt has longstanding history of EtOH abuse. Has been excessively
drinking the day PTA in ___. His family brought him
back to ___ at 3AM. At about 5AM, he took a large amount of
pills out of a bag, likely seroquel and/or ultram per family
report (unkown quantity). He was driven to the ED, had
difficulties getting out of the car.
In the ED, his VS were stable (T96.7, 80, 91/65, 19, 100%RA) but
his pupils were 1mm b/l and he did not respond to painful
stimuli. EtOH level of 225. Serum tox positive for BZD. He was
given Narcan 0.4, followed by 2 mg with no improvement. EKG was
unremarkable. Head CT without acute findings. A nasal airway was
placed, pt did not have a gag. He was intubated without
complications. He also received 50 gm of activated charcoal
100mg of IV thiamine. Toxicology was consulted. His meds in his
bag were reviewed and were identified as seroquel, tramadol,
klonopin, antabuse and clonidin. Recommendations were supportive
care only. Post-intubation CXR with signs of lingular PNA. Pt
received levaquin and was admitted to the ICU for further care.

On arrival to the ICU, pt was intubated, sedated on propofol
gtt.

ROS could not be obtained.""",

"""Opioid abuse: Although the patient claims to be clean since
___, track marks on her arms and the history from ___ suggest
more recent use. We continued treatment with 20mg methadone TID
and transitioned her 30mg BID, ultimately to be on 60mg daily.
She was referred to a ___ clinic for follow-up.  Her QTc
on ___ on a stable amount of methadone was 462.

TRANSITIONAL ISSUES:
# CODE: Full
# CONTACT: Husband, ___ - does not have a phone

[ ] MEDICATION CHANGES:
- Added: Methadone 60mg PO daily, metoprolol succinate 25mg
daily, ASA 81mg daily
- Stopped: PO hydromorphone, metoprolol tartrate

[ ] METHADONE TREATMENT:
- Pt will be followed by the Habit ___ clinic on ___.
She will have her next-day dosing on ___.
- Her last dose of methadone was 60mg PO. It was given at 0952
on ___.
- QTc on ___ was 426 by ECG."""
]

In [6]:
empty_df = spark.createDataFrame([['']]).toDF("text")

pipelineModel = pipeline.fit(empty_df)

In [7]:
df = spark.createDataFrame(pd.DataFrame({"text": text_list}))
result = pipelineModel.transform(df)
result.select(F.explode(F.arrays_zip(result.ner_chunk.result,
                                                        result.ner_chunk.begin,
                                                        result.ner_chunk.end,
                                                        result.ner_chunk.metadata)).alias("cols"))\
                .select(F.expr("cols['0']").alias("ner_chunk"),
                        F.expr("cols['1']").alias("begin"),
                        F.expr("cols['2']").alias("end"),
                        F.expr("cols['3']['entity']").alias("ner_label")).show(60,truncate=False)

result

+----------------------+-----+----+----------------------+
|ner_chunk             |begin|end |ner_label             |
+----------------------+-----+----+----------------------+
|percocet              |94   |101 |opioid_drug           |
|20                    |181  |182 |drug_quantity         |
|percocet              |184  |191 |opioid_drug           |
|suicidal attempts     |307  |323 |psychiatric_issue     |
|Percocet              |383  |390 |opioid_drug           |
|withdrawal            |392  |401 |general_symptoms      |
|5                     |427  |427 |drug_quantity         |
|Percocet              |429  |436 |opioid_drug           |
|NAC                   |563  |565 |other_drug            |
|q4hr                  |567  |570 |drug_frequency        |
|decline in LFT's      |585  |600 |general_symptoms      |
|clonidine             |676  |684 |other_drug            |
|Bipolar               |758  |764 |psychiatric_issue     |
|suicide attempts      |775  |790 |psychiatric_issue    

DataFrame[text: string, document: array<struct<annotatorType:string,begin:int,end:int,result:string,metadata:map<string,string>,embeddings:array<float>>>, sentence: array<struct<annotatorType:string,begin:int,end:int,result:string,metadata:map<string,string>,embeddings:array<float>>>, token: array<struct<annotatorType:string,begin:int,end:int,result:string,metadata:map<string,string>,embeddings:array<float>>>, embeddings: array<struct<annotatorType:string,begin:int,end:int,result:string,metadata:map<string,string>,embeddings:array<float>>>, ner: array<struct<annotatorType:string,begin:int,end:int,result:string,metadata:map<string,string>,embeddings:array<float>>>, ner_chunk: array<struct<annotatorType:string,begin:int,end:int,result:string,metadata:map<string,string>,embeddings:array<float>>>]

In [8]:
from sparknlp_display import NerVisualizer

df = spark.createDataFrame(pd.DataFrame({'text': text_list}))
result = pipeline.fit(df).transform(df)

visualiser = NerVisualizer()

for i in range(len(text_list)):
  visualiser.display(result = result.collect()[i] ,label_col = 'ner_chunk', document_col = 'document')
  print("\n\n")



















