![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/healthcare/NORMALIZED_SECTION_HEADER_MAPPER.ipynb)

# **`normalized_section_header_mapper` model**

# **Colab Setup**

In [None]:
import json, os
from google.colab import files

if 'spark_jsl.json' not in os.listdir():
  license_keys = files.upload()
  os.rename(list(license_keys.keys())[0], 'spark_jsl.json')

with open('spark_jsl.json') as f:
    license_keys = json.load(f)

# Defining license key-value pairs as local variables
locals().update(license_keys)
os.environ.update(license_keys)

In [None]:
# Installing pyspark and spark-nlp
! pip install --upgrade -q pyspark==3.1.2 spark-nlp==$PUBLIC_VERSION

# Installing Spark NLP Healthcare
! pip install --upgrade -q spark-nlp-jsl==$JSL_VERSION  --extra-index-url https://pypi.johnsnowlabs.com/$SECRET

# Installing Spark NLP Display Library for visualization
! pip install -q spark-nlp-display

In [3]:
import sparknlp
import sparknlp_jsl

from sparknlp.base import *
from sparknlp.annotator import *
from sparknlp_jsl.annotator import *

from pyspark.sql import SparkSession
from pyspark.sql import functions as F
from pyspark.ml import Pipeline,PipelineModel
from pyspark.sql.types import StringType, IntegerType

import pandas as pd
pd.set_option('display.max_colwidth', 200)

import warnings
warnings.filterwarnings('ignore')

params = {"spark.driver.memory":"16G",
          "spark.kryoserializer.buffer.max":"2000M",
          "spark.driver.maxResultSize":"2000M"}

spark = sparknlp_jsl.start(license_keys['SECRET'],params=params)

print("Spark NLP Version :", sparknlp.version())
print("Spark NLP_JSL Version :", sparknlp_jsl.version())

spark

Spark NLP Version : 5.0.1
Spark NLP_JSL Version : 5.0.1


# **🔎 For about models**

### 📌 **normalized_section_header_mapper**

This pretrained pipeline normalizes the section headers in clinical notes. It returns two levels of normalization called level_1 and level_2.

# **🔎Define Spark NLP pipeline**

In [None]:
document_assembler = DocumentAssembler()\
      .setInputCol('text')\
      .setOutputCol('document')

sentence_detector = SentenceDetector()\
      .setInputCols(["document"])\
      .setOutputCol("sentence")

tokenizer = Tokenizer()\
      .setInputCols("sentence")\
      .setOutputCol("token")

embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en","clinical/models")\
      .setInputCols(["sentence", "token"])\
      .setOutputCol("word_embeddings")

clinical_ner = MedicalNerModel.pretrained("ner_jsl_slim", "en", "clinical/models")\
      .setInputCols(["sentence","token", "word_embeddings"])\
      .setOutputCol("ner")

ner_converter = NerConverterInternal()\
      .setInputCols(["sentence", "token", "ner"])\
      .setOutputCol("ner_chunk")\
      .setWhiteList(["Header"])

chunkerMapper = ChunkMapperModel.pretrained("normalized_section_header_mapper", "en", "clinical/models") \
      .setInputCols("ner_chunk")\
      .setOutputCol("mappings")\
      .setRel("level_1") #or level_2

pipeline = Pipeline().setStages([
    document_assembler,
    sentence_detector,
    tokenizer,
    embeddings,
    clinical_ner,
    ner_converter,
    chunkerMapper])



embeddings_clinical download started this may take some time.
Approximate size to download 1.6 GB
[OK!]
ner_jsl_slim download started this may take some time.
[OK!]
normalized_section_header_mapper download started this may take some time.
[OK!]


# **🔎Sample Text**

In [None]:
sample_text = """ADMISSION DIAGNOSIS Upper respiratory illness with apnea, possible pertussis.
DISCHARGE DIAGNOSIS Upper respiratory illness with apnea, possible pertussis.
PATIENT HISTORY This is a one plus-month-old female with respiratory symptoms for approximately a week prior to admission.  This involved cough, post-tussive emesis, questionable fever, but only 99.7.  Their usual doctor prescribed amoxicillin over the phone.  The coughing persisted and worsened.  She went to the ER, where sats were normal at baseline, but dropped into the 80s with coughing spells.  They did witness some apnea.  They gave some Rocephin, did some labs, and the patient was transferred to hospital.
GENERAL HISTORY AND PHYSICAL On admission. there was some nasal discharge. Remainder of the HEENT was normal.Had few rhonchi,  No retractions,  No significant coughing or apnea during the admission physical. abdomen  benign.
RADIOGRAPHIC STUDIES She had a CBC done Garberville, which showed a white count of 12.4, with a differential of 10 segs, 82 lymphs, 8 monos, hemoglobin of 15, hematocrit 42, platelets 296,000, and a normal BMP.  An x-ray was done and I do not have an official interpretation, but to the admitting physician, Dr. X it showed no significant infiltrate.  Well at hospital, she had a rapid influenza swab done, which was negative.  She had a rapid RSV done, which is still not in the chart, but I believe I was told that it was negative.  She also had a pertussis PCR swab done and a pertussis culture done, neither of which has result in the chart.  I do know that the pertussis culture proved to be negative.
HOSPITAL COURSE The baby was afebrile.  Required no oxygen in the hospital.  Actually fed reasonably well.  Did have one episode of coughing with slight emesis.  Appeared basically quite well between episodes.  Had no apnea witnessed and after overnight observation, the parents were anxious to go home.  The patient was started on Zithromax in the hospital.
DISCHARGE CONDITION The patient was in stable condition and good condition on exam at the time and was discharged home on Zithromax to be followed up in the office within a week.
DISCHARGE INSTRUCTIONS Include usual diet and to follow up within a week, but certainly sooner if the coughing is worse and there is cyanosis or apnea again."""


df = spark.createDataFrame([[sample_text]]).toDF('text')
df.show(truncate = False)

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

# **🔎Run the pipeline**

In [None]:
result = pipeline.fit(df).transform(df)

In [None]:
result.select(F.explode(F.arrays_zip(result.ner_chunk.result,
                                     result.mappings.result)).alias("col"))\
      .select(F.expr("col['0']").alias("ner_chunk"),
              F.expr("col['1']").alias("normalized_headers")).show(truncate=False)

+----------------------+-----------------------------+
|ner_chunk             |normalized_headers           |
+----------------------+-----------------------------+
|ADMISSION DIAGNOSIS   |DIAGNOSIS                    |
|DISCHARGE DIAGNOSIS   |ADMISSION DIAGNOSIS          |
|PATIENT HISTORY       |DIAGNOSIS                    |
|GENERAL HISTORY AND   |DISCHARGE DIAGNOSIS          |
|RADIOGRAPHIC STUDIES  |HISTORY                      |
|HOSPITAL COURSE       |EXPOSURE HISTORY             |
|DISCHARGE CONDITION   |NONE                         |
|DISCHARGE INSTRUCTIONS|LABORATORY AND RADIOLOGY DATA|
|null                  |MAGNETIC RESONANCE IMAGING   |
|null                  |COURSE TYPE                  |
|null                  |HOSPITAL COURSE              |
|null                  |DISCHARGE RELATED            |
|null                  |DISCHARGE CONDITION          |
|null                  |DISCHARGE RELATED            |
|null                  |DISCHARGE INSTRUCTIONS       |
+---------

# 📌 ner_section_header_diagnosis

# **🔎Sample Text**

In [4]:
final_input_list = ['''
Patient Name: Samantha Johnson
Age: 52
Gender: Female
Patient Info:
Name: Samantha Johnson
Age: 52
Gender: Female
Medical History:
Patient has a history of Chronic respiratory disease.
Clinical History:
Patient presented with shortness of breath and chest pain.
Chief Complaint:
Patient complained of chest pain and difficulty breathing.
History of Present Illness:
Patient has been experiencing chest pain and shortness of breath for the past week. Symptoms were relieved by medication at first but became worse over time.
Past Medical History:
Patient has a history of Asthma and was previously diagnosed with Bronchitis.
Medications:
Patient is currently taking Albuterol, Singulair, and Advair for respiratory issues.
Allergies:
Patient has a documented allergy to Penicillin.
Physical Examination:
Patient had diffuse wheezing and decreased breath sounds on lung auscultation. Heart rate and rhythm were regular.
Laboratory Results:
Pulmonary function test results showed a decrease in Forced Expiratory Volume in one second (FEV1).
Imaging Studies:
Chest x-ray showed bilateral infiltrates consistent with Chronic obstructive pulmonary disease (COPD).
Diagnosis:
The patient was diagnosed with COPD exacerbation.
Treatment Plan:
The patient was managed with nebulized bronchodilators, steroid therapy, and oxygen as needed. The patient was discharged with instructions to continue bronchodilator and steroid therapy and to follow up with primary care physician in two weeks.
''',
"""ADMISSION DIAGNOSIS Upper respiratory illness with apnea, possible pertussis.
DISCHARGE DIAGNOSIS Upper respiratory illness with apnea, possible pertussis.
PATIENT HISTORY This is a one plus-month-old female with respiratory symptoms for approximately a week prior to admission.  This involved cough, post-tussive emesis, questionable fever, but only 99.7.  Their usual doctor prescribed amoxicillin over the phone.  The coughing persisted and worsened.  She went to the ER, where sats were normal at baseline, but dropped into the 80s with coughing spells.  They did witness some apnea.  They gave some Rocephin, did some labs, and the patient was transferred to hospital.
GENERAL HISTORY AND PHYSICAL On admission. there was some nasal discharge. Remainder of the HEENT was normal.Had few rhonchi,  No retractions,  No significant coughing or apnea during the admission physical. abdomen  benign.
RADIOGRAPHIC STUDIES She had a CBC done Garberville, which showed a white count of 12.4, with a differential of 10 segs, 82 lymphs, 8 monos, hemoglobin of 15, hematocrit 42, platelets 296,000, and a normal BMP.  An x-ray was done and I do not have an official interpretation, but to the admitting physician, Dr. X it showed no significant infiltrate.  Well at hospital, she had a rapid influenza swab done, which was negative.  She had a rapid RSV done, which is still not in the chart, but I believe I was told that it was negative.  She also had a pertussis PCR swab done and a pertussis culture done, neither of which has result in the chart.  I do know that the pertussis culture proved to be negative.
HOSPITAL COURSE The baby was afebrile.  Required no oxygen in the hospital.  Actually fed reasonably well.  Did have one episode of coughing with slight emesis.  Appeared basically quite well between episodes.  Had no apnea witnessed and after overnight observation, the parents were anxious to go home.  The patient was started on Zithromax in the hospital.
DISCHARGE CONDITION The patient was in stable condition and good condition on exam at the time and was discharged home on Zithromax to be followed up in the office within a week.
DISCHARGE INSTRUCTIONS Include usual diet and to follow up within a week, but certainly sooner if the coughing is worse and there is cyanosis or apnea again.""",
"""PATIENT HISTORY The patient is a 56-year-old female with a history of systemic lupus erythematosus, who was last seen in rheumatology clinic approximately 4 months ago for bilateral hand discomfort, left greater than right.  The patient was seen on 10/30/07.  She had the same complaint.  She was given a trial of Elavil at bedtime because the thought was to see that represented ulnar or radial neuropathy.  She was also given a prescription for Zostrix cream but was unable to get it filled because of insurance coverage.  The patient reports some worsening of the symptoms especially involving at the dorsum of the left hand, and she points to the area that actually involves the dorsal aspect of the second, third, and fourth digits.  The patient recently has developed what sounds like an upper respiratory problem with a nonproductive cough for 3 days, although she reports that she has had subjective fevers for the past 3 or 4 days, but has not actually taken the temperature.  She has not had any night sweats or chills.  She has had no recent problems with chest pain, chest discomfort, shortness of breath or problems with GU or GI complaints.  She is returning today for routine followup evaluation.
REVIEW OF SYSTEMS Noncontributory except for what was noted in the HPI and the remainder or complete review of systems is unremarkable.
VITAL SIGNS  blood pressure 155/84, pulse 87, weight 223 pounds, and temperature 99.2. She is a well-developed, well-nourished female appearing her staged age.  She is alert, oriented, and cooperative. Normocephalic and atraumatic.  There is no facial rash.  No oral lesions.  lungs;  Clear to auscultation.
LABORATORY DATA wbc 5100, hemoglobin 11.1, hematocrit 32.8, and platelets 200,000.  Westergren sedimentation rate of 47.  Urinalysis is negative for protein and blood.  Lupus serology is pending.
DISCHARGE INSTRUCTIONS The patient will have a trial of a resting wrist splint at night for the next 4 to 6 weeks.  If there is no improvement, the patient will return for corticosteroid injection of her carpal tunnel.,2.  Azithromycin 5-day dose pack.,3.  Robitussin Cough and Cold Flu to be taken twice a day.,4.  Atarax 25 mg at bedtime for sleep.,5.  The patient will return to the rheumatology clinic for a routine followup evaluation in 4 months.""",
"""ADMISSION DIAGNOSIS  Intractable migraine with aura.
FINAL DIAGNOSIS  Bipolar disorder.,iron deficiency anemia., anxiety disorder., history of tubal ligation.
PATIENT HISTORY   The patient is a 25-year-old right-handed Caucasian female who presented to the emergency department with sudden onset of headache occurring at approximately 11 a.m. on the morning of the July 31, 2008.  She described the headache as worse in her life and it was also accompanied by blurry vision and scotoma.  The patient also perceived some swelling in her face.  Once in the Emergency Department, the patient underwent a very thorough evaluation and examination.  She was given the migraine cocktail.  Also was given morphine a total of 8 mg while in the Emergency Department.  For full details on the history of present illness, please see the previous history and physical.
HOSPITAL COURSE   The patient was admitted to the neurological service after her headache felt to be removed with the headache cocktail.  The patient was brought up to 4 or more early in the a.m. on the August 1, 2008 and was given the dihydroergotamine IV, which did allow some minimal resolution in her headache immediately.  At the time of examination this morning, the patient was feeling better and desired going home.  She states the headache had for the most part resolved though she continues to have some diffuse trigger point pain.""",
"""PATIENT HISTORY A 14-day-old was seen by private doctor because of blister.  On Friday, she was noted to have a small blister near her umbilicus.  They went to their doctor on Saturday, culture was drawn.  It came back today, growing MRSA.  She has been doing well.  They put her on bacitracin ointment near the umbilicus.  That has about healed up.  However today, they noticed a small blister on her left temporal area.  They called the private doctor.  They direct called the Infectious Disease doctor here and was asked that they come into the hospital.  Mom states she has been diagnosed with MRSA on her buttocks as well and is on some medications.  The child has not had any fever.  She has not been lethargic or irritable.  She has been eating well up to 2 ounces every feed.  Eating well and sleeping well.  No other changes have been noted.
PAST MEDICAL HISTORY She was born full term.  No complications.  Home with mom.  No hospitalization, surgeries, allergies.
FAMILY AND SOCIAL HISTORY Negative, No ill contacts.  No travel or changes in living condition.
REVIEW OF SYSTEMS Ten systems were asked, all of them were negative except as noted above.
EMERGENCY DEPARTMENT COURSE  I spoke with Infectious Disease, Dr. X.  He states, we should treat for MRSA with Bactrim p.o.  There has been no evidence of jaundice with this little girl.  Hibiclens and Bactroban.  I spoke with Dr. X's associate to call back after Dr. X recommended a Herpes culture be done, just for completeness and that was done.  Blood culture was done here to make sure she did not have MRSA in her blood, which clinically, she does not appear to have.  She was discharged in stable condition.""",
"""ADMISSION DIAGNOSIS Symptomatic thyroid goiter.
DISCHARGE DIAGNOSIS Symptomatic thyroid goiter.
PRINCIPAL PROCEDURES  Total thyroidectomy.
HOSPITAL COURSE The patient underwent total thyroidectomy on 09/22/08, which she tolerated very well and remained stable in the postoperative period.  On postoperative day #1, she was tolerating her diet, began on thyroid hormone replacement, and remained afebrile with stable vital signs.  She required intravenous narcotics for pain control.  She was judged stable for discharge home on 09/25/08, tolerating a diet well, having no fever, stable vital signs, and good pain control.  The wound was clean and dry.  The drain was removed.  She was instructed to follow up in the surgical office within one week after discharge.  She was given prescription for Vicodin for pain and Synthroid thyroid hormone, and otherwise the appropriate wound care instructions per my routine wound care sheet."""
]

In [5]:
documentAssembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

sentenceDetector = SentenceDetector()\
    .setInputCols(["document"])\
    .setOutputCol("sentence")

tokenizer = Tokenizer()\
    .setInputCols(["sentence"])\
    .setOutputCol("token")

word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical","en","clinical/models")\
    .setInputCols(["sentence","token"])\
    .setOutputCol("embeddings")

clinical_ner = MedicalNerModel.pretrained("ner_section_header_diagnosis", "en","clinical/models")\
    .setInputCols(["sentence","token","embeddings"])\
    .setOutputCol("ner")\
    .setLabelCasing("upper") #decide if we want to return the tags in upper or lower case

ner_converter = NerConverterInternal()\
    .setInputCols(["sentence","token","ner"])\
    .setOutputCol("ner_chunk")

nlpPipeline = Pipeline(stages=[
        documentAssembler,
        sentenceDetector,
        tokenizer,
        word_embeddings,
        clinical_ner,
        ner_converter])

model = nlpPipeline.fit(spark.createDataFrame([['']]).toDF("text"))

embeddings_clinical download started this may take some time.
Approximate size to download 1.6 GB
[OK!]
ner_section_header_diagnosis download started this may take some time.
[OK!]


# **🔎Run the pipeline**

In [6]:
df= spark.createDataFrame(pd.DataFrame({"text": final_input_list}))
result = model.transform(df)

In [7]:
result = result.select(F.explode(F.arrays_zip(result.ner_chunk.result, result.ner_chunk.metadata)).alias("cols")) \
    .select(F.expr("cols['0']").alias("chunk"),
            F.expr("cols['1']['entity']").alias("ner_label")).show(truncate=False)

+-------------------------------------+--------------------------+
|chunk                                |ner_label                 |
+-------------------------------------+--------------------------+
|Patient Info                         |PATIENT_INFO_HEADER       |
|Medical History                      |MEDICAL_HISTORY_HEADER    |
|Chronic respiratory disease          |RESPIRATORY_DISEASE       |
|Clinical History                     |CLINICAL_HISTORY_HEADER   |
|Chief Complaint                      |CHIEF_COMPLAINT_HEADER    |
|History of Present Illness           |HISTORY_PRES_ILNESS_HEADER|
|Past Medical History                 |MEDICAL_HISTORY_HEADER    |
|Asthma                               |RESPIRATORY_DISEASE       |
|Bronchitis                           |RESPIRATORY_DISEASE       |
|Medications                          |MEDICATIONS_HEADER        |
|Allergies                            |ALLERGIES_HEADER          |
|Laboratory Results                   |LAB_RESULTS_HEADER     