# Patient Cohort Building with Unstructured Data: Entity Extraction

In this notebook, we extract clinical and medical entities from texts to create a Knowledge Graph (KG) using pre-trained models in [Spark NLP relation extraction models for healthcare](https://www.johnsnowlabs.com/databricks/?utm_term=sparknlp&utm_campaign=Search+%7C+Spark+NLP&utm_source=adwords&utm_medium=ppc&hsa_acc=7272492311&hsa_cam=12543136013&hsa_grp=121056973604&hsa_ad=605485254464&hsa_src=g&hsa_tgt=kwd-1243265465686&hsa_kw=sparknlp&hsa_mt=p&hsa_net=adwords&hsa_ver=3&gclid=Cj0KCQiAmaibBhCAARIsAKUlaKRjPen9d1iGLcnRo3Ep10euMmW8dd5HuwERjTbbgyaOcNYrwaAeu8caAvmmEALw_wcB).
In the first step of this workflow, we use pre-trained models to extract the entities and their relationships and in the next step we create a KG.

In [0]:
import json
import os

from pyspark.sql import SparkSession
from pyspark.ml import PipelineModel,Pipeline
from pyspark.sql import functions as F
from pyspark.sql.types import *

from sparknlp.annotator import *
from sparknlp_jsl.annotator import *
from sparknlp.base import *
import sparknlp_jsl
import sparknlp

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import warnings

warnings.filterwarnings("ignore")
pd.set_option("display.max_colwidth",100)

print('sparknlp.version : ',sparknlp.version())
print('sparknlp_jsl.version : ',sparknlp_jsl.version())

spark

[0;31m---------------------------------------------------------------------------[0m
[0;31mModuleNotFoundError[0m                       Traceback (most recent call last)
[0;32m<command-713863765819899>[0m in [0;36m<cell line: 9>[0;34m()[0m
[1;32m      7[0m [0;32mfrom[0m [0mpyspark[0m[0;34m.[0m[0msql[0m[0;34m.[0m[0mtypes[0m [0;32mimport[0m [0;34m*[0m[0;34m[0m[0;34m[0m[0m
[1;32m      8[0m [0;34m[0m[0m
[0;32m----> 9[0;31m [0;32mfrom[0m [0msparknlp[0m[0;34m.[0m[0mannotator[0m [0;32mimport[0m [0;34m*[0m[0;34m[0m[0;34m[0m[0m
[0m[1;32m     10[0m [0;32mfrom[0m [0msparknlp_jsl[0m[0;34m.[0m[0mannotator[0m [0;32mimport[0m [0;34m*[0m[0;34m[0m[0;34m[0m[0m
[1;32m     11[0m [0;32mfrom[0m [0msparknlp[0m[0;34m.[0m[0mbase[0m [0;32mimport[0m [0;34m*[0m[0;34m[0m[0;34m[0m[0m

[0;32m/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py[0m in [0;36mimport_patch[0;34m(name, globals, 

In [0]:
spark._jvm.com.johnsnowlabs.util.start.registerListenerAndStartRefresh()

## Download Medical Dataset

In this notebook, we will use synthetic medical records in csv format.

In [0]:
notes_path='/FileStore/HLS/jsl_kg/data/'
delta_path='/FileStore/HLS/jsl_kg/delta/jsl/'

dbutils.fs.mkdirs(notes_path)
os.environ['notes_path']=f'/dbfs{notes_path}'

In [0]:
%sh
cd $notes_path
wget -q https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/databricks/python/healthcare_case_studies/data/data.csv

In [0]:
display(dbutils.fs.ls(f'{notes_path}/'))

path,name,size
dbfs:/FileStore/HLS/jsl_kg/data/data.csv,data.csv,1454309
dbfs:/FileStore/HLS/jsl_kg/data/data.csv.1,data.csv.1,1454309
dbfs:/FileStore/HLS/jsl_kg/data/data.csv.10,data.csv.10,1454309
dbfs:/FileStore/HLS/jsl_kg/data/data.csv.11,data.csv.11,1454309
dbfs:/FileStore/HLS/jsl_kg/data/data.csv.12,data.csv.12,1454309
dbfs:/FileStore/HLS/jsl_kg/data/data.csv.13,data.csv.13,1454309
dbfs:/FileStore/HLS/jsl_kg/data/data.csv.2,data.csv.2,1454309
dbfs:/FileStore/HLS/jsl_kg/data/data.csv.3,data.csv.3,1454309
dbfs:/FileStore/HLS/jsl_kg/data/data.csv.4,data.csv.4,1454309
dbfs:/FileStore/HLS/jsl_kg/data/data.csv.5,data.csv.5,1454309


## Read Data and Write to Bronze Delta Layer

There are 965 clinical records stored in delta table. We read the data and write the records into bronze delta tables.

In [0]:
df = spark.createDataFrame(pd.read_csv(f'/dbfs{notes_path}/data.csv', sep=';'))
df.limit(20).display()

subject_id,date,text,gender,dateOfBirth
19823,2167-02-25,"Admission Date: [**2167-2-16**] Discharge Date: [**2167-2-24**] Date of Birth: [**2099-5-5**] Sex: F Service: [**Hospital Unit Name 196**] CHIEF COMPLAINT: Shortness of breath, cough, and fever. HISTORY OF PRESENT ILLNESS: The patient is a 67 -year-old Russian speaking female with a past medical history significant for diabetes type II, congestive heart failure of unknown etiology, and hypertension. The patient presents with a three day history of progressively worsening shortness of breath and dyspnea on exertion, wheezing, nonproductive cough, and fever to 102 F on the day prior to admission. Per patient's husband, she denies any nausea or vomiting, chills at night, night sweats, or chest pain. She denies diarrhea. She has been constipated. The patient does have paroxysmal nocturnal dyspnea and two pillow orthopnea. The patient denies dysuria. Fingersticks at home have been running approximately 200 to 270's. The patient denies any sick contacts. On the morning of presentation, the patient was noted to be more lethargic by her husband. [**Name (NI) **] report, her oxygen saturation upon arrival of the EMS, was in the 80's. The patient was placed on 100% nonrebreather and arrived at [**Hospital3 **] - [**Hospital **] [**First Name (Titles) **] [**Last Name (Titles) **] where she was noted to be wheezing on examination. She was given Albuterol and Atrovent nebulizers with improvement of her oxygen saturation from 90% to 94%, also on 100% nonrebreather. She was also administered 40 mg of IV Lasix times two with diuresis of approximately one liter. The patient denied any chest pain throughout her entire presentation. PAST MEDICAL HISTORY: 1. Type II diabetes mellitus. 2. Morbid obesity. 3. Hypertension. 4. Congestive heart failure of unclear etiology with normal [**Name (NI) 20679**] systolic function. Question of left ventricular hypertrophy on prior echocardiogram. 5. Stasis dermatitis in bilateral lower extremities. 6. No history of coronary artery disease. 7. Restrictive lung disease, believed to be secondary to morbid obesity. The patient does have a home O2 requirement of approximately 2.0 to 2.5 liters during the day time. 8. Presumptive obstructive sleep apnea. ALLERGIES: The patient has no known drug allergies. ADMITTING MEDICATIONS: 1) Amaryl 2.0 mg po bid, 2) Glucophage 1,000 mg po bid, 3) Singulair 10 mg po q day, 4) Zocor 10 mg po q day, 5) Hyzaar 50/12.5 q day, 6) Atenolol 25 mg po q day, 7) Avandia 8.0 mg po bid. SOCIAL HISTORY: The patient denies any tobacco use. She lives with her husband, no alcohol use. She is gravida II, para II. FAMILY HISTORY: Negative for cancers. Paternal grandmother with diabetes mellitus and maternal aunt with coronary artery disease. PRIOR STUDIES: Echocardiogram of 12/99 revealed mild left axis deviation, mild mitral regurgitation, and mild pulmonary artery systolic hypertension. Exercise tolerance test MIBI was without angina, no ischemic changes, left ventricular ejection fraction was estimated at 64% with normal wall motion. PHYSICAL EXAMINATION: Temperature on presentation was 99.1 F, pulse is 84, blood pressure was 116/50, breathing at a rate of 19, 96% on 100% nonrebreather. Weight was approximately 286 pounds. In general, the patient was alert and oriented times three, on 100% nonrebreather, in moderate respiratory distress. Head, eyes, ears, nose, and throat: pupils are equal, round, and reactive to light and accommodation bilaterally, oropharynx is clear. Neck is without lymphadenopathy, there is not any assessable jugular venous pulse. Chest examination reveals diffuse inspiratory and expiratory wheezes and crackles with a prolonged expiratory phase. Cardiovascular examination is of regular rate and rhythm without evidence of murmurs, rubs, or gallops. Abdomen is obese without tenderness, guarding, or distention, there are normoactive bowel sounds. There is no suprapubic tenderness, no costovertebral angle tenderness. Extremities reveal trace bilateral upper extremity edema with 2+ to 3+ bilateral lower extremity edema, 1+ bilateral dorsalis pedis pulses, and 2+ bilateral radial pulses with stage I stasis dermatitis of the bilateral ankles. Neurologic examination: there are no focal motor deficits and the patient denies any sensory changes. ADMISSION LABORATORY DATA: Include a white blood cell count of 8.2, hematocrit of 39.6, platelets are 310,000. Chemistries showed a sodium of 138, potassium of 5.0, chloride 96, bicarbonate is 34, BUN is 17, creatinine 0.6, serum glucose is 231. Urinalysis was negative for blood, nitrates, with 30 protein, greater than 1,000 glucose, negative ketones, and no cells. Troponin on admission was 1.4, CK on admission was 176 with an MB fraction of 2.0. Second CK was 153 with an MB fraction of 3.0. Calcium was 8.3, albumin 3.5, magnesium 4.9, phosphate 1.9. Chest x-ray on admission showed florid congestive heart failure with no evidence of pleural effusion, but perihilar haziness and more confluent of opacification of the lung bases and retrocardiac region. Electrocardiogram showed a left bundle branch block, stable from comparison electrocardiogram of [**2165-10-30**], with no acute ST-T wave changes indicative of ongoing ischemia. HOSPITAL COURSE: The patient was admitted and started on a regimen for treatment of acute pulmonary edema. The patient was aggressively diuresed with 80 mg of IV Lasix [**Hospital1 **] and oxygenated with face mask O2 as needed to keep oxygen saturations greater than 92%. Levaquin 500 mg po q day was also initiated for treatment of presumptive pneumonia. The patient remained initially stable overnight; however, on hospital day two was noted to be increasingly somnolent. An arterial blood gas was measured at this time which revealed a pH of 7.2, pO2 of 42, pCO2 of 104. The patient was sent for a stat CT scan angiogram to rule out pulmonary embolism. This study, although grossly limited, did not find any major perfusion deficits in the pulmonary vascular system. The patient was continued to be aggressively diuresed. She was also placed on BiPAP, however, the patient did not tolerate the BiPAP apparatus. The patient was transferred to the Medical Intensive Care Unit for closer monitoring and optimization of respiratory status in the setting of hypercarbic respiratory failure of unclear etiology with underlying congestive heart failure and pneumonia, both of which were being treated. While in the Intensive Care Unit, the patient's oxygenation was maximized with CPAP at bed time at 20 cm of water which the patient intermittently tolerated and five liters of oxygen via nasal cannula during the day. Repeat arterial blood gas prior to discharge from the Medical Intensive Care Unit, was pH of 7.36 with CO2 of 87 and a pO2 of 61. The patient was transferred to [**Hospital Unit Name 196**] team after her Medical Intensive Care Unit course. While on the floor, the patient continued to improve on CPAP of 5.0 cm to 10 cm of water at bedtime. She was unable to tolerate any further increase beyond this point. The patient's oxygenation remained stable on four to five liters of oxygen via nasal cannula during the day time. The patient continued to be aggressively diuresed. It was noted that intermittently throughout her hospital course, the patient was experiencing gross hematuria through her Foley catheter. At the time her subcutaneous heparin was discontinued. In addition, this was believed to be secondary to the fact that the patient was intermittently on heparin around the time of admission as empiric therapy for a possible pulmonary embolus. Upon returning to the Medical Floor, the patient was noted also to have intermittent alarms on telemetry of paroxysmal multi-focal atrial tachycardia and frequent runs of supraventricular tachycardia. The patient's Albuterol nebulizers were believed to be contributing to this and these were placed on a prn basis. In addition, secondary to the patient's persistent wheezing, her beta blocker was stopped and she was switched to Diltiazem. For further characterization of the patient's congestive heart failure, a transthoracic echocardiogram was obtained to assess the patient's systolic function for any further causes for possible diastolic dysfunction. Echocardiogram windows were severely limited secondary to the patient's body habitus; however, preliminary [**Location (un) 1131**] was of a sustained systolic function. Over the following day of her hospitalization, the patient was evaluated by Physical Therapy, was ambulating well, and was switched to po regimens for her diuretics. Fingerstick blood sugars throughout the hospitalization remained well controlled and the patient was covered with insulin sliding scale. DISPOSITION: The patient was discharged to a rehabilitation facility in stable condition. DISCHARGE STATUS: Stable to rehabilitation. DISCHARGE INSTRUCTIONS: The patient is to follow up with Dr. [**First Name8 (NamePattern2) **] [**Name (STitle) 19512**] in one to two weeks. At this time she will also be scheduled for further outpatient pulmonary follow up for characterization of her restrictive lung disease and for further evaluation of possible obstructive lung disease. DISCHARGE MEDICATIONS: 1) Lasix 40 mg po bid, 2) potassium chloride 40 mEq po q day, 3) Diltiazem SR 16 mg po bid, 4) Losartan 75 mg po q day, 5) Zocor 10 mg po q day, 6) Levofloxacin 500 mg po q day (discontinue on [**3-2**] after completion of a fourteen day course), 7) Avandia 4.0 mg po bid, 8) Prilosec 20 mg po q day, 9) enteric coated aspirin 325 mg po q day, 10) Glucophage 1,000 mg po bid, 11) Amaryl 2.0 mg po qid, 12) Colace 200 mg po bid, 13) Ocean Spray nasal spray two to four puffs in each naris q two to four hours prn, 14) Robitussin DM 10 cc to 15 cc po q four hours prn, 15) Lactulose 30 cc q four hours prn, 16) mineral oil 30 cc po bid prn, 17) Fleet's enema one per rectum q other day prn, 17) Dulcolax suppositories 10 mg per rectum q day prn, 18) Tylenol 650 mg po q four to six hours prn, 19) Albuterol metered dose inhaler two to four puffs q four to six hours prn shortness of breath or wheezing. DISCHARGE DIAGNOSES: 1. Congestive heart failure. 2. Pneumonia. 3. Restrictive lung disease. 4. Presumed to obesity hypoventilation syndrome. 5. Type II diabetes mellitus. 6. Morbid obesity.  [**First Name8 (NamePattern2) **] [**Name8 (MD) **], M.D. [**MD Number(1) 19513**] Dictated By:[**Name8 (MD) 5469**] MEDQUIST36 D: [**2167-2-24**] 09:11 T: [**2167-2-24**] 09:27 JOB#: [**Job Number 20680**]",F,2099-05-05
19823,2167-11-27,"Admission Date: [**2167-11-27**] Discharge Date: [**2167-12-9**] Date of Birth: [**2099-5-5**] Sex: F HISTORY OF PRESENT ILLNESS: The patient is a 68 year old female with a history of morbid obesity, history of sleep apnea, obesity hypoventilation syndrome, congestive heart failure with diastolic dysfunction, restrictive lung disease, the Emergency Department with a chief complaint of increasing shortness of breath for four to five days. The patient had mild productive cough but no fevers or chills. Prior to her admission on [**2167-11-27**], her O2 saturation dropped from the low-90s which is her baseline, down to 78 to 80% at home with a heart rate in the 130s to 140s. of 99.8 F.; blood pressure 123/70; heart rate 130 to 140; respirations 30; 02 saturation 80% on room air, up to 95% on four to five liters. She was found to be atrial fibrillation with rapid ventricular response to the 140s and started on heparin and Diltiazem for rate control. After admission to the Medical Floor on [**2167-11-27**], she required more oxygen to the point of 100% face mask. Arterial blood gas was initially 7.23, 98, 117, 100% non-rebreather mask; 7.25, 95, 58, on 5 liters nasal cannula, to 7.25, 99, 75, on Bi-PAP, then to 7.26, 96, 72, on C-PAP. She was transferred to the Cardiac Care Unit on [**2167-11-28**], for cardiac and respiratory failure. Her issues in the Cardiac Care Unit included: 1. Pulmonary: Respiratory failure secondary to hypoventilation secondary to morbid obesity in the setting of restrictive lung disease and congestive heart failure. The patient did not require intubation and continued on Bi-PAP and C-PAP. She was started on potassium chloride and progesterone to increase her respiratory drive. On [**2167-12-1**], the patient was started on nasal cannula O2 during the day with continued Bi-PAP at night. Her O2 saturation on nasal cannula of four to five liters was as low as 90%. 2. Cardiovascular: Congestive heart failure; the patient was aggressively diuresed with intravenous Lasix, for a total of nine liters negative while in the Cardiac Care Unit. Paroxysmal atrial fibrillation: Started on anti-coagulation on Diltiazem and digoxin. 3. Gastrointestinal: Constipation; aggressive bowel regimen including Lactulose and GoLYTELY. 4. Decreased mental status: Believed to be secondary to hypoxia and hypercarbia, resolved with improved oxygenation. PAST MEDICAL HISTORY: 1. Restrictive lung disease with PFTs, FEV of 1.38, 77%, FVC  of 1.78, 71%, mild decrease in DLCO, on home O2 two to  three liters at night. 2. Morbid obesity. 3. Presumed obstructive sleep apnea. 4. Obesity hypoventilation syndrome. 5. Congestive heart failure with diastolic dysfunction,  echocardiogram in [**2166**] very limited, ejection fraction  was 67% on stress test in [**2164**]. History of positive  stress in 09 of [**2165**]; with reversible defects laterally  and inferior laterally which was never worked up. 6. Hypertension. 7. Type 2 diabetes mellitus. 8. Stasis dermatitis. ALLERGIES: No known drug allergies. MEDICATIONS at time of transfer to floor 1. Simvastatin 10 mg p.o. q. h.s. 2. Glucophage 1000 mg p.o. twice a day. 3. Avandia 4 mg p.o. twice a day. 4. Provera 5 mg p.o. q. day. 5. Diltiazem 90 mg p.o. four times a day. 6. Digoxin 0.125 mg p.o. q. day. 7. Lasix 60 mg intravenously three times a day. 8. Colace 100 mg p.o. three times a day. 9. Coumadin 2.5 mg p.o. q. h.s. 10. Fleets Enema p.r.n. 11. Atrovent nebulizers p.r.n. 12. Dulcolax p.r.n. 13. Milk of Magnesia p.r.n. 14. Tylenol p.r.n. 15. Ioconasol Powder. PHYSICAL EXAMINATION: Temperature 98.9 F.; pulse 85 to 112; blood pressure 93/60; respirations 23; O2 saturation 90% on five liters nasal cannula, 95% on non-rebreather face mask. In general, the patient alert, in no acute distress sitting in a chair. HEENT: Oropharynx is clear. Moist mucous membranes. Neck supple. Unable to assess jugular venous pressure secondary to morbid obesity. Cardiovascular: Irregularly irregular rhythm, normal S1 and S2. Grade II/VI systolic ejection murmur at right upper sternal border. Lungs: Rare crackles bilateral bases. Abdomen obese, soft, nontender, nondistended. Positive bowel sounds. Extremities: Three plus non-pitting edema bilaterally to the thighs. Right upper extremity edema near PICC line. Decreased range of motion of right shoulder secondary to pain. Erythema of bilateral calves but not warm to touch. LABORATORY: CBC within normal limits. INR 2.8. Sodium 143, chloride 93, potassium 4.4, bicarbonate 41, BUN 12, creatinine 0.4. Chest x-ray unchanged from previous studies, shows congestive heart failure and bilateral pleural effusions. HOSPITAL COURSE: 1. Pulmonary: Hypercarbic respiratory failure secondary to morbid obesity/hypoventilatioin and CHF. She was continued on nasal cannula during the day. Her O2 saturations improved to 93 to 97% on 2 liters via nasal cannula. An attempt to continue Bi-PAP at night was tried, however, the patient's O2 saturations dropped to the high 80s on bi-PAP. Instead, she was kept on her nasal cannula at night. The patient was tried on CPAP but did not tolerate this either. 2. Cardiovascular: Congestive heart failure; the patient continued to be diuresed with intravenous Lasix. Total diuresis to date negative 12 to 15 liters. Paroxysmal atrial fibrillation/atrial flutter: Cardiology was consulted for a possible transesophageal echocardiogram and cardioversion. They did not recommend cardioversion now as she cannot tolerate TEE due to need to lie flat.Plan to f/u with cardiology as outpatient after several weeks anticoagualtion and consider cardioversion at that time. The patient was continued on Diltiazem and digoxin. Her rate was still poorly controlled. A beta blocker was added. There is a questionable history of beta blocker intolerance with wheezing reported on prior admit however pt's OMR med list indicates that she was on atenolol at the time of admisison. She had no wheezing while on beta blockers here ofr last 2 days of admit. Coronary artery disease: The patient has a history of positive stress tests with reversible defects. Cardiology was consulted for possible cardiac catheterization. The patient refused cardiac catheterization since she is unable to lie flat. The patient was continued on Statin and aspirin.Repeat TTE was done that showed decraesd EF of 35-40% and multiple regional wall motion abnormalities. beta blocker and ACE-I have been added to help with CAD and CHF managament. Aortic stenosis murmur on examination with mild AS confirmed on echo this admit. 3. Hematology: The patient on anti-coagulation for paroxysmal atrial fibrillation. 4. Right shoulder pain: Right shoulder films were normal. The patient's pain improved spontaneously without intervention.She appears to have right rotator cuff tendonitis vs bursitis. [**Month (only) 116**] need PT to help increase use and could use tylenol or low dose nsaids for increased pain. 5. GI: severe Constipation; the patient needs to be maintained standing regimen of lactulose and titrate as needed to maintain 1 BM per day 6. Diabetes mellitus: The patient was continued on home medications, Glucophage and Avandia as well as a Regular insulin sliding scale.Her glucotrol and amaryl were added back the day prior to discharge. Code Status: The patient is ""DO NOT RESUSCITATE"", ""DO NOT INTUBATE"", however, patient is agreeable to Medical Intensive Care Unit transfer for pressors as necessary. DISCHARGE STATUS: Discharge patient to Rehabilitation. DISCHARGE CONDITION: Stable.Of note, pt says her breathing is better than it has been in 2 years. She uses home O2. DISCHARGE MEDICATIONS: 1. Diltiazem 120 mg p.o. four times a day. 2. Digoxin 0.125 mg p.o. q. day. 3. Lasix 120 [**Hospital1 **] PO 4. Colace 100 mg p.o. three times a day. 5. Lactulose 3o cc PO qid 6.Amaryl 2 mg PO bid 7.glucotrol XL 10 mg PO qd 8. Dulcolax 10 mg p.o. prn 9. Aspirin 325 mg p.o. q. day. 10. Simvastatin 10 mg p.o. q. h.s. 11. Provera 5 mg p.o. q. day. 12. Protonix 40 mg p.o. q. day. 13. Avandia 4 mg p.o. twice a day. 14. Glucophage 1000 mg p.o. twice a day. 15. Tylenol 325 mg p.o. q. four to six hours p.r.n. 16. Aldactone 25 mg p.o. [**Hospital1 **] 17. Albuterol MDI two puffs q. two to four hours p.r.n. 18. Coumadin: 1.5 mg po qd- needs to be monitored an dad[**Name (NI) 20681**] for INR [**1-3**] 19. Lopressor 25 po bid PO 20. O2 2L NP 21. Regular insulin sliding scale. 22. Lisinopril 5 mg po qd FOLLOW-UP INSTRUCTIONS: The patient has a Cardiology appointment on [**2168-1-7**], at 4 p.m. at [**Hospital Ward Name 23**] Center, [**Location (un) 20682**], with Dr. [**Last Name (STitle) 20683**]. She also needs f/u with Dr. [**First Name8 (NamePattern2) **] [**Last Name (NamePattern1) 1022**] in [**Company 191**] at [**Hospital1 18**] in [**1-4**] weeks (Dr. [**First Name (STitle) 1022**] is covering for her PCP- [**Last Name (NamePattern4) **]. [**First Name8 (NamePattern2) **] [**Name (STitle) 19512**] who is on maternity leave). Pt will also need pulmonary f/u with Dr. [**First Name4 (NamePattern1) **] [**Last Name (NamePattern1) **] (I think he is her outpatient pulmonary doctor) DISCHARGE DIAGNOSES: 1. hypercapnic respiratory failure-resolved with crhonic CO2 retention. 2. Obstructive sleep apnea/obesity hypoventialtion syndrome 3. Congestive heart failure 4. Hypertension. 5. Diabetes mellitus, type 2. 6. Restrictive lung disease. 7. Atrial fibrillation/atrial flutter. 8. CAD with echo evidence of prior MI and reduced EF 9. mild AS 10. constipation  [**First Name11 (Name Pattern1) **] [**Last Name (NamePattern4) 3022**], M.D. [**MD Number(1) 3023**] Dictated By:[**Name8 (MD) 7112**] MEDQUIST36 D: [**2167-12-7**] 10:29 T: [**2167-12-7**] 11:19 JOB#: [**Job Number 20684**]",F,2099-05-05
19823,2170-10-12,"Admission Date: [**2170-9-19**] Discharge Date: [**2170-10-12**] Date of Birth: [**2099-5-5**] Sex: F Service: [**Hospital Unit Name 196**] Allergies: Patient recorded as having No Known Allergies to Drugs Attending:[**First Name3 (LF) 9554**] Chief Complaint: Weight gain, weakness Major Surgical or Invasive Procedure: Colonoscopy-no apparent bleeding lesion. History of Present Illness: 71 y.o Russian speaking female with extensive PMH including CAD, CHF, afib and chronic anemia. She was recently admitted in [**6-3**] for anemia work-up and found to have a bleeding gastric ectasia on EGD which was removed. Colonoscopy revealed a benign polyp. Pt presents today after feeling increased fatigue at home. Denies CP or increasing SOB. On home O2 at 2L and has not required increased amounts. Pt also notes that she has been unable to walk around her apartment as much, but is limited by weakness vs shortness of breath. She does not feel that her breathig has changed. Her symptoms began approx 3 weeks ago. Denies, cough, cold symptoms, fever, chills, nausea, vomting, change in diet or medication. Pt reports that she was told by her PCP that she had gained a lot of weight due to fluid and needed to come into the hospital for diuresis. Past Medical History: CAD h/o CHF AFib on coumadin anemia Restrictive lung disease Social History: married, no alcohol or tobacco Family History: non-contributory Physical Exam: VS: 97.5, 97/50, 57,16, 95 on 2l NC GEn: Morbidly obese, pale, pleasant, speaking in full sentences. HEENT: Ophx clear, MMM, PERRLA, conjinctiva pale, no icterus CV: distant HS, reg [**Last Name (LF) 20687**], [**First Name3 (LF) **], III/VI SEM radiating to carotids. Pulm: Distant BS, good inspiratory effort, bibasilar crackles 1/3 up, no rhonchi or wheezing. Abd: obese, NT, ND, +BS Ext:4+ woody edema to the knee bilat, warm, erythematous, non-tender Neuro: occ resting tremor which is not new. No focal deficits. A&O x3 Pertinent Results: ECHO: Left Atrium - Long Axis Dimension: 3.7 cm (nl <= 4.0 cm) Aortic Valve - Peak Velocity: *4.1 m/sec (nl <= 2.0 m/sec) Aortic Valve - Peak Gradient: 64 mm Hg Aortic Valve - Mean Gradient: 40 mm Hg Mitral Valve - E Wave: 1.1 m/sec Mitral Valve - A Wave: 1.2 m/sec Mitral Valve - E/A Ratio: 0.92 Mitral Valve - E Wave Deceleration Time: 270 msec LEFT ATRIUM: Normal LA size. RIGHT ATRIUM/INTERATRIAL SEPTUM: Normal RA size. LEFT VENTRICLE: Normal LV cavity size. Overall normal LVEF (>55%). RIGHT VENTRICLE: Normal RV chamber size and free wall motion. AORTA: Normal aortic root diameter. AORTIC VALVE: Severely thickened/deformed aortic valve leaflets. Moderate AS. MITRAL VALVE: Mildly thickened mitral valve leaflets. Moderate mitral annular calcification. PERICARDIUM: No pericardial effusion. Conclusions: 1. The left ventricular cavity size is normal. Overall left ventricular systolic function is very difficult to assess but it may be normal (LVEF>55%). 2. The aortic valve leaflets are severely thickened/deformed. There is moderate aortic valve stenosis. 3. The mitral valve leaflets are mildly thickened. 4. Compared with the findings of the prior study (tape reviewed) of [**2167-12-7**], LV function may have improved. COLONOSCOPY: (Rectal polyp, polypectomy): Distorted fragment of benign colonic mucosa with melanosis coli; no adenomatous change seen (multiple levels examined). Brief Hospital Course: 71 yo Russian speaking female with extensive PMH presents with weight gain and increased fatigue over the past 3-4 weeks. 1)Anemia: Pt was recently admitted in [**2170-5-31**] for anemia work-up and found to have a bleeding gastric ectasia on EGD which was removed. Colonoscopy at that time revealed a benign polyp. Pt was found to have Hct of 18 on this admission. Pt was transferred to the CCU for monitoring and received 8 units of PRBC with appropriate increase from 18 to 33. The anemia was thought to be subacute since she was never hemodynamically unstable. GI was consulted. Coumadin was held for suspected GI bleed. Colonoscopy was scheduled but held for persistent high INR which was reversed with vitamin K. Pt was a difficult prep and required almost 4-5 days of prepping with Golytely and other laxative. Pt finally underwent colonoscopy which revealed no source of bleed. Since pt's Hct was stable 25 34-35, no further diagnostic procedure was done. If pt were to develop another acute/subacute anemia, capsule study was recommended. 2) CHF: Pt has a long hx of CHF per old records. Last echo before admission was from [**2168**] which showed EF of 35-40%. She got an echo on [**9-20**] which showed EF>55%. Pt was initially started on niseritide and lasix for diuresis for suspected CHF exacerbation before her initial Hct of 18 came back. Pt received lasix between transfusions. Lisinopril was held for increased creatinine. Pt's wt was stable and CHF status was stable initially. However, after 5 days of prep for the colonoscopy, pt started to gain weight everyday and was net positive daily. Pt was refractory to standing IV Lasix and Diuril. She got PICC line placed under IR and Natrecor gtt was started with still net positive daily. Lasix gtt was added and was titrated up to 10-15mg/hr which gave some reponse initially but again became refractory to it. Dopamine gtt was tried but showed no improvement in UOP. Pt lost PICC access. However one day, she started to respond extremely well with lasix gtt at 10mg/hr and IV Diuril 250 mg [**Hospital1 **] only (without Natrecor). Pt's admission weight was 130 kg (128 kg in a clinic note) and has gotten up as high as 139 kg. However, she was able to diuresis 1-2L/day and her weight came down to 130kg which is her baseline. The diuretics were changed to po form (Lasix po 120 mg [**Hospital1 **] and Diuril po 125 mg [**Hospital1 **]) and pt continued to diuresis with net negative daily. Pt's CHF was thought to be possibly from AS. If that is the case, valve replacement could improve her symtoms. Review of the aortic valve orifice and consideration of valve replacement should be discussed as outpatient. Pt needs to follow up with a [**Hospital 1902**] clinic within 1 week. 3) Afib: Pt with hx of atrial fibrillation but now in sinus rhythm. Rate is bradycardic. Pt noted to have pauses on tele up to 2 seconds. Pt was continued on amiodarone 200 mg po qd. Coumadin was held in a setting of GI bleed and also for high INR prior to colonoscopy. Coumadin was restarted with goal INR of [**1-2**]. Pt needs to be seen by her PCP to check her INR level. 4) COPD/restrictive lung dz: Pt was continued on 2 L of oxygen which is her baseline. Pt was getting nebulizer prn for wheezing and SOB. Pt is on home O2. 5) DM: Pt was initially continued on home meds of avandia and glyburide and was cover with RISS. However, avandia was held while she was NPO. She will be discharged with her home regimen. 9) CODE: DNR/ DNI- this was re-discussed with patient and husband to determine if pt still wants to be DNI/DNR as she has been DNR/DNI on prior admissions. Medications on Admission: avandia 4 [**Hospital1 **] amaryl 2 mg prn FS > 250 protonix 50 qd coumadin 2 qhs- on HOLD amiodorone 200 qd lasix 160 qam, 40 qpm zaroxyln 2.5 qd 30 minute before am lasix lipitor 40 qd iron 325 tid- don't give w/ protonix vit c tid with iron lisinopril 5 qd levoxyl 0.050 mg qd albuterol/atrovent MDI epogen 3000 units 2x per week. Discharge Medications: 1. Amiodarone HCl 200 mg Tablet Sig: One (1) Tablet PO QD (). Disp:*30 Tablet(s)* Refills:*2* 2. Ascorbic Acid 500 mg Tablet Sig: One (1) Tablet PO TID (3 times a day). Disp:*90 Tablet(s)* Refills:*2* 3. Levothyroxine Sodium 50 mcg Tablet Sig: One (1) Tablet PO QD (). Disp:*30 Tablet(s)* Refills:*2* 4. Lisinopril 5 mg Tablet Sig: One (1) Tablet PO QD (). Disp:*30 Tablet(s)* Refills:*2* 5. Epoetin Alfa 4,000 unit/mL Solution Sig: Two (2) Injection QMOWEFR (Monday -Wednesday-Friday). Disp:*qs * Refills:*2* 6. Albuterol Sulfate 0.083 % Solution Sig: [**12-1**] Inhalation Q6H (every 6 hours) as needed. 7. Triamcinolone Acetonide 0.1 % Cream Sig: One (1) Appl Topical  HS (at bedtime). Disp:*1 tube* Refills:*2* 8. Pantoprazole Sodium 40 mg Tablet, Delayed Release (E.C.) Sig: One (1) Tablet, Delayed Release (E.C.) PO Q12H (every 12 hours). Disp:*60 Tablet, Delayed Release (E.C.)(s)* Refills:*2* 9. Avandia 4 mg Tablet Sig: One (1) Tablet PO twice a day. Disp:*60 Tablet(s)* Refills:*2* 10. Amaryl 2 mg Tablet Sig: One (1) Tablet PO as needed as needed for FS>200. Disp:*30 Tablet(s)* Refills:*0* 11. Atorvastatin Calcium 40 mg Tablet Sig: One (1) Tablet PO once a day. Disp:*30 Tablet(s)* Refills:*2* 12. Iron 325 (65) mg Capsule, Sustained Release Sig: One (1) Capsule, Sustained Release PO three times a day. Disp:*90 Capsule, Sustained Release(s)* Refills:*2* 13. Metoprolol Succinate 25 mg Tablet Sustained Release 24HR Sig: One (1) Tablet Sustained Release 24HR PO DAILY (Daily). Disp:*30 Tablet Sustained Release 24HR(s)* Refills:*2* 14. Chlorothiazide 250 mg Tablet Sig: 0.5 Tablet PO BID (2 times a day). Disp:*60 Tablet(s)* Refills:*2* 15. Furosemide 80 mg Tablet Sig: 1.5 Tablets PO BID (2 times a day). Disp:*60 Tablet(s)* Refills:*2* 16. Pramoxine-Zinc Oxide in MO 1-12.5 % Ointment Sig: One (1) Appl Rectal Q4-6H (every 4 to 6 hours) as needed. Disp:*qs qs* Refills:*0* 17. Coumadin 1 mg Tablet Sig: Three (3) Tablet PO once a day. Disp:*90 Tablet(s)* Refills:*2* Discharge Disposition: Home With Service Facility: [**Hospital 20688**] Home Health Discharge Diagnosis: Acute anemia from GI bleed CHF Discharge Condition: Hemodynamically stable, stable Hct, no chest pain, no symptoms of dizziness. Discharge Instructions: Patient was instructed to take all of the medications as instructed. Pt was instructed to seek medical attention if shed develops fatigue, dizziness, SOB, Chest pain, bloody stool, melena, bloody emesis. Pt should see her PCP [**Last Name (NamePattern4) **] [**12-1**] weeks after the discharge. Followup Instructions: Provider: [**First Name8 (NamePattern2) **] [**Last Name (NamePattern1) **] [**Name8 (MD) **], MD Where: [**Hospital6 29**] [**Hospital3 249**] Phone:[**Telephone/Fax (1) 250**] Date/Time:[**2170-10-30**] 1:30  [**First Name8 (NamePattern2) 2064**] [**Last Name (NamePattern1) **] MD [**MD Number(2) 2139**] Completed by:[**2170-10-12**]",F,2099-05-05
19823,2172-06-22,"Admission Date: [**2172-6-13**] Discharge Date: [**2172-6-22**] Date of Birth: [**2099-5-5**] Sex: F Service: MEDICINE Allergies: Patient recorded as having No Known Allergies to Drugs Attending:[**First Name3 (LF) 8487**] Chief Complaint: diarrhea/hypotension Major Surgical or Invasive Procedure: None History of Present Illness: Pt is a 73 yo female with MMP including CRI, DM, HTN, CHF requiring admissions, and a recent admission for cellulitis who presents with seven days of diarrhea and found to be hypotensive, meeting code sepsis criteria. Pt was recently admitted to [**Hospital1 **] from [**Date range (3) 20690**] with a left lower extremity cellulitis treated with unasyn transitioned to augmentin as an outpt. She took the augmentin for 11 days post-discharge with last being ~[**2172-6-9**]. Pt says that for the last seven days she has had profuse diarrhea (two days per husband), last today with 3 episodes. No blood or melena noted. She denies any lightheadedness/ fever/ chills/ nausea/ vomiting or chest pain. She has had decreased PO intake for many days (could not quantify). In the ED, VS on admission were: T: 99.6; HR: 112; BP 88/42-->70/20; RR: 22; O2: 93% RA. An abdominal CT was done which showed mild diffuse colonic wall thickening without distention. She was given levaquin 500 mg IV and flagyl 500 mg IV x 1. She was also started on norepinephrine gtt prior to transport via ambulance Past Medical History: 1) Chronic renal insufficiency baseline Cr 2.6 on [**8-4**] 2) Restrictive lung disease presumed to be secondary to obesity with PFTS in [**2165**] 3) Hyperlipidemia 4) NIDDM x 10 years 5) Obesity 6) HTN 7) CHF, EF >55% with an echo in [**9-2**] 8) Moderate AS (10'[**69**] echo) with AV gradient of 64 9) Chronic atrial fibrillation on coumadin and amiodarone 10) Hypothyroidism TSH 6.7 in [**6-3**] 11) Iron deficiency anemia Hct 34 at baseline 0n [**2171-6-7**] with gastritis and ectasias on recent EGD/colonoscopy 12) B12 deficiency on supplements 13) Venous insufficiency 14) h/o Left lower extremity cellulitis treated with full course of Augmentin in [**2171**] 15) Glaucoma; s/p surgery in [**11-3**] 16) h/o left hand cellulitis/gout flare [**10-4**] Social History: Lives with her husband in [**Name (NI) 583**]. She denies any smoking or alcohol use. Family History: NC Physical Exam: VS: T: 99.5;HR: 75; BP: 103/61; RR: 21; O2: 95 7L; CVP:3 Gen: Speaking in full sentences in mild distress HEENT: PERRL; EOMI; sclera anicteric; OP clear Neck: JVD difficult to see [**1-2**] neck girth CV: RRR S1S2 III/VI crescendo-descrendo murmur at RUSB with radiation to carotids. Lungs: scattered crackles 1/3 up without wheezes. Abd: NABS. Soft, obese, NT, ND Back: unable to assess Ext: Brown venous stasis changes ankle--> below knee b/l, L>R. No open sores, erthema, or warmth. DP 1+. 2+ edema, non-pitting. Neuro: A&O x 3. MS [**First Name (Titles) 20691**] [**Last Name (Titles) 5235**]. Pertinent Results: Labs on Admission: CBC ([**2172-6-13**] 12:10A) WBC-31.6*# RBC-3.88* HGB-11.9* HCT-35.1* MCV-91 MCH-30.7 MCHC-33.9 RDW-16.4* NEUTS-88* BANDS-5 LYMPHS-2* MONOS-4 EOS-0 BASOS-0 ATYPS-1* METAS-0 MYELOS-0 Chemistires ([**2172-6-13**] 12:10AM) GLUCOSE-189* UREA N-94* CREAT-4.4*# SODIUM-128* POTASSIUM-5.3* CHLORIDE-93* TOTAL CO2-19* ANION GAP-21* MAGNESIUM-2.0 Coags: ([**2172-6-13**] 12:56AM) PT-37.0* PTT-33.6 INR(PT)-4.1* Lactate: ([**2172-6-13**] 12:57AM) LACTATE-3.5* UA: ([**2172-6-13**] 03:40AM) URINE COLOR-Yellow APPEAR-Clear SP [**Last Name (un) 155**]-1.012 BLOOD-NEG NITRITE-NEG PROTEIN-NEG GLUCOSE-NEG KETONE-NEG BILIRUBIN-NEG UROBILNGN-NEG PH-5.0 LEUK-NEG VBG ([**2172-6-13**] 01:14PM) TYPE-MIX TEMP-37.8 PO2-53* PCO2-47* PH-7.22* TOTAL CO2-20* BASE XS--8 INTUBATED-NOT INTUBA [**Last Name (un) **] Stim: [**2172-6-13**] 01:37PM CORTISOL-33.4* [**2172-6-13**] 02:46PM CORTISOL-46.8* [**2172-6-13**] 03:25PM CORTISOL-52.1* Imaging: CHEST (PORTABLE AP) [**2172-6-13**] 2:04 PM IMPRESSION: Compared with earlier the same day, the right IJ central line has been retracted. The tip now overlies the SVC/RA junction. There has been interval progression of left lower lobe collapse and/or consolidation with interval obscuration of left hemidiaphragm. A small left and also a small right pleural effusion cannot be excluded. No pneumothorax is detected.  RADIOLOGY Final Report CT ABDOMEN W/O CONTRAST [**2172-6-13**] 5:29 AM IMPRESSION: 1. There is colonic wall thickening extending along the entire course of the colon, with associated pericolonic inflammatory stranding. This appearance is consistent with mild pancolitis, of inflammatory or infectious etiologies. No pericolonic fluid collections or free intraperitoneal air or fluid is identified. 2. Cholelithiasis without evidence of acute cholecystitis. EKG ([**2172-6-13**]) Sinus rhythm; Borderline first degree A-V block; Left bundle branch block Lateral ST-T changes may be due to myocardial ischemia; Generalized low QRS voltages No change from previous Echo ([**2172-6-15**]) IMPRESSION: Suboptimal study. At least moderate (may be severe) calcific aortic stenosis. LVH. Normal LVEF. If clincally indicated, a repeat study with definity contrast may improve spectral doppler fidelity to assess morte accurately the aortic valve gradients/area. Compared to the prior report dated [**2170-9-20**], an aortic valve area change cannot be excluded on the basis of the current study. LVEF is probably similar. Brief Hospital Course: Pt is a 73 yo Ukranian female with MMP who presents with hypotension, despite fluid resuscitation, and with diarrhea. She initially required pressors (epinephrine). After more aggressive IVF use, she was able to be weaned off pressors. During this time, she was also changed from flagyl to PO vancomycin (for positive c. diff colitis), given her initial lack of progress. During this time, her SBPs were in the 90s, often dropping to the 70s systolic. Her initial acute on chronic renal failure improved over the first few days. After this initial improvement, her course began to worsen again. Her blood pressures again required pressor support (despite IVF), her WBC began to increase (with 14% bands) and her blood gas showed a worsening acidemia. Her urine culture grew enterococcus. Treatment with vanc and flagyl for c. diff and gent/cefepine for UTI were begun. Despite this, she required more pressor support and her respirations became less strong. She expired at 5:59 pm on [**2172-6-22**]. Medications on Admission: Albuterol prn Allopurinol 200 mg [**Hospital1 **] Amiodarone 200 mg qday Bisacodyl 5 mg qday Colace 100 mg [**Hospital1 **] Colchicine 0.6 mg po qod Glipizide SR 2.5 mg qday Ipratropium 2 puffs QID ferrous sulfate 325 one po tid Levothyroxine 125 mcg qday Atorvastatin 20 mg qday Lisinopril 5 mg qday Pantoprazole 40 mg qday Cyanocobalamin 1000 mg qday Furosemide 40 mg po bid Toprol XL 25 mg qday Warfarin 1 mg po qhs Epoetin 6000 units [**Hospital1 **] Amoxicillin-Claulanate 500-125 mg q12--ENDED [**2172-6-9**] Discharge Medications: None Discharge Disposition: Expired Discharge Diagnosis: Primary: Sepsis C. Diff Colitis UTI Cardiopulmonary arrest Secondary: Diabetes Mellitus CHF CRI Discharge Condition: Expired Discharge Instructions: None Followup Instructions: None",F,2099-05-05
19823,2167-12-07,"PATIENT/TEST INFORMATION: Indication: Aortic valve disease. Shortness of breath. Height: (in) 63 Weight (lb): 290 BSA (m2): 2.27 m2 BP (mm Hg): 130/70 Status: Inpatient Date/Time: [**2167-12-7**] at 13:21 Test: TTE(Complete) Doppler: Complete pulse and color flow Contrast: None Technical Quality: Adequate INTERPRETATION: Findings: LEFT VENTRICLE: Left ventricular wall thicknesses are normal. The left ventricular cavity size is normal. Overall left ventricular systolic function is moderately depressed. LV WALL MOTION: The following resting regional left ventricular wall motion abnormalities are seen: basal anterior - hypokinetic; mid anterior - hypokinetic; basal anteroseptal - hypokinetic; mid anteroseptal - hypokinetic; anterior apex - hypokinetic; septal apex - hypokinetic; AORTIC VALVE: There is mild aortic valve stenosis. MITRAL VALVE: The mitral valve leaflets are mildly thickened. There is moderate mitral annular calcification. Physiologic mitral regurgitation is seen (within normal limits). TRICUSPID VALVE: Physiologic tricuspid regurgitation is seen. There is borderline pulmonary artery systolic hypertension. PERICARDIUM: There is no pericardial effusion. GENERAL COMMENTS: Suboptimal image quality due to body habitus. Conclusions: Left ventricular wall thicknesses are normal. The left ventricular cavity size is normal. Overall left ventricular systolic function is hard to assess but is probably moderately depressed. Resting regional wall motion abnormalities include mid and distal septal hypokinesis to akinesis. There is mild aortic valve stenosis. The mitral valve leaflets are mildly thickened. There is borderline pulmonary artery systolic hypertension. There is no pericardial effusion. Compared to the previous study of [**1-29**], there is a marked decrease in LV function present.",F,2099-05-05
19823,2167-02-20,PATIENT/TEST INFORMATION: Indication: Congestive heart failure. Height: (in) 65 Weight (lb): 280 BSA (m2): 2.28 m2 BP (mm Hg): 132/76 Status: Inpatient Date/Time: [**2167-2-20**] at 15:19 Test: Portable TTE(Complete) Doppler: Complete pulse and color flow Contrast: None Technical Quality: Suboptimal INTERPRETATION: Findings: LEFT VENTRICLE: The left ventricle is not well seen. RIGHT VENTRICLE: The right ventricle is not well seen. AORTIC VALVE: The aortic valve leaflets are mildly thickened. MITRAL VALVE: The mitral valve leaflets are mildly thickened. There is mild mitral annular calcification. PERICARDIUM: There is no pericardial effusion. GENERAL COMMENTS: Suboptimal image quality due to poor echo windows. Conclusions: Study was extremely limited. The left ventricle is not well seen but systolic function appears grossly normal. The aortic valve leaflets are mildly thickened. The mitral valve leaflets are mildly thickened. There is no pericardial effusion.,F,2099-05-05
19823,2172-06-15,"PATIENT/TEST INFORMATION: Indication: h/o AS. Hypotensive. Height: (in) 62 Weight (lb): 258 BSA (m2): 2.13 m2 BP (mm Hg): 108/41 HR (bpm): 83 Status: Inpatient Date/Time: [**2172-6-15**] at 15:14 Test: Portable TTE (Complete) Doppler: Full Doppler and color Doppler Contrast: None Technical Quality: Suboptimal INTERPRETATION: Findings: LEFT ATRIUM: Mild LA enlargement. LEFT VENTRICLE: Mild symmetric LVH with normal cavity size and systolic function (LVEF>55%). Suboptimal technical quality, a focal LV wall motion abnormality cannot be fully excluded. RIGHT VENTRICLE: Paradoxic septal motion consistent with conduction abnormality/ventricular pacing. AORTA: Normal aortic root diameter. Mildly dilated ascending aorta. AORTIC VALVE: Severely thickened/deformed aortic valve leaflets. Moderate AS. MITRAL VALVE: Mildly thickened mitral valve leaflets. No MVP. Mild mitral annular calcification. Mild thickening of mitral valve chordae. TRICUSPID VALVE: Borderline PA systolic hypertension. PERICARDIUM: Small pericardial effusion. No echocardiographic signs of tamponade. GENERAL COMMENTS: Suboptimal image quality - poor echo windows. Conclusions: The left atrium is mildly dilated. There is mild symmetric left ventricular hypertrophy with normal cavity size and systolic function (LVEF>55%). Due to suboptimal technical quality, a focal wall motion abnormality cannot be fully excluded. The ascending aorta is mildly dilated. The aortic valve leaflets are severely thickened/deformed. There is at least moderate aortic valve stenosis (severe aortic stenosis may be present but cannot be excluded by this study). The mitral valve leaflets are mildly thickened. There is no mitral valve prolapse. There is borderline pulmonary artery systolic hypertension. There is a small pericardial effusion. There are no echocardiographic signs of tamponade. IMPRESSIOn: Suboptimal study. At least moderate (may be severe) calcific aortic stenosis. LVH. Normal LVEF. If clincally indicated, a repeat study with definity contrast may improve spectral doppler fidelity to assess morte accurately the aortic valve gradients/area. Compared to the prior report dated [**2170-9-20**], an aortic valve area change cannot be excluded on the basis of the current study. LVEF is probably similar.",F,2099-05-05
19823,2170-09-20,"PATIENT/TEST INFORMATION: Indication: Aortic valve disease. Congestive heart failure. Left ventricular function. Height: (in) 63 Weight (lb): 299 BSA (m2): 2.30 m2 BP (mm Hg): 101/41 HR (bpm): 90 Status: Inpatient Date/Time: [**2170-9-20**] at 15:34 Test: Portable TTE (Complete) Doppler: Full doppler and color doppler Contrast: Definity Technical Quality: Adequate INTERPRETATION: Findings: LEFT ATRIUM: Normal LA size. RIGHT ATRIUM/INTERATRIAL SEPTUM: Normal RA size. LEFT VENTRICLE: Normal LV cavity size. Overall normal LVEF (>55%). RIGHT VENTRICLE: Normal RV chamber size and free wall motion. AORTA: Normal aortic root diameter. AORTIC VALVE: Severely thickened/deformed aortic valve leaflets. Moderate AS. MITRAL VALVE: Mildly thickened mitral valve leaflets. Moderate mitral annular calcification. PERICARDIUM: No pericardial effusion. Conclusions: 1. The left ventricular cavity size is normal. Overall left ventricular systolic function is very difficult to assess but it may be normal (LVEF>55%). 2. The aortic valve leaflets are severely thickened/deformed. There is moderate aortic valve stenosis. 3. The mitral valve leaflets are mildly thickened. 4. Compared with the findings of the prior study (tape reviewed) of [**2167-12-7**], LV function may have improved.",F,2099-05-05
19823,2167-12-04,Atrial fibrillation with a controlled ventricular response. Left bundle-branch block. Ventricular ectopy. Compared to the previous tracing of [**2167-12-3**] ventricular ectopy is now present. TRACING #2,F,2099-05-05
19823,2167-12-03,Atrial fibrillation with a controlled ventricular response. Left bundle-branch block. Compared to the previous tracing of [**2167-11-27**] the ventricular rate is now controlled. TRACING #1,F,2099-05-05


In [0]:
df.write.format('delta').mode('overwrite').save(f'{delta_path}/bronze/dataset')
display(dbutils.fs.ls(f'{delta_path}/bronze/dataset'))

path,name,size
dbfs:/FileStore/HLS/jsl_kg/delta/jsl/bronze/dataset/_delta_log/,_delta_log/,0
dbfs:/FileStore/HLS/jsl_kg/delta/jsl/bronze/dataset/part-00000-00239749-0900-4b3e-b923-8994b2c8d947-c000.snappy.parquet,part-00000-00239749-0900-4b3e-b923-8994b2c8d947-c000.snappy.parquet,28005
dbfs:/FileStore/HLS/jsl_kg/delta/jsl/bronze/dataset/part-00000-166696aa-db83-4dba-9108-71e9a2854429-c000.snappy.parquet,part-00000-166696aa-db83-4dba-9108-71e9a2854429-c000.snappy.parquet,28005
dbfs:/FileStore/HLS/jsl_kg/delta/jsl/bronze/dataset/part-00000-18110c06-c341-4a39-b349-3214d8616db7-c000.snappy.parquet,part-00000-18110c06-c341-4a39-b349-3214d8616db7-c000.snappy.parquet,28005
dbfs:/FileStore/HLS/jsl_kg/delta/jsl/bronze/dataset/part-00000-31ac54a2-4565-472e-87db-d09d5225d6d9-c000.snappy.parquet,part-00000-31ac54a2-4565-472e-87db-d09d5225d6d9-c000.snappy.parquet,28005
dbfs:/FileStore/HLS/jsl_kg/delta/jsl/bronze/dataset/part-00000-37f7f523-7de7-4959-b052-0628124c2d29-c000.snappy.parquet,part-00000-37f7f523-7de7-4959-b052-0628124c2d29-c000.snappy.parquet,28005
dbfs:/FileStore/HLS/jsl_kg/delta/jsl/bronze/dataset/part-00000-58d36867-3aab-47f6-a494-b6d3395e3609-c000.snappy.parquet,part-00000-58d36867-3aab-47f6-a494-b6d3395e3609-c000.snappy.parquet,28005
dbfs:/FileStore/HLS/jsl_kg/delta/jsl/bronze/dataset/part-00000-63dcf0f8-f1f7-4701-82f6-25fd8c2cec40-c000.snappy.parquet,part-00000-63dcf0f8-f1f7-4701-82f6-25fd8c2cec40-c000.snappy.parquet,28005
dbfs:/FileStore/HLS/jsl_kg/delta/jsl/bronze/dataset/part-00000-a771aedc-137d-4f48-9617-7ace1ab38906-c000.snappy.parquet,part-00000-a771aedc-137d-4f48-9617-7ace1ab38906-c000.snappy.parquet,28005
dbfs:/FileStore/HLS/jsl_kg/delta/jsl/bronze/dataset/part-00000-c6d0dcb4-2abe-46dd-a39f-4e4f9ca01fa9-c000.snappy.parquet,part-00000-c6d0dcb4-2abe-46dd-a39f-4e4f9ca01fa9-c000.snappy.parquet,28005


## Posology RE Pipeline

### Posology Relation Extraction

Posology relation extraction pretrained model supports the following relatios:

DRUG-DOSAGE  
DRUG-FREQUENCY  
DRUG-ADE (Adversed Drug Events)  
DRUG-FORM  
DRUG-ROUTE  
DRUG-DURATION  
DRUG-REASON  
DRUG=STRENGTH  

The model has been validated against the posology dataset described in (Magge, Scotch, & Gonzalez-Hernandez, 2018).

| Relation | Recall | Precision | F1 | F1 (Magge, Scotch, & Gonzalez-Hernandez, 2018) |
| --- | --- | --- | --- | --- |
| DRUG-ADE | 0.66 | 1.00 | **0.80** | 0.76 |
| DRUG-DOSAGE | 0.89 | 1.00 | **0.94** | 0.91 |
| DRUG-DURATION | 0.75 | 1.00 | **0.85** | 0.92 |
| DRUG-FORM | 0.88 | 1.00 | **0.94** | 0.95* |
| DRUG-FREQUENCY | 0.79 | 1.00 | **0.88** | 0.90 |
| DRUG-REASON | 0.60 | 1.00 | **0.75** | 0.70 |
| DRUG-ROUTE | 0.79 | 1.00 | **0.88** | 0.95* |
| DRUG-STRENGTH | 0.95 | 1.00 | **0.98** | 0.97 |


*Magge, Scotch, Gonzalez-Hernandez (2018) collapsed DRUG-FORM and DRUG-ROUTE into a single relation.

In [0]:
documenter = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("documents")

sentencer = SentenceDetector()\
    .setInputCols(["documents"])\
    .setOutputCol("sentences")

tokenizer = sparknlp.annotators.Tokenizer()\
    .setInputCols(["sentences"])\
    .setOutputCol("tokens")

words_embedder = WordEmbeddingsModel()\
    .pretrained("embeddings_clinical", "en", "clinical/models")\
    .setInputCols(["sentences", "tokens"])\
    .setOutputCol("embeddings")

pos_tagger = PerceptronModel()\
    .pretrained("pos_clinical", "en", "clinical/models") \
    .setInputCols(["sentences", "tokens"])\
    .setOutputCol("pos_tags")

posology_ner = MedicalNerModel()\
    .pretrained("ner_posology", "en", "clinical/models")\
    .setInputCols("sentences", "tokens", "embeddings")\
    .setOutputCol("ners")   

posology_ner_converter = NerConverterInternal() \
    .setInputCols(["sentences", "tokens", "ners"]) \
    .setOutputCol("ner_chunks")

dependency_parser = DependencyParserModel()\
    .pretrained("dependency_conllu", "en")\
    .setInputCols(["sentences", "pos_tags", "tokens"])\
    .setOutputCol("dependencies")

reModel = RelationExtractionModel()\
    .pretrained("posology_re")\
    .setInputCols(["embeddings", "pos_tags", "ner_chunks", "dependencies"])\
    .setOutputCol("posology_relations")\
    .setMaxSyntacticDistance(4)

pipeline = Pipeline(stages=[
    documenter,
    sentencer,
    tokenizer, 
    words_embedder, 
    pos_tagger, 
    posology_ner,
    posology_ner_converter,
    dependency_parser,
    reModel
])

empty_data = spark.createDataFrame([[""]]).toDF("text")

model = pipeline.fit(empty_data)

In [0]:
results = model.transform(df)
results.printSchema()

In [0]:
result_df = results.select('subject_id','date',F.explode(F.arrays_zip(results.posology_relations.result, results.posology_relations.metadata)).alias("cols")) \
                   .select('subject_id','date',F.expr("cols['0']").alias("relation"),
                                               F.expr("cols['1']['entity1']").alias("entity1"),
                                               F.expr("cols['1']['entity1_begin']").alias("entity1_begin"),
                                               F.expr("cols['1']['entity1_end']").alias("entity1_end"),
                                               F.expr("cols['1']['chunk1']").alias("chunk1"),
                                               F.expr("cols['1']['entity2']").alias("entity2"),
                                               F.expr("cols['1']['entity2_begin']").alias("entity2_begin"),
                                               F.expr("cols['1']['entity2_end']").alias("entity2_end"),
                                               F.expr("cols['1']['chunk2']").alias("chunk2"),
                                               F.expr("cols['1']['confidence']").alias("confidence"))
result_df.limit(20).display()

subject_id,date,relation,entity1,entity1_begin,entity1_end,chunk1,entity2,entity2_begin,entity2_end,chunk2,confidence
19823,2167-02-25,DRUG-FORM,DRUG,1391,1399,Albuterol,FORM,1414,1423,nebulizers,1.0
19823,2167-02-25,DRUG-FORM,DRUG,1405,1412,Atrovent,FORM,1414,1423,nebulizers,1.0
19823,2167-02-25,STRENGTH-DRUG,STRENGTH,1539,1543,40 mg,DRUG,1551,1555,Lasix,1.0
19823,2167-02-25,ROUTE-DRUG,ROUTE,1548,1549,IV,DRUG,1551,1555,Lasix,1.0
19823,2167-02-25,DRUG-STRENGTH,DRUG,2336,2341,Amaryl,STRENGTH,2343,2348,2.0 mg,1.0
19823,2167-02-25,DRUG-ROUTE,DRUG,2336,2341,Amaryl,ROUTE,2350,2351,po,1.0
19823,2167-02-25,DRUG-FREQUENCY,DRUG,2336,2341,Amaryl,FREQUENCY,2353,2355,bid,1.0
19823,2167-02-25,DRUG-STRENGTH,DRUG,2336,2341,Amaryl,STRENGTH,2372,2379,"1,000 mg",1.0
19823,2167-02-25,DRUG-FREQUENCY,DRUG,2336,2341,Amaryl,FREQUENCY,2384,2386,bid,1.0
19823,2167-02-25,STRENGTH-DRUG,STRENGTH,2343,2348,2.0 mg,DRUG,2361,2370,Glucophage,1.0


## RxNorm Code Extraction From Re_Results

In [0]:
# drug + strength or form
from pyspark.sql.functions import when, col

result_df = (
  result_df.withColumn('rx_text', when((F.col('entity1')=='DRUG') & ((F.col('entity2')=='FORM') | (F.col('entity2')=='STRENGTH') | (F.col('entity2')=='DOSAGE') ), F.concat(F.col('chunk1'),F.lit(' '), F.col('chunk2')))
 .when( ((F.col('entity1')=='FORM') | (F.col('entity1')=='STRENGTH') | (F.col('entity1')=='DOSAGE') ) & (F.col('entity2')=='DRUG'), F.concat(F.col('chunk2'),F.lit(' '), F.col('chunk1')))
 .when( (F.col('entity1')=='DRUG') & ((F.col('entity2')!='FORM') & (F.col('entity2')!='STRENGTH') & (F.col('entity2')!='DOSAGE') ), F.col('chunk1'))
 .when( (F.col('entity2')=='DRUG') & ((F.col('entity1')!='FORM') & (F.col('entity1')!='STRENGTH') & (F.col('entity1')!='DOSAGE') ), F.col('chunk2'))
                   .otherwise(F.lit(' '))
                   )
)

result_df.display(20,70)

subject_id,date,relation,entity1,entity1_begin,entity1_end,chunk1,entity2,entity2_begin,entity2_end,chunk2,confidence,rx_text
19823,2167-02-25,DRUG-FORM,DRUG,1391,1399,Albuterol,FORM,1414,1423,nebulizers,1.0,Albuterol nebulizers
19823,2167-02-25,DRUG-FORM,DRUG,1405,1412,Atrovent,FORM,1414,1423,nebulizers,1.0,Atrovent nebulizers
19823,2167-02-25,STRENGTH-DRUG,STRENGTH,1539,1543,40 mg,DRUG,1551,1555,Lasix,1.0,Lasix 40 mg
19823,2167-02-25,ROUTE-DRUG,ROUTE,1548,1549,IV,DRUG,1551,1555,Lasix,1.0,Lasix
19823,2167-02-25,DRUG-STRENGTH,DRUG,2336,2341,Amaryl,STRENGTH,2343,2348,2.0 mg,1.0,Amaryl 2.0 mg
19823,2167-02-25,DRUG-ROUTE,DRUG,2336,2341,Amaryl,ROUTE,2350,2351,po,1.0,Amaryl
19823,2167-02-25,DRUG-FREQUENCY,DRUG,2336,2341,Amaryl,FREQUENCY,2353,2355,bid,1.0,Amaryl
19823,2167-02-25,DRUG-STRENGTH,DRUG,2336,2341,Amaryl,STRENGTH,2372,2379,"1,000 mg",1.0,"Amaryl 1,000 mg"
19823,2167-02-25,DRUG-FREQUENCY,DRUG,2336,2341,Amaryl,FREQUENCY,2384,2386,bid,1.0,Amaryl
19823,2167-02-25,STRENGTH-DRUG,STRENGTH,2343,2348,2.0 mg,DRUG,2361,2370,Glucophage,1.0,Glucophage 2.0 mg


In [0]:
documentAssembler = DocumentAssembler()\
      .setInputCol("rx_text")\
      .setOutputCol("ner_chunk")

sbert_embedder = BertSentenceEmbeddings.pretrained('sbiobert_base_cased_mli', 'en','clinical/models')\
      .setInputCols(["ner_chunk"])\
      .setOutputCol("sentence_embeddings")
    
rxnorm_resolver = SentenceEntityResolverModel.pretrained("sbiobertresolve_rxnorm_augmented","en", "clinical/models") \
      .setInputCols(["ner_chunk", "sentence_embeddings"]) \
      .setOutputCol("rxnorm_code")\
      .setDistanceFunction("EUCLIDEAN")

rxnorm_pipelineModel = PipelineModel(
    stages = [
        documentAssembler,
        sbert_embedder,
        rxnorm_resolver])

In [0]:
rxnorm_results = rxnorm_pipelineModel.transform(result_df)
rxnorm_result = rxnorm_results.select('subject_id','date', 'relation', 'entity1', 'entity1_begin','entity1_end',  'chunk1', 'entity2', 'entity2_begin', 'entity2_end', 
                                         'chunk2', 'confidence', 'rx_text', 
                                         F.explode(F.arrays_zip(rxnorm_results.ner_chunk.result, 
                                                                rxnorm_results.ner_chunk.metadata, 
                                                                rxnorm_results.rxnorm_code.result, 
                                                                rxnorm_results.rxnorm_code.metadata)).alias("cols")) \
                                     .select('subject_id','date', 'relation', 'entity1', 'entity1_begin','entity1_end',  'chunk1', 'entity2', 'entity2_begin', 'entity2_end',
                                             'chunk2', 'confidence', 'rx_text',
                                             F.expr("cols['1']['sentence']").alias("sent_id"),
                                             F.expr("cols['0']").alias("ner_chunk"),
                                             F.expr("cols['1']['entity']").alias("entity"), 
                                             F.expr("cols['2']").alias('rxnorm_code'),
                                             F.expr("cols['3']['all_k_results']").alias("all_codes"),
                                             F.expr("cols['3']['all_k_resolutions']").alias("resolutions"))
rxnorm_result.limit(20).display()

subject_id,date,relation,entity1,entity1_begin,entity1_end,chunk1,entity2,entity2_begin,entity2_end,chunk2,confidence,rx_text,sent_id,ner_chunk,entity,rxnorm_code,all_codes,resolutions
19823,2167-02-25,DRUG-FORM,DRUG,1391,1399,Albuterol,FORM,1414,1423,nebulizers,1.0,Albuterol nebulizers,0,Albuterol nebulizers,,2108226,2108226:::1154602:::370790:::1154603:::2108233:::2108255:::2108276:::745678:::1163444:::2108246:::2108507:::1154987:::2108228:::437534:::1008406:::2108280:::2108463:::1154598:::2108451:::2108291:::2108552:::377207:::1154604:::1156070,albuterol Inhalation Solution:::albuterol Inhalant Product:::albuterol Injectable Solution:::albuterol Injectable Product:::albuterol Inhalation Powder:::metaproterenol Inhalation Solution [Alupent]:::arformoterol Inhalation Solution:::albuterol Metered Dose Inhaler:::levalbuterol Inhalant Product:::albuterol Inhalation Solution [Airet]:::sodium chloride Inhalation Solution [Nebusal]:::atropine Inhalant Product:::albuterol Inhalation Solution [Accuneb]:::albuterol / ambroxol Oral Solution:::Albuterol / Ambroxol:::atropine Inhalation Solution:::isoflurane Inhalation Solution [Attane]:::albuterol / ipratropium Inhalant Product:::ipratropium Inhalation Solution [Atrovent]:::bitolterol Inhalation Solution:::tobramycin Inhalation Solution [Tobi]:::acebutolol Injectable Solution:::albuterol Oral Liquid Product:::budesonide Inhalant Product
19823,2167-02-25,DRUG-FORM,DRUG,1405,1412,Atrovent,FORM,1414,1423,nebulizers,1.0,Atrovent nebulizers,0,Atrovent nebulizers,,2108451,2108451:::1173573:::379767:::1173576:::2463732:::1945043:::1172634:::1171309:::363357:::1184866:::1170108:::1593875:::1941535:::1547449:::1547653:::2108255:::1367507:::2056699:::1795589:::2108507:::1170006:::1170113:::1176096:::2108246,ipratropium Inhalation Solution [Atrovent]:::Atrovent Inhalant Product:::Atrovent Autohaler:::Atrovent Nasal Product:::epoetin alfa Injectable Solution [Retacrit]:::Trelegy Inhalant Product:::Fortical Inhalant Product:::Advair Inhalant Product:::alefacept Injectable Solution [Amevive]:::Suprane Inhalant Product:::Arduan Injectable Product:::Altamist Inhalant Product:::Armonair Inhalant Product:::Nutrilyte Injectable Product:::Arnuity Inhalant Product:::metaproterenol Inhalation Solution [Alupent]:::Adasuve Inhalant Product:::Ajovy Injectable Product:::Nasoflow Inhalant Product:::sodium chloride Inhalation Solution [Nebusal]:::Alupent Inhalant Product:::Arfonad Injectable Product:::Haldol Injectable Product:::albuterol Inhalation Solution [Airet]
19823,2167-02-25,STRENGTH-DRUG,STRENGTH,1539,1543,40 mg,DRUG,1551,1555,Lasix,1.0,Lasix 40 mg,0,Lasix 40 mg,,200809,200809:::617319:::103919:::1871459:::201286:::2556796:::1927858:::1648194:::977916:::352320:::208458:::1738528:::2548764:::2100009:::104437:::152923:::1040034:::1307305:::2167570:::152944:::1734953:::200802:::1312404:::1652065:::897742,furosemide 40 MG Oral Tablet [Lasix]:::atorvastatin 40 MG [Lipitor]:::fluvastatin 40 MG Oral Capsule [Lescol]:::lisdexamfetamine dimesylate 40 MG Chewable Tablet [Vyvanse]:::verapamil hydrochloride 40 MG Oral Tablet [Cordilox]:::relugolix 40 MG:::betrixaban 40 MG [Bevyxxa]:::methylphenidate hydrochloride 40 MG [Aptensio]:::oxymorphone hydrochloride 40 MG [Opana]:::atomoxetine 40 MG Oral Capsule [Strattera]:::tacrine 40 MG Oral Capsule [Cognex]:::paroxetine mesylate 40 MG [Pexeva]:::selinexor 40 MG [Xpovio]:::baloxavir marboxil 40 MG [Xofluza]:::moxisylyte 40 MG Oral Tablet [Opilon]:::simvastatin 40 MG Oral Tablet [Zocor]:::lurasidone hydrochloride 40 MG [Latuda]:::enzalutamide 40 MG [Xtandi]:::rosuvastatin 40 MG [Ezallor]:::stavudine 40 MG Oral Capsule [Zerit]:::24 HR methylphenidate 40 MG Chewable Extended Release Oral Tablet [QuilliChew]:::furosemide 40 MG Oral Tablet [Aluzine]:::regorafenib 40 MG [Stivarga]:::duloxetine 40 MG [Irenka]:::verapamil hydrochloride 40 MG [Calan]
19823,2167-02-25,ROUTE-DRUG,ROUTE,1548,1549,IV,DRUG,1551,1555,Lasix,1.0,Lasix,0,Lasix,,202991,202991:::151963:::2256936:::2256930:::1043720:::224946:::217961:::203783:::261550:::1013021:::606658:::218019:::1364397:::196483,Lasix:::Lasma:::lasmiditan Oral Tablet:::lasmiditan:::LidoWorx:::Lidex:::Laniroif:::Lanoxicaps:::Lanabiotic:::Lidoject:::LidaMantle:::Lidomar:::lasalocid:::Laxoberal
19823,2167-02-25,DRUG-STRENGTH,DRUG,2336,2341,Amaryl,STRENGTH,2343,2348,2.0 mg,1.0,Amaryl 2.0 mg,0,Amaryl 2.0 mg,,901295,901295:::153591:::1310138:::213799:::2399657:::1036818:::998190:::1439900:::905270:::540140:::202295:::104384:::206241:::541719:::105562:::1369727:::1486694:::102530:::104049:::1044591:::999493:::562891:::102577:::885210:::208493,sodium fluoride 2.2 MG [Ludent]:::glimepiride 2 MG Oral Tablet [Amaryl]:::everolimus 2 MG Tablet for Oral Suspension [Afinitor]:::albuterol 2 MG Oral Tablet [Proventil]:::hydrocortisone 2 MG [Alkindi]:::loperamide hydrochloride 2 MG [Arret]:::everolimus 2.5 MG [Afinitor]:::riociguat 2 MG [Adempas]:::trihexyphenidyl hydrochloride 2 MG [Artane]:::sodium fluoride 2.2 MG [Luride]:::estradiol 2 MG Oral Tablet [Climaval]:::ramipril 2.5 MG Oral Capsule [Altace]:::estradiol 2 MG Oral Tablet [Estrace]:::isosorbide dinitrate 2.5 MG [Wesorbide]:::melphalan 2 MG Oral Tablet [Alkeran]:::pomalidomide 2 MG [Pomalyst]:::selegiline hydrochloride 2 MG [Anipryl]:::diazepam 2 MG Oral Capsule [Valium]:::poldine 2 MG Oral Tablet [Nacton]:::tesamorelin 2 MG Injection [Egrifta]:::sodium fluoride 2.2 MG [Epiflur]:::isosorbide dinitrate 2.5 MG [Isordil]:::mazindol 2 MG Oral Tablet [Teronac]:::benztropine mesylate 2 MG [Cogentin]:::thiothixene 2 MG Oral Capsule [Navane]
19823,2167-02-25,DRUG-ROUTE,DRUG,2336,2341,Amaryl,ROUTE,2350,2351,po,1.0,Amaryl,0,Amaryl,,215221,215221:::135820:::151348:::215203:::153592:::152800:::215200:::151345:::131725:::215206:::831421:::204292:::202795,Amilac:::Aventyl:::Amytal:::Amcort:::Amaryl:::Amilamont:::Ambenyl:::Amoram:::Ambien:::Americet:::Amoclan:::Amicar:::Accutane
19823,2167-02-25,DRUG-FREQUENCY,DRUG,2336,2341,Amaryl,FREQUENCY,2353,2355,bid,1.0,Amaryl,0,Amaryl,,215221,215221:::135820:::151348:::215203:::153592:::152800:::215200:::151345:::131725:::215206:::831421:::204292:::202795,Amilac:::Aventyl:::Amytal:::Amcort:::Amaryl:::Amilamont:::Ambenyl:::Amoram:::Ambien:::Americet:::Amoclan:::Amicar:::Accutane
19823,2167-02-25,DRUG-STRENGTH,DRUG,2336,2341,Amaryl,STRENGTH,2372,2379,"1,000 mg",1.0,"Amaryl 1,000 mg",0,"Amaryl 1,000 mg",,106248,106248:::1549223:::1654725:::1298448:::282828:::885214:::1312717:::417424:::409160:::1293504:::1362937:::283160:::1049535:::2279417:::103358:::1484805:::259249:::141895:::1439895:::1235742:::1051005:::1373234:::992531:::452837,hydrocortisone 1 MG/ML Topical Cream:::lidocaine 10 MG/ML Topical Spray:::glycerin 250 MG/ML Topical Cream:::butenafine hydrochloride 10 MG/ML Topical Cream:::zinc pyrithione 10 MG/ML Topical Lotion:::benztropine mesylate 1 MG [Cogentin]:::Phellinus linteus mycelium extract:::salicylic acid 10 MG/ML Topical Cream:::triclocarban 10 MG/ML Topical Foam:::caffeine 100 MG / ergotamine tartrate 1 MG Oral Tablet:::tolnaftate 10 MG/ML Topical Oil:::hydrocortisone 0.1 MG/MG Topical Ointment:::clotrimazole 0.01 MG/MG Topical Gel:::peanut (Arachis hypogaea) allergen-dnfp 1 MG [Palforzia]:::dimethicone 100 MG/ML Topical Cream:::sodium fluoride 18.1 MG/ML Oral Foam:::menthol 0.1 MG/MG Topical Gel:::prazosin 1 MG Oral Tablet [Alphavase]:::riociguat 1 MG [Adempas]:::hydrocortisone acetate 0.01 MG/MG Topical Gel:::camphor 31.8 MG/ML Topical Cream:::colloidal oatmeal 1 MG/ML Topical Cream:::terbinafine hydrochloride 10 MG/ML Topical Solution:::topotecan 1 MG
19823,2167-02-25,DRUG-FREQUENCY,DRUG,2336,2341,Amaryl,FREQUENCY,2384,2386,bid,1.0,Amaryl,0,Amaryl,,215221,215221:::135820:::151348:::215203:::153592:::152800:::215200:::151345:::131725:::215206:::831421:::204292:::202795,Amilac:::Aventyl:::Amytal:::Amcort:::Amaryl:::Amilamont:::Ambenyl:::Amoram:::Ambien:::Americet:::Amoclan:::Amicar:::Accutane
19823,2167-02-25,STRENGTH-DRUG,STRENGTH,2343,2348,2.0 mg,DRUG,2361,2370,Glucophage,1.0,Glucophage 2.0 mg,0,Glucophage 2.0 mg,,865570,865570:::201058:::1855336:::2001263:::205490:::808502:::996825:::199176:::999493:::315321:::308055:::669984:::1485769:::2475853:::2475854:::213220:::153353:::1093331:::800547:::422784:::439908:::540140:::892796:::1044591:::1999009,glipizide 2.5 MG [Glucotrol]:::glyburide 2.5 MG Oral Tablet [Euglucon]:::omeprazole 2.5 MG [Prilosec]:::pitavastatin magnesium 2 MG [Zypitamag]:::bumetanide 2 MG Oral Tablet [Bumex]:::dimethicone 12.5 MG/ML Topical Solution:::methylergonovine maleate 0.2 MG [Methergine]:::hydroquinone 0.02 MG/MG Topical Gel:::sodium fluoride 2.2 MG [Epiflur]:::alseroxylon 2 MG:::alseroxylon 2 MG Oral Tablet:::glyburide 2.5 MG [Glyburase]:::treprostinil 2.5 MG [Orenitram]:::vericiguat 2.5 MG:::vericiguat 2.5 MG Oral Tablet:::repaglinide 2 MG Oral Tablet [Prandin]:::zolmitriptan 2.5 MG Oral Tablet [Zomig]:::niacinamide 20 MG/ML Topical Cream:::glycine 2.2 MG/ML:::guaiacol 2.5 MG/ML Oral Suspension:::guaiacol 2.5 MG/ML:::sodium fluoride 2.2 MG [Luride]:::benzocaine 0.2 MG/MG [Orabase]:::tesamorelin 2 MG Injection [Egrifta]:::angiotensin II 2.5 MG/ML [Giapreza]


In [0]:
rxnorm_result = rxnorm_result.withColumn('all_codes', F.split(F.col('all_codes'), ':::'))\
                             .withColumn('resolutions', F.split(F.col('resolutions'), ':::'))
rxnorm_result.limit(20).display()

subject_id,date,relation,entity1,entity1_begin,entity1_end,chunk1,entity2,entity2_begin,entity2_end,chunk2,confidence,rx_text,sent_id,ner_chunk,entity,rxnorm_code,all_codes,resolutions
19823,2167-02-25,DRUG-FORM,DRUG,1391,1399,Albuterol,FORM,1414,1423,nebulizers,1.0,Albuterol nebulizers,0,Albuterol nebulizers,,2108226,"List(2108226, 1154602, 370790, 1154603, 2108233, 2108255, 2108276, 745678, 1163444, 2108246, 2108507, 1154987, 2108228, 437534, 1008406, 2108280, 2108463, 1154598, 2108451, 2108291, 2108552, 377207, 1154604, 1156070)","List(albuterol Inhalation Solution, albuterol Inhalant Product, albuterol Injectable Solution, albuterol Injectable Product, albuterol Inhalation Powder, metaproterenol Inhalation Solution [Alupent], arformoterol Inhalation Solution, albuterol Metered Dose Inhaler, levalbuterol Inhalant Product, albuterol Inhalation Solution [Airet], sodium chloride Inhalation Solution [Nebusal], atropine Inhalant Product, albuterol Inhalation Solution [Accuneb], albuterol / ambroxol Oral Solution, Albuterol / Ambroxol, atropine Inhalation Solution, isoflurane Inhalation Solution [Attane], albuterol / ipratropium Inhalant Product, ipratropium Inhalation Solution [Atrovent], bitolterol Inhalation Solution, tobramycin Inhalation Solution [Tobi], acebutolol Injectable Solution, albuterol Oral Liquid Product, budesonide Inhalant Product)"
19823,2167-02-25,DRUG-FORM,DRUG,1405,1412,Atrovent,FORM,1414,1423,nebulizers,1.0,Atrovent nebulizers,0,Atrovent nebulizers,,2108451,"List(2108451, 1173573, 379767, 1173576, 2463732, 1945043, 1172634, 1171309, 363357, 1184866, 1170108, 1593875, 1941535, 1547449, 1547653, 2108255, 1367507, 2056699, 1795589, 2108507, 1170006, 1170113, 1176096, 2108246)","List(ipratropium Inhalation Solution [Atrovent], Atrovent Inhalant Product, Atrovent Autohaler, Atrovent Nasal Product, epoetin alfa Injectable Solution [Retacrit], Trelegy Inhalant Product, Fortical Inhalant Product, Advair Inhalant Product, alefacept Injectable Solution [Amevive], Suprane Inhalant Product, Arduan Injectable Product, Altamist Inhalant Product, Armonair Inhalant Product, Nutrilyte Injectable Product, Arnuity Inhalant Product, metaproterenol Inhalation Solution [Alupent], Adasuve Inhalant Product, Ajovy Injectable Product, Nasoflow Inhalant Product, sodium chloride Inhalation Solution [Nebusal], Alupent Inhalant Product, Arfonad Injectable Product, Haldol Injectable Product, albuterol Inhalation Solution [Airet])"
19823,2167-02-25,STRENGTH-DRUG,STRENGTH,1539,1543,40 mg,DRUG,1551,1555,Lasix,1.0,Lasix 40 mg,0,Lasix 40 mg,,200809,"List(200809, 617319, 103919, 1871459, 201286, 2556796, 1927858, 1648194, 977916, 352320, 208458, 1738528, 2548764, 2100009, 104437, 152923, 1040034, 1307305, 2167570, 152944, 1734953, 200802, 1312404, 1652065, 897742)","List(furosemide 40 MG Oral Tablet [Lasix], atorvastatin 40 MG [Lipitor], fluvastatin 40 MG Oral Capsule [Lescol], lisdexamfetamine dimesylate 40 MG Chewable Tablet [Vyvanse], verapamil hydrochloride 40 MG Oral Tablet [Cordilox], relugolix 40 MG, betrixaban 40 MG [Bevyxxa], methylphenidate hydrochloride 40 MG [Aptensio], oxymorphone hydrochloride 40 MG [Opana], atomoxetine 40 MG Oral Capsule [Strattera], tacrine 40 MG Oral Capsule [Cognex], paroxetine mesylate 40 MG [Pexeva], selinexor 40 MG [Xpovio], baloxavir marboxil 40 MG [Xofluza], moxisylyte 40 MG Oral Tablet [Opilon], simvastatin 40 MG Oral Tablet [Zocor], lurasidone hydrochloride 40 MG [Latuda], enzalutamide 40 MG [Xtandi], rosuvastatin 40 MG [Ezallor], stavudine 40 MG Oral Capsule [Zerit], 24 HR methylphenidate 40 MG Chewable Extended Release Oral Tablet [QuilliChew], furosemide 40 MG Oral Tablet [Aluzine], regorafenib 40 MG [Stivarga], duloxetine 40 MG [Irenka], verapamil hydrochloride 40 MG [Calan])"
19823,2167-02-25,ROUTE-DRUG,ROUTE,1548,1549,IV,DRUG,1551,1555,Lasix,1.0,Lasix,0,Lasix,,202991,"List(202991, 151963, 2256936, 2256930, 1043720, 224946, 217961, 203783, 261550, 1013021, 606658, 218019, 1364397, 196483)","List(Lasix, Lasma, lasmiditan Oral Tablet, lasmiditan, LidoWorx, Lidex, Laniroif, Lanoxicaps, Lanabiotic, Lidoject, LidaMantle, Lidomar, lasalocid, Laxoberal)"
19823,2167-02-25,DRUG-STRENGTH,DRUG,2336,2341,Amaryl,STRENGTH,2343,2348,2.0 mg,1.0,Amaryl 2.0 mg,0,Amaryl 2.0 mg,,901295,"List(901295, 153591, 1310138, 213799, 2399657, 1036818, 998190, 1439900, 905270, 540140, 202295, 104384, 206241, 541719, 105562, 1369727, 1486694, 102530, 104049, 1044591, 999493, 562891, 102577, 885210, 208493)","List(sodium fluoride 2.2 MG [Ludent], glimepiride 2 MG Oral Tablet [Amaryl], everolimus 2 MG Tablet for Oral Suspension [Afinitor], albuterol 2 MG Oral Tablet [Proventil], hydrocortisone 2 MG [Alkindi], loperamide hydrochloride 2 MG [Arret], everolimus 2.5 MG [Afinitor], riociguat 2 MG [Adempas], trihexyphenidyl hydrochloride 2 MG [Artane], sodium fluoride 2.2 MG [Luride], estradiol 2 MG Oral Tablet [Climaval], ramipril 2.5 MG Oral Capsule [Altace], estradiol 2 MG Oral Tablet [Estrace], isosorbide dinitrate 2.5 MG [Wesorbide], melphalan 2 MG Oral Tablet [Alkeran], pomalidomide 2 MG [Pomalyst], selegiline hydrochloride 2 MG [Anipryl], diazepam 2 MG Oral Capsule [Valium], poldine 2 MG Oral Tablet [Nacton], tesamorelin 2 MG Injection [Egrifta], sodium fluoride 2.2 MG [Epiflur], isosorbide dinitrate 2.5 MG [Isordil], mazindol 2 MG Oral Tablet [Teronac], benztropine mesylate 2 MG [Cogentin], thiothixene 2 MG Oral Capsule [Navane])"
19823,2167-02-25,DRUG-ROUTE,DRUG,2336,2341,Amaryl,ROUTE,2350,2351,po,1.0,Amaryl,0,Amaryl,,215221,"List(215221, 135820, 151348, 215203, 153592, 152800, 215200, 151345, 131725, 215206, 831421, 204292, 202795)","List(Amilac, Aventyl, Amytal, Amcort, Amaryl, Amilamont, Ambenyl, Amoram, Ambien, Americet, Amoclan, Amicar, Accutane)"
19823,2167-02-25,DRUG-FREQUENCY,DRUG,2336,2341,Amaryl,FREQUENCY,2353,2355,bid,1.0,Amaryl,0,Amaryl,,215221,"List(215221, 135820, 151348, 215203, 153592, 152800, 215200, 151345, 131725, 215206, 831421, 204292, 202795)","List(Amilac, Aventyl, Amytal, Amcort, Amaryl, Amilamont, Ambenyl, Amoram, Ambien, Americet, Amoclan, Amicar, Accutane)"
19823,2167-02-25,DRUG-STRENGTH,DRUG,2336,2341,Amaryl,STRENGTH,2372,2379,"1,000 mg",1.0,"Amaryl 1,000 mg",0,"Amaryl 1,000 mg",,106248,"List(106248, 1549223, 1654725, 1298448, 282828, 885214, 1312717, 417424, 409160, 1293504, 1362937, 283160, 1049535, 2279417, 103358, 1484805, 259249, 141895, 1439895, 1235742, 1051005, 1373234, 992531, 452837)","List(hydrocortisone 1 MG/ML Topical Cream, lidocaine 10 MG/ML Topical Spray, glycerin 250 MG/ML Topical Cream, butenafine hydrochloride 10 MG/ML Topical Cream, zinc pyrithione 10 MG/ML Topical Lotion, benztropine mesylate 1 MG [Cogentin], Phellinus linteus mycelium extract, salicylic acid 10 MG/ML Topical Cream, triclocarban 10 MG/ML Topical Foam, caffeine 100 MG / ergotamine tartrate 1 MG Oral Tablet, tolnaftate 10 MG/ML Topical Oil, hydrocortisone 0.1 MG/MG Topical Ointment, clotrimazole 0.01 MG/MG Topical Gel, peanut (Arachis hypogaea) allergen-dnfp 1 MG [Palforzia], dimethicone 100 MG/ML Topical Cream, sodium fluoride 18.1 MG/ML Oral Foam, menthol 0.1 MG/MG Topical Gel, prazosin 1 MG Oral Tablet [Alphavase], riociguat 1 MG [Adempas], hydrocortisone acetate 0.01 MG/MG Topical Gel, camphor 31.8 MG/ML Topical Cream, colloidal oatmeal 1 MG/ML Topical Cream, terbinafine hydrochloride 10 MG/ML Topical Solution, topotecan 1 MG)"
19823,2167-02-25,DRUG-FREQUENCY,DRUG,2336,2341,Amaryl,FREQUENCY,2384,2386,bid,1.0,Amaryl,0,Amaryl,,215221,"List(215221, 135820, 151348, 215203, 153592, 152800, 215200, 151345, 131725, 215206, 831421, 204292, 202795)","List(Amilac, Aventyl, Amytal, Amcort, Amaryl, Amilamont, Ambenyl, Amoram, Ambien, Americet, Amoclan, Amicar, Accutane)"
19823,2167-02-25,STRENGTH-DRUG,STRENGTH,2343,2348,2.0 mg,DRUG,2361,2370,Glucophage,1.0,Glucophage 2.0 mg,0,Glucophage 2.0 mg,,865570,"List(865570, 201058, 1855336, 2001263, 205490, 808502, 996825, 199176, 999493, 315321, 308055, 669984, 1485769, 2475853, 2475854, 213220, 153353, 1093331, 800547, 422784, 439908, 540140, 892796, 1044591, 1999009)","List(glipizide 2.5 MG [Glucotrol], glyburide 2.5 MG Oral Tablet [Euglucon], omeprazole 2.5 MG [Prilosec], pitavastatin magnesium 2 MG [Zypitamag], bumetanide 2 MG Oral Tablet [Bumex], dimethicone 12.5 MG/ML Topical Solution, methylergonovine maleate 0.2 MG [Methergine], hydroquinone 0.02 MG/MG Topical Gel, sodium fluoride 2.2 MG [Epiflur], alseroxylon 2 MG, alseroxylon 2 MG Oral Tablet, glyburide 2.5 MG [Glyburase], treprostinil 2.5 MG [Orenitram], vericiguat 2.5 MG, vericiguat 2.5 MG Oral Tablet, repaglinide 2 MG Oral Tablet [Prandin], zolmitriptan 2.5 MG Oral Tablet [Zomig], niacinamide 20 MG/ML Topical Cream, glycine 2.2 MG/ML, guaiacol 2.5 MG/ML Oral Suspension, guaiacol 2.5 MG/ML, sodium fluoride 2.2 MG [Luride], benzocaine 0.2 MG/MG [Orabase], tesamorelin 2 MG Injection [Egrifta], angiotensin II 2.5 MG/ML [Giapreza])"


### Split Resolutions to Resolution Drug and Write Results to Golden Delta Layer

In [0]:
pd_rxnorm_result = rxnorm_result.toPandas()
pd_rxnorm_result

Unnamed: 0,subject_id,date,relation,entity1,entity1_begin,entity1_end,chunk1,entity2,entity2_begin,entity2_end,chunk2,confidence,rx_text,sent_id,ner_chunk,entity,rxnorm_code,all_codes,resolutions
0,19823,2167-02-25,DRUG-FORM,DRUG,1391,1399,Albuterol,FORM,1414,1423,nebulizers,1.0,Albuterol nebulizers,0,Albuterol nebulizers,,2108226,"[2108226, 1154602, 370790, 1154603, 2108233, 2108255, 2108276, 745678, 1163444, 2108246, 2108507...","[albuterol Inhalation Solution, albuterol Inhalant Product, albuterol Injectable Solution, albut..."
1,19823,2167-02-25,DRUG-FORM,DRUG,1405,1412,Atrovent,FORM,1414,1423,nebulizers,1.0,Atrovent nebulizers,0,Atrovent nebulizers,,2108451,"[2108451, 1173573, 379767, 1173576, 2463732, 1945043, 1172634, 1171309, 363357, 1184866, 1170108...","[ipratropium Inhalation Solution [Atrovent], Atrovent Inhalant Product, Atrovent Autohaler, Atro..."
2,19823,2167-02-25,STRENGTH-DRUG,STRENGTH,1539,1543,40 mg,DRUG,1551,1555,Lasix,1.0,Lasix 40 mg,0,Lasix 40 mg,,200809,"[200809, 617319, 103919, 1871459, 201286, 2556796, 1927858, 1648194, 977916, 352320, 208458, 173...","[furosemide 40 MG Oral Tablet [Lasix], atorvastatin 40 MG [Lipitor], fluvastatin 40 MG Oral Caps..."
3,19823,2167-02-25,ROUTE-DRUG,ROUTE,1548,1549,IV,DRUG,1551,1555,Lasix,1.0,Lasix,0,Lasix,,202991,"[202991, 151963, 2256936, 2256930, 1043720, 224946, 217961, 203783, 261550, 1013021, 606658, 218...","[Lasix, Lasma, lasmiditan Oral Tablet, lasmiditan, LidoWorx, Lidex, Laniroif, Lanoxicaps, Lanabi..."
4,19823,2167-02-25,DRUG-STRENGTH,DRUG,2336,2341,Amaryl,STRENGTH,2343,2348,2.0 mg,1.0,Amaryl 2.0 mg,0,Amaryl 2.0 mg,,901295,"[901295, 153591, 1310138, 213799, 2399657, 1036818, 998190, 1439900, 905270, 540140, 202295, 104...","[sodium fluoride 2.2 MG [Ludent], glimepiride 2 MG Oral Tablet [Amaryl], everolimus 2 MG Tablet ..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2777,70004,2182-08-05,ROUTE-DRUG,ROUTE,545,546,IV,DRUG,548,555,contrast,1.0,contrast,0,contrast,,1592743,"[1592743, 202939, 23202, 66977, 2262255, 543455, 705766, 1436150, 744843, 216795, 1946584, 65874...","[Ofev, Dixarit, Dilor, Vascor, Scenesse, Durad, Appearex, Visco-Gel, Isentress, Duratest, Xhance..."
2778,70004,2182-08-05,DOSAGE-DRUG,DOSAGE,942,946,20 cc,DRUG,951,959,Magnevist,1.0,Magnevist 20 cc,0,Magnevist 20 cc,,208456,"[208456, 152893, 2286257, 617317, 664142, 1119558, 596927, 429343, 440810, 571777, 351387, 79386...","[tacrine 20 MG Oral Capsule [Cognex], sertindole 20 MG Oral Tablet [Serdolect], dexamethasone 20..."
2779,70004,2182-08-05,DRUG-ROUTE,DRUG,951,959,Magnevist,ROUTE,961,971,intravenous,1.0,Magnevist,0,Magnevist,,196214,"[196214, 2475179, 406156, 991881, 6574, 218204, 218250, 218167, 797858, 1043619, 152000, 218245,...","[Magnesiocard, magnesite, MagneBind, Maracyn Plus, magnesium, Maoson, Maxaquin, Magagel Plus, Ma..."
2780,70004,2182-08-16,ROUTE-DRUG,ROUTE,475,476,IV,DRUG,478,485,CONTRAST,1.0,CONTRAST,0,CONTRAST,,799044,"[799044, 153381, 385716, 216281, 668395, 1013644, 216253, 284702, 1188463, 2264346, 323984, 2158...","[Cotab A, Cozaar-Comp, Cesamet, Crolom, Certuss, Cidaflex, Cosopt, Colocort, Citravet, belladonn..."


In [0]:
pd_rxnorm_result['drug_resolution']= pd_rxnorm_result['resolutions'].apply(lambda x: x[0])
pd_rxnorm_result['drug_resolution'] = pd_rxnorm_result['drug_resolution'].str.lower()
pd_rxnorm_result['chunk1']          = pd_rxnorm_result['chunk1'].str.lower()
pd_rxnorm_result['chunk2']          = pd_rxnorm_result['chunk2'].str.lower()
pd_rxnorm_result.head(4)

Unnamed: 0,subject_id,date,relation,entity1,entity1_begin,entity1_end,chunk1,entity2,entity2_begin,entity2_end,chunk2,confidence,rx_text,sent_id,ner_chunk,entity,rxnorm_code,all_codes,resolutions,drug_resolution
0,19823,2167-02-25,DRUG-FORM,DRUG,1391,1399,albuterol,FORM,1414,1423,nebulizers,1.0,Albuterol nebulizers,0,Albuterol nebulizers,,2108226,"[2108226, 1154602, 370790, 1154603, 2108233, 2108255, 2108276, 745678, 1163444, 2108246, 2108507...","[albuterol Inhalation Solution, albuterol Inhalant Product, albuterol Injectable Solution, albut...",albuterol inhalation solution
1,19823,2167-02-25,DRUG-FORM,DRUG,1405,1412,atrovent,FORM,1414,1423,nebulizers,1.0,Atrovent nebulizers,0,Atrovent nebulizers,,2108451,"[2108451, 1173573, 379767, 1173576, 2463732, 1945043, 1172634, 1171309, 363357, 1184866, 1170108...","[ipratropium Inhalation Solution [Atrovent], Atrovent Inhalant Product, Atrovent Autohaler, Atro...",ipratropium inhalation solution [atrovent]
2,19823,2167-02-25,STRENGTH-DRUG,STRENGTH,1539,1543,40 mg,DRUG,1551,1555,lasix,1.0,Lasix 40 mg,0,Lasix 40 mg,,200809,"[200809, 617319, 103919, 1871459, 201286, 2556796, 1927858, 1648194, 977916, 352320, 208458, 173...","[furosemide 40 MG Oral Tablet [Lasix], atorvastatin 40 MG [Lipitor], fluvastatin 40 MG Oral Caps...",furosemide 40 mg oral tablet [lasix]
3,19823,2167-02-25,ROUTE-DRUG,ROUTE,1548,1549,iv,DRUG,1551,1555,lasix,1.0,Lasix,0,Lasix,,202991,"[202991, 151963, 2256936, 2256930, 1043720, 224946, 217961, 203783, 261550, 1013021, 606658, 218...","[Lasix, Lasma, lasmiditan Oral Tablet, lasmiditan, LidoWorx, Lidex, Laniroif, Lanoxicaps, Lanabi...",lasix


In [0]:
outname = 'posology_RE_rxnorm_w_drug_resolutions.csv'
outdir = f'/FileStore/HLS/jsl_kg/data/'

In [0]:
pd_rxnorm_result.to_csv(f'/dbfs{outdir+outname}', index=False, encoding="utf-8")

## NER JSL Slim

Model card of the ner_jsl_slim is [here](https://nlp.johnsnowlabs.com/2021/08/13/ner_jsl_slim_en.html).

In [0]:
documentAssembler = DocumentAssembler()\
      .setInputCol("text")\
      .setOutputCol("document")

sentenceDetector = SentenceDetector()\
      .setInputCols(["document"])\
      .setOutputCol("sentence")\
      .setCustomBounds(["\|"])

tokenizer = Tokenizer()\
      .setInputCols(["sentence"])\
      .setOutputCol("token")\

word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")\
      .setInputCols(["sentence", "token"])\
      .setOutputCol("embeddings")

jsl_ner = MedicalNerModel.pretrained("ner_jsl_slim", "en", "clinical/models") \
      .setInputCols(["sentence", "token", "embeddings"]) \
      .setOutputCol("ner")

jsl_converter = NerConverter() \
      .setInputCols(["sentence", "token", "ner"]) \
      .setOutputCol("ner_chunk")\
      .setWhiteList(['Symptom','Body_Part', 'Procedure', 'Disease_Syndrome_Disorder', 'Test'])

ner_pipeline = Pipeline(
    stages = [
        documentAssembler,
        sentenceDetector,
        tokenizer,
        word_embeddings,
        jsl_ner,
        jsl_converter
        ])

data_ner = spark.createDataFrame([[""]]).toDF("text")
model = ner_pipeline.fit(data_ner)

In [0]:
results = model.transform(df)
results.printSchema()

In [0]:
result_df = results.select('subject_id','date',
                           F.explode(F.arrays_zip(results.ner_chunk.result, results.ner_chunk.begin, results.ner_chunk.end, results.ner_chunk.metadata)).alias("cols")) \
                    .select('subject_id','date',
                            F.expr("cols['3']['sentence']").alias("sentence_id"),
                            F.expr("cols['0']").alias("chunk"),
                            F.expr("cols['1']").alias("begin"),
                            F.expr("cols['2']").alias("end"),
                            F.expr("cols['3']['entity']").alias("ner_label"))\
                    .filter("ner_label!='O'")
result_df.limit(20).display()

subject_id,date,sentence_id,chunk,begin,end,ner_label
19823,2167-02-25,0,Shortness of breath,178,196,Symptom
19823,2167-02-25,0,cough,199,203,Symptom
19823,2167-02-25,1,diabetes type II,345,360,Disease_Syndrome_Disorder
19823,2167-02-25,1,congestive heart failure,363,386,Disease_Syndrome_Disorder
19823,2167-02-25,1,hypertension,413,424,Disease_Syndrome_Disorder
19823,2167-02-25,2,progressively worsening shortness of breath,477,519,Symptom
19823,2167-02-25,2,dyspnea on exertion,525,543,Symptom
19823,2167-02-25,2,wheezing,546,553,Symptom
19823,2167-02-25,2,nonproductive cough,556,574,Symptom
19823,2167-02-25,3,nausea,666,671,Symptom


In [0]:
pd_result = result_df.toPandas()
pd_result

Unnamed: 0,subject_id,date,sentence_id,chunk,begin,end,ner_label
0,19823,2167-02-25,0,Shortness of breath,178,196,Symptom
1,19823,2167-02-25,0,cough,199,203,Symptom
2,19823,2167-02-25,1,diabetes type II,345,360,Disease_Syndrome_Disorder
3,19823,2167-02-25,1,congestive heart failure,363,386,Disease_Syndrome_Disorder
4,19823,2167-02-25,1,hypertension,413,424,Disease_Syndrome_Disorder
...,...,...,...,...,...,...,...
16516,70004,2182-08-16,12,cerebellum,1812,1821,Body_Part
16517,70004,2182-08-16,13,Multilevel degenerative changes,1860,1890,Symptom
16518,70004,2182-08-16,13,uncovertebral joint hypertrophy,1897,1927,Disease_Syndrome_Disorder
16519,70004,2182-08-16,14,cervical spine,1992,2005,Body_Part


In [0]:
outname = 'ner_jsl_slim_results.csv'
outdir = f'/FileStore/HLS/jsl_kg/data/'
pd_result.to_csv(f'/dbfs{outdir+outname}', index=False, encoding="utf-8")

## License
Copyright / License info of the notebook. Copyright [2021] the Notebook Authors.  The source in this notebook is provided subject to the [Apache 2.0 License](https://spdx.org/licenses/Apache-2.0.html).  All included or referenced third party libraries are subject to the licenses set forth below.

|Library Name|Library License|Library License URL|Library Source URL|
| :-: | :-:| :-: | :-:|
|Pandas |BSD 3-Clause License| https://github.com/pandas-dev/pandas/blob/master/LICENSE | https://github.com/pandas-dev/pandas|
|Numpy |BSD 3-Clause License| https://github.com/numpy/numpy/blob/main/LICENSE.txt | https://github.com/numpy/numpy|
|Apache Spark |Apache License 2.0| https://github.com/apache/spark/blob/master/LICENSE | https://github.com/apache/spark/tree/master/python/pyspark|
|BeautifulSoup|MIT License|https://www.crummy.com/software/BeautifulSoup/#Download|https://www.crummy.com/software/BeautifulSoup/bs4/download/|
|Requests|Apache License 2.0|https://github.com/psf/requests/blob/main/LICENSE|https://github.com/psf/requests|
|Spark NLP Display|Apache License 2.0|https://github.com/JohnSnowLabs/spark-nlp-display/blob/main/LICENSE|https://github.com/JohnSnowLabs/spark-nlp-display|
|Spark NLP |Apache License 2.0| https://github.com/JohnSnowLabs/spark-nlp/blob/master/LICENSE | https://github.com/JohnSnowLabs/spark-nlp|
|Spark NLP for Healthcare|[Proprietary license - John Snow Labs Inc.](https://www.johnsnowlabs.com/spark-nlp-health/) |NA|NA|




|Author|
|-|
|Databricks Inc.|
|John Snow Labs Inc.|

## Disclaimers
Databricks Inc. (“Databricks”) does not dispense medical, diagnosis, or treatment advice. This Solution Accelerator (“tool”) is for informational purposes only and may not be used as a substitute for professional medical advice, treatment, or diagnosis. This tool may not be used within Databricks to process Protected Health Information (“PHI”) as defined in the Health Insurance Portability and Accountability Act of 1996, unless you have executed with Databricks a contract that allows for processing PHI, an accompanying Business Associate Agreement (BAA), and are running this notebook within a HIPAA Account.  Please note that if you run this notebook within Azure Databricks, your contract with Microsoft applies.