## Extract Diagnoseses, Symptoms, and Procedures using GPT-4o

**Goal of this notebook:**  
Use AI to extract ONLY the important components of the Mimic-III NOTEEVENT TEXT field. Removing details such as admission dates and medical dosages will reduce the token count, and simplify the text _while_ keeping relevant context and the original sentences that would be lost if we were to use a parser such as [Azure Text Analytics for Health](./01_az_text_analytics.ipynb) or the [mednlp](https://github.com/plandes/mednlp) python package.

The extracted data will be used to...  
- Simplify the note text before attempting to code. This will improve calssification results

**Requirements**  
- Setup Azure Open AI Resource with GPT-4o deployment, ensure [.env](./.env.sample) file is populated up to date

In [35]:
from src.prepare_mimic_iii import transform_data
from dotenv import load_dotenv, find_dotenv
from openai import AzureOpenAI
from textwrap import dedent

import pandas as pd
import os
import tiktoken

load_dotenv(find_dotenv(), override=True)

pd.set_option('display.max_colwidth', None)

#### Prepare Data

In [27]:
# Get medical coding data, take a small subsample

# df = transform_data("data/") # Only re-run if change in preparation logic
df = pd.read_csv("data/joined/dataset_single_001_088.csv.gz").sample(5, replace=False,random_state=1234)
print(df.shape)
display(df.dtypes)
display(df.head(5))

(5, 3)


HADM_ID       int64
TEXT         object
ICD9_CODE    object
dtype: object

Unnamed: 0,HADM_ID,TEXT,ICD9_CODE
4808,199105,"['Admission Date Discharge Date Date Birth Sex Service SURGERY Allergies Sulfonamides / Lipitor / Naprosyn / Penicillins / Amoxicillin / Chocolate Flavor / Crestor / Morphine / Ativan AttendingFirst Name LF Chief Complaint Respiratory distress Major Surgical Invasive Procedure None History Present Illness Mr . Known lastname year old male complex medical history originally admitted hospital Month strangulated ventral hernia . underwent small bowel resections x , prolonged recovery included placement tracheostomy tube , discharged long term rehabiliation facility . subsequently transferred back Hospital secondary respiratory distress/ ? pneumonia well management volume overload . Past Medical History . CAD s/p LAD OM stent , s/p RCA PTCA . DM . HTN . PVD s/p bilat LE bypass surgeries Dr.Last Name STitle . CRI baseline Cr . . . cataracts . gout . BPH . Abd hernia . s/p CCY , ex lap w/abd hernia resulting . Incarcerated ventral hernia containing strangulated small bowel requiring small bowel resection . complicated leak leading operation . Social History Worked head Doctor Last Name . Hx Etoh abuse x yrs , quit . ppy tob . Multiple family memebrs live nearby Family History Fa died secondary colon ca Mo died secondary PNA Siblings Etoh abuse , HTN Physical Exam . HR BP / RR SpO % Alert , apparent distress Regular rate & rhythm Breath sounds course bilaterally , wheezes rhonchi Tracheostomy intact , + air leak Soft , non tender , & non distended . VAC place . + upper & lower extremity edema Pertinent Results PM BLOOD WBC . RBC . Hgb . Hct . MCV MCH . MCHC . RDW . Plt Ct PM BLOOD WBC . RBC . Hgb . Hct . MCV MCH . MCHC . RDW . Plt Ct BLOOD WBC . RBC . Hgb . Hct . MCV MCH . MCHC . RDW . Plt Ct PM BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap PM BLOOD ALT AST AlkPhos Amylase TotBili . PM BLOOD Amylase pm SPUTUM Source Endotracheal . GRAM STAIN Final PMNs < epithelial cells/X field . + < per X FIELD GRAM POSITIVE COCCI . PAIRS CLUSTERS . + < per X FIELD GRAM POSITIVE RODS . RESPIRATORY CULTURE Final SPARSE GROWTH OROPHARYNGEAL FLORA . STAPH AUREUS COAG + . SPARSE GROWTH . TWO COLONIAL MORPHOLOGIES . Oxacillin RESISTANT Staphylococci MUST reported also RESISTANT penicillins , cephalosporins , carbacephems , carbapenems , beta lactamase inhibitor combinations . Rifampin used alone therapy . SENSITIVITIES MIC expressed MCG/ML _________________________________________________________ STAPH AUREUS COAG + | CLINDAMYCIN = > R ERYTHROMYCIN = > R GENTAMICIN < = . LEVOFLOXACIN = > R OXACILLIN = > R PENICILLIN = > . R RIFAMPIN < = . TETRACYCLINE VANCOMYCIN < = FUNGAL CULTURE Preliminary FUNGUS ISOLATED . pm BLOOD CULTURE Source Venipuncture . FINAL REPORT AEROBIC BOTTLE Final GROWTH . ANAEROBIC BOTTLE Final REPORTED PHONE First Name NamePattern Last Name NamePattern CCB Numeric Identifier . STAPH AUREUS COAG + . FINAL SENSITIVITIES . Oxacillin RESISTANT Staphylococci MUST reported also RESISTANT penicillins , cephalosporins , carbacephems , carbapenems , beta lactamase inhibitor combinations . Rifampin used alone therapy . VANCOMYCIN SENSITIVITY DONE E TEST .. SENSITIVITIES MIC expressed MCG/ML _________________________________________________________ STAPH AUREUS COAG + | CLINDAMYCIN = > R ERYTHROMYCIN = > R GENTAMICIN < = . LEVOFLOXACIN = > R OXACILLIN = > R PENICILLIN = > . R RIFAMPIN < = . TETRACYCLINE VANCOMYCIN Brief Hospital Course Mr . Known lastname readmitted surgical intensive care unit Hospital rehabilitation facility . alert responsive , looked well . Diuresis initiated good response , weaned ventilator placed tracheostomy mask ventilation . initially well , noted large cuff leak tracheostomy , became somewhat tachypneic . placed back ventilator responded well . noted fever blood/urine cultures sent . found staphylococcus sputum blood , pseudomonas urine . Appropriate antibiotic coverage initiated linezolid cefepime , defervesced , corresponding drop WBC . addition , tracheostomy tube removed replaced similarly sized flexible Last Name un tracheostomy , angled specifically longer arm . air leak resolved maneuver . continued improve , decided ready transfer rehabiliation facility . Medications Admission . Lopressor mg PO TID . ASA mg PO DAILY . Lasix mg PO BID . Ativan PRN . Albuterol IH . Lansoprazole mg PO DAILY . Regular Insulin Sliding Scale . Epogen . Seroquel . Heparin U SC TID Discharge Medications . Heparin Porcine , unit/mL Solution Last Name un Units Injection TID times day . . Albuterol Sulfate . % Solution Last Name un One Neb Inhalation QH every hours needed . . Epoetin Alfa , unit/mL Solution Last Name un Units Injection QMOWEFR Monday Wednesday Friday . . Zinc Oxide Cod Liver Oil % Ointment Last Name un One Appl Topical PRN needed . . Miconazole Nitrate % Powder Last Name un One Appl Topical TID times day needed . . Acetaminophen mg/ mL Solution Last Name un Six Age mg PO Q H every hours needed . . Docusate Sodium mg/ mL Liquid Age One Hundred mg PO BID times day . . Lorazepam . mg Tablet Age Tablets PO Q H every hours needed . . Clonidine . mg/ hr Patch Weekly Age One Patch Weekly Transdermal QFRI every Friday . . Aspirin mg Tablet , Chewable Age One Tablet , Chewable PO DAILY Daily . . Metoprolol Tartrate mg Tablet Age One Tablet PO TID times day . . Quetiapine mg Tablet Age One Tablet PO QHS day bedtime . . Ipratropium Bromide mcg/Actuation Aerosol Age Puffs Inhalation QH every hours needed vent . . Albuterol mcg/Actuation Aerosol Age Puffs Inhalation QH every hours needed vent . . Therapeutic Multivitamin Liquid Age Five ML PO DAILY Daily . . Linezolid mg Tablet Age One Tablet PO QH every hours . . Papain Urea , unit/g % Ointment Age One Appl Topical DAILY Daily . . Furosemide mg Tablet Age . Tablets PO BID times day . . Cefepime g Recon Soln Age One Recon Soln Intravenous QH every hours . . Insulin NPH Human Recomb unit/mL Suspension Age Ten Units Subcutaneous twice day breakfast dinner . . Regular Insulin Sliding Scale Discharge Disposition Extended Care Facility Hospital Discharge Diagnosis . Respiratory distress . Congestive heart failure . Air leak tracheostomy . Pancreatitis . Pneumonia . Incarcerated strangulated ventral hernia , Small bowel resection primary reanastomosis , multiple abdominal abscesses , respiratory failure , myocardial infarction . CAD s/p LAD OM stent , s/p RCA PTCA . DM . HTN . PVD . CRI Cr . . . cataracts . gout . BPH , . h/o EtOH abuse quit yrs ago , h/o heavy tobacco use Discharge Condition Stable Discharge Instructions Take medications directed . seen doctors Name PTitle rehab . Call doctor go ED chest pain shortness breath fever > significant drainage blood wound Followup Instructions Please follow Dr. First Name Name Pattern Last Name NamePattern , M.D . need repeat echo months , please see Dr. First Name STitle arrange . Please follow Dr. Last Name STitle call Telephone/Fax make appointment Completed']",['041']
836,116802,['Atrial fibrillation . Right bundle branch block . Non specific ST wave changes . Compared previous tracing significant change .'],['041']
4510,192358,"['Admission Date Discharge Date Date Birth Sex F Service MEDICINE Allergies Lasix / Diuril / Keflex / Iodine AttendingFirst Name LF Chief Complaint AMS Major Surgical Invasive Procedure none . History Present Illness yo F IPF , COPD L chronic prednisone , CHF , mechanical mitral valve , s/p pacemaker placement , known high grade colonic adenoma GIB resected , gastric varix history liver disease recently admitted nocardia pneuomina , presented AMS . Patient transferred MICU given + melena likely need endoscopy may need intubation airway protection . note , per EMS , baseline EMS arrival rehab . hypoxic high RA . ED inital vitals signs / % L Non Rebreather . Pt denies confusion & self , place . Patient denies complaints shortness breath , typical . Pt declined NG lavage . Rectal heme postive dark stool . Labs notable WBC count , Hct , creatinine . baseline . , metabolic alkylosis VBG chronic . crossed four units . . , AV paced , / , , % arrival floor small melanotic BM visualized , also dried blood right nare . repeat Hct prior transfusion . Given melena , patient completed uit pRBC getting nd FFP prior transfer . also getting x dose Bumex given triggerred required NRB OSat mid , weaned L low % . ROS reported wanting sleep . Increased frequency BMs recently describes dark , bloody . recall sticky . Denies fevers , chills , chest pain , shortness breath , abdominal pain , nausea , vomiting , anything else . Past Medical History s/p mechanical mitral valve sinus node dysfunction s/p DDD pacemaker placement atrial flutter s/p ablation / cardioversion congestive heart failure , Last Echo , mildly depressed LVEF= % systolic function chronic obstructive pulmonary disease LO trach home rest idiopathic pulmonary fibrosis chronic prednisone chronic kidney disease baseline creatinine . . anemia due mechanical valve chronic kidney disease hypertension hypercholesterolemia hypothyroidism meniere ? ? ? ? ? ? disease HOH spinal arthritis breast cancer radical mastectomy right breast . Partial left . s/p hysterectomy s/p nasal embolization refractory epistaxis lower GI bleed secondary high grade colonic adenoma s/p biopsy resection / Social History Recently Hospital MACU . Lived husband , suddenly passed away patient intubated . Patient aware . requires assistance ADLs IADLs tobacco smoked years , quit . alcohol social drugs IVDU . Family History Father polymyositis coronary artery disease mother metastatic bone cancer . several cousins breast cancer . Physical Exam Admission Exam Vitals Tmax . ? ? ? ? ? ? C . ? ? ? ? ? ? F Tcurrent . ? ? ? ? ? ? C . ? ? ? ? ? ? F HR bpm BP / / / mmHg RR insp/min SpO % General lethargic , oriented person place , trying pull things HEENT Sclera anicteric , mucous membrane dry , oropharynx clear Neck supple , JVP elevated , LAD Lungs Clear auscultation bilaterally anteriorly , wheezes ronchi CV Regular rate rhythm , normal + , + mechanical click Abdomen soft , non tender , non distended , bowel sounds present , rebound tenderness guarding , organomegaly Ext cool , + pulses , clubbing , cyanosis edema . . Discharge PEx Vitals / . / % L General alert , aao , sitting bed HEENT Sclera anicteric , mucous membrane moist , oropharynx clear Lungs improved wheezing . CV RRR , normal + , + mechanical click Abdomen soft , non tender , non distended , bowel sounds present , rebound tenderness guarding , organomegaly Ext warm , + pulses , clubbing , cyanosis edema Pertinent Results Admission Labs PM BLOOD WBC . RBC . Hgb . Hct . MCV MCH . MCHC . RDW . Plt Ct PM BLOOD Neuts . Lymphs . Monos . Eos . Baso . PM BLOOD PT . PTT . INRPT . PM BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap PM BLOOD Calcium . Phos . Mg . PM BLOOD Type Last Name un pO pCO pH . calTCO Base XS Comment GREEN TOP PM BLOOD Glucose Lactate . K . PM BLOOD Hgb . calcHCT Sat COHgb MetHgb BLOOD Hct . Urine PM URINE Color Straw Appear Clear Sp Last Name un . PM URINE Blood MOD Nitrite NEG Protein Glucose NEG Ketone NEG Bilirub NEG Urobiln pH . Leuks NEG PM URINE RBC WBC Bacteri NONE Yeast NONE Epi Pertinent Labs Micro BCx Negative x . Studies CXR Single AP upright portable view chest obtained . patient rotated left . patients chin partially obscures left lung apex . Dual lead left sided pacemaker seen , unchanged position . , pacer wires seen traverse stent , presumably SVC . patient status post median sternotomy cardiac valve replacement . Abandoned epicardial leads noted left lower hemithorax/left upper quadrant , stable . Surgical chain sutures seen right lung apex . Evidence basilar fibrosis seen . persistent blunting costophrenic angles trace effusions would difficult exclude . new focal consolidation evidence pneumothorax seen . Discharge Labs BLOOD WBC . RBC . Hgb . Hct . MCV MCH . MCHC . RDW . Plt Ct BLOOD PT . PTT . INRPT . BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD LDLDH BLOOD Hapto BLOOD Calcium . Phos . Mg . Brief Hospital Course yo female extensive medical including gastric varix colonoic adenoma Hct drop small melena , s/p transfusions appropriate Hct bump , likely chronic bleeding colonic adenoma . . Goals care patient made DNR/DNI , confirmed daughter/HCP multiple family meetings . Patient escalation care , treat acute blood loss transfusion supportive care . Overall plan eventually move towards comfort , patient family would still like time deliberate . Patient eating regular diet knows may increase risk bleeding . Also , lab draws limited daily , even setting transfusion decreased Hematocrit . Also , lengthy discussion risks benefits taking anti coagulation given mechanical mitral valve , patient family decided stop anticoagulation . willing accept risks stroke . currently continuing offer transfusions well antibiotics . . Melena/Dysplastic Adenoma ED , initial hct , rose unit PRBC . Started pantoprazole octreotide gtt . Given therapeutic INR , patient also given units FFP . Slightly hypotensive arrival SBP mid received Bumex . daughter HCP , Name NI stated mother would want another colonoscopy endoscopy . Pressures improved cc fluid bolus . Patient known malignant colonic mass gastric varices . colonic mass proven malignant biopsy , GI felt endoscopic resection carried high risk perforation documented d/c summary . GI consulted , intervention done . drop Hct large melanotic BM . concern thrombocytopenia , octreotide stopped despite continued slow bleeding . Patient transferred medicine floor . Patient one episode point hematocrit drop , bumped two units packed red cells . Thrombocytopenia stable range may secondary medications ? meropenem . . Last Name un patient worsening creatinine stay . FeNA > % , suggesting intrinsic cause . Urine eosinophils negative peripheral eosinophilia . Meropenem stopped Bactrim continued discussion ID service , may elevate Cr falsly without changes GFR . time discharge , Cr improved somewhat . peak . . Thrombocytopenia patient worsening thrombocytopenia beginning admission . Fibrinogen normal schistocytes . Heparin products stopped HIT antibody sent , negative . Octreotide stopped case reports octreotide associated thrombocytopenia . . Nocardia PNA Diagnosed BAL previous admission . Treated imipenem bactrim outpatient two weeks based suspicion possible dissminated Nocardia . plan Bactrim additional extended course per ID recommendations . Patient follow ID outpatient future management . . Altered mental status Mental status waxed waned , sometimes confused , generally oriented person , place year . Often recall events day day . Neurologic exam non focal . felt decisions decision making capabilities . . Systolic congestive heart failure , chronic Per Echo , mildly depressed LVEF= % systolic function . appear volume overloaded exam presently . Echo shows dilated right heart , severe TR , moderate MR w/ functional mechanical prosthesis . Given transfusions hemodynamically stable , patient restarted bumex @ home dose mg daily . Metolazone held restarted per acute care facility/nursing facility . . Mechanical MVR Valve Anticoagulation held indefinitely lengthy discussion patient HCP , noted , despite risks annual stroke given mechanical mitral valve . . Bullous upper extremity rashContinued hydrocortisone % cream . . COPD/IPF continued nebs steroids . . HYPOTHYROIDISM continued home levothyroxine . HYPERTENSION restarted bumex , holding metolazone . . . . Transitional Issues Please evaluate need rectal tube foley daily remove asap Please check CBC every day days every day determined physician acute care facility . Please d/c PICC days discretion MACU . Please continue Bactrim , double strength , two tablets Hospital evaluated ID team outpatient . need reinitiate metolazone mg every day outpatient pending volume status lung exam . . Medications Admission albuterol sulfate . mg / mL . % Solution Nebulization qh prn cholecalciferol , unit daily ferrous sulfate mg daily fluticasone mcg/Actuation Aerosol , puffs Hospital ipratropium bromide . % Solution QH prn levothyroxine mcg daily multivitamin daily nadolol mg daily warfarin mg daily Goal INR . .. prednisone mg daily cortisone % Cream qid Bactrim DS mg Tablet , tabs tid X days nystatin , unit/g Cream daily zinc oxide daily MS Contin mg qhs oxycodone mg , . tabs q hours prn imipenem cilastatin mg , qh X weeks bumetanide mg daily metolazone mg qod omeprazole Hospital Discharge Medications . albuterol sulfate . mg / mL . % Solution Nebulization Sig One treatment Inhalation QH every hours needed sob/wheeze . . fluticasone mcg/Actuation Aerosol Sig Two Puff Inhalation Hospital times day . . levothyroxine mcg Tablet Sig One Tablet PO DAILY Daily . . prednisone mg Tablet Sig One Tablet PO DAILY Daily . . ipratropium bromide . % Solution Sig One treatment Inhalation every six hours needed shortness breath wheezing . . omeprazole mg Capsule , Delayed ReleaseE.C . Sig One Capsule , Delayed ReleaseE.C . PO bedtime . . cholecalciferol vitamin , unit Tablet Sig One Tablet PO day . . cortisone % Cream Sig One Appl Topical QID times day . . ferrous sulfate mg mg iron Tablet Sig One Tablet PO day . . nadolol mg Tablet Sig One Tablet PO DAILY Daily . . sulfamethoxazole trimethoprim mg Tablet Sig Two Tablet PO BID times day . . nystatin , unit/g Cream Sig One application affected areas Topical day . . MS Contin mg Tablet Extended Release Sig One Tablet Extended Release PO bedtime . . oxycodone mg Tablet Sig . Tablet PO every six hours needed pain . . acetaminophen mg Tablet Sig Two Tablet PO QH every hours needed pain . . guaifenesin mg/ mL Syrup Sig MLs PO QH every hours needed cough . . trazodone mg Tablet Sig . Tablet PO HS bedtime needed insomnia . . docusate sodium mg Capsule Sig One Capsule PO BID times day . . senna . mg Tablet Sig One Tablet PO BID times day needed constipation . . bumetanide mg Tablet Sig . Tablets PO DAILY Daily Please hold SBP < . . Labs Please check CBC daily least days every day determined physician acute care facility . Discharge Disposition Extended Care Facility Hospital Aged MACU Discharge Diagnosis gastrointestinal bleeding colonic mass nocardia pneumonia acute chronic kidney failure Discharge Condition Mental Status Confused sometimes . Level Consciousness Lethargic arousable . Activity Status Bed assistance chair wheelchair . Discharge Instructions Dear Ms . Known lastname , pleasure taking care Hospital . admitted hospital acute gastrointestinal bleed . able stabilize blood transfusions . likely bleeding originated colon , known mass . Also , history esophageal varices may bleeding well . multiple discussions MICU/medical GI teams conjunction daughter law , decided pursue diagnostic interventional procedures . declined EGD/colonoscopy . transferred MACU , able receive supportive care blood product transfusion necessary . also decided change code status resuscitate intubate . . also treated pneumonia antibiotics , continue follow infectious disease physician outpatient . . hope able regain strength rehab feel better soon . . STOP imipenem cilastatin mg , qh X weeks currently holding metolazone mg every day physician Hospital Name PRE evaluate regards reiniation medication outpatient based vital signs breathing . . Please follow appointments listed . Followup Instructions following appointments . Please follow primary care physician , Name NameIs , Name NameIs Telephone/Fax , within one week discharge rehabilitation facility . help make appointment upon discharge . . Department PULMONARY FUNCTION LAB WEDNESDAY PULMONARY FUNCTION LAB Telephone/Fax Building Hospital Location un Campus EAST Best Parking Hospital Ward Name Garage Department Hospital Ward Name WEDNESDAY Department MEDICAL SPECIALTIES WEDNESDAY DR. Last Name STitle & DR. Last Name STitle Telephone/Fax Building SC Hospital Ward Name Clinical Ctr Location un Campus EAST Best Parking Hospital Ward Name Garage Department INFECTIOUS DISEASE MONDAY Name MD Name MD , MD Telephone/Fax Building LM Hospital Unit Name Hospital Campus WEST Best Parking Hospital Ward Name Garage']",['039']
1574,131932,"['PM CHEST PORT . LINE PLACEMENT PHYSICIAN Name Initial PRE Clip Number Radiology Reason post op film contact Name NI NP Numeric Identifier abnormal OPEN C Admitting Diagnosis UNSTABLE ANGINA ______________________________________________________________________________ Hospital MEDICAL CONDITION year old man s/p emergency AVR/MVrepair/cabg x/repl . asc . aorta/ fem . Last Name un Last Name un bypass/cardiogenic shock REASON EXAMINATION post op film contact Name NI NP Numeric Identifier abnormal OPEN CHEST ! CVICU approx . PM please call first ______________________________________________________________________________ WET READ KYg FRI PM LOW LUNG VOLUMES . ETT TUBE TERMINATES .CM CARINA . PROSTHETIC VALVES , BILATERAL CHEST TUBES , NG TUBE PRESENT . RIGHT IJ CVL TERMINATES PROXIMAL RIGHT ATRIUM . TIP LEFT IJ CVL LIKELY WITHIN LEFT SUBCLAVIAN . PTX . EXTENT PULMONARY EDEMA UNCHANGED . SUGGESTION OPEN CHEST WALL . D/Initials NamePattern Last Name NamePattern . Doctor Last Name Numeric Identifier ______________________________________________________________________________ FINAL REPORT AP CHEST , P.M. HISTORY Emergency valve replacements . IMPRESSION AP chest compared intraoperative study a.m . Moderately severe pulmonary edema improved , though , though lung volumes remain quite low . Mediastinum normal postoperative appearance . Given delayed sternal closure persistently low lung volumes . Tip intraaortic balloon pump lies least cm level carina , standard placement . Right internal jugular line ends right atrium . lines tubes standard placements . pneumothorax pleural effusion , , small . Dr. First Name STitle paged report findings subsequent chest radiograph , p.m. , reported separately also showing IABP placement lower standard .']","['042', '070']"
4533,192984,"['Admission Date Discharge Date Date Birth Sex F Service MEDICINE Allergies Patient recorded Known Allergies Drugs AttendingFirst Name LF Chief Complaint Respiratory Distress Major Surgical Invasive Procedure PEG tube exchange History Present Illness y/o F w/ dementia , non verbal p/w respiratory distress diarrhea . Pt . recently treated recurrent respiratory infection w/ augmentin developed diarrhea . diarrhea several days today son noted respiratory distress brought Location un ED . Location un VS initially / , , % . Labs came back w/ bicarb lactate . given L NS , levo flagyl . U/A > WBCs . WBC w/ % bands . Hct . intubated discussion w/ son goals care transferred Hospital ED started levophed . SBP . got L NS , cefepime vanc IV . Location un hyperkalemia w/o EKG changes got insulin , D. Past Medical History Dementia , nonverbal baseline Left hip decubitus ulcer Sacral decubitus ulcer diabetes urinary retention CVA years ago Recurrent pulmonary infections Old necrotic left great toe Social History Living home son . Immigrated Country years ago . Dependent ADLs . smoking , alcohol known . Family History NC Physical Exam Vitals BP / P R % FiO % , CMV TV , PEEP General Unresponsive stimuli , vent levophed drip .mcg/min . Extremities contracted . HEENT Sclera anicteric , MMM , oropharynx clear , TMs impacted w/ wax . Pupils equal reactive light . Neck Left triple lumen Lungs Clear auscultation bilaterally , wheezes , rales , ronchi CV Regular rate rhythm , normal + , murmurs , rubs , gallops Abdomen soft , non tender , non distended , bowel sounds present , rebound tenderness guarding , organomegaly GU foley present w/ cloudy urine Ext warm , well perfused , + pulses , clubbing , cyanosis edema , large decubitus ulcer/burn wound L hip . L foot w/ necrotic great toe . Pertinent Results URINE Source Catheter . FINAL REPORT URINE CULTURE Final PSEUDOMONAS AERUGINOSA . , , ORGANISMS/ML .. GRAM NEGATIVE RODS . ~OOO/ML . SENSITIVITIES MIC expressed MCG/ML _________________________________________________________ PSEUDOMONAS AERUGINOSA | CEFEPIME CEFTAZIDIME CIPROFLOXACIN GENTAMICIN MEROPENEM . PIPERACILLIN/TAZO TOBRAMYCIN < = SPUTUM Source Endotracheal . FINAL REPORT GRAM STAIN Final > PMNs < epithelial cells/X field . + > per X FIELD BUDDING YEAST PSEUDOHYPHAE . SMEAR REVIEWED RESULTS CONFIRMED . RESPIRATORY CULTURE Final Commensal Respiratory Flora Absent . PSEUDOMONAS AERUGINOSA . MODERATE GROWTH . THREE COLONIAL MORPHOLOGIES . YEAST . MODERATE GROWTH . SENSITIVITIES MIC expressed MCG/ML _________________________________________________________ PSEUDOMONAS AERUGINOSA | CEFEPIME CEFTAZIDIME CIPROFLOXACIN . GENTAMICIN MEROPENEM PIPERACILLIN/TAZO TOBRAMYCIN < = MRSA SCREEN Source Nasal swab . FINAL REPORT MRSA SCREEN Final POSITIVE METHICILLIN RESISTANT STAPH AUREUS . BLOOD WBC . RBC . Hgb . Hct . MCV MCH . MCHC . RDW . Plt Ct BLOOD Neuts . Lymphs . Monos . Eos . Baso . BLOOD Glucose UreaN Creat . Na K . Cl HCO AnGap BLOOD Calcium . Phos . Mg . BLOOD Type ART Temp . Rates / Flow pO pCO pH . calTCO Base XS Intubat INTUBA BLOOD Lactate . BLOOD freeCa . Brief Hospital Course Sepsis Ms Known lastname hypotensive hypoxic admission intubated required pressors . Lactate .. treated empirically hospital acquired pneumonia cefepime vancomycin narrowed cefepime sputum cultures returned positive pseudomonas aeruginosa . UA positive culture also grew pseudomonas . day course cefepime planned . concern c. diff colitis initially treated PO vancomycin , discontinued toxin assays returned negative . sources infection considered included various decubitus ulcers , burn related ulcer , necrotic great toe . Pressors weaned hours BP remained stable throughout remainder hospitalization . ventilator settings gradually weaned extubated HD , tolerated . discharged plan continue IV antibiotics via PICC day course . Left upper extremity DVT Ms . Known lastname noted edematous left arm hospitalization setting left internal jugular catheter . Ultrasound demonstrated cephalic vein DVT . Heparin begun changed lovanox HD , decided risks anticoagulation exceed benefits , discontinued . Nutrition Ms . Known lastname gastric tube clogged son administered tube feeding . tube fell flushed . replaced gastrojejunal tube , selected decrease aspiration risk , sutured place . Last Name un Ms . Known lastname originally creatinine . admission attributed prerenal factors . Creatinine decreased . blood pressure support fluids . Anemia Hct trended , stabilized , hospitalization . DM Ms . Known lastname managed ISS , decreased HD following episode hypoglycemia . discharge , returned home regimen metformin . Goals care Ms . Known lastname eventually made DNR/DNI discussions sons , discharged home hospice services . Medications Admission Metformin mg QD Jenuvia tube feed MVI Vit C oral Name NI MOM Artificial Tears Tylenol Discharge Medications . cefepime gram Recon Soln Sig One Recon Soln Injection QH every hours days continue days . Disp Recon Solns Refills . sodium chloride . % . % Parenteral Solution Sig Ten ML Intravenous PRN needed needed line flush non heparin dependent PICC . Disp MLs Refills . acetaminophen mg Tablet Sig Two Tablet PO QH every hours needed pain/fever . . metformin mg Tablet Sig One Tablet PO day . . Januvia continuing previous home dose dose known us time Discharge Disposition Home Service Facility VistaCare Discharge Diagnosis . Sepsis . Respiratory failure . Dementia . Acute renal failure Discharge Condition Mental Status interactive . Level Consciousness Lethargic arousable . Activity Status Bedbound . Discharge Instructions Dear Mr . Known lastname , pleasure taking care mother ICU . extubated going home hospice services . receive IV antibiotics home PICC line nutrition PEG tube . Hospital hospice care , important concerned symptoms , call Vistacare , hospice company , assistance , rather bringing hospital . Followup Instructions Home hospice services VistaCare .']","['041', '038']"


#### Use AOAI to Simplify Note Text

In [28]:
aoai_client = AzureOpenAI(
    azure_endpoint = os.getenv("AZURE_OPENAI_BASE"), 
    api_key=os.getenv("AZURE_OPENAI_KEY"),
    api_version="2024-02-01"
)

In [33]:
# Function to get token counts

def get_token_counts(text):
    encoding = tiktoken.get_encoding('cl100k_base')
    num_tokens = len(encoding.encode(text))
    return num_tokens

In [37]:
# Funciton to build prompt

def build_prompt(note):
    sys = """
    Parse the following medical note. Return any sentences that relate to a diagnosis or symptom. Ignore other information such as patient name, dates, medicine types, dosage amounts, etc.
    DO NOT ADD ANY INFORMATION TO THE NOTE. ONLY RETURN THE RELEVANT SENTENCES. 
    """
    prompt = f"{note}"

    return (sys, prompt)

In [38]:
def aoai_extract(note):
    sys, prompt = build_prompt(note)
    response = aoai_client.chat.completions.create(
        model=os.getenv("AOAI_MAIN_DEPLOYMENT_NAME"), # model = "deployment_name".
        messages=[
            {"role": "system", "content": dedent(sys)},
            {"role": "user", "content": dedent(prompt)}
        ],
    )

    return response.choices[0].message.content


In [39]:
df["AOAI_EX"] = df["TEXT"].apply(lambda x: aoai_extract(x))

INFO:httpx:HTTP Request: POST https://medcode-aoai-useast.openai.azure.com//openai/deployments/gpt-4o-deploy/chat/completions?api-version=2024-02-01 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://medcode-aoai-useast.openai.azure.com//openai/deployments/gpt-4o-deploy/chat/completions?api-version=2024-02-01 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://medcode-aoai-useast.openai.azure.com//openai/deployments/gpt-4o-deploy/chat/completions?api-version=2024-02-01 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://medcode-aoai-useast.openai.azure.com//openai/deployments/gpt-4o-deploy/chat/completions?api-version=2024-02-01 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://medcode-aoai-useast.openai.azure.com//openai/deployments/gpt-4o-deploy/chat/completions?api-version=2024-02-01 "HTTP/1.1 200 OK"


In [74]:
df["AOAI_TOKENS"] = df["AOAI_EX"].apply(lambda x: get_token_counts(x))
df["TEXT_TOKENS"] = df["TEXT"].apply(lambda x: get_token_counts(x))

#### Examine Results

In [44]:
# Avg token count difference - How much efficiency is gained?
print(f"Average Original Text tokens: {df['TEXT_TOKENS'].mean()}")
print(f"Average AOAI Extracted tokens: {df['AOAI_TOKENS'].mean()}")
print(f"Average token count difference: {(df['TEXT_TOKENS'] - df['AOAI_TOKENS']).mean()}")

Average Original Text tokens: 1332.2
Average AOAI Extracted tokens: 457.8
Average token count difference: 874.4


In [66]:
# Text analysis - Is any information lost?
sub_df=df[['ICD9_CODE', 'AOAI_EX', 'AOAI_TOKENS', 'TEXT_TOKENS']]

In [67]:
# Example 0
display(sub_df.iloc[[0]])

Unnamed: 0,ICD9_CODE,AOAI_EX,AOAI_TOKENS,TEXT_TOKENS
4808,['041'],"- Chief Complaint Respiratory distress\n- History Present Illness Mr . Known lastname year old male complex medical history originally admitted hospital Month strangulated ventral hernia .\n- Secondary respiratory distress/ ? pneumonia well management volume overload .\n- CAD s/p LAD OM stent , s/p RCA PTCA .\n- DM .\n- HTN .\n- PVD s/p bilat LE bypass surgeries Dr.Last Name STitle .\n- CRI baseline Cr . .\n- Cataracts .\n- Gout .\n- BPH .\n- Abd hernia .\n- s/p CCY , ex lap w/abd hernia resulting .\n- Incarcerated ventral hernia containing strangulated small bowel requiring small bowel resection .\n- Complicated leak leading operation .\n- Hx Etoh abuse x yrs , quit .\n- Fa died secondary colon ca Mo died secondary PNA Siblings Etoh abuse , HTN \n- Breath sounds course bilaterally , wheezes rhonchi\n- + upper & lower extremity edema\n- BLOOD WBC . RBC . Hgb . Hct . MCV MCH . MCHC . RDW . Plt Ct\n- SPARSE GROWTH OROPHARYNGEAL FLORA .\n- STAPH AUREUS COAG + .\n- Oxacillin RESISTANT Staphylococci MUST reported also RESISTANT penicillins , cephalosporins , carbacephems , carbapenems , beta lactamase inhibitor combinations . Rifampin used alone therapy .\n- STAPH AUREUS COAG + | CLINDAMYCIN = > R ERYTHROMYCIN = > R GENTAMICIN < = . LEVOFLOXACIN = > R OXACILLIN = > R PENICILLIN = > . R RIFAMPIN < = . TETRACYCLINE VANCOMYCIN < =\n- FUNGUS ISOLATED .\n- FINAL REPORT AEROBIC BOTTLE Final GROWTH .\n- STAPH AUREUS COAG + .\n- FINAL SENSITIVITIES .\n- Oxacillin RESISTANT Staphylococci MUST reported also RESISTANT penicillins , cephalosporins , carbacephems , carbapenems , beta lactamase inhibitor combinations . Rifampin used alone therapy .\n- VANCOMYCIN SENSITIVITY DONE E TEST ..\n- STAPH AUREUS COAG + | CLINDAMYCIN = > R ERYTHROMYCIN = > R GENTAMICIN < = . LEVOFLOXACIN = > R OXACILLIN = > R PENICILLIN = > . R RIFAMPIN < = . TETRACYCLINE VANCOMYCIN \n- Mr . Known lastname readmitted surgical intensive care unit Hospital rehabilitation facility .\n- Diuresis initiated good response , weaned ventilator placed tracheostomy mask ventilation .\n- Initially well , noted large cuff leak tracheostomy , became somewhat tachypneic .\n- Placed back ventilator responded well .\n- Noted fever blood/urine cultures sent .\n- Found staphylococcus sputum blood , pseudomonas urine .\n- Appropriate antibiotic coverage initiated linezolid cefepime , defervesced , corresponding drop WBC .\n- Tracheostomy tube removed replaced similarly sized flexible Last Name un tracheostomy , angled specifically longer arm .\n- Air leak resolved maneuver .\n- Respiratory distress .\n- Congestive heart failure .\n- Air leak tracheostomy .\n- Pancreatitis .\n- Pneumonia .\n- Incarcerated strangulated ventral hernia , Small bowel resection primary reanastomosis , multiple abdominal abscesses , respiratory failure , myocardial infarction .\n- CAD s/p LAD OM stent , s/p RCA PTCA .\n- DM .\n- HTN .\n- PVD .\n- CRI Cr . .\n- cataracts .\n- gout .\n- BPH .\n- h/o EtOH abuse quit yrs ago , h/o heavy tobacco use",861,1832


Analysis:  
- Code 041 = "Bacterial infection in conditions classified elsewhere"
- The important component in this comment is the mention of 'staphylococcus' which is not lost despite a near 1000 token reduction

---


In [71]:
# Example 1
display(sub_df.iloc[[1]])

Unnamed: 0,ICD9_CODE,AOAI_EX,AOAI_TOKENS,TEXT_TOKENS
836,['041'],Atrial fibrillation. Right bundle branch block. Non specific ST wave changes. Compared previous tracing significant change.,22,24


Analysis:  
- Code 041 = "Bacterial infection in conditions classified elsewhere"
- This comment was unchanged

---

In [72]:
# Example 2
display(sub_df.iloc[[2]])

Unnamed: 0,ICD9_CODE,AOAI_EX,AOAI_TOKENS,TEXT_TOKENS
4510,['039'],"- Chief Complaint AMS \n- History Present Illness yo F IPF, COPD L chronic prednisone, CHF, mechanical mitral valve, s/p pacemaker placement, known high grade colonic adenoma GIB resected, gastric varix history liver disease recently admitted nocardia pneuomina, presented AMS.\n- Patient transferred MICU given + melena likely need endoscopy may need intubation airway protection.\n- Patient denies complaints shortness breath, typical.\n- Rectal heme postive dark stool.\n- Labs notable WBC count, Hct, creatinine.\n- baseline, metabolic alkylosis VBG chronic.\n- crossed four units. \n- ROS reported wanting sleep.\n- Increased frequency BMs recently describes dark, bloody. recall sticky. \n- Denies fevers, chills, chest pain, shortness breath, abdominal pain, nausea, vomiting, anything else.\n- s/p mechanical mitral valve sinus node dysfunction s/p DDD pacemaker placement atrial flutter s/p ablation / cardioversion congestive heart failure, Last Echo, mildly depressed LVEF= % systolic function chronic obstructive pulmonary disease LO trach home rest idiopathic pulmonary fibrosis chronic prednisone chronic kidney disease baseline creatinine . . anemia due mechanical valve chronic kidney disease hypertension hypercholesterolemia hypothyroidism meniere ? ? ? ? ? ? disease HOH spinal arthritis breast cancer radical mastectomy right breast. Partial left. s/p hysterectomy s/p nasal embolization refractory epistaxis lower GI bleed secondary high grade colonic adenoma s/p biopsy resection\n- General lethargic, oriented person place, trying pull things\n- Lungs Clear auscultation bilaterally anteriorly, wheezes ronchi\n- Vitals / . / % L General alert, aao, sitting bed\n- Lungs improved wheezing.\n- colonic adenoma Hct drop small melena, s/p transfusions appropriate Hct bump, likely chronic bleeding colonic adenoma.\n- Patient escalation care, treat acute blood loss transfusion supportive care.\n- patient family decided stop anticoagulation.\n- Melena/Dysplastic Adenoma ED, initial hct, rose unit PRBC.\n- Patient known malignant colonic mass gastric varices.\n- colonic mass proven malignant biopsy \n- drop Hct large melanotic BM.\n- concern thrombocytopenia, octreotide stopped despite continued slow bleeding.\n- Patient transferred medicine floor.\n- Patient one episode point hematocrit drop, bumped two units packed red cells.\n- Nocardia PNA Diagnosed BAL previous admission.\n- Altered mental status Mental status waxed waned, sometimes confused, generally oriented person, place year.\n- Systolic congestive heart failure, chronic Per Echo, mildly depressed LVEF= % systolic function.\n- appear volume overloaded exam presently.\n- Mechanical MVR Valve Anticoagulation held indefinitely lengthy discussion patient HCP\n- COPD/IPF continued nebs steroids.\n- HYPOTHYROIDISM continued home levothyroxine.\n- HYPERTENSION restarted bumex, holding metolazone.\n- gastrointestinal bleeding colonic mass nocardia pneumonia acute chronic kidney failure\n- Mental Status Confused sometimes.\n- admitted hospital acute gastrointestinal bleed. \n- likely bleeding originated colon, known mass.\n- Also, history esophageal varices may bleeding well.\n- also treated pneumonia antibiotics, continue follow infectious disease physician outpatient.",726,2963


Analysis:  
- Code 039 = "Actinomycotic infections"
- TODO...

---

In [70]:
# Example 3
display(sub_df.iloc[[3]])

Unnamed: 0,ICD9_CODE,AOAI_EX,AOAI_TOKENS,TEXT_TOKENS
1574,"['042', '070']","- Admitting Diagnosis UNSTABLE ANGINA\n- Hospital MEDICAL CONDITION year old man s/p emergency AVR/MV repair/cabg x/repl . asc . aorta/ fem . bypass/cardiogenic shock\n- LOW LUNG VOLUMES\n- EXTENT PULMONARY EDEMA UNCHANGED\n- SUGGESTION OPEN CHEST WALL\n- HISTORY Emergency valve replacements\n- IMPRESSION AP chest compared intraoperative study a.m . Moderately severe pulmonary edema improved , though , though lung volumes remain quite low\n- Mediastinum normal postoperative appearance\n- Given delayed sternal closure persistently low lung volumes\n- pneumothorax pleural effusion , , small",148,379


Analysis:  
- Code 042 = "HIV" ; Code 070 = "Viral Hepatitis"
- TODO

---

In [73]:
# Example 4
display(sub_df.iloc[[4]])

Unnamed: 0,ICD9_CODE,AOAI_EX,AOAI_TOKENS,TEXT_TOKENS
4533,"['041', '038']","- Chief Complaint Respiratory Distress\n- Major Surgical Invasive Procedure PEG tube exchange\n- History Present Illness y/o F w/ dementia, non verbal p/w respiratory distress diarrhea.\n- Pt. recently treated recurrent respiratory infection w/ augmentin developed diarrhea.\n- diarrhea several days today son noted respiratory distress brought Location un ED.\n- intubated discussion w/ son goals care transferred Hospital ED started levophed.\n- SBP. got L NS, cefepime vanc IV.\n- Location un hyperkalemia w/o EKG changes got insulin, D.\n- Past Medical History Dementia, nonverbal baseline Left hip decubitus ulcer Sacral decubitus ulcer diabetes urinary retention CVA years ago Recurrent pulmonary infections Old necrotic left great toe\n- Unresponsive stimuli, vent levophed drip. mcg/min.\n- large decubitus ulcer/burn wound L hip. L foot w/ necrotic great toe.\n- Pertinent Results URINE Source Catheter. FINAL REPORT URINE CULTURE Final PSEUDOMONAS AERUGINOSA.\n- PSEUDOMONAS AERUGINOSA MODERATE GROWTH. THREE COLONIAL MORPHOLOGIES. YEAST. MODERATE GROWTH.\n- MRSA SCREEN Source Nasal swab. FINAL REPORT MRSA SCREEN Final POSITIVE METHICILLIN RESISTANT STAPH AUREUS.\n- BLOOD WBC.\n- BLOOD Lactate.\n- Brief Hospital Course Sepsis Ms Known lastname hypotensive hypoxic admission intubated required pressors.\n- Lactate treated empirically hospital acquired pneumonia cefepime vancomycin narrowed cefepime sputum cultures returned positive pseudomonas aeruginosa.\n- UA positive culture also grew pseudomonas.\n- concern c. diff colitis initially treated PO vancomycin, discontinued toxin assays returned negative.\n- sources infection considered included various decubitus ulcers, burn related ulcer, necrotic great toe.\n- Left upper extremity DVT Ms. Known lastname noted edematous left arm hospitalization setting left internal jugular catheter.\n- Ultrasound demonstrated cephalic vein DVT.\n- Anemia Hct trended, stabilized, hospitalization.\n- DM Ms. Known lastname managed ISS, decreased HD following episode hypoglycemia.\n- Goals care Ms. Known lastname eventually made DNR/DNI discussions sons, discharged home hospice services.\n- Sepsis\n- Respiratory failure\n- Dementia\n- Acute renal failure",532,1463


Analysis:  
- Code 041 = "Bacterial infection in conditions classified elsewhere" ; Code 038 = "Septicaemia"
- TODO

---