# Reducing Bias in Machine Learning Models - Data Cleaning

#### Kristen Lo - BrainStation
---

### Table of Contents
- [Introduction](#intro)
- [Part 0: Cleaning the Data](#clean)
    - 1.1: [Housekeeping](#housekeeping)
    - 1.2: [Deleting Columns](#col)
    - 1.3: [Deleting Rows](#rows)
    - 1.4: [Handling Duplicates](#duplicates)
    - 1.5: [Data Type Conversion](#type)
    - 1.6: [Handling Missing Data](#missing) 
    - 1.7: [Categorizing Columns](#cat) 
- [Conclusion](#conc)


---
### <a id = 'intro'></a> Introduction

In this notebook, we will be cleaning the data received so that it can be ready for further analysis.

There is a need to be able to predict the hospital admission rates for diabetic patients. However, using traditional machine learning models can lead to health disparities caused by biased data which can be related to demographic data (ex. race, age, income, insurance etc). These biases need to be removed prior to modelling so that bias isn't introduced into the model. Building on the work of Raza, S. who aimed to predict, diagnose, and mitigate health disparities in hospital re-admission, my aim is to replicate the study performed by Raza and create my own model that's able to screen for biases and predict admission rates for diabetics visiting the ER. 


Data was sourced from all adult Emergency Department visits from March 2014 - July 2017 from one academic and two community emergency rooms, apart from the Yale New Haven Health system. These visits resulted in either admission to their respective hospital or discharge. 

There are a total of 972 variables that we extracted per patient visit from 560,486 patient visits. 

Courtesy of:
 "Hong WS, Haimovich AD, Taylor RA (2018) Predicting hospital admission at emergency department triage using machine learning. PLoS ONE 13(7): e0201016." (https://doi.org/10.1371/journal.pone.0201016)




-----

## <a id = 'clean'></a> Part 0: Cleaning the Data

---
#### <a id = 'housekeeping'></a> 1.1 HouseKeeping 

For housekeeping, we will import all necessary libraries and upload the data. In addition we will look at the data set from a very high level view. We want to know generally what we are working with and to familiarize ourself with the data before we start cleaning. 

Plan of attack: 
- Import libraries needed and import the file 
- View the data and find out it's shape 
- Understand unique entries to the columns 

Importing all necessary libraries

In [1]:
!pip install pyreadr -q
import pandas as pd
import pyreadr
import numpy as np
import matplotlib.pyplot as plt

The dataset is stored as an .rdata file, let's import it so cleaning can begin. The relative path won't work so the full path is needed. 

In [2]:
result = pyreadr.read_r('Data/5v_cleandf.rdata')

Let's take a quick look at our data to see what we are working with.

In [3]:
raw_data = result['df']

pd.set_option('display.max_columns', None)  # Show all columns
pd.set_option('display.max_rows', None)

raw_data.head(10)

Unnamed: 0,dep_name,esi,age,gender,ethnicity,race,lang,religion,maritalstatus,employstatus,insurance_status,disposition,arrivalmode,arrivalmonth,arrivalday,arrivalhour_bin,previousdispo,2ndarymalig,abdomhernia,abdomnlpain,abortcompl,acqfootdef,acrenlfail,acutecvd,acutemi,acutphanm,adjustmentdisorders,adltrespfl,alcoholrelateddisorders,allergy,amniosdx,analrectal,anemia,aneurysm,anxietydisorders,appendicitis,artembolism,asppneumon,asthma,attentiondeficitconductdisruptivebeha,backproblem,biliarydx,birthasphyx,birthtrauma,bladdercncr,blindness,bnignutneo,bonectcncr,bph,brainnscan,breastcancr,breastdx,brnchlngca,bronchitis,burns,cardiaarrst,cardiacanom,carditis,cataract,cervixcancr,chestpain,chfnonhp,chrkidneydisease,coaghemrdx,coloncancer,comabrndmg,complicdevi,complicproc,conduction,contraceptiv,copd,coronathero,crushinjury,cysticfibro,deliriumdementiaamnesticothercognitiv,developmentaldisorders,diabmelnoc,diabmelwcm,disordersusuallydiagnosedininfancych,diverticulos,dizziness,dminpreg,dysrhythmia,earlylabor,ecodesadverseeffectsofmedicalcare,ecodesadverseeffectsofmedicaldrugs,ecodescutpierce,ecodesdrowningsubmersion,ecodesfall,ecodesfirearm,ecodesfireburn,ecodesmachinery,ecodesmotorvehicletrafficmvt,ecodesnaturalenvironment,ecodesotherspecifiedandclassifiable,ecodesotherspecifiednec,ecodespedalcyclistnotmvt,ecodesplaceofoccurrence,ecodespoisoning,ecodesstruckbyagainst,ecodessuffocation,ecodestransportnotmvt,ecodesunspecified,ectopicpreg,encephalitis,endometrios,epilepsycnv,esophcancer,esophgealdx,exameval,eyeinfectn,fatigue,femgenitca,feminfertil,fetaldistrs,fluidelcdx,fuo,fxarm,fxhip,fxleg,fxskullfac,gangrene,gasduoulcer,gastritis,gastroent,giconganom,gihemorrhag,giperitcan,glaucoma,goutotcrys,guconganom,hdnckcancr,headachemig,hemmorhoids,hemorrpreg,hepatitis,hivinfectn,hodgkinsds,hrtvalvedx,htn,htncomplicn,htninpreg,hyperlipidem,immunitydx,immunizscrn,impulsecontroldisordersnec,inducabortn,infectarth,influenza,infmalegen,intestinfct,intobstruct,intracrninj,jointinjury,kidnyrnlca,lateeffcvd,leukemias,liveborn,liveribdca,longpregncy,lowbirthwt,lungexternl,lymphenlarg,maintchemr,malgenitca,maligneopls,malposition,meningitis,menopausldx,menstrualdx,miscellaneousmentalhealthdisorders,mooddisorders,mouthdx,ms,multmyeloma,mycoses,nauseavomit,neoplsmunsp,nephritis,nervcongan,nonepithca,nonhodglym,nutritdefic,obrelatedperintrauma,opnwndextr,opnwndhead,osteoarthros,osteoporosis,otacqdefor,otaftercare,otbnignneo,otbonedx,otcirculdx,otcomplbir,otconganom,otconntiss,otdxbladdr,otdxkidney,otdxstomch,otendodsor,otfemalgen,othbactinf,othcnsinfx,othematldx,othercvd,othereardx,otheredcns,othereyedx,othergidx,othergudx,otherinjury,otherpregnancyanddeliveryincludingnormal,otherscreen,othfracture,othheartdx,othinfectns,othliverdx,othlowresp,othmalegen,othnervdx,othskindx,othveindx,otinflskin,otitismedia,otjointdx,otnutritdx,otperintdx,otpregcomp,otprimryca,otrespirca,otupprresp,otuprspin,ovariancyst,ovarycancer,pancreascan,pancreasdx,paralysis,parkinsons,pathologfx,pelvicobstr,perintjaund,peripathero,peritonitis,personalitydisorders,phlebitis,pid,pleurisy,pneumonia,poisnnonmed,poisnotmed,poisonpsych,precereoccl,prevcsectn,prolapse,prostatecan,pulmhartdx,rctmanusca,rehab,respdistres,retinaldx,rheumarth,schizophreniaandotherpsychoticdisorde,screeningandhistoryofmentalhealthan,septicemia,septicemiaexceptinlabor,sexualinfxs,shock,sicklecell,skininfectn,skinmelanom,sle,socialadmin,spincorinj,spontabortn,sprain,stomchcancr,substancerelateddisorders,suicideandintentionalselfinflictedin,superficinj,syncope,teethdx,testiscancr,thyroidcncr,thyroiddsor,tia,tonsillitis,tuberculosis,ulceratcol,ulcerskin,umbilcord,unclassified,urinstone,urinyorgca,uteruscancr,uti,varicosevn,viralinfect,whtblooddx,n_edvisits,n_admissions,absolutelymphocytecount_last,acetonebld_last,alanineaminotransferase(alt)_last,albumin_last,alkphos_last,anc(absneutrophilcount)_last,aniongap_last,aspartateaminotransferase(ast)_last,"b-typenatriureticpeptide,pro(probnp)_last",baseexcess(poc)_last,"baseexcess,venous(poc)_last",basos_last,basosabs_last,"benzodiazepinesscreen,urine,noconf._last",bilirubindirect_last,bilirubintotal_last,bun_last,bun/creatratio_last,calcium_last,calculatedco2(poc)_last,calculatedhco3(poc)i_last,calculatedo2saturation(poc)_last,chloride_last,cktotal_last,co2_last,"co2calculated,venous(poc)_last","co2,poc_last",creatinine_last,d-dimer_last,egfr_last,egfr(nonafricanamerican)_last,egfr(aframer)_last,eos_last,eosinoabs_last,epithelialcells_last,globulin_last,glucose_last,"glucose,meter_last","hco3calculated,venous(poc)_last",hematocrit_last,hemoglobin_last,immaturegrans(abs)_last,immaturegranulocytes_last,inr_last,"lactate,poc_last",lipase_last,lymphs_last,magnesium_last,mch_last,mchc_last,mcv_last,monocytes_last,monosabs_last,mpv_last,neutrophils_last,nrbc_last,nrbcabsolute_last,"o2satcalculated,venous(poc)_last",pco2(poc)_last,"pco2,venous(poc)_last","ph,venous(poc)_last","phencyclidine(pcp)screen,urine,noconf._last",phosphorus_last,platelets_last,po2(poc)_last,"po2,venous(poc)_last",pocbun_last,poccreatinine_last,pocglucose_last,pochematocrit_last,pocionizedcalcium_last,pocph_last,pocpotassium_last,pocsodium_last,poctroponini._last,potassium_last,proteintotal_last,prothrombintime_last,ptt_last,rbc_last,rbc/hpf_last,rdw_last,sodium_last,troponini(poc)_last,troponint_last,tsh_last,wbc_last,wbc/hpf_last,absolutelymphocytecount_min,acetonebld_min,alanineaminotransferase(alt)_min,albumin_min,alkphos_min,anc(absneutrophilcount)_min,aniongap_min,aspartateaminotransferase(ast)_min,"b-typenatriureticpeptide,pro(probnp)_min",baseexcess(poc)_min,"baseexcess,venous(poc)_min",basos_min,basosabs_min,"benzodiazepinesscreen,urine,noconf._min",bilirubindirect_min,bilirubintotal_min,bun_min,bun/creatratio_min,calcium_min,calculatedco2(poc)_min,calculatedhco3(poc)i_min,calculatedo2saturation(poc)_min,chloride_min,cktotal_min,co2_min,"co2calculated,venous(poc)_min","co2,poc_min",creatinine_min,d-dimer_min,egfr_min,egfr(nonafricanamerican)_min,egfr(aframer)_min,eos_min,eosinoabs_min,epithelialcells_min,globulin_min,glucose_min,"glucose,meter_min","hco3calculated,venous(poc)_min",hematocrit_min,hemoglobin_min,immaturegrans(abs)_min,immaturegranulocytes_min,inr_min,"lactate,poc_min",lipase_min,lymphs_min,magnesium_min,mch_min,mchc_min,mcv_min,monocytes_min,monosabs_min,mpv_min,neutrophils_min,nrbc_min,nrbcabsolute_min,"o2satcalculated,venous(poc)_min",pco2(poc)_min,"pco2,venous(poc)_min","ph,venous(poc)_min","phencyclidine(pcp)screen,urine,noconf._min",phosphorus_min,platelets_min,po2(poc)_min,"po2,venous(poc)_min",pocbun_min,poccreatinine_min,pocglucose_min,pochematocrit_min,pocionizedcalcium_min,pocph_min,pocpotassium_min,pocsodium_min,poctroponini._min,potassium_min,proteintotal_min,prothrombintime_min,ptt_min,rbc_min,rbc/hpf_min,rdw_min,sodium_min,troponini(poc)_min,troponint_min,tsh_min,wbc_min,wbc/hpf_min,absolutelymphocytecount_max,acetonebld_max,alanineaminotransferase(alt)_max,albumin_max,alkphos_max,anc(absneutrophilcount)_max,aniongap_max,aspartateaminotransferase(ast)_max,"b-typenatriureticpeptide,pro(probnp)_max",baseexcess(poc)_max,"baseexcess,venous(poc)_max",basos_max,basosabs_max,"benzodiazepinesscreen,urine,noconf._max",bilirubindirect_max,bilirubintotal_max,bun_max,bun/creatratio_max,calcium_max,calculatedco2(poc)_max,calculatedhco3(poc)i_max,calculatedo2saturation(poc)_max,chloride_max,cktotal_max,co2_max,"co2calculated,venous(poc)_max","co2,poc_max",creatinine_max,d-dimer_max,egfr_max,egfr(nonafricanamerican)_max,egfr(aframer)_max,eos_max,eosinoabs_max,epithelialcells_max,globulin_max,glucose_max,"glucose,meter_max","hco3calculated,venous(poc)_max",hematocrit_max,hemoglobin_max,immaturegrans(abs)_max,immaturegranulocytes_max,inr_max,"lactate,poc_max",lipase_max,lymphs_max,magnesium_max,mch_max,mchc_max,mcv_max,monocytes_max,monosabs_max,mpv_max,neutrophils_max,nrbc_max,nrbcabsolute_max,"o2satcalculated,venous(poc)_max",pco2(poc)_max,"pco2,venous(poc)_max","ph,venous(poc)_max","phencyclidine(pcp)screen,urine,noconf._max",phosphorus_max,platelets_max,po2(poc)_max,"po2,venous(poc)_max",pocbun_max,poccreatinine_max,pocglucose_max,pochematocrit_max,pocionizedcalcium_max,pocph_max,pocpotassium_max,pocsodium_max,poctroponini._max,potassium_max,proteintotal_max,prothrombintime_max,ptt_max,rbc_max,rbc/hpf_max,rdw_max,sodium_max,troponini(poc)_max,troponint_max,tsh_max,wbc_max,wbc/hpf_max,absolutelymphocytecount_median,acetonebld_median,alanineaminotransferase(alt)_median,albumin_median,alkphos_median,anc(absneutrophilcount)_median,aniongap_median,aspartateaminotransferase(ast)_median,"b-typenatriureticpeptide,pro(probnp)_median",baseexcess(poc)_median,"baseexcess,venous(poc)_median",basos_median,basosabs_median,"benzodiazepinesscreen,urine,noconf._median",bilirubindirect_median,bilirubintotal_median,bun_median,bun/creatratio_median,calcium_median,calculatedco2(poc)_median,calculatedhco3(poc)i_median,calculatedo2saturation(poc)_median,chloride_median,cktotal_median,co2_median,"co2calculated,venous(poc)_median","co2,poc_median",creatinine_median,d-dimer_median,egfr_median,egfr(nonafricanamerican)_median,egfr(aframer)_median,eos_median,eosinoabs_median,epithelialcells_median,globulin_median,glucose_median,"glucose,meter_median","hco3calculated,venous(poc)_median",hematocrit_median,hemoglobin_median,immaturegrans(abs)_median,immaturegranulocytes_median,inr_median,"lactate,poc_median",lipase_median,lymphs_median,magnesium_median,mch_median,mchc_median,mcv_median,monocytes_median,monosabs_median,mpv_median,neutrophils_median,nrbc_median,nrbcabsolute_median,"o2satcalculated,venous(poc)_median",pco2(poc)_median,"pco2,venous(poc)_median","ph,venous(poc)_median","phencyclidine(pcp)screen,urine,noconf._median",phosphorus_median,platelets_median,po2(poc)_median,"po2,venous(poc)_median",pocbun_median,poccreatinine_median,pocglucose_median,pochematocrit_median,pocionizedcalcium_median,pocph_median,pocpotassium_median,pocsodium_median,poctroponini._median,potassium_median,proteintotal_median,prothrombintime_median,ptt_median,rbc_median,rbc/hpf_median,rdw_median,sodium_median,troponini(poc)_median,troponint_median,tsh_median,wbc_median,wbc/hpf_median,bloodua_last,glucoseua_last,ketonesua_last,leukocytesua_last,nitriteua_last,pregtestur_last,proteinua_last,"bloodculture,routine_last","urineculture,routine_last",bloodua_npos,glucoseua_npos,ketonesua_npos,leukocytesua_npos,nitriteua_npos,pregtestur_npos,proteinua_npos,"bloodculture,routine_npos","urineculture,routine_npos",bloodua_count,glucoseua_count,ketonesua_count,leukocytesua_count,nitriteua_count,pregtestur_count,proteinua_count,"bloodculture,routine_count","urineculture,routine_count",triage_vital_hr,triage_vital_sbp,triage_vital_dbp,triage_vital_rr,triage_vital_o2,triage_vital_o2_device,triage_vital_temp,pulse_last,resp_last,spo2_last,temp_last,sbp_last,dbp_last,o2_device_last,pulse_min,resp_min,spo2_min,temp_min,sbp_min,dbp_min,o2_device_min,pulse_max,resp_max,spo2_max,temp_max,sbp_max,dbp_max,o2_device_max,pulse_median,resp_median,spo2_median,temp_median,sbp_median,dbp_median,o2_device_median,cxr_count,echo_count,ekg_count,headct_count,mri_count,otherct_count,otherimg_count,otherus_count,otherxr_count,meds_analgesicandantihistaminecombination,meds_analgesics,meds_anesthetics,meds_anti-obesitydrugs,meds_antiallergy,meds_antiarthritics,meds_antiasthmatics,meds_antibiotics,meds_anticoagulants,meds_antidotes,meds_antifungals,meds_antihistamineanddecongestantcombination,meds_antihistamines,meds_antihyperglycemics,meds_antiinfectives,meds_antiinfectives/miscellaneous,meds_antineoplastics,meds_antiparkinsondrugs,meds_antiplateletdrugs,meds_antivirals,meds_autonomicdrugs,meds_biologicals,meds_blood,meds_cardiacdrugs,meds_cardiovascular,meds_cnsdrugs,meds_colonystimulatingfactors,meds_contraceptives,meds_cough/coldpreparations,meds_diagnostic,meds_diuretics,meds_eentpreps,meds_elect/caloric/h2o,meds_gastrointestinal,meds_herbals,meds_hormones,meds_immunosuppressants,meds_investigational,"meds_miscellaneousmedicalsupplies,devices,non-drug",meds_musclerelaxants,meds_pre-natalvitamins,meds_psychotherapeuticdrugs,meds_sedative/hypnotics,meds_skinpreps,meds_smokingdeterrents,meds_thyroidpreps,meds_unclassifieddrugproducts,meds_vitamins,n_surgeries,cc_abdominalcramping,cc_abdominaldistention,cc_abdominalpain,cc_abdominalpainpregnant,cc_abnormallab,cc_abscess,cc_addictionproblem,cc_agitation,cc_alcoholintoxication,cc_alcoholproblem,cc_allergicreaction,cc_alteredmentalstatus,cc_animalbite,cc_ankleinjury,cc_anklepain,cc_anxiety,cc_arminjury,cc_armpain,cc_armswelling,cc_assaultvictim,cc_asthma,cc_backpain,cc_bleeding/bruising,cc_blurredvision,cc_bodyfluidexposure,cc_breastpain,cc_breathingdifficulty,cc_breathingproblem,cc_burn,cc_cardiacarrest,cc_cellulitis,cc_chestpain,cc_chesttightness,cc_chills,cc_coldlikesymptoms,cc_confusion,cc_conjunctivitis,cc_constipation,cc_cough,cc_cyst,cc_decreasedbloodsugar-symptomatic,cc_dehydration,cc_dentalpain,cc_depression,cc_detoxevaluation,cc_diarrhea,cc_dizziness,cc_drug/alcoholassessment,cc_drugproblem,cc_dyspnea,cc_dysuria,cc_earpain,cc_earproblem,cc_edema,cc_elbowpain,cc_elevatedbloodsugar-nosymptoms,cc_elevatedbloodsugar-symptomatic,cc_emesis,cc_epigastricpain,cc_epistaxis,cc_exposuretostd,cc_extremitylaceration,cc_extremityweakness,cc_eyeinjury,cc_eyepain,cc_eyeproblem,cc_eyeredness,cc_facialinjury,cc_faciallaceration,cc_facialpain,cc_facialswelling,cc_fall,cc_fall>65,cc_fatigue,cc_femaleguproblem,cc_fever,cc_fever-75yearsorolder,cc_fever-9weeksto74years,cc_feverimmunocompromised,cc_fingerinjury,cc_fingerpain,cc_fingerswelling,cc_flankpain,cc_follow-upcellulitis,cc_footinjury,cc_footpain,cc_footswelling,cc_foreignbodyineye,cc_fulltrauma,cc_generalizedbodyaches,cc_gibleeding,cc_giproblem,cc_groinpain,cc_hallucinations,cc_handinjury,cc_handpain,cc_headache,cc_headache-newonsetornewsymptoms,cc_headache-recurrentorknowndxmigraines,cc_headachere-evaluation,cc_headinjury,cc_headlaceration,cc_hematuria,cc_hemoptysis,cc_hippain,cc_homicidal,cc_hyperglycemia,cc_hypertension,cc_hypotension,cc_influenza,cc_ingestion,cc_insectbite,cc_irregularheartbeat,cc_jawpain,cc_jointswelling,cc_kneeinjury,cc_kneepain,cc_laceration,cc_leginjury,cc_legpain,cc_legswelling,cc_lethargy,cc_lossofconsciousness,cc_maleguproblem,cc_mass,cc_medicalproblem,cc_medicalscreening,cc_medicationproblem,cc_medicationrefill,cc_migraine,cc_modifiedtrauma,cc_motorcyclecrash,cc_motorvehiclecrash,cc_multiplefalls,cc_nasalcongestion,cc_nausea,cc_nearsyncope,cc_neckpain,cc_neurologicproblem,cc_numbness,cc_oralswelling,cc_otalgia,cc_other,cc_overdose-accidental,cc_overdose-intentional,cc_pain,cc_palpitations,cc_panicattack,cc_pelvicpain,cc_poisoning,cc_post-opproblem,cc_psychiatricevaluation,cc_psychoticsymptoms,cc_rapidheartrate,cc_rash,cc_rectalbleeding,cc_rectalpain,cc_respiratorydistress,cc_ribinjury,cc_ribpain,cc_seizure-newonset,cc_seizure-priorhxof,cc_seizures,cc_shortnessofbreath,cc_shoulderinjury,cc_shoulderpain,cc_sicklecellpain,cc_sinusproblem,cc_skinirritation,cc_skinproblem,cc_sorethroat,cc_stdcheck,cc_strokealert,cc_suicidal,cc_suture/stapleremoval,cc_swallowedforeignbody,cc_syncope,cc_tachycardia,cc_testiclepain,cc_thumbinjury,cc_tickremoval,cc_toeinjury,cc_toepain,cc_trauma,cc_unresponsive,cc_uri,cc_urinaryfrequency,cc_urinaryretention,cc_urinarytractinfection,cc_vaginalbleeding,cc_vaginaldischarge,cc_vaginalpain,cc_weakness,cc_wheezing,cc_withdrawal-alcohol,cc_woundcheck,cc_woundinfection,cc_woundre-evaluation,cc_wristinjury,cc_wristpain
0,B,4,40,Male,Hispanic or Latino,White or Caucasian,English,,Single,Full Time,Other,Discharge,Walk-in,June,Tuesday,23-02,No previous dispo,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,63.0,146.0,85.0,18.0,97.0,0.0,97.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,B,4,66,Male,Hispanic or Latino,Native Hawaiian or Other Pacific Islander,English,Pentecostal,Married,Not Employed,Commercial,Discharge,Car,January,Tuesday,15-18,No previous dispo,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,83.0,125.0,77.0,16.0,,0.0,98.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,B,2,66,Male,Hispanic or Latino,Native Hawaiian or Other Pacific Islander,English,Pentecostal,Married,Not Employed,Commercial,Discharge,Walk-in,July,Thursday,11-14,Discharge,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,78.0,134.0,78.0,16.0,97.0,,97.8,83.0,16.0,,98.0,125.0,77.0,0.0,83.0,16.0,,98.0,125.0,77.0,0.0,83.0,16.0,,98.0,125.0,77.0,0.0,83.0,16.0,,98.0,125.0,77.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,A,2,66,Male,Hispanic or Latino,Native Hawaiian or Other Pacific Islander,English,Pentecostal,Married,Not Employed,Commercial,Discharge,Car,July,Saturday,11-14,Discharge,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,2,0,1.9,,12.0,,71.0,9.2,,19.0,,-1.0,,2.0,,,0.1,0.4,,,,26.0,24.8,50.0,,,,,25.0,,,,,,2.0,,,,,,,42.3,14.9,,,1.0,0.6,,16.0,,31.1,35.2,88.0,5.0,,10.3,76.0,,,,44.0,,,,,238.0,28.0,,10.0,1.0,109.0,44.0,4.5,7.36,4.2,136.0,,,,10.3,,4.8,,12.6,,0.0,,,12.2,,1.9,,12.0,,71.0,9.2,,19.0,,-1.0,,2.0,,,0.1,0.4,,,,26.0,24.8,50.0,,,,,25.0,,,,,,2.0,,,,,,,42.3,14.9,,,1.0,0.6,,16.0,,31.1,35.2,88.0,5.0,,10.3,76.0,,,,44.0,,,,,238.0,28.0,,10.0,1.0,109.0,44.0,4.5,7.36,4.2,136.0,,,,10.3,,4.8,,12.6,,0.0,,,12.2,,1.9,,12.0,,71.0,9.2,,19.0,,-1.0,,2.0,,,0.1,0.4,,,,26.0,24.8,50.0,,,,,25.0,,,,,,2.0,,,,,,,42.3,14.9,,,1.0,0.6,,16.0,,31.1,35.2,88.0,5.0,,10.3,76.0,,,,44.0,,,,,238.0,28.0,,10.0,1.0,109.0,44.0,4.5,7.36,4.2,136.0,,,,10.3,,4.8,,12.6,,0.0,,,12.2,,1.9,,12.0,,71.0,9.2,,19.0,,-1.0,,2.0,,,0.1,0.4,,,,26.0,24.8,50.0,,,,,25.0,,,,,,2.0,,,,,,,42.3,14.9,,,1.0,0.6,,16.0,,31.1,35.2,88.0,5.0,,10.3,76.0,,,,44.0,,,,,238.0,28.0,,10.0,1.0,109.0,44.0,4.5,7.36,4.2,136.0,,,,10.3,,4.8,,12.6,,0.0,,,12.2,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,98.0,171.0,92.0,18.0,98.0,0.0,,61.0,16.0,98.0,97.6,135.0,83.0,0.0,61.0,16.0,97.0,97.6,125.0,67.0,0.0,83.0,16.0,98.0,98.0,135.0,83.0,0.0,74.5,16.0,98.0,97.85,132.5,77.5,0.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,A,3,84,Female,Hispanic or Latino,Other,Other,Pentecostal,Widowed,Retired,Medicare,Admit,Walk-in,November,Tuesday,07-10,Discharge,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,101.0,133.0,72.0,18.0,97.0,0.0,98.4,75.0,18.0,97.0,98.0,140.0,58.0,0.0,70.0,18.0,97.0,98.0,96.0,58.0,0.0,76.0,18.0,98.0,98.4,140.0,76.0,0.0,75.0,18.0,97.0,98.2,132.0,70.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,2.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,5.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,A,3,86,Female,Hispanic or Latino,Other,Other,Pentecostal,Widowed,Retired,Medicare,Discharge,Walk-in,April,Monday,15-18,Admit,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,76.0,143.0,87.0,18.0,98.0,0.0,98.5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,10.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,A,3,86,Female,Hispanic or Latino,Other,Other,Pentecostal,Widowed,Retired,Medicare,Admit,Car,September,Wednesday,11-14,Discharge,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,0,2.3,,,,,6.8,15.0,,,,,0.0,,,,,12.0,,9.6,,,,96.0,,28.7,,,0.7,,,,,1.0,,,,93.0,,,37.6,12.5,,,,,,23.0,2.0,31.8,33.3,96.0,8.0,,8.2,68.0,,,,,,,,3.4,326.0,,,,,,,,,,,,3.9,,,,3.9,,13.2,140.0,,,,9.9,,2.3,,,,,6.8,15.0,,,,,0.0,,,,,12.0,,9.6,,,,96.0,,28.7,,,0.7,,,,,1.0,,,,93.0,,,37.6,12.5,,,,,,23.0,2.0,31.8,33.3,96.0,8.0,,8.2,68.0,,,,,,,,3.4,326.0,,,,,,,,,,,,3.9,,,,3.9,,13.2,140.0,,,,9.9,,2.3,,,,,6.8,15.0,,,,,0.0,,,,,12.0,,9.6,,,,96.0,,28.7,,,0.7,,,,,1.0,,,,93.0,,,37.6,12.5,,,,,,23.0,2.0,31.8,33.3,96.0,8.0,,8.2,68.0,,,,,,,,3.4,326.0,,,,,,,,,,,,3.9,,,,3.9,,13.2,140.0,,,,9.9,,2.3,,,,,6.8,15.0,,,,,0.0,,,,,12.0,,9.6,,,,96.0,,28.7,,,0.7,,,,,1.0,,,,93.0,,,37.6,12.5,,,,,,23.0,2.0,31.8,33.3,96.0,8.0,,8.2,68.0,,,,,,,,3.4,326.0,,,,,,,,,,,,3.9,,,,3.9,,13.2,140.0,,,,9.9,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,0.0,0.0,77.0,154.0,69.0,16.0,,0.0,98.0,66.0,20.0,95.0,98.4,129.0,69.0,0.0,66.0,18.0,95.0,98.4,129.0,69.0,0.0,76.0,20.0,98.0,98.5,143.0,87.0,0.0,71.0,19.0,96.5,98.45,136.0,78.0,0.0,1.0,0.0,1.0,1.0,0.0,2.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,10.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,A,4,87,Female,Hispanic or Latino,Other,Other,Pentecostal,Widowed,Retired,Medicare,Discharge,Car,March,Saturday,11-14,Admit,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2,1,1.9,,14.0,,57.0,5.2,12.0,21.0,556.9,,,0.4,0.0,,0.1,0.3,11.0,16.2,9.4,,,,99.0,,28.0,,,0.68,,,,,0.4,0.0,,,87.0,,,33.9,11.2,0.0,0.3,,,33.0,24.7,,30.9,33.0,93.6,8.3,0.7,11.5,65.9,0.0,0.0,,,,,,,266.0,,,,,,,,,,,,4.6,,,,3.6,,14.3,139.0,0.18,0.0,,7.8,,1.9,,14.0,,57.0,5.2,12.0,21.0,556.9,,,0.0,0.0,,0.1,0.3,11.0,16.2,9.4,,,,96.0,,28.0,,,0.68,,,,,0.4,0.0,,,87.0,,,33.9,11.2,0.0,0.3,,,33.0,23.0,2.0,30.9,33.0,93.6,8.0,0.7,8.2,65.9,0.0,0.0,,,,,,3.4,266.0,,,,,,,,,,,,3.9,,,,3.6,,13.2,139.0,0.18,0.0,,7.8,,2.3,,14.0,,57.0,6.8,15.0,21.0,556.9,,,0.4,0.0,,0.1,0.3,12.0,16.2,9.6,,,,99.0,,28.7,,,0.7,,,,,1.0,0.0,,,93.0,,,37.6,12.5,0.0,0.3,,,33.0,24.7,2.0,31.8,33.3,96.0,8.3,0.7,11.5,68.0,0.0,0.0,,,,,,3.4,326.0,,,,,,,,,,,,4.6,,,,3.9,,14.3,140.0,0.18,0.0,,9.9,,2.1,,14.0,,57.0,6.0,13.5,21.0,556.9,,,0.2,0.0,,0.1,0.3,11.5,16.2,9.5,,,,97.5,,28.35,,,0.69,,,,,0.7,0.0,,,90.0,,,35.75,11.85,0.0,0.3,,,33.0,23.85,2.0,31.35,33.15,94.8,8.15,0.7,9.85,66.95,0.0,0.0,,,,,,3.4,296.0,,,,,,,,,,,,4.25,,,,3.75,,13.75,139.5,0.18,0.0,,8.85,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,0.0,0.0,88.0,155.0,75.0,17.0,98.0,0.0,97.8,68.0,13.0,98.0,,,,,66.0,13.0,95.0,97.5,117.0,51.0,0.0,79.0,20.0,98.0,98.5,154.0,87.0,0.0,76.0,17.0,98.0,98.0,131.5,62.5,0.0,2.0,1.0,3.0,1.0,0.0,2.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,11.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,B,2,75,Male,Non-Hispanic,White or Caucasian,English,,Married,Retired,Medicare,Admit,ambulance,March,Sunday,03-06,Admit,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1,1,0.8,,31.0,5.1,94.0,14.6,18.0,26.0,,,,0.0,,,0.3,0.6,31.0,16.3,10.2,,,,108.0,31.0,17.0,,,1.9,,37.0,,,0.0,,,3.0,213.0,,,48.0,16.1,,,1.13,,18.0,5.0,1.9,32.8,33.6,98.0,2.0,,9.6,93.0,,,,,,,,3.6,239.0,,,,,,,,5.0,,,,4.7,8.1,12.8,28.2,4.9,,12.7,143.0,,0.02,0.78,15.7,,0.8,,31.0,5.1,94.0,14.6,18.0,26.0,,,,0.0,,,0.3,0.6,31.0,16.3,10.2,,,,108.0,31.0,17.0,,,1.9,,37.0,,,0.0,,,3.0,213.0,,,48.0,16.1,,,1.13,,18.0,5.0,1.9,32.8,33.6,98.0,2.0,,9.6,93.0,,,,,,,,3.6,239.0,,,,,,,,5.0,,,,4.7,8.1,12.8,28.2,4.9,,12.7,143.0,,0.02,0.78,15.7,,0.8,,31.0,5.1,94.0,14.6,18.0,26.0,,,,0.0,,,0.3,0.6,31.0,16.3,10.2,,,,108.0,31.0,17.0,,,1.9,,37.0,,,0.0,,,3.0,213.0,,,48.0,16.1,,,1.13,,18.0,5.0,1.9,32.8,33.6,98.0,2.0,,9.6,93.0,,,,,,,,3.6,239.0,,,,,,,,5.0,,,,4.7,8.1,12.8,28.2,4.9,,12.7,143.0,,0.02,0.78,15.7,,0.8,,31.0,5.1,94.0,14.6,18.0,26.0,,,,0.0,,,0.3,0.6,31.0,16.3,10.2,,,,108.0,31.0,17.0,,,1.9,,37.0,,,0.0,,,3.0,213.0,,,48.0,16.1,,,1.13,,18.0,5.0,1.9,32.8,33.6,98.0,2.0,,9.6,93.0,,,,,,,,3.6,239.0,,,,,,,,5.0,,,,4.7,8.1,12.8,28.2,4.9,,12.7,143.0,,0.02,0.78,15.7,,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,,,,,,,,84.0,16.0,97.0,,151.0,91.0,0.0,84.0,16.0,97.0,98.6,132.0,80.0,0.0,103.0,18.0,97.0,100.0,184.0,95.0,0.0,100.0,18.0,97.0,100.0,150.0,89.5,0.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,4.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,B,2,75,Male,Non-Hispanic,White or Caucasian,English,,Married,Retired,Medicare,Admit,ambulance,October,Sunday,15-18,Admit,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1,1,,,53.0,4.4,81.0,,17.0,39.0,,,,,,,,0.7,107.0,28.2,9.7,,,,110.0,,21.0,,,3.8,,16.0,,,,,,3.1,123.0,,,50.2,17.3,,,1.23,,46.0,,,33.7,34.5,98.0,,,9.6,,,,,,,,,,196.0,,,,,,,,,,,,5.8,7.5,13.4,30.2,5.1,,13.5,148.0,,,,22.6,,,,53.0,4.4,81.0,,17.0,39.0,,,,,,,,0.7,107.0,28.2,9.7,,,,110.0,,21.0,,,3.8,,16.0,,,,,,3.1,123.0,,,50.2,17.3,,,1.23,,46.0,,,33.7,34.5,98.0,,,9.6,,,,,,,,,,196.0,,,,,,,,,,,,5.8,7.5,13.4,30.2,5.1,,13.5,148.0,,,,22.6,,,,53.0,4.4,81.0,,17.0,39.0,,,,,,,,0.7,107.0,28.2,9.7,,,,110.0,,21.0,,,3.8,,16.0,,,,,,3.1,123.0,,,50.2,17.3,,,1.23,,46.0,,,33.7,34.5,98.0,,,9.6,,,,,,,,,,196.0,,,,,,,,,,,,5.8,7.5,13.4,30.2,5.1,,13.5,148.0,,,,22.6,,,,53.0,4.4,81.0,,17.0,39.0,,,,,,,,0.7,107.0,28.2,9.7,,,,110.0,,21.0,,,3.8,,16.0,,,,,,3.1,123.0,,,50.2,17.3,,,1.23,,46.0,,,33.7,34.5,98.0,,,9.6,,,,,,,,,,196.0,,,,,,,,,,,,5.8,7.5,13.4,30.2,5.1,,13.5,148.0,,,,22.6,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,86.0,18.0,94.0,,106.0,75.0,0.0,82.0,16.0,93.0,97.5,92.0,53.0,0.0,104.0,18.0,98.0,99.0,106.0,75.0,0.0,86.0,18.0,97.0,98.25,101.0,73.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


Looking at the data dictionary, we can see that columns are grouped into the following distinct categories: 
- `Demographics`: All demographic data related to patient
- `Triage Variables`: All vital signs recorded by the triage nurse
- `Hospital Usage`: How often the patient used the hospital, prior surgeries or hospitalizations
- `Chief Complaint`: What they visited the ER for
- `Past Medical History ` : Pre-existing conditions of patients
- `Outpatient Medications` : Number of outpatient medications taken divided up by type. 
- `Imaging` : Number of imaging procedures within one year of the current visit.
- `Historical Labs (Numberic)`: The last min, max, and median recorded value of all vitals taken in the ED within one year of the current visit 
- `Historical Labs (Categorical)` : The last value, number of positive results, and the total number of tests ordered for selected categorical labs ordered in the ED within one year of the current visit 

In [4]:
raw_data.size #to find out the number of rows 

544792392

In [5]:
raw_data.shape[1] #to find out the number of columns

972

As we can see, our data frame is 927 columns and ~550k rows. 


---
#### <a id = 'col'></a> 1.2 Deleting Columns

In order to create a useable data set to be used to predict the outcomes for diabetic patients, they first need to be isolated. From looking at the data dictionary, those who have 'diabmelnoc' and 'diabmelwcm' have a prior medical history of diabetes. Those who are visiting with the chief complaint of 'hyperglycemia' could possibly be diabetic as well. 

Here is the proposed plan of attack: 
- Create a new data frame that only contains the data of those with diabetes

First, let's isolate those that are positive for diabetes

In [6]:
#Filter rows that are positive for the condition
filtered_rows = raw_data[(raw_data['cc_hyperglycemia'] == 1) | (raw_data['diabmelnoc'] == 1) | (raw_data['diabmelwcm'] == 1)]

In [7]:
#Concatenate the filtered rows into a new DataFrame
health_data = pd.concat([filtered_rows])

In [8]:
health_data.head(5)

Unnamed: 0,dep_name,esi,age,gender,ethnicity,race,lang,religion,maritalstatus,employstatus,insurance_status,disposition,arrivalmode,arrivalmonth,arrivalday,arrivalhour_bin,previousdispo,2ndarymalig,abdomhernia,abdomnlpain,abortcompl,acqfootdef,acrenlfail,acutecvd,acutemi,acutphanm,adjustmentdisorders,adltrespfl,alcoholrelateddisorders,allergy,amniosdx,analrectal,anemia,aneurysm,anxietydisorders,appendicitis,artembolism,asppneumon,asthma,attentiondeficitconductdisruptivebeha,backproblem,biliarydx,birthasphyx,birthtrauma,bladdercncr,blindness,bnignutneo,bonectcncr,bph,brainnscan,breastcancr,breastdx,brnchlngca,bronchitis,burns,cardiaarrst,cardiacanom,carditis,cataract,cervixcancr,chestpain,chfnonhp,chrkidneydisease,coaghemrdx,coloncancer,comabrndmg,complicdevi,complicproc,conduction,contraceptiv,copd,coronathero,crushinjury,cysticfibro,deliriumdementiaamnesticothercognitiv,developmentaldisorders,diabmelnoc,diabmelwcm,disordersusuallydiagnosedininfancych,diverticulos,dizziness,dminpreg,dysrhythmia,earlylabor,ecodesadverseeffectsofmedicalcare,ecodesadverseeffectsofmedicaldrugs,ecodescutpierce,ecodesdrowningsubmersion,ecodesfall,ecodesfirearm,ecodesfireburn,ecodesmachinery,ecodesmotorvehicletrafficmvt,ecodesnaturalenvironment,ecodesotherspecifiedandclassifiable,ecodesotherspecifiednec,ecodespedalcyclistnotmvt,ecodesplaceofoccurrence,ecodespoisoning,ecodesstruckbyagainst,ecodessuffocation,ecodestransportnotmvt,ecodesunspecified,ectopicpreg,encephalitis,endometrios,epilepsycnv,esophcancer,esophgealdx,exameval,eyeinfectn,fatigue,femgenitca,feminfertil,fetaldistrs,fluidelcdx,fuo,fxarm,fxhip,fxleg,fxskullfac,gangrene,gasduoulcer,gastritis,gastroent,giconganom,gihemorrhag,giperitcan,glaucoma,goutotcrys,guconganom,hdnckcancr,headachemig,hemmorhoids,hemorrpreg,hepatitis,hivinfectn,hodgkinsds,hrtvalvedx,htn,htncomplicn,htninpreg,hyperlipidem,immunitydx,immunizscrn,impulsecontroldisordersnec,inducabortn,infectarth,influenza,infmalegen,intestinfct,intobstruct,intracrninj,jointinjury,kidnyrnlca,lateeffcvd,leukemias,liveborn,liveribdca,longpregncy,lowbirthwt,lungexternl,lymphenlarg,maintchemr,malgenitca,maligneopls,malposition,meningitis,menopausldx,menstrualdx,miscellaneousmentalhealthdisorders,mooddisorders,mouthdx,ms,multmyeloma,mycoses,nauseavomit,neoplsmunsp,nephritis,nervcongan,nonepithca,nonhodglym,nutritdefic,obrelatedperintrauma,opnwndextr,opnwndhead,osteoarthros,osteoporosis,otacqdefor,otaftercare,otbnignneo,otbonedx,otcirculdx,otcomplbir,otconganom,otconntiss,otdxbladdr,otdxkidney,otdxstomch,otendodsor,otfemalgen,othbactinf,othcnsinfx,othematldx,othercvd,othereardx,otheredcns,othereyedx,othergidx,othergudx,otherinjury,otherpregnancyanddeliveryincludingnormal,otherscreen,othfracture,othheartdx,othinfectns,othliverdx,othlowresp,othmalegen,othnervdx,othskindx,othveindx,otinflskin,otitismedia,otjointdx,otnutritdx,otperintdx,otpregcomp,otprimryca,otrespirca,otupprresp,otuprspin,ovariancyst,ovarycancer,pancreascan,pancreasdx,paralysis,parkinsons,pathologfx,pelvicobstr,perintjaund,peripathero,peritonitis,personalitydisorders,phlebitis,pid,pleurisy,pneumonia,poisnnonmed,poisnotmed,poisonpsych,precereoccl,prevcsectn,prolapse,prostatecan,pulmhartdx,rctmanusca,rehab,respdistres,retinaldx,rheumarth,schizophreniaandotherpsychoticdisorde,screeningandhistoryofmentalhealthan,septicemia,septicemiaexceptinlabor,sexualinfxs,shock,sicklecell,skininfectn,skinmelanom,sle,socialadmin,spincorinj,spontabortn,sprain,stomchcancr,substancerelateddisorders,suicideandintentionalselfinflictedin,superficinj,syncope,teethdx,testiscancr,thyroidcncr,thyroiddsor,tia,tonsillitis,tuberculosis,ulceratcol,ulcerskin,umbilcord,unclassified,urinstone,urinyorgca,uteruscancr,uti,varicosevn,viralinfect,whtblooddx,n_edvisits,n_admissions,absolutelymphocytecount_last,acetonebld_last,alanineaminotransferase(alt)_last,albumin_last,alkphos_last,anc(absneutrophilcount)_last,aniongap_last,aspartateaminotransferase(ast)_last,"b-typenatriureticpeptide,pro(probnp)_last",baseexcess(poc)_last,"baseexcess,venous(poc)_last",basos_last,basosabs_last,"benzodiazepinesscreen,urine,noconf._last",bilirubindirect_last,bilirubintotal_last,bun_last,bun/creatratio_last,calcium_last,calculatedco2(poc)_last,calculatedhco3(poc)i_last,calculatedo2saturation(poc)_last,chloride_last,cktotal_last,co2_last,"co2calculated,venous(poc)_last","co2,poc_last",creatinine_last,d-dimer_last,egfr_last,egfr(nonafricanamerican)_last,egfr(aframer)_last,eos_last,eosinoabs_last,epithelialcells_last,globulin_last,glucose_last,"glucose,meter_last","hco3calculated,venous(poc)_last",hematocrit_last,hemoglobin_last,immaturegrans(abs)_last,immaturegranulocytes_last,inr_last,"lactate,poc_last",lipase_last,lymphs_last,magnesium_last,mch_last,mchc_last,mcv_last,monocytes_last,monosabs_last,mpv_last,neutrophils_last,nrbc_last,nrbcabsolute_last,"o2satcalculated,venous(poc)_last",pco2(poc)_last,"pco2,venous(poc)_last","ph,venous(poc)_last","phencyclidine(pcp)screen,urine,noconf._last",phosphorus_last,platelets_last,po2(poc)_last,"po2,venous(poc)_last",pocbun_last,poccreatinine_last,pocglucose_last,pochematocrit_last,pocionizedcalcium_last,pocph_last,pocpotassium_last,pocsodium_last,poctroponini._last,potassium_last,proteintotal_last,prothrombintime_last,ptt_last,rbc_last,rbc/hpf_last,rdw_last,sodium_last,troponini(poc)_last,troponint_last,tsh_last,wbc_last,wbc/hpf_last,absolutelymphocytecount_min,acetonebld_min,alanineaminotransferase(alt)_min,albumin_min,alkphos_min,anc(absneutrophilcount)_min,aniongap_min,aspartateaminotransferase(ast)_min,"b-typenatriureticpeptide,pro(probnp)_min",baseexcess(poc)_min,"baseexcess,venous(poc)_min",basos_min,basosabs_min,"benzodiazepinesscreen,urine,noconf._min",bilirubindirect_min,bilirubintotal_min,bun_min,bun/creatratio_min,calcium_min,calculatedco2(poc)_min,calculatedhco3(poc)i_min,calculatedo2saturation(poc)_min,chloride_min,cktotal_min,co2_min,"co2calculated,venous(poc)_min","co2,poc_min",creatinine_min,d-dimer_min,egfr_min,egfr(nonafricanamerican)_min,egfr(aframer)_min,eos_min,eosinoabs_min,epithelialcells_min,globulin_min,glucose_min,"glucose,meter_min","hco3calculated,venous(poc)_min",hematocrit_min,hemoglobin_min,immaturegrans(abs)_min,immaturegranulocytes_min,inr_min,"lactate,poc_min",lipase_min,lymphs_min,magnesium_min,mch_min,mchc_min,mcv_min,monocytes_min,monosabs_min,mpv_min,neutrophils_min,nrbc_min,nrbcabsolute_min,"o2satcalculated,venous(poc)_min",pco2(poc)_min,"pco2,venous(poc)_min","ph,venous(poc)_min","phencyclidine(pcp)screen,urine,noconf._min",phosphorus_min,platelets_min,po2(poc)_min,"po2,venous(poc)_min",pocbun_min,poccreatinine_min,pocglucose_min,pochematocrit_min,pocionizedcalcium_min,pocph_min,pocpotassium_min,pocsodium_min,poctroponini._min,potassium_min,proteintotal_min,prothrombintime_min,ptt_min,rbc_min,rbc/hpf_min,rdw_min,sodium_min,troponini(poc)_min,troponint_min,tsh_min,wbc_min,wbc/hpf_min,absolutelymphocytecount_max,acetonebld_max,alanineaminotransferase(alt)_max,albumin_max,alkphos_max,anc(absneutrophilcount)_max,aniongap_max,aspartateaminotransferase(ast)_max,"b-typenatriureticpeptide,pro(probnp)_max",baseexcess(poc)_max,"baseexcess,venous(poc)_max",basos_max,basosabs_max,"benzodiazepinesscreen,urine,noconf._max",bilirubindirect_max,bilirubintotal_max,bun_max,bun/creatratio_max,calcium_max,calculatedco2(poc)_max,calculatedhco3(poc)i_max,calculatedo2saturation(poc)_max,chloride_max,cktotal_max,co2_max,"co2calculated,venous(poc)_max","co2,poc_max",creatinine_max,d-dimer_max,egfr_max,egfr(nonafricanamerican)_max,egfr(aframer)_max,eos_max,eosinoabs_max,epithelialcells_max,globulin_max,glucose_max,"glucose,meter_max","hco3calculated,venous(poc)_max",hematocrit_max,hemoglobin_max,immaturegrans(abs)_max,immaturegranulocytes_max,inr_max,"lactate,poc_max",lipase_max,lymphs_max,magnesium_max,mch_max,mchc_max,mcv_max,monocytes_max,monosabs_max,mpv_max,neutrophils_max,nrbc_max,nrbcabsolute_max,"o2satcalculated,venous(poc)_max",pco2(poc)_max,"pco2,venous(poc)_max","ph,venous(poc)_max","phencyclidine(pcp)screen,urine,noconf._max",phosphorus_max,platelets_max,po2(poc)_max,"po2,venous(poc)_max",pocbun_max,poccreatinine_max,pocglucose_max,pochematocrit_max,pocionizedcalcium_max,pocph_max,pocpotassium_max,pocsodium_max,poctroponini._max,potassium_max,proteintotal_max,prothrombintime_max,ptt_max,rbc_max,rbc/hpf_max,rdw_max,sodium_max,troponini(poc)_max,troponint_max,tsh_max,wbc_max,wbc/hpf_max,absolutelymphocytecount_median,acetonebld_median,alanineaminotransferase(alt)_median,albumin_median,alkphos_median,anc(absneutrophilcount)_median,aniongap_median,aspartateaminotransferase(ast)_median,"b-typenatriureticpeptide,pro(probnp)_median",baseexcess(poc)_median,"baseexcess,venous(poc)_median",basos_median,basosabs_median,"benzodiazepinesscreen,urine,noconf._median",bilirubindirect_median,bilirubintotal_median,bun_median,bun/creatratio_median,calcium_median,calculatedco2(poc)_median,calculatedhco3(poc)i_median,calculatedo2saturation(poc)_median,chloride_median,cktotal_median,co2_median,"co2calculated,venous(poc)_median","co2,poc_median",creatinine_median,d-dimer_median,egfr_median,egfr(nonafricanamerican)_median,egfr(aframer)_median,eos_median,eosinoabs_median,epithelialcells_median,globulin_median,glucose_median,"glucose,meter_median","hco3calculated,venous(poc)_median",hematocrit_median,hemoglobin_median,immaturegrans(abs)_median,immaturegranulocytes_median,inr_median,"lactate,poc_median",lipase_median,lymphs_median,magnesium_median,mch_median,mchc_median,mcv_median,monocytes_median,monosabs_median,mpv_median,neutrophils_median,nrbc_median,nrbcabsolute_median,"o2satcalculated,venous(poc)_median",pco2(poc)_median,"pco2,venous(poc)_median","ph,venous(poc)_median","phencyclidine(pcp)screen,urine,noconf._median",phosphorus_median,platelets_median,po2(poc)_median,"po2,venous(poc)_median",pocbun_median,poccreatinine_median,pocglucose_median,pochematocrit_median,pocionizedcalcium_median,pocph_median,pocpotassium_median,pocsodium_median,poctroponini._median,potassium_median,proteintotal_median,prothrombintime_median,ptt_median,rbc_median,rbc/hpf_median,rdw_median,sodium_median,troponini(poc)_median,troponint_median,tsh_median,wbc_median,wbc/hpf_median,bloodua_last,glucoseua_last,ketonesua_last,leukocytesua_last,nitriteua_last,pregtestur_last,proteinua_last,"bloodculture,routine_last","urineculture,routine_last",bloodua_npos,glucoseua_npos,ketonesua_npos,leukocytesua_npos,nitriteua_npos,pregtestur_npos,proteinua_npos,"bloodculture,routine_npos","urineculture,routine_npos",bloodua_count,glucoseua_count,ketonesua_count,leukocytesua_count,nitriteua_count,pregtestur_count,proteinua_count,"bloodculture,routine_count","urineculture,routine_count",triage_vital_hr,triage_vital_sbp,triage_vital_dbp,triage_vital_rr,triage_vital_o2,triage_vital_o2_device,triage_vital_temp,pulse_last,resp_last,spo2_last,temp_last,sbp_last,dbp_last,o2_device_last,pulse_min,resp_min,spo2_min,temp_min,sbp_min,dbp_min,o2_device_min,pulse_max,resp_max,spo2_max,temp_max,sbp_max,dbp_max,o2_device_max,pulse_median,resp_median,spo2_median,temp_median,sbp_median,dbp_median,o2_device_median,cxr_count,echo_count,ekg_count,headct_count,mri_count,otherct_count,otherimg_count,otherus_count,otherxr_count,meds_analgesicandantihistaminecombination,meds_analgesics,meds_anesthetics,meds_anti-obesitydrugs,meds_antiallergy,meds_antiarthritics,meds_antiasthmatics,meds_antibiotics,meds_anticoagulants,meds_antidotes,meds_antifungals,meds_antihistamineanddecongestantcombination,meds_antihistamines,meds_antihyperglycemics,meds_antiinfectives,meds_antiinfectives/miscellaneous,meds_antineoplastics,meds_antiparkinsondrugs,meds_antiplateletdrugs,meds_antivirals,meds_autonomicdrugs,meds_biologicals,meds_blood,meds_cardiacdrugs,meds_cardiovascular,meds_cnsdrugs,meds_colonystimulatingfactors,meds_contraceptives,meds_cough/coldpreparations,meds_diagnostic,meds_diuretics,meds_eentpreps,meds_elect/caloric/h2o,meds_gastrointestinal,meds_herbals,meds_hormones,meds_immunosuppressants,meds_investigational,"meds_miscellaneousmedicalsupplies,devices,non-drug",meds_musclerelaxants,meds_pre-natalvitamins,meds_psychotherapeuticdrugs,meds_sedative/hypnotics,meds_skinpreps,meds_smokingdeterrents,meds_thyroidpreps,meds_unclassifieddrugproducts,meds_vitamins,n_surgeries,cc_abdominalcramping,cc_abdominaldistention,cc_abdominalpain,cc_abdominalpainpregnant,cc_abnormallab,cc_abscess,cc_addictionproblem,cc_agitation,cc_alcoholintoxication,cc_alcoholproblem,cc_allergicreaction,cc_alteredmentalstatus,cc_animalbite,cc_ankleinjury,cc_anklepain,cc_anxiety,cc_arminjury,cc_armpain,cc_armswelling,cc_assaultvictim,cc_asthma,cc_backpain,cc_bleeding/bruising,cc_blurredvision,cc_bodyfluidexposure,cc_breastpain,cc_breathingdifficulty,cc_breathingproblem,cc_burn,cc_cardiacarrest,cc_cellulitis,cc_chestpain,cc_chesttightness,cc_chills,cc_coldlikesymptoms,cc_confusion,cc_conjunctivitis,cc_constipation,cc_cough,cc_cyst,cc_decreasedbloodsugar-symptomatic,cc_dehydration,cc_dentalpain,cc_depression,cc_detoxevaluation,cc_diarrhea,cc_dizziness,cc_drug/alcoholassessment,cc_drugproblem,cc_dyspnea,cc_dysuria,cc_earpain,cc_earproblem,cc_edema,cc_elbowpain,cc_elevatedbloodsugar-nosymptoms,cc_elevatedbloodsugar-symptomatic,cc_emesis,cc_epigastricpain,cc_epistaxis,cc_exposuretostd,cc_extremitylaceration,cc_extremityweakness,cc_eyeinjury,cc_eyepain,cc_eyeproblem,cc_eyeredness,cc_facialinjury,cc_faciallaceration,cc_facialpain,cc_facialswelling,cc_fall,cc_fall>65,cc_fatigue,cc_femaleguproblem,cc_fever,cc_fever-75yearsorolder,cc_fever-9weeksto74years,cc_feverimmunocompromised,cc_fingerinjury,cc_fingerpain,cc_fingerswelling,cc_flankpain,cc_follow-upcellulitis,cc_footinjury,cc_footpain,cc_footswelling,cc_foreignbodyineye,cc_fulltrauma,cc_generalizedbodyaches,cc_gibleeding,cc_giproblem,cc_groinpain,cc_hallucinations,cc_handinjury,cc_handpain,cc_headache,cc_headache-newonsetornewsymptoms,cc_headache-recurrentorknowndxmigraines,cc_headachere-evaluation,cc_headinjury,cc_headlaceration,cc_hematuria,cc_hemoptysis,cc_hippain,cc_homicidal,cc_hyperglycemia,cc_hypertension,cc_hypotension,cc_influenza,cc_ingestion,cc_insectbite,cc_irregularheartbeat,cc_jawpain,cc_jointswelling,cc_kneeinjury,cc_kneepain,cc_laceration,cc_leginjury,cc_legpain,cc_legswelling,cc_lethargy,cc_lossofconsciousness,cc_maleguproblem,cc_mass,cc_medicalproblem,cc_medicalscreening,cc_medicationproblem,cc_medicationrefill,cc_migraine,cc_modifiedtrauma,cc_motorcyclecrash,cc_motorvehiclecrash,cc_multiplefalls,cc_nasalcongestion,cc_nausea,cc_nearsyncope,cc_neckpain,cc_neurologicproblem,cc_numbness,cc_oralswelling,cc_otalgia,cc_other,cc_overdose-accidental,cc_overdose-intentional,cc_pain,cc_palpitations,cc_panicattack,cc_pelvicpain,cc_poisoning,cc_post-opproblem,cc_psychiatricevaluation,cc_psychoticsymptoms,cc_rapidheartrate,cc_rash,cc_rectalbleeding,cc_rectalpain,cc_respiratorydistress,cc_ribinjury,cc_ribpain,cc_seizure-newonset,cc_seizure-priorhxof,cc_seizures,cc_shortnessofbreath,cc_shoulderinjury,cc_shoulderpain,cc_sicklecellpain,cc_sinusproblem,cc_skinirritation,cc_skinproblem,cc_sorethroat,cc_stdcheck,cc_strokealert,cc_suicidal,cc_suture/stapleremoval,cc_swallowedforeignbody,cc_syncope,cc_tachycardia,cc_testiclepain,cc_thumbinjury,cc_tickremoval,cc_toeinjury,cc_toepain,cc_trauma,cc_unresponsive,cc_uri,cc_urinaryfrequency,cc_urinaryretention,cc_urinarytractinfection,cc_vaginalbleeding,cc_vaginaldischarge,cc_vaginalpain,cc_weakness,cc_wheezing,cc_withdrawal-alcohol,cc_woundcheck,cc_woundinfection,cc_woundre-evaluation,cc_wristinjury,cc_wristpain
5,A,3,86,Female,Hispanic or Latino,Other,Other,Pentecostal,Widowed,Retired,Medicare,Discharge,Walk-in,April,Monday,15-18,Admit,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,76.0,143.0,87.0,18.0,98.0,0.0,98.5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,10.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,A,3,86,Female,Hispanic or Latino,Other,Other,Pentecostal,Widowed,Retired,Medicare,Admit,Car,September,Wednesday,11-14,Discharge,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,0,2.3,,,,,6.8,15.0,,,,,0.0,,,,,12.0,,9.6,,,,96.0,,28.7,,,0.7,,,,,1.0,,,,93.0,,,37.6,12.5,,,,,,23.0,2.0,31.8,33.3,96.0,8.0,,8.2,68.0,,,,,,,,3.4,326.0,,,,,,,,,,,,3.9,,,,3.9,,13.2,140.0,,,,9.9,,2.3,,,,,6.8,15.0,,,,,0.0,,,,,12.0,,9.6,,,,96.0,,28.7,,,0.7,,,,,1.0,,,,93.0,,,37.6,12.5,,,,,,23.0,2.0,31.8,33.3,96.0,8.0,,8.2,68.0,,,,,,,,3.4,326.0,,,,,,,,,,,,3.9,,,,3.9,,13.2,140.0,,,,9.9,,2.3,,,,,6.8,15.0,,,,,0.0,,,,,12.0,,9.6,,,,96.0,,28.7,,,0.7,,,,,1.0,,,,93.0,,,37.6,12.5,,,,,,23.0,2.0,31.8,33.3,96.0,8.0,,8.2,68.0,,,,,,,,3.4,326.0,,,,,,,,,,,,3.9,,,,3.9,,13.2,140.0,,,,9.9,,2.3,,,,,6.8,15.0,,,,,0.0,,,,,12.0,,9.6,,,,96.0,,28.7,,,0.7,,,,,1.0,,,,93.0,,,37.6,12.5,,,,,,23.0,2.0,31.8,33.3,96.0,8.0,,8.2,68.0,,,,,,,,3.4,326.0,,,,,,,,,,,,3.9,,,,3.9,,13.2,140.0,,,,9.9,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,0.0,0.0,77.0,154.0,69.0,16.0,,0.0,98.0,66.0,20.0,95.0,98.4,129.0,69.0,0.0,66.0,18.0,95.0,98.4,129.0,69.0,0.0,76.0,20.0,98.0,98.5,143.0,87.0,0.0,71.0,19.0,96.5,98.45,136.0,78.0,0.0,1.0,0.0,1.0,1.0,0.0,2.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,10.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,A,4,87,Female,Hispanic or Latino,Other,Other,Pentecostal,Widowed,Retired,Medicare,Discharge,Car,March,Saturday,11-14,Admit,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2,1,1.9,,14.0,,57.0,5.2,12.0,21.0,556.9,,,0.4,0.0,,0.1,0.3,11.0,16.2,9.4,,,,99.0,,28.0,,,0.68,,,,,0.4,0.0,,,87.0,,,33.9,11.2,0.0,0.3,,,33.0,24.7,,30.9,33.0,93.6,8.3,0.7,11.5,65.9,0.0,0.0,,,,,,,266.0,,,,,,,,,,,,4.6,,,,3.6,,14.3,139.0,0.18,0.0,,7.8,,1.9,,14.0,,57.0,5.2,12.0,21.0,556.9,,,0.0,0.0,,0.1,0.3,11.0,16.2,9.4,,,,96.0,,28.0,,,0.68,,,,,0.4,0.0,,,87.0,,,33.9,11.2,0.0,0.3,,,33.0,23.0,2.0,30.9,33.0,93.6,8.0,0.7,8.2,65.9,0.0,0.0,,,,,,3.4,266.0,,,,,,,,,,,,3.9,,,,3.6,,13.2,139.0,0.18,0.0,,7.8,,2.3,,14.0,,57.0,6.8,15.0,21.0,556.9,,,0.4,0.0,,0.1,0.3,12.0,16.2,9.6,,,,99.0,,28.7,,,0.7,,,,,1.0,0.0,,,93.0,,,37.6,12.5,0.0,0.3,,,33.0,24.7,2.0,31.8,33.3,96.0,8.3,0.7,11.5,68.0,0.0,0.0,,,,,,3.4,326.0,,,,,,,,,,,,4.6,,,,3.9,,14.3,140.0,0.18,0.0,,9.9,,2.1,,14.0,,57.0,6.0,13.5,21.0,556.9,,,0.2,0.0,,0.1,0.3,11.5,16.2,9.5,,,,97.5,,28.35,,,0.69,,,,,0.7,0.0,,,90.0,,,35.75,11.85,0.0,0.3,,,33.0,23.85,2.0,31.35,33.15,94.8,8.15,0.7,9.85,66.95,0.0,0.0,,,,,,3.4,296.0,,,,,,,,,,,,4.25,,,,3.75,,13.75,139.5,0.18,0.0,,8.85,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,0.0,0.0,88.0,155.0,75.0,17.0,98.0,0.0,97.8,68.0,13.0,98.0,,,,,66.0,13.0,95.0,97.5,117.0,51.0,0.0,79.0,20.0,98.0,98.5,154.0,87.0,0.0,76.0,17.0,98.0,98.0,131.5,62.5,0.0,2.0,1.0,3.0,1.0,0.0,2.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,11.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
13,B,3,53,Male,Hispanic or Latino,Other,English,Catholic,Significant Other,Disabled,Medicare,Admit,ambulance,March,Wednesday,23-02,No previous dispo,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
14,B,3,53,Male,Hispanic or Latino,Other,English,Catholic,Significant Other,Disabled,Medicare,Admit,ambulance,May,Sunday,19-22,Admit,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,1,1.1,,9.0,,92.0,1.4,,9.0,,,,1.0,,,0.4,0.9,,,,,,,,,,,27.0,,,,,,0.0,,,,,,,39.5,13.0,,,,,,43.0,,27.3,33.0,83.0,4.0,,8.6,53.0,,,,,,,,,189.0,,,16.0,1.0,154.0,41.0,4.4,,3.8,139.0,,,,,,4.8,,15.2,,,,,2.7,,1.1,,9.0,,92.0,1.4,,9.0,,,,1.0,,,0.4,0.9,,,,,,,,,,,27.0,,,,,,0.0,,,,,,,39.5,13.0,,,,,,43.0,,27.3,33.0,83.0,4.0,,8.6,53.0,,,,,,,,,189.0,,,16.0,1.0,154.0,41.0,4.4,,3.8,139.0,,,,,,4.8,,15.2,,,,,2.7,,1.1,,9.0,,92.0,1.4,,9.0,,,,1.0,,,0.4,0.9,,,,,,,,,,,27.0,,,,,,0.0,,,,,,,39.5,13.0,,,,,,43.0,,27.3,33.0,83.0,4.0,,8.6,53.0,,,,,,,,,189.0,,,16.0,1.0,154.0,41.0,4.4,,3.8,139.0,,,,,,4.8,,15.2,,,,,2.7,,1.1,,9.0,,92.0,1.4,,9.0,,,,1.0,,,0.4,0.9,,,,,,,,,,,27.0,,,,,,0.0,,,,,,,39.5,13.0,,,,,,43.0,,27.3,33.0,83.0,4.0,,8.6,53.0,,,,,,,,,189.0,,,16.0,1.0,154.0,41.0,4.4,,3.8,139.0,,,,,,4.8,,15.2,,,,,2.7,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,104.0,18.0,96.0,98.7,137.0,83.0,0.0,104.0,18.0,96.0,98.0,108.0,79.0,0.0,115.0,20.0,98.0,98.7,137.0,86.0,0.0,105.0,20.0,97.0,98.0,124.0,83.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


Let's now look at the size of this data set. 

In [9]:
health_data.shape

(108567, 972)

We have a about 110k rows left, this is still a sizeable data set to use. Let's quickly take a look at the nulls that exist so that we can better inform our dropping column strategy.

In [10]:
health_data.isna().sum()/health_data.shape[0]*100

dep_name                                                0.000000
esi                                                     0.369357
age                                                     0.000000
gender                                                  0.000000
ethnicity                                               0.000000
race                                                    0.005527
lang                                                    0.000000
religion                                                0.000000
maritalstatus                                           0.000000
employstatus                                            0.000000
insurance_status                                        0.000000
disposition                                             0.000000
arrivalmode                                             2.013503
arrivalmonth                                            0.000000
arrivalday                                              0.000000
arrivalhour_bin          

As we can see, there are a few nulls that that exist throughout the data. In particular, there is a large percentage of missing values in the `triage vital` super column and the `historical labs`. I am only going to keep the esi and the arrival month. 

As for the `Historical Labs`, I believe that it's necessary to keep the median glucose reading as this will be our gage. I will impute the missing values based on the average reading for the age group who also have diabetes. I will also include a column to indicate whether or not these values were imputed.

----
As indicated earlier, there are super-categories that these columns belong to. Some of them are labelled accordingly and some aren't. For instance, `chief complaint` columns are preceded by cc_. I will to these for the `Demographic` (demo_), `Triage Variables` (I will split it into categorical triage_cat_), `Hospital Usage` (huse_), and  `Past Medical History`(pmh_). This way it will be easier to group down the road.

Let's start with altering the `Demographic` columns.

In [11]:
demo_col = ['age','gender','ethnicity','race','lang','religion','maritalstatus','employstatus','insurance_status']

In [12]:
#Create a dictionary to map the new column names
column_name_mapping = {col: 'demo_' + col for col in demo_col}

In [13]:
# Rename the columns in the DataFrame
health_data = health_data.rename(columns=column_name_mapping)

Now, the's alter the `Triage Variables` that are categorical in nature by adding triage_cat_ to the beginning. 

In [14]:
tri_cat = ['dep_name','arrivalmode', 'arrivalmonth', 'arrivalday', 'arrivalhour_bin', 'esi']

In [15]:
#Create a dictionary to map the new column names
column_name_mapping = {col: 'triage_cat_' + col for col in tri_cat}

In [16]:
# Rename the columns in the DataFrame
health_data = health_data.rename(columns=column_name_mapping)

Now, the's alter the `Hospital Usage`  by adding huse_ to the beginning. 

In [17]:
hos_use = ['previousdispo', 'n_edvisits', 'n_admissions', 'n_surgeries']

In [18]:
#Create a dictionary to map the new column names
column_name_mapping = {col: 'huse_' + col for col in hos_use}

In [19]:
# Rename the columns in the DataFrame
health_data = health_data.rename(columns=column_name_mapping)

Now, the's alter the `Past Medical History`  by adding pmh_ to the beginning. 

In [20]:
pmh_col = [
    '2ndarymalig', 'abdomhernia', 'abdomnlpain', 'abortcompl', 'acqfootdef',
    'acrenlfail', 'acutecvd', 'acutemi', 'acutphanm', 'adjustmentdisorders',
    'adltrespfl', 'alcoholrelateddisorders', 'allergy', 'amniosdx',
    'analrectal', 'anemia', 'aneurysm', 'anxietydisorders', 'appendicitis',
    'artembolism', 'asppneumon', 'asthma', 'attentiondeficitconductdisruptivebeha',
    'backproblem', 'biliarydx', 'birthasphyx', 'birthtrauma', 'bladdercncr',
    'blindness', 'bnignutneo', 'bonectcncr', 'bph', 'brainnscan',
    'breastcancr', 'breastdx', 'brnchlngca', 'bronchitis', 'burns',
    'cardiaarrst', 'cardiacanom', 'carditis', 'cataract', 'cervixcancr',
    'chestpain', 'chfnonhp', 'chrkidneydisease', 'coaghemrdx', 'coloncancer',
    'comabrndmg', 'complicdevi', 'complicproc', 'conduction', 'contraceptiv',
    'copd', 'coronathero', 'crushinjury', 'cysticfibro', 'deliriumdementiaamnesticothercognitiv',
    'developmentaldisorders', 'diabmelnoc', 'diabmelwcm', 'disordersusuallydiagnosedininfancych',
    'diverticulos', 'dizziness', 'dminpreg', 'dysrhythmia', 'earlylabor',
    'ecodesadverseeffectsofmedicalcare', 'ecodesadverseeffectsofmedicaldrugs', 'ecodescutpierce',
    'ecodesdrowningsubmersion', 'ecodesfall', 'ecodesfirearm', 'ecodesfireburn', 'ecodesmachinery',
    'ecodesmotorvehicletrafficmvt', 'ecodesnaturalenvironment', 'ecodesotherspecifiedandclassifiable',
    'ecodesotherspecifiednec', 'ecodespedalcyclistnotmvt', 'ecodesplaceofoccurrence', 'ecodespoisoning',
    'ecodesstruckbyagainst', 'ecodessuffocation', 'ecodestransportnotmvt', 'ecodesunspecified', 'ectopicpreg',
    'encephalitis', 'endometrios', 'epilepsycnv', 'esophcancer', 'esophgealdx',
    'exameval', 'eyeinfectn', 'fatigue', 'femgenitca', 'feminfertil', 'fetaldistrs',
    'fluidelcdx', 'fuo', 'fxarm', 'fxhip', 'fxleg', 'fxskullfac', 'gangrene',
    'gasduoulcer', 'gastritis', 'gastroent', 'giconganom', 'gihemorrhag', 'giperitcan',
    'glaucoma', 'goutotcrys', 'guconganom', 'hdnckcancr', 'headachemig', 'hemmorhoids',
    'hemorrpreg', 'hepatitis', 'hivinfectn', 'hodgkinsds', 'hrtvalvedx', 'htn',
    'htncomplicn', 'htninpreg', 'hyperlipidem', 'immunitydx', 'immunizscrn',
    'impulsecontroldisordersnec', 'inducabortn', 'infectarth', 'influenza', 'infmalegen',
    'intestinfct', 'intobstruct', 'intracrninj', 'jointinjury', 'kidnyrnlca',
    'lateeffcvd', 'leukemias', 'liveborn', 'liveribdca', 'longpregncy', 'lowbirthwt',
    'lungexternl', 'lymphenlarg', 'maintchemr', 'malgenitca', 'maligneopls', 'malposition',
    'meningitis', 'menopausldx', 'menstrualdx', 'miscellaneousmentalhealthdisorders',
    'mooddisorders', 'mouthdx', 'ms', 'multmyeloma', 'mycoses', 'nauseavomit',
    'neoplsmunsp', 'nephritis', 'nervcongan', 'nonepithca', 'nonhodglym', 'nutritdefic',
    'obrelatedperintrauma', 'opnwndextr', 'opnwndhead', 'osteoarthros', 'osteoporosis',
    'otacqdefor', 'otaftercare', 'otbnignneo', 'otbonedx', 'otcirculdx', 'otcomplbir',
    'otconganom', 'otconntiss', 'otdxbladdr', 'otdxkidney', 'otdxstomch', 'otendodsor',
    'otfemalgen', 'othbactinf', 'othcnsinfx', 'othematldx', 'othercvd', 'othereardx',
    'otherecns', 'othereyedx', 'othergidx', 'othergudx', 'otherinjury', 'otherpregnancyanddeliveryincludingnormal',
    'otherscreen', 'othfracture', 'othheartdx', 'othinfectns', 'othliverdx', 'othlowresp',
    'othmalegen', 'othnervdx', 'othskindx', 'othveindx', 'otinflskin', 'otitismedia',
    'otjointdx', 'otnutritdx', 'otperintdx', 'otpregcomp', 'otprimryca', 'otrespirca',
    'otupprresp', 'otuprspin', 'ovariancyst', 'ovarycancer', 'pancreascan', 'pancreasdx',
    'paralysis', 'parkinsons', 'pathologfx', 'pelvicobstr', 'perintjaund', 'peripathero',
    'peritonitis', 'personalitydisorders', 'phlebitis', 'pid', 'pleurisy', 'pneumonia',
    'poisnnonmed', 'poisnotmed', 'poisonpsych', 'precereoccl', 'prevcsectn', 'prolapse',
    'prostatecan', 'pulmhartdx', 'rctmanusca', 'rehab', 'respdistres', 'retinaldx', 'rheumarth',
    'schizophreniaandotherpsychoticdisorde', 'screeningandhistoryofmentalhealthan',
    'septicemia', 'septicemiaexceptinlabor', 'sexualinfxs', 'shock', 'sicklecell',
    'skininfectn', 'skinmelanom', 'sle', 'socialadmin', 'spincorinj', 'spontabortn',
    'sprain', 'stomchcancr', 'substancerelateddisorders', 'suicideandintentionalselfinflictedin',
    'superficinj', 'syncope', 'teethdx', 'testiscancr', 'thyroidcncr', 'thyroiddsor',
    'tia', 'tonsillitis', 'tuberculosis', 'ulceratcol', 'ulcerskin', 'umbilcord',
    'unclassified', 'urinstone', 'urinyorgca', 'uteruscancr', 'uti', 'varicosevn',
    'viralinfect', 'whtblooddx', 'otheredcns'
]


In [21]:
#Create a dictionary to map the new column names
column_name_mapping = {col: 'pmh_' + col for col in pmh_col}

In [22]:
# Rename the columns in the DataFrame
health_data = health_data.rename(columns=column_name_mapping)

Though I eventually plan to drop many columns from the `Historical Labs (Numerical)` and `Hisotircal Labs (Categorical)`, I still plan to group them together. This is because I need to isolate the columns of interest for glucose labs down the road and only want to make alterations to this data rather than the entire data frame. By using '_count', I will also be isolating the `imaging` super category with this union. This is fine as I intend to drop that supercategory due to lack of domain expertise.

In [23]:
#Create a filter to isolate the columns of interest
hist_labs = health_data.filter(like='_max').columns.union(
    health_data.filter(like='_last').columns).union(
    health_data.filter(like='_min').columns).union(
    health_data.filter(like='_npos').columns).union(
    health_data.filter(like='_count').columns).union(
    health_data.filter(like='_median').columns)

In [24]:
#Create a dictionary to map the new column names
column_name_mapping = {col: 'hist_' + col for col in hist_labs}

Let's check to make sure all the right alterations have been made. 

In [25]:
# Rename the columns in the DataFrame
health_data = health_data.rename(columns=column_name_mapping)

In [26]:
health_data.columns.tolist()

['triage_cat_dep_name',
 'triage_cat_esi',
 'demo_age',
 'demo_gender',
 'demo_ethnicity',
 'demo_race',
 'demo_lang',
 'demo_religion',
 'demo_maritalstatus',
 'demo_employstatus',
 'demo_insurance_status',
 'disposition',
 'triage_cat_arrivalmode',
 'triage_cat_arrivalmonth',
 'triage_cat_arrivalday',
 'triage_cat_arrivalhour_bin',
 'huse_previousdispo',
 'pmh_2ndarymalig',
 'pmh_abdomhernia',
 'pmh_abdomnlpain',
 'pmh_abortcompl',
 'pmh_acqfootdef',
 'pmh_acrenlfail',
 'pmh_acutecvd',
 'pmh_acutemi',
 'pmh_acutphanm',
 'pmh_adjustmentdisorders',
 'pmh_adltrespfl',
 'pmh_alcoholrelateddisorders',
 'pmh_allergy',
 'pmh_amniosdx',
 'pmh_analrectal',
 'pmh_anemia',
 'pmh_aneurysm',
 'pmh_anxietydisorders',
 'pmh_appendicitis',
 'pmh_artembolism',
 'pmh_asppneumon',
 'pmh_asthma',
 'pmh_attentiondeficitconductdisruptivebeha',
 'pmh_backproblem',
 'pmh_biliarydx',
 'pmh_birthasphyx',
 'pmh_birthtrauma',
 'pmh_bladdercncr',
 'pmh_blindness',
 'pmh_bnignutneo',
 'pmh_bonectcncr',
 'pmh_bph'

---
I want to be able to keep as much information as I can from the original data set, as the diabetic patients can have other prior medical history and chief complaints as well. I feel as though these are important to keep intact as some could comorbid with diabetes. I will search through the `chief complaint`, `prior medical history`,and `medication` columns and look for any columns that have 0% occurrence and drop those.

In [27]:
#Identifying the columns that are in the chief complaint and prior medical history column
chief_complaint = health_data.filter(like='cc_').columns
prior_medical_history =  health_data.filter(like='pmh_').columns
medication = health_data.filter(like='meds_').columns

In [28]:
# Combine both sets of columns
all_columns = chief_complaint.union(prior_medical_history)

# Calculate the percentage of occurrence for each column
occurrence_percentages = (health_data[all_columns].sum() / len(health_data)) * 100

# Filter columns with 0% occurrence
columns_with_zero_occurrence = occurrence_percentages[occurrence_percentages == 0].index

# Display the columns with 0% occurrence
print("Columns with 0% occurrence:")
print(columns_with_zero_occurrence)

Columns with 0% occurrence:
Index(['pmh_abortcompl', 'pmh_birthasphyx', 'pmh_ecodesdrowningsubmersion',
       'pmh_ecodesfireburn', 'pmh_ecodesmachinery',
       'pmh_ecodesnaturalenvironment', 'pmh_ecodespedalcyclistnotmvt',
       'pmh_ecodesplaceofoccurrence', 'pmh_ecodespoisoning',
       'pmh_ecodessuffocation', 'pmh_longpregncy', 'pmh_malposition',
       'pmh_perintjaund', 'pmh_septicemiaexceptinlabor'],
      dtype='object')


As we can see, the above columns have 0% occurrence so these will be removed accordingly. I am curious about the range of percentage occurrence. It should be very high for the columns noted as indicating diabetes. 

In [29]:
# Find the column with the maximum occurrence
max_occurrence_column = occurrence_percentages.idxmax()

# Display the column with the maximum occurrence
print("Column with the Maximum Occurrence:")
print(max_occurrence_column)

Column with the Maximum Occurrence:
pmh_diabmelnoc


In [30]:
#Isolate the columns with 0% occurrences. 
col_to_drop = ['pmh_abortcompl', 'pmh_birthasphyx', 'pmh_ecodesdrowningsubmersion',
       'pmh_ecodesfireburn', 'pmh_ecodesmachinery',
       'pmh_ecodesnaturalenvironment', 'pmh_ecodespedalcyclistnotmvt',
       'pmh_ecodesplaceofoccurrence', 'pmh_ecodespoisoning',
       'pmh_ecodessuffocation', 'pmh_longpregncy', 'pmh_malposition',
       'pmh_perintjaund', 'pmh_septicemiaexceptinlabor']

In [31]:
# Drop the columns 
health_data = health_data.drop(columns=col_to_drop)

----
 The columns `ethnicity` and `race` can be combined into the `race` column. We will do this before dropping the `ethnicity` column.

I noticed that for the `ethnicity` and `race` rows that if a patient identifies as 'Hispanic or Latino' in the `ethnicity` column then in the `race` column, they are labeled as 'other'. I will write a lambda function that replaces 'other' with 'Hispanic or latino' in the `race` column if they write 'Hispanic or Latino' in the `ethnicity` column


In [32]:
health_data['demo_race'] = health_data.apply(lambda row: 'Hispanic or Latino' if row['demo_ethnicity'] == 'Hispanic or Latino' else row['demo_race'], axis=1)

Let's make sure that all of the data from the `ethnicity` column has been ported into the `race` column. I'll do this by creating a value_count of the race row.

In [33]:
health_data['demo_race'].value_counts()

demo_race
White or Caucasian                           51384
Black or African American                    34501
Hispanic or Latino                           19526
Other                                         1822
Asian                                          662
Patient Refused                                327
Unknown                                        176
American Indian or Alaska Native               104
Native Hawaiian or Other Pacific Islander       59
Name: count, dtype: int64

The two columns have been successfully merged.  Now, let's drop the redundant `ethnicity` column. 

In [34]:
#Drop the column
health_data = health_data.drop(columns='demo_ethnicity')

----
The only demographic data that I would be interested would be the following:
- Age, gender, race, employ status, insurance_status

The only column that isn't included is demo_maritalstatus, demo_lang, and demo_religion. So I will drop these columns.

In [35]:
col_to_drop = ['demo_maritalstatus','demo_lang','demo_religion']

In [36]:
#Drop the column
health_data = health_data.drop(columns=col_to_drop)

---
The columns `triage_vital_o2_device` and `triage_vital_o2` will be dropped. The former indicates whether or not the ED had a supplementary oxygen device and the latter indicates oxygen saturation. I am dropping these columns as they have a high rate of nulls (40% - 48%). These values cannot be easily imputed. 

The columns `arrivalhour_bin` and `arrivalday`, we do not the arrival time for our analysis and the arrival day only indicates the day of the week and not hte actual date. We can just keep analysis high level to the `arrivalmonth` if we need. 

In addition, we will remove `triage_cat_arrivalmode`, `triage_cat_dep_name`, and  `triage_cat_esi`. We will remove arrival mode and department name as it isn't revenant to the modeling that I intend to do later down the road. I am removing ESI as it would be a possible form or leakage as it indicates the severity index of the patient upon entry to triage. This could lead to skewed results or leakage between my test/train in my model down the road, so I will be leaving it now. 

In [37]:
# Define the columns to be dropped
col_drop = [
    'triage_vital_o2_device', 'triage_vital_o2',
    'triage_cat_arrivalhour_bin', 'triage_cat_arrivalday', 'triage_cat_arrivalmode',
    'triage_cat_dep_name', 'triage_cat_esi'
]
# Drop the columns 
health_data = health_data.drop(columns=col_drop)

---
In addition, I will drop the `Historical Labs (Numerical)` and `Historical Labs (Numerical)` that are not needed. The columns that are needed are the following:
- glucose_median - glucose levels in blood work. 

In [38]:
#Isolate the historical labs
historical_labs = health_data.filter(like='hist_').columns

In [39]:
# Specify the columns to keep
columns_to_keep = [
    'hist_glucose_median'

]

# Create a separate data frame to store this data
health_data_filtered = health_data[columns_to_keep]

I will now drop historical_labs from the original data frame, then merge back the health_data_filtered. 

In [40]:
#Drop the columns isolated
health_data = health_data.drop(columns=historical_labs)

I will concatenate the filtered data to the original data frame.

In [41]:
# Concatenate the filtered data with the original data along the columns axis
health_data = pd.concat([health_data, health_data_filtered], axis=1)

In [42]:
historical_labs = health_data.filter(like='hist_').columns
print(f'The columns that make up the super category are : {historical_labs}')

The columns that make up the super category are : Index(['hist_glucose_median'], dtype='object')


---
In addition, I will drop the `Meds` that are not needed. The columns that are needed are the following:
 - meds_antihyperglycemics - this will indicate whether or not the patient is currently medicated for their diabetes
 - meds_anti-obesitydrugs - sometimes medications like ozempic could be classified as a anti-obesity drug
 -meds_hormones - insulin is a peptide hormone which is used to regulate blood sugar levels diabetics.

In [43]:
#Isolate the historical labs
meds = health_data.filter(like='meds_').columns

In [44]:
# Specify the columns to keep
columns_to_keep = [
    'meds_antihyperglycemics', 'meds_anti-obesitydrugs', 'meds_hormones'

]

# Create a separate data frame to store this data
meds_data_filtered = health_data[columns_to_keep]

I will now drop historical_labs from the original data frame, then merge back the health_data_filtered. 

In [45]:
#Drop the columns isolated
health_data = health_data.drop(columns=meds)

I will concatenate the filtered data to the original data frame.

In [46]:
# Concatenate the filtered data with the original data along the columns axis
health_data = pd.concat([health_data, meds_data_filtered], axis=1)

In [47]:
historical_labs = health_data.filter(like='hist_').columns
print(f'The columns that make up the super category are : {meds}')

The columns that make up the super category are : Index(['meds_analgesicandantihistaminecombination', 'meds_analgesics',
       'meds_anesthetics', 'meds_anti-obesitydrugs', 'meds_antiallergy',
       'meds_antiarthritics', 'meds_antiasthmatics', 'meds_antibiotics',
       'meds_anticoagulants', 'meds_antidotes', 'meds_antifungals',
       'meds_antihistamineanddecongestantcombination', 'meds_antihistamines',
       'meds_antihyperglycemics', 'meds_antiinfectives',
       'meds_antiinfectives/miscellaneous', 'meds_antineoplastics',
       'meds_antiparkinsondrugs', 'meds_antiplateletdrugs', 'meds_antivirals',
       'meds_autonomicdrugs', 'meds_biologicals', 'meds_blood',
       'meds_cardiacdrugs', 'meds_cardiovascular', 'meds_cnsdrugs',
       'meds_colonystimulatingfactors', 'meds_contraceptives',
       'meds_cough/coldpreparations', 'meds_diagnostic', 'meds_diuretics',
       'meds_eentpreps', 'meds_elect/caloric/h2o', 'meds_gastrointestinal',
       'meds_herbals', 'meds_hormones

Now let's look at what our current data frame looks like.

In [48]:
health_data.shape[1]

487

---
#### <a id = 'rows'></a> 1.3 Deleting Rows

Before we can start, let's understand what lives inside of our columns that we have remaining

In [49]:
#Writing a loop to retireve all unique values for all of the columns
for column in health_data.columns:
    unique_vals = health_data[column].unique()
    print(f'Unique values in {column}: {unique_vals}')

Unique values in demo_age: [86 87 53 33 49 50 51 45 46 35 56 55 48 64 65 66 67 47 42 63 61 62 70 71
 72 39 41 75 73 60 44 58 88 84 85 59 27 76 74 80 81 40 77 69 52 83 89 57
 36 37 23 68 54 96 82 95 79 38 78 93 97 91 92 94 19 34 26 28 25 90 43 99
 98 100 29 31 32 22 18 24 30 20 21 102 101 103 104 106 105]
Unique values in demo_gender: ['Female', 'Male']
Categories (2, object): ['Female', 'Male']
Unique values in demo_race: ['Hispanic or Latino' 'White or Caucasian' 'Black or African American'
 'Other' 'Patient Refused' 'Asian'
 'Native Hawaiian or Other Pacific Islander' 'Unknown'
 'American Indian or Alaska Native' nan]
Unique values in demo_employstatus: ['Retired', 'Disabled', 'Full Time', 'Not Employed', 'Part Time', 'Self Employed', 'Student - Full Time', 'Unknown', 'On Active Military Duty', 'Student - Part Time']
Categories (10, object): ['Disabled', 'Full Time', 'Not Employed', 'On Active Military Duty', ..., 'Self Employed', 'Student - Full Time', 'Student - Part Time', 'Unknow

From what we can see, there are already null values that exist within the data set. Let's look at the percentage of nulls per column. 

In [50]:
#This will show the percentage of null values per column
health_data.isna().sum()/health_data.shape[0]*100

demo_age                                         0.000000
demo_gender                                      0.000000
demo_race                                        0.005527
demo_employstatus                                0.000000
demo_insurance_status                            0.000000
disposition                                      0.000000
triage_cat_arrivalmonth                          0.000000
huse_previousdispo                               0.000000
pmh_2ndarymalig                                  0.000000
pmh_abdomhernia                                  0.000000
pmh_abdomnlpain                                  0.000000
pmh_acqfootdef                                   0.000000
pmh_acrenlfail                                   0.000000
pmh_acutecvd                                     0.000000
pmh_acutemi                                      0.000000
pmh_acutphanm                                    0.000000
pmh_adjustmentdisorders                          0.000000
pmh_adltrespfl

Plan of attack:
- Rows in the `race` column where it says 'Unknown','Patient Refused', or 'Other', as these do not provide the demographic data that's needed. 
- Rows in the `employstatus` column that say 'Unknown' as these do not provide the demographic data that's needed. 


Let's first define the values to be dropped in each column, then create a loop to drop rows with the defined values.

In [51]:
health_data['demo_race'].value_counts()

demo_race
White or Caucasian                           51384
Black or African American                    34501
Hispanic or Latino                           19526
Other                                         1822
Asian                                          662
Patient Refused                                327
Unknown                                        176
American Indian or Alaska Native               104
Native Hawaiian or Other Pacific Islander       59
Name: count, dtype: int64

In [52]:
# Define the values to be dropped in each column
values_to_drop = {
    'demo_race': ['Unknown', 'Patient Refused', 'Other'],
    'demo_employstatus': ['Unknown']
    }
total_rows_before = len(health_data)
print(f"Total Rows Before Dropping: {total_rows_before}")

Total Rows Before Dropping: 108567


In [53]:
# A for loop to iterate through columns and drop rows with specified values
for column, drop_values in values_to_drop.items():
    health_data = health_data[~health_data[column].isin(drop_values)]

health_data.reset_index(drop=True, inplace=True) #reset the index as needed

In [54]:
total_rows_after = len(health_data)
print(f"Total Rows After Dropping: {total_rows_after}")


Total Rows After Dropping: 106126


In [55]:
columns_of_interest = ['demo_race','demo_employstatus']
for column in columns_of_interest:
    counts = health_data[column].value_counts()
    print(f"Value Counts for '{column}':\n{counts}")

Value Counts for 'demo_race':
demo_race
White or Caucasian                           51317
Black or African American                    34482
Hispanic or Latino                           19497
Asian                                          661
American Indian or Alaska Native               104
Native Hawaiian or Other Pacific Islander       59
Name: count, dtype: int64
Value Counts for 'demo_employstatus':
demo_employstatus
Retired                    42418
Not Employed               22542
Disabled                   21221
Full Time                  13361
Part Time                   4561
Self Employed               1549
Student - Full Time          404
Student - Part Time           66
On Active Military Duty        4
Unknown                        0
Name: count, dtype: int64


As seen above, we can see that the rows were removed properly. 

---
#### <a id = 'duplicates'></a> 1.4 Handling Duplicates

Let look and see how many duplicated rows there might be. 

In [56]:
# Detecting duplicates
duplicates = health_data[health_data.duplicated()]

In [57]:
#Percentage of duplicated
(duplicates.shape[0])/health_data.shape[0] * 100

0.013191866272167048

The duplicated rows represent 0.064% of the total data set. Given that all columns match for these values, it's safe to say that these are truly duplicated rows. It's hard to believe that all columns are the same, even if it's repeated visit at least 1 variable should differ. Since these duplicated rows makes up 0.064% of the data set, I will be dropping them. 

In [58]:
health_data.drop_duplicates(inplace=True) #dropping the duplicates 

In [59]:
health_data.duplicated().sum() #confirming that the duplicates have been dropped

0

In [60]:
health_data = health_data.reset_index(drop=True) #reset the index

Now that we have dropped all duplicated rows and reset the index, let's move on to the next part of our cleaning.

---
#### <a id = 'type'></a> 1.5 Data Type Conversion

Disposition is a binary column, but it's still in text form. I will binarize the column as a step to clean the data. 

In [61]:
(health_data['disposition'].value_counts()/health_data['disposition'].count())*100

disposition
Discharge    54.031589
Admit        45.968411
Name: count, dtype: float64

As we can see, about 54% of all patients were discharged and 46% of all patients were admitted. Now let's binarize them. 

In [62]:
health_data['disposition'] = health_data['disposition'].map({'Admit': 1, 'Discharge': 0})
health_data['disposition'].value_counts()

disposition
0    57334
1    48778
Name: count, dtype: int64

This confirms that the `disposition` column has been successfully binarized. 
___

To make analysis a little easier down the road, I will convert the age column from numeric to categorical.

Find the min and max age and determine the size of 'age bins' we will use. 

In [63]:
# Find the minimum, maximum in age
min_age = health_data['demo_age'].min()
max_age = health_data['demo_age'].max()

# Display the results
print(f"Minimum age: {min_age}")
print(f"Maximum age: {max_age}")

Minimum age: 18
Maximum age: 106


As we can see, age ranges from 18 - 108. We will use roughly 10 bins of 9 to classify age. So from ages '18-29', '30-39', '40-49', '50-59', '60-69', '70-79', '80-89', '90-99', '100-109'. 


In [64]:
age_bins = [18, 29, 39, 49, 59, 69, 79, 89, 99, 109]
age_labels = ['18-29', '30-39', '40-49', '50-59', '60-69', '70-79', '80-89', '90-99', '100-109']

Now let's update the age column with the appropriate categorical information. Effectively converting the numeric column to categorical. 

In [65]:
#Classify the age column
health_data['demo_age'] = pd.cut(health_data['demo_age'], bins=age_bins, labels=age_labels, right=False)

Let's confirm that column has been classified correctly and changed.

In [66]:
health_data['demo_age'].value_counts()

demo_age
50-59      24001
60-69      22333
70-79      18015
40-49      13156
80-89      13090
30-39       7421
90-99       4749
18-29       3211
100-109      136
Name: count, dtype: int64

---
In addition, I noticed that some of the columns in `Chief Complaint` columns have a 2 or 3 as an input. These instances were few so I feel comfortable with dealing with them. I assume that these were meant to be a 1. I will map it accordingly so that all of the `chief_complaint` columns will be binary. 

In [67]:
# List of columns to update (columns that start with 'cc_')
columns_to_update = [col for col in health_data.columns if col.startswith('cc_')]

In [68]:
health_data[columns_to_update].max().value_counts()#let's see what the different values are.

1    179
2     20
3      1
Name: count, dtype: int64

In [69]:
# Loop through the selected columns and replace 2 with 1
for column in columns_to_update:
    health_data[column] = health_data[column].replace(2, 1).replace(3, 1)

In [70]:
health_data[columns_to_update].min().sum()

0.0

In [71]:
health_data[columns_to_update].max().value_counts()

1    200
Name: count, dtype: int64

The additional 2 and 3 input has been removed from the columns. 

---
#### <a id = 'missing'></a> 1.6 Handling Missing Data


Let's look at the number of nulls.

In [72]:
health_data.isna().sum()/health_data.shape[0]*100

demo_age                                         0.000000
demo_gender                                      0.000000
demo_race                                        0.005654
demo_employstatus                                0.000000
demo_insurance_status                            0.000000
disposition                                      0.000000
triage_cat_arrivalmonth                          0.000000
huse_previousdispo                               0.000000
pmh_2ndarymalig                                  0.000000
pmh_abdomhernia                                  0.000000
pmh_abdomnlpain                                  0.000000
pmh_acqfootdef                                   0.000000
pmh_acrenlfail                                   0.000000
pmh_acutecvd                                     0.000000
pmh_acutemi                                      0.000000
pmh_acutphanm                                    0.000000
pmh_adjustmentdisorders                          0.000000
pmh_adltrespfl

In general we seem to have null values that range from 0.44% - 48%. The higher end of the null values exist in the triage vital columns, these are the vital signs recorded at intake. 

Plan of attack: 
  - Rows in the `Chief complaint columns` have a null percentage of about 0.46%. I will drop these rows as there is no way to impute these values. A patient either has a chief complaint  or they don't. In addition, these rows are meant to be binary and there is a 2 that comes up on occasion. I will assume that these were meant to go under 1 and merge this information. 
  - Rows in the  `race`, and `arrivalmode` have a null percentage of ranging 0.002% - 3.9%. I will drop these rows as there is no way to impute these values.
  ---

I will start with the null values that can be dropped simply.

In [73]:
health_data.dropna(subset='demo_race',inplace = True)

---
Let's now remove the null values from the `Chief complaint columns`, these need to be aggregated. I noticed that cc_ precedes these columns, so I will isolate according to this.

In [74]:
cc_rows = health_data.filter(like='cc_').columns #this isolates the chief complain columns

health_data.dropna(subset = cc_rows, inplace=True)

---

Now let's look at the state of our data after removing these nulls. 

In [75]:
missing_percentage = health_data.isna().sum() / health_data.shape[0] * 100
with pd.option_context('display.max_rows', None): #I was having display issues only using the above code. This is my work around
    print(missing_percentage)

demo_age                                         0.000000
demo_gender                                      0.000000
demo_race                                        0.000000
demo_employstatus                                0.000000
demo_insurance_status                            0.000000
disposition                                      0.000000
triage_cat_arrivalmonth                          0.000000
huse_previousdispo                               0.000000
pmh_2ndarymalig                                  0.000000
pmh_abdomhernia                                  0.000000
pmh_abdomnlpain                                  0.000000
pmh_acqfootdef                                   0.000000
pmh_acrenlfail                                   0.000000
pmh_acutecvd                                     0.000000
pmh_acutemi                                      0.000000
pmh_acutphanm                                    0.000000
pmh_adjustmentdisorders                          0.000000
pmh_adltrespfl

All of the remaining null values are due to there not being a reading available. I will put int No_Reading as I plan to later convert these numeric column into categorical columns below. 

---
#### <a id = 'cat'></a> 1.7: Categorizing Columns

When looking at different columns, I want to convert some of the numerical columns into categorical columns. From what I have observed, there is an eventual ceiling for these values. It would be better to sort them into ranges that can be put into certain categories. I will then drop the original columns.

I will group them based on max/min values, but also reference medical information that is widely available. 

In [76]:
# Select all numeric columns
numeric_columns = health_data.select_dtypes(include=['number'])

# Exclude columns that start with 'pmh_' or 'cc_' 
columns_to_exclude = [col for col in numeric_columns.columns if col.startswith('pmh_') or col.startswith('cc_') or col.endswith('_flag')]
filtered_numeric_columns = numeric_columns.drop(columns=columns_to_exclude)

In [77]:
filtered_numeric_columns.describe()

Unnamed: 0,huse_n_edvisits,huse_n_admissions,triage_vital_hr,triage_vital_sbp,triage_vital_dbp,triage_vital_rr,triage_vital_temp,huse_n_surgeries,hist_glucose_median,meds_antihyperglycemics,meds_anti-obesitydrugs,meds_hormones
count,105627.0,105627.0,67521.0,67127.0,67093.0,66750.0,63973.0,105627.0,66513.0,105627.0,105627.0,105627.0
mean,4.892471,1.790868,86.172901,139.010868,80.279362,17.821657,98.055874,3.47147,182.134545,0.466481,0.000331,0.055857
std,10.282794,3.312581,16.916307,24.060948,15.038267,2.061627,0.811444,3.119137,97.39737,0.889094,0.0182,0.257845
min,0.0,0.0,30.0,51.0,25.0,8.0,90.0,0.0,16.0,0.0,0.0,0.0
25%,0.0,0.0,74.0,122.0,70.0,16.0,97.6,1.0,113.5,0.0,0.0,0.0
50%,2.0,1.0,85.0,137.0,80.0,18.0,98.0,3.0,152.0,0.0,0.0,0.0
75%,5.0,2.0,97.0,153.0,90.0,18.0,98.4,5.0,221.5,1.0,0.0,0.0
max,155.0,46.0,237.0,266.0,214.0,66.0,105.3,47.0,1364.0,8.0,1.0,4.0


As we can see for the following columns here is the information: 
**There are nulls that exist within these columns, they will be handled accordingly in the functions defined below.
- triage_vital_hr	
    - ranges from 30 - 237 beats per minute (bmp)
- triage_vital_sbp	
    - ranges from 45 - 266 mmHg for sbp
- triage_vital_dbp	
    - ranges from 25 - 214 mmHg for dbp
- triage_vital_rr	
    - ranges from 8 - 66 breathes per minutes
- triage_vital_temp	
    - ranges from 90 - 105.3 Fahrenheit 
- huse_n_edvisits	
    - ranges from 0 - 155 prior visits
- huse_n_admissions	
    - ranges from 1 - 46 prior admissions to the hospital
- huse_n_surgeries
    - ranges from 0 - 27 prior surgeries at the time of admission 
- hist_glucose_median
    - ranges from 16 - 1364 mg/dL in glucose readings for bloodwork 
- meds_antihyperglycemics
    - ranges from 0 - 8 medications prescribed for outpatient use
- meds_anti-obesitydrugs
    - ranges from 0 - 1 medications prescribed for outpatient use
- meds_hormones
    - ranges from 0 - 4 medications prescribed for outpatient use

I will segment these columns as shown below. These values for the various conditions were pulled from the Mayo Clinic.  

triage_vital_hr (heart rate):
- Normal: 60 - 100 bpm
- Tachycardia (high): >100 bpm
- Bradycardia (low): <60 bpm
- Critical (extreme values): <30 bpm or >160 bpm

triage_vital_sbp (systolic blood pressure):
- Normal: 90 - 120 mmHg
- Hypotension (low): <90 mmHg
- Pre-hypertension: 120 - 140 mmHg
- Hypertension (high): >140 mmHg
- Critical (extreme values): <70 mmHg or >180 mmHg

triage_vital_dbp (diastolic blood pressure):
- Normal: 60 - 80 mmHg
- Hypotension (low): <60 mmHg
- Pre-hypertension: 80 - 90 mmHg
- Hypertension (high): >90 mmHg
- Critical (extreme values): <40 mmHg or >120 mmHg

triage_vital_rr (respiratory rate):
- Normal: 12 - 20 breaths per minute
- Tachypnea (high): >20 breaths per minute
- Bradypnea (low): <12 breaths per minute
- Critical (extreme values): <8 breaths per minute or >30 breaths per minute

triage_vital_temp (temperature):
- Normal body temperature: 97 - 99.5 degrees Fahrenheit
- Hypothermia (low): <97 degrees Fahrenheit
- Fever (high): >99.5 degrees Fahrenheit
- Critical (extreme values): <90 degrees Fahrenheit or >106 degrees Fahrenheit

huse_n_edvisits (prior hospital visits):
- No prior visits: 0
- Low prior visits: 1 - 10
- Moderate prior visits: 11 - 50
- High prior visits: >50

huse_n_admissions (prior hospital admissions):
- Low prior admissions: 1 - 10
- Moderate prior admissions: 11 - 20
- High prior admissions: 21 - 30
- Very high prior admissions: >30

huse_n_surgeries (prior surgeries):
- No prior surgeries: 0
- Low prior surgeries: 1 - 10
- Moderate prior surgeries: 11 - 20
- High prior surgeries: >20

hist_glucose_median
- "Normal" if <= 200 mg/dL
- ">200(high)" if > 200 and <= 300 mg/dL
- ">300(very high)" if > 300 mg/dL
- **for this data, we will be following the guide put forth by Raza

meds_antihyperglycemics
- if 0 = no_antihyperglycemics
- if 1-2 = 1-to-2_antihyperglycemics
- if  3-6 = 3-to-6_antihyperglycemics
- if more than 7 = 7-plus_antihyperglycemics

meds_anti-obesitydrugs
- Since it's 0 or 1, there is no need to do anything 

meds_hormones
- ranges from 0 - 4 medications prescribed for outpatient use
- if 0 = no_hormones
- if 1-2 = 1-to-2_hormones
- if 3+ = 3-plus_hormones 


Let's start with replacing the numeric values in the columns with the above values. We will refer back to this replacement to understand what the true values are. 

In [78]:
# Define a function to classify heart rate
def classify_heart_rate(value):
    if pd.isnull(value):
        return 'no_hr_recorded'
    elif value >= 30 and value <= 160:
        if value >= 60 and value <= 100:
            return 'normal_hr'
        elif value > 100:
            return 'tachycardia(high)_hr'
        else:
            return 'bradycardia(low)_hr'
    else:
        return 'critical_hr'

# Apply the classification function to the DataFrame
health_data['triage_vital_hr'] = health_data['triage_vital_hr'].apply(classify_heart_rate)

# Let's make sure it worked properly
health_data['triage_vital_hr'].value_counts()


triage_vital_hr
normal_hr               52670
no_hr_recorded          38106
tachycardia(high)_hr    12336
bradycardia(low)_hr      2472
critical_hr                43
Name: count, dtype: int64

In [79]:
# Define a function to classify systolic blood pressure
def classify_systolic_sbp(value):
    if pd.isnull(value):
        return 'no_sbp_recorded'
    elif value >= 70 and value <= 180:
        if value >= 90 and value <= 120:
            return 'normal_sbp'
        elif value < 90:
            return 'hypotension(low)_sbp'
        elif value <= 140:
            return 'pre-hypertension_sbp'
        elif value <= 180:
            return 'hypertension(high)_sbp'
    else:
        return 'critical_sbp'

# Apply the classification function to the DataFrame
health_data['triage_vital_sbp'] = health_data['triage_vital_sbp'].apply(classify_systolic_sbp)

# Let's make sure it worked properly
health_data['triage_vital_sbp'].value_counts()

triage_vital_sbp
no_sbp_recorded           38500
hypertension(high)_sbp    25970
pre-hypertension_sbp      22520
normal_sbp                14252
critical_sbp               3754
hypotension(low)_sbp        631
Name: count, dtype: int64

In [80]:
# Define a function to classify diastolic blood pressure
def classify_diastolic_dbp(value):
    if pd.isnull(value):
        return 'no_dbp_recorded'
    elif value >= 40 and value <= 120:
        if value >= 60 and value <= 80:
            return 'normal_dbp'
        elif value < 60:
            return 'hypotension(low)_dbp'
        elif value <= 90:
            return 'pre-hypertension_dbp'
        elif value <= 120:
            return 'hypertension(high)_dbp'
    else:
        return 'critical_dpb'

# Apply the classification function to the DataFrame
health_data['triage_vital_dbp'] = health_data['triage_vital_dbp'].apply(classify_diastolic_dbp)

# Let's make sure it worked properly
health_data['triage_vital_dbp'].value_counts()

triage_vital_dbp
no_dbp_recorded           38534
normal_dbp                29487
pre-hypertension_dbp      17345
hypertension(high)_dbp    14610
hypotension(low)_dbp       4863
critical_dpb                788
Name: count, dtype: int64

In [81]:
# Define a function to classify respiratory rate
def classify_respiratory_rate(value):
    if pd.isnull(value):
        return 'no_rr_recorded'
    elif value >= 8 and value <= 30:
        if value >= 12 and value <= 20:
            return 'normal_rr'
        elif value > 20:
            return 'tachypnea(high)_rr'
        else:
            return 'bradypnea(low)_rr'
    else:
        return 'Critical_rr'
    
# Apply the classification function to the DataFrame
health_data['triage_vital_rr'] = health_data['triage_vital_rr'].apply(classify_respiratory_rate)

#Let's Make sure it worked properly
health_data['triage_vital_rr'].value_counts()

triage_vital_rr
normal_rr             63958
no_rr_recorded        38877
tachypnea(high)_rr     2620
Critical_rr             146
bradypnea(low)_rr        26
Name: count, dtype: int64

In [82]:
# Define a function to classify body temperature
def classify_temperature(value):
    if pd.isnull(value):
        return 'no_temp_recorded'
    elif value >= 90 and value <= 106:
        if value >= 97 and value <= 99.5:
            return 'normal_temp'
        elif value < 97:
            return 'hypothermia(low)_temp'
        elif value > 99.5:
            return 'fever(high_temp)'
    else:
        return 'critical_temp'

# Apply the classification function to the DataFrame
health_data['triage_vital_temp'] = health_data['triage_vital_temp'].apply(classify_temperature)


#Let's Make sure it worked properly
health_data['triage_vital_temp'].value_counts()

triage_vital_temp
normal_temp              58890
no_temp_recorded         41654
hypothermia(low)_temp     2966
fever(high_temp)          2117
Name: count, dtype: int64

In [83]:
# Define a function to classify prior hospital visits
def classify_prior_visits(value):
    if pd.isnull(value):
        return 'no_prior_recorded'
    if value == 0:
        return 'no_prior_visits'
    elif value >= 1 and value <= 10:
        return 'low_prior_visit'
    elif value >= 11 and value <= 50:
        return 'moderate_prior_visit'
    else:
        return 'high_prior_visit'

# Apply the classification function to the DataFrame
health_data['huse_n_edvisits'] = health_data['huse_n_edvisits'].apply(classify_prior_visits)

#Let's Make sure it worked properly
health_data['huse_n_edvisits'].value_counts()



huse_n_edvisits
low_prior_visit         66680
no_prior_visits         27409
moderate_prior_visit    10470
high_prior_visit         1068
Name: count, dtype: int64

In [84]:

# Define a function to classify prior hospital admissions
def classify_prior_admissions(value):
    if value == 0:
        return 'No_prior_admis'
    elif value >= 1 and value <= 10:
        return 'low_prior_admis'
    elif value >= 11 and value <= 20:
        return 'moderate_prior_admis'
    elif value >= 21 and value <= 30:
        return 'high_prior_admis'
    else:
        return 'vhigh_prior_admis'
    
# Apply the classification function to the DataFrame
health_data['huse_n_admissions'] = health_data['huse_n_admissions'].apply(classify_prior_admissions)

#Let's Make sure it worked properly
health_data['huse_n_admissions'].value_counts()

huse_n_admissions
low_prior_admis         52083
No_prior_admis          50670
moderate_prior_admis     2446
high_prior_admis          282
vhigh_prior_admis         146
Name: count, dtype: int64

In [85]:
# Define a function to classify prior surgeries
def classify_prior_surgeries(value):
    if value == 0:
        return 'no_prior_surg'
    elif value >= 1 and value <= 10:
        return 'low_Surg'
    elif value >= 11 and value <= 20:
        return 'moderate_Surg'
    else:
        return 'high_Surg'

# Apply the classification function to the DataFrame
health_data['huse_n_surgeries'] = health_data['huse_n_surgeries'].apply(classify_prior_surgeries)

#Let's Make sure it worked properly
health_data['huse_n_surgeries'].value_counts()

huse_n_surgeries
low_Surg         87142
no_prior_surg    14866
moderate_Surg     3553
high_Surg           66
Name: count, dtype: int64

In [86]:
# Function to categorize blood glucose levels
def categorize_blood_glucose(value):
    if pd.isnull(value):
        return 'no_glucose_level_recorded'
    elif value < 200:
        return 'Normal'
    elif value <= 300:
        return '>200(high)'
    else:
        return '>300(very high)'
    
# Apply the classification function to the DataFrame
health_data['hist_glucose_median'] = health_data['hist_glucose_median'].apply(categorize_blood_glucose)

#Let's Make sure it worked properly
health_data['hist_glucose_median'].value_counts()

hist_glucose_median
Normal                       46062
no_glucose_level_recorded    39114
>200(high)                   12856
>300(very high)               7595
Name: count, dtype: int64

In [87]:
# Define a function to classify antihyperglycemics medications
def classify_antihyperglycemics(value):
    if value == 0:
        return 'no_antihyperglycemics'
    elif value <= 2:
        return '1-to-2_antihyperglycemics'
    elif value <= 6:
        return '3-to-6_antihyperglycemics'
    else:
        return '7-plus_antihyperglycemics'

# Apply the classification function to the DataFrame
health_data['meds_antihyperglycemics'] = health_data['meds_antihyperglycemics'].apply(classify_antihyperglycemics)

#Let's Make sure it worked properly
health_data['meds_antihyperglycemics'].value_counts()

meds_antihyperglycemics
no_antihyperglycemics        77286
1-to-2_antihyperglycemics    23534
3-to-6_antihyperglycemics     4803
7-plus_antihyperglycemics        4
Name: count, dtype: int64

In [88]:
# Define a function to classify hormones medications
def classify_hormones(value):
    if value == 0:
        return 'no_hormones'
    elif value <= 2:
        return '1-to-2_hormones'
    else:
        return '3-plus_hormones'

# Apply the classification function to the DataFrame
health_data['meds_hormones'] = health_data['meds_hormones'].apply(classify_hormones)

#Let's Make sure it worked properly
health_data['meds_hormones'].value_counts()

meds_hormones
no_hormones        100391
1-to-2_hormones      5186
3-plus_hormones        50
Name: count, dtype: int64

This will make it easier to understand the distribution of these values in a meaningful way while also pre-processing these columns for future one-hot encoding and modelling. 

---
### <a id = 'conc'></a> Conclusion

In [89]:
health_data.columns.tolist()

['demo_age',
 'demo_gender',
 'demo_race',
 'demo_employstatus',
 'demo_insurance_status',
 'disposition',
 'triage_cat_arrivalmonth',
 'huse_previousdispo',
 'pmh_2ndarymalig',
 'pmh_abdomhernia',
 'pmh_abdomnlpain',
 'pmh_acqfootdef',
 'pmh_acrenlfail',
 'pmh_acutecvd',
 'pmh_acutemi',
 'pmh_acutphanm',
 'pmh_adjustmentdisorders',
 'pmh_adltrespfl',
 'pmh_alcoholrelateddisorders',
 'pmh_allergy',
 'pmh_amniosdx',
 'pmh_analrectal',
 'pmh_anemia',
 'pmh_aneurysm',
 'pmh_anxietydisorders',
 'pmh_appendicitis',
 'pmh_artembolism',
 'pmh_asppneumon',
 'pmh_asthma',
 'pmh_attentiondeficitconductdisruptivebeha',
 'pmh_backproblem',
 'pmh_biliarydx',
 'pmh_birthtrauma',
 'pmh_bladdercncr',
 'pmh_blindness',
 'pmh_bnignutneo',
 'pmh_bonectcncr',
 'pmh_bph',
 'pmh_brainnscan',
 'pmh_breastcancr',
 'pmh_breastdx',
 'pmh_brnchlngca',
 'pmh_bronchitis',
 'pmh_burns',
 'pmh_cardiaarrst',
 'pmh_cardiacanom',
 'pmh_carditis',
 'pmh_cataract',
 'pmh_cervixcancr',
 'pmh_chestpain',
 'pmh_chfnonhp',
 

In [90]:
health_data.isna().sum().sum()

0

There are no null values. 

In [91]:
health_data.shape

(105627, 487)

After cleaning, we are left with 494 columns and 105,243 rows. There is still a sizeable data set left over. We will now export the clean data as it's own csv to be used for further eda. 

In [92]:
clean_health_data = health_data

In [93]:
clean_health_data.to_csv('clean_health_data.csv', index=False) 

---