## EDA ADMISSION TABLE

In [None]:
import pandas as pd
import matplotlib as plt
%matplotlib inline
pd.options.display.float_format = '{:,}'.format

In [None]:
df_admissions = pd.read_csv('data/ADMISSIONS.csv.gz', compression='gzip')

In [None]:
df_admissions.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ADMITTIME,DISCHTIME,DEATHTIME,ADMISSION_TYPE,ADMISSION_LOCATION,DISCHARGE_LOCATION,INSURANCE,LANGUAGE,RELIGION,MARITAL_STATUS,ETHNICITY,EDREGTIME,EDOUTTIME,DIAGNOSIS,HOSPITAL_EXPIRE_FLAG,HAS_CHARTEVENTS_DATA
0,21,22,165315,2196-04-09 12:26:00,2196-04-10 15:54:00,,EMERGENCY,EMERGENCY ROOM ADMIT,DISC-TRAN CANCER/CHLDRN H,Private,,UNOBTAINABLE,MARRIED,WHITE,2196-04-09 10:06:00,2196-04-09 13:24:00,BENZODIAZEPINE OVERDOSE,0,1
1,22,23,152223,2153-09-03 07:15:00,2153-09-08 19:10:00,,ELECTIVE,PHYS REFERRAL/NORMAL DELI,HOME HEALTH CARE,Medicare,,CATHOLIC,MARRIED,WHITE,,,CORONARY ARTERY DISEASE\CORONARY ARTERY BYPASS...,0,1
2,23,23,124321,2157-10-18 19:34:00,2157-10-25 14:00:00,,EMERGENCY,TRANSFER FROM HOSP/EXTRAM,HOME HEALTH CARE,Medicare,ENGL,CATHOLIC,MARRIED,WHITE,,,BRAIN MASS,0,1
3,24,24,161859,2139-06-06 16:14:00,2139-06-09 12:48:00,,EMERGENCY,TRANSFER FROM HOSP/EXTRAM,HOME,Private,,PROTESTANT QUAKER,SINGLE,WHITE,,,INTERIOR MYOCARDIAL INFARCTION,0,1
4,25,25,129635,2160-11-02 02:06:00,2160-11-05 14:55:00,,EMERGENCY,EMERGENCY ROOM ADMIT,HOME,Private,,UNOBTAINABLE,MARRIED,WHITE,2160-11-02 01:01:00,2160-11-02 04:27:00,ACUTE CORONARY SYNDROME,0,1


In [None]:
df_admissions.shape

(58976, 19)

In [None]:
df_admissions.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 58976 entries, 0 to 58975
Data columns (total 19 columns):
ROW_ID                  58976 non-null int64
SUBJECT_ID              58976 non-null int64
HADM_ID                 58976 non-null int64
ADMITTIME               58976 non-null object
DISCHTIME               58976 non-null object
DEATHTIME               5854 non-null object
ADMISSION_TYPE          58976 non-null object
ADMISSION_LOCATION      58976 non-null object
DISCHARGE_LOCATION      58976 non-null object
INSURANCE               58976 non-null object
LANGUAGE                33644 non-null object
RELIGION                58518 non-null object
MARITAL_STATUS          48848 non-null object
ETHNICITY               58976 non-null object
EDREGTIME               30877 non-null object
EDOUTTIME               30877 non-null object
DIAGNOSIS               58951 non-null object
HOSPITAL_EXPIRE_FLAG    58976 non-null int64
HAS_CHARTEVENTS_DATA    58976 non-null int64
dtypes: int64(5), objec

  
The ADMISSIONS table gives information regarding a patient’s admission to the hospital. Since each unique hospital visit for a patient is assigned a unique HADM_ID, the ADMISSIONS table can be considered as a definition table for HADM_ID. Information available includes timing information for admission and discharge, demographic information, the source of the admission, and so on.  

In [None]:
df_admissions.SUBJECT_ID.nunique()

46520

The Dataset contains 58976 Admissions of 46520 patients. 

|Column| Describtion|Type|
|:-----|:-----------|:----|
|ROW_ID |gives a index of the table = not important -> **DROP** |int|  
|SUBJECT_ID|gives a index to every single patient|int|    
|HADM_ID|range(100000 - 1999999), which represents a single patient’s admission to the hospital | int|  
|ADMITTIME|provides the date and time the patient was admitted to the hospital |Timestamp |  
|DISCHTIME| provides the date and time the patient was discharged from the hospital->**DROP**|TIMESTAMP|  |DEATHTIME| provides (if applicable)the time of in-hospital death for the patient. Is only present if the patient died in-hospital, and is almost always the same as the patient’s DISCHTIME.|TIMESTAMP|   
|ADMISSION_TYPE| describes the type of the admission: ‘ELECTIVE’, ‘URGENT’, ‘NEWBORN’ or ‘EMERGENCY’. Emergency/urgent indicate unplanned medical care, and are often collapsed into a single category in studies. Elective indicates a previously planned hospital admission. Newborn indicates that the HADM_ID pertains to the patient’s birth.|string|    
|ADMISSION_LOCATION|provides information about the previous location of the patient prior to arriving at the hospital. There are 9 possible values: EMERGENCY ROOM ADMIT, TRANSFER FROM HOSP/EXTRAM, TRANSFER FROM OTHER HEALT, CLINIC REFERRAL/PREMATURE, INFO NOT AVAILABLE, TRANSFER FROM SKILLED NUR, TRSF WITHIN THIS FACILITY, HMO REFERRAL/SICK, PHYS REFERRAL/NORMAL DELI|string| 
|DISCHARGE_LOCATION	| provides information about the location when the patient is descharged -> **DROP**|string|   
|INSURANCE|describes the health insurance of the patient = not important -> **DROP**|string|
|LANGUAGE|native language = not important -> **DROP**|string|
|RELIGION|religious affiliation = not important -> **DROP**|string|
|MARITAL_STATUS|marital status = not important -> **DROP**|string|
|ETHNICITY|	ethnicity = not important -> **DROP**|string|
|EDREGTIME|time that the patient was registered from the emergency department= not important -> **DROP**|	TIMESTAMP|
|EDOUTTIME|	time that the patient was discharged from the emergency department= not important -> **DROP**|TIMESTAMP|
|DIAGNOSIS|	provides a preliminary, free text diagnosis for the patient on hospital admission. The diagnosis is usually assigned by the admitting clinician and does not use a systematic ontology. |string|
|HOSPITAL_EXPIRE_FLAG|indicates whether the patient died within the given hospitalization. 1 indicates death in the hospital, and 0 indicates survival to hospital discharge.|integer|
|HAS_CHARTEVENTS_DATA|	indicates wether the patient occurs in the Charteventstable. 1 indicates has charteventsdata, and 0 indicates not.|integer|

In [None]:
to_drop_admissions = ['ROW_ID', 'INSURANCE', 'LANGUAGE', 'RELIGION', 'MARITAL_STATUS', 
           'ETHNICITY', 'EDREGTIME', 'EDOUTTIME']
df_admissions.drop(to_drop, axis=1, inplace=True)

In [None]:
# drop out the newborn babies, so that are only adults in the dataframe
df_admissions_adults = df_admissions[df_admissions.ADMISSION_TYPE != "NEWBORN"]

In [None]:
df_admissions_adults.shape

(51113, 19)

In [None]:
# How many adult admission have no entries in the chartevents?
len(df_admissions_adults[df_admissions_adults.HAS_CHARTEVENTS_DATA == 0])

1492

In [None]:
# Check how many Admissions of them have ARDS
liste_patienten = list(pd.read_csv('data/liste_patienten.csv'))
liste_patienten = [int(x) for x in liste_patienten]

In [None]:
df_admissions_adults[df_admissions_adults.HAS_CHARTEVENTS_DATA == 0].HADM_ID.isin(liste_patienten).sum()


53

There are 1492 admissions without any entry in the CHARTTIMEEVENT Table. Only 53 of the admissions represent patients with ARDS. Because the important data can be found in the CHARTTIMEEVENT Table, we have to do without these patients.

**Mortality allover**

In [None]:
len(df_admissions[df_admissions.HOSPITAL_EXPIRE_FLAG == 1])

5854

Of all 58976 admissions 5854 patients died in hospital = 10%.

**Mortality Adults**

In [None]:
len(df_admissions_adults[df_admissions_adults.HOSPITAL_EXPIRE_FLAG == 1])

5792

Of all adult admissions (51113) died 5792 patients in hospital = 11%.

The mortality rate of the babies is: 62 / 7833 = 0,8% , significantly less than among the adults.