# Analysis of Measurement Values and Its Thresholds
## Comparison of Original Values and Values in 'alarm_violations.csv'

In [None]:
import math
import pandas as pd

CHUNK_SIZE = 1000
NROWS = 1000000
DATA_PATH = './mimic-iii-clinical-database-1.4/CHARTEVENTS.csv'

csv_iter = pd.read_csv(DATA_PATH, iterator=True, chunksize=CHUNK_SIZE, nrows=NROWS, usecols=['ITEMID', 'VALUENUM'])

chunks = [0] * math.ceil(NROWS / CHUNK_SIZE)

for i, chunk in enumerate(csv_iter):
    chunks[i] = chunk

df = pd.concat(chunks, axis=0)
del chunks

### Analysis of Heart Rate

In [None]:
hr_values = df[(df['ITEMID'] == 220045)]
hr_values.VALUENUM.describe() # measurements range from -88 to 6,632

The lower limit of the measurement values of the heart rate coincide with the ones in 'alarm_violations.csv', but the upper limit in the respective CSV is much higher (86,101). Additionally, the measured heart rate should be between 0 and at most 480 resp. 350.

In [None]:
hr_low = df[(df['ITEMID'] == 220047)]
hr_low.VALUENUM.describe() # LOW thresholds range from 8 to 50,120

These LOW thresholds of the heart rate coincide with the ones (10 to 85,160) generated in 'alarm_violations.csv', but both upper limits are definitely too high.

In [None]:
hr_high = df[(df['ITEMID'] == 220046)]
hr_high.VALUENUM.describe() # HIGH thresholds range from 10 to 1,230

These HIGH thresholds of the heart rate coincide with the ones (0 to 175) generated in 'alarm_violations.csv', but its upper limit is definitely too high.

### Analysis of Systolic Blood Pressure

Tbc.

### Analysis of O2 Saturation

In [None]:
o2sat_values = df[(df['ITEMID'] == 220277)]
o2sat_values.VALUENUM.describe() # measurements range from 0 to 100

In contrast to these values, the measurement values of the O2 saturation in 'alarm_violations.csv' go up until 1,000.

In [None]:
o2sat_low = df[(df['ITEMID'] == 223770)]
o2sat_low.VALUENUM.describe() # LOW thresholds range from 2 to 90,100

These LOW thresholds of the O2 saturation coincide with the ones (50 to 90,100) generated in 'alarm_violations.csv'. Nevertheless, the maximal LOW threshold should be at most 99 and thus much lower than 90,100.

In [None]:
o2sat_high = df[(df['ITEMID'] == 223769)]
o2sat_high.VALUENUM.describe() # HIGH thresholds range from 10 to 1,000

In contrast to these values, the HIGH thresholds of the O2 saturation in 'alarm_violations.csv' only go up until 100 and thus look plausible. Nevertheless, the maximal HIGH threshold should be at most 100.

### Analysis of Respiratory Rate

In [None]:
rr_values = df[(df['ITEMID'] == 220210)]
rr_values.VALUENUM.describe() # measurements range from 0 to 200

The measurement values of the respiratory rate in 'alarm_violations.csv' range from 0 to 2.35 million which completely contradicts the original values.

In [None]:
rr_low = df[(df['ITEMID'] == 224162)]
rr_low.VALUENUM.describe() # LOW thresholds range from 0 to 93

Maximal LOW threshold of the respiratory rate is 93 which speaks against the values in the million range from 'alarm_violations.csv'.

In [None]:
rr_high = df[(df['ITEMID'] == 224161)]
rr_high.VALUENUM.describe() # HIGH thresholds range from 0 to 160

The HIGH thresholds of the respiratory rate in 'alarm_violations.csv' range from 0 to 55 which is plausible regarding the range found in 'CHARTEVENTS.csv'.

### Analysis of Minute Volume

Tbc.