## General Insights into 'alarm_violations.csv'

In [None]:
import pandas as pd

PATH = ''
ALARM_VIOLATIONS = pd.read_csv(PATH + 'alarm_violations.csv')

ALARM_VIOLATIONS.head()

In [None]:
ALARM_VIOLATIONS['ITEMID'] = ALARM_VIOLATIONS['ITEMID'].astype(str)
ALARM_VIOLATIONS['ICUSTAY_ID'] = ALARM_VIOLATIONS['ICUSTAY_ID'].astype(str)
ALARM_VIOLATIONS.describe()

### How many unique ICU stays exist in the MIMIC-III data set?
There are 19,968 unique ICU stays.

In [None]:
unique_ICU_stays = ALARM_VIOLATIONS["ICUSTAY_ID"].value_counts()
len(unique_ICU_stays)

### How often is an alarm raised? (= row count)
There are 388,209 triggered alarms.

In [None]:
len(ALARM_VIOLATIONS)

## Stratify Alarms by ITEM ID and TYPE (HIGH/LOW)

### Are alarms triggered more by falling below or exceeding thresholds?
Approximately 11,000 more alarms were triggered by exceeding a threshold.

In [None]:
ALARM_VIOLATIONS_HIGH = ALARM_VIOLATIONS[(ALARM_VIOLATIONS['THRESHOLD_TYPE'] == 'HIGH')]
print("HIGH Alarms:",len(ALARM_VIOLATIONS_HIGH))

ALARM_VIOLATIONS_LOW = ALARM_VIOLATIONS[(ALARM_VIOLATIONS['THRESHOLD_TYPE'] == 'LOW')]
print("LOW Alarms:",len(ALARM_VIOLATIONS_LOW))

### How often are the respective ITEM IDs affected by an alarm being triggered?
Most ITEM IDs were affected tens of thousands of times by alarm violations. Only the thresholds of the Minute Volume parameter were exceeded (534) or undershot (1,860) significantly less often.

In [None]:
ALARM_VIOLATIONS_STRATIFIED = ALARM_VIOLATIONS\
    .groupby(['ITEMID','THRESHOLD_TYPE'])\
    .size()\
    .reset_index(name='count')
print(ALARM_VIOLATIONS_STRATIFIED)
ALARM_VIOLATIONS_STRATIFIED.dtypes

### Bar Chart Visualization

In [None]:
import numpy as np

ALARM_VIOLATIONS_STRATIFIED_T= ALARM_VIOLATIONS_STRATIFIED.pivot(index='ITEMID', columns='THRESHOLD_TYPE', values='count')
ALARM_VIOLATIONS_STRATIFIED_T

In [None]:
import matplotlib.pyplot as plt

# define figure
fig, ax = plt.subplots(1, figsize=(16, 6))
# numerical x
x = np.arange(0, len(ALARM_VIOLATIONS_STRATIFIED_T.index))
# plot bars
plt.bar(x - 0.1, ALARM_VIOLATIONS_STRATIFIED_T['LOW'], width = 0.2, color = '#1D2F6F')
plt.bar(x + 0.1, ALARM_VIOLATIONS_STRATIFIED_T['HIGH'], width = 0.2, color = '#8390FA')

# x and y details
plt.xlabel('ITEM ID',fontsize=16)
plt.ylabel('Alarm Counts',fontsize=16)
plt.xticks(x, ALARM_VIOLATIONS_STRATIFIED_T.index)

# title and legend
plt.title('Alarm Counts by ITEM ID and TYPE', fontsize=18)
plt.legend(['LOW','HIGH'], loc='upper left', ncol = 2)

plt.show()

## How many alarm violations exist per ICU stay?

In [None]:
unique_ICU_stays = ALARM_VIOLATIONS["ICUSTAY_ID"].value_counts()
df_unique_ICU_stays = pd.DataFrame(unique_ICU_stays)

df_unique_ICU_stays = df_unique_ICU_stays.reset_index()
df_unique_ICU_stays.columns = ['ICUSTAY_ID','AlarmCount']
df_unique_ICU_stays['ICUSTAY_ID']=df_unique_ICU_stays['ICUSTAY_ID'].str.rstrip('.0')
df_unique_ICU_stays.describe()

In [None]:
mean_alarms_per_stay = df_unique_ICU_stays['AlarmCount'].mean()
median_alarms_per_stay = df_unique_ICU_stays['AlarmCount'].median()
min_alarms_per_stay = df_unique_ICU_stays['AlarmCount'].min()
max_alarms_per_stay = df_unique_ICU_stays['AlarmCount'].max()
print('Mean Alarms per Stay:',mean_alarms_per_stay)
print('Median Alarms per Stay:',median_alarms_per_stay)
print('Min Alarms per Stay:',min_alarms_per_stay)
print('Max Alarms per Stay:',max_alarms_per_stay)

### Strip Plot Visualization
There were 2,490 ICU stays with only one alarm.

In [None]:
count_alarm_numbers = df_unique_ICU_stays['AlarmCount'].value_counts()
count_alarm_numbers

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

sns.set_style("whitegrid")

fig, ax = plt.subplots(1, figsize=(16, 6))

sns.stripplot(data=df_unique_ICU_stays,x='AlarmCount')
plt.title("Alarm Count - Strip Plot",fontsize=18)
plt.xlabel("Alarm Count",fontsize=16)

plt.show()

## Analysis of the Alarm Counts per ICU Stay by ITEM ID

In [None]:
# Create dataframe
unique_ICU_stays_by_ItemId = ALARM_VIOLATIONS\
    .groupby(['ITEMID','ICUSTAY_ID'])\
    .size()\
    .reset_index(name='AlarmCount')
unique_ICU_stays_by_ItemId['ICUSTAY_ID']=unique_ICU_stays_by_ItemId['ICUSTAY_ID'].str.rstrip('.0')
unique_ICU_stays_by_ItemId.sort_values(by=['AlarmCount'], inplace=True)
unique_ICU_stays_by_ItemId
# ICUSTAY_ID with highest alarm count for ITEM ID 220277 equals the one above in the strip plot with approx. 14,000 alarms

In [None]:
unique_ICU_stays_by_220045= unique_ICU_stays_by_ItemId[(unique_ICU_stays_by_ItemId["ITEMID"] =="220045")]
unique_ICU_stays_by_220045.describe()

In [None]:
unique_ICU_stays_by_220179= unique_ICU_stays_by_ItemId[(unique_ICU_stays_by_ItemId["ITEMID"] =="220179")]
unique_ICU_stays_by_220179.describe()

In [None]:
unique_ICU_stays_by_220210= unique_ICU_stays_by_ItemId[(unique_ICU_stays_by_ItemId["ITEMID"] =="220210")]
unique_ICU_stays_by_220210.describe()

In [None]:
unique_ICU_stays_by_220277= unique_ICU_stays_by_ItemId[(unique_ICU_stays_by_ItemId["ITEMID"] =="220277")]
unique_ICU_stays_by_220277.describe()

In [None]:
unique_ICU_stays_by_224687= unique_ICU_stays_by_ItemId[(unique_ICU_stays_by_ItemId["ITEMID"] =="224687")]
unique_ICU_stays_by_224687.describe()

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

sns.set_style("whitegrid")

fig, ax = plt.subplots(1, figsize=(16, 6))

sns.stripplot(x="ITEMID", y="AlarmCount", data=unique_ICU_stays_by_ItemId)
plt.title("Alarm Count by ITEM ID - Scatter Plot",fontsize=18)
plt.xlabel("ITEM ID",fontsize=16)
plt.ylabel("Alarm Count",fontsize=16)
plt.gca().set_ylim(bottom=0)
plt.show()

## Analysis of Difference Between Actual Values and Thresholds

In [None]:
# create new column that shows the dif between actual and threshold
ALARM_VIOLATIONS['DIF_ACTUAL_TH'] = ALARM_VIOLATIONS['VALUENUM'] - ALARM_VIOLATIONS['THRESHOLD_VALUE']
ALARM_VIOLATIONS.head()

In [None]:
# analyze dif by item id
dif_analysis = ALARM_VIOLATIONS.groupby('ITEMID').describe()
dif_analysis = dif_analysis["DIF_ACTUAL_TH"].round(2)
dif_analysis

Analyze difference for each Item ID:

### 220045 - Heart Rate 

In [None]:
ALARM_VIOLATIONS_220045 = ALARM_VIOLATIONS[(ALARM_VIOLATIONS['ITEMID'] == '220045')]
ALARM_VIOLATIONS_220045

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_style("whitegrid")
ALARM_VIOLATIONS_220045 = ALARM_VIOLATIONS[(ALARM_VIOLATIONS['ITEMID'] == '220045')]

fig, axs = plt.subplots(1, 2, figsize=(25, 5))
fig.suptitle("Difference Between Actual and Threshold - 220045", fontsize=18)

sns.stripplot(data=ALARM_VIOLATIONS_220045,x='THRESHOLD_TYPE', y='DIF_ACTUAL_TH', ax=axs[0])
#axs[0].set_title("Difference Between Actual and Threshold - 220045 Scatter Plot",fontsize=14)
axs[0].set_ylabel("Difference Between Actual and Threshold",fontsize=14)
axs[0].set_xlabel("Threshold Type",fontsize=14)

sns.boxplot(data=ALARM_VIOLATIONS_220045,x='THRESHOLD_TYPE', y='DIF_ACTUAL_TH', ax=axs[1])
#axs[1].set_title("Difference Between Actual and Threshold - 220045 Boxplot")
axs[1].set_ylabel("Difference Between Actual and Threshold", fontsize=14)
axs[1].set_xlabel("Threshold Type",fontsize=14)

#sns.histplot(data=ALARM_VIOLATIONS_220045, x='DIF_ACTUAL_TH', ax=axs[2])
#axs[2].set_title("HR_violations_clean histogram")
#axs[2].set_xlabel("HR_violations_clean VALUENUM")

plt.show(fig)

In [None]:
ALARM_VIOLATIONS_220045.describe()
ALARM_VIOLATIONS_220045_H = ALARM_VIOLATIONS_220045[(ALARM_VIOLATIONS_220045['THRESHOLD_TYPE'] == 'HIGH')]
ALARM_VIOLATIONS_220045_H.describe()

In [None]:
ALARM_VIOLATIONS_220045_L = ALARM_VIOLATIONS_220045[(ALARM_VIOLATIONS_220045['THRESHOLD_TYPE'] == 'LOW')]
ALARM_VIOLATIONS_220045_L.describe()

### 220179 - Non-Invasive Blood Pressure (Systolic)

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_style("whitegrid")
ALARM_VIOLATIONS_220179 = ALARM_VIOLATIONS[(ALARM_VIOLATIONS['ITEMID'] == '220179')]

fig, axs = plt.subplots(1, 2, figsize=(25, 5))
fig.suptitle("Difference Between Actual and Threshold - 220179", fontsize=18)

sns.stripplot(data=ALARM_VIOLATIONS_220179,x='THRESHOLD_TYPE', y='DIF_ACTUAL_TH', ax=axs[0])
#axs[0].set_title("Difference Between Actual and Threshold - 220045 Scatter Plot",fontsize=14)
axs[0].set_ylabel("Difference Between Actual and Threshold",fontsize=14)
axs[0].set_xlabel("Threshold Type",fontsize=14)

sns.boxplot(data=ALARM_VIOLATIONS_220179,x='THRESHOLD_TYPE', y='DIF_ACTUAL_TH', ax=axs[1])
#axs[1].set_title("Difference Between Actual and Threshold - 220045 Boxplot")
axs[1].set_ylabel("Difference Between Actual and Threshold", fontsize=14)
axs[1].set_xlabel("Threshold Type",fontsize=14)

#sns.histplot(data=ALARM_VIOLATIONS_220179, x='DIF_ACTUAL_TH', ax=axs[2])
#axs[2].set_title("HR_violations_clean histogram")
#axs[2].set_xlabel("HR_violations_clean VALUENUM")

plt.show(fig)

### 220210 - Respiratory Rate

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_style("whitegrid")
ALARM_VIOLATIONS_220210 = ALARM_VIOLATIONS[(ALARM_VIOLATIONS['ITEMID'] == '220210')]

fig, axs = plt.subplots(1, 2, figsize=(25, 5))
fig.suptitle("Difference Between Actual and Threshold - 220210", fontsize=18)

sns.stripplot(data=ALARM_VIOLATIONS_220210,x='THRESHOLD_TYPE', y='DIF_ACTUAL_TH', ax=axs[0])
#axs[0].set_title("Difference Between Actual and Threshold - 220045 Scatter Plot",fontsize=14)
axs[0].set_ylabel("Difference Between Actual and Threshold",fontsize=14)
axs[0].set_xlabel("Threshold Type",fontsize=14)

sns.boxplot(data=ALARM_VIOLATIONS_220210,x='THRESHOLD_TYPE', y='DIF_ACTUAL_TH', ax=axs[1])
#axs[1].set_title("Difference Between Actual and Threshold - 220045 Boxplot")
axs[1].set_ylabel("Difference Between Actual and Threshold", fontsize=14)
axs[1].set_xlabel("Threshold Type",fontsize=14)

#sns.histplot(data=ALARM_VIOLATIONS_220210, x='DIF_ACTUAL_TH', ax=axs[2])
#axs[2].set_title("HR_violations_clean histogram")
#axs[2].set_xlabel("HR_violations_clean VALUENUM")

plt.show(fig)

### 220277 - O2 Saturation Pulseoxymetry

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_style("whitegrid")
ALARM_VIOLATIONS_220277 = ALARM_VIOLATIONS[(ALARM_VIOLATIONS['ITEMID'] == '220277')]

fig, axs = plt.subplots(1, 2, figsize=(25, 5))
fig.suptitle("Difference Between Actual and Threshold - 220277", fontsize=18)

sns.stripplot(data=ALARM_VIOLATIONS_220277,x='THRESHOLD_TYPE', y='DIF_ACTUAL_TH', ax=axs[0])
#axs[0].set_title("Difference Between Actual and Threshold - 220045 Scatter Plot",fontsize=14)
axs[0].set_ylabel("Difference Between Actual and Threshold",fontsize=14)
axs[0].set_xlabel("Threshold Type",fontsize=14)

sns.boxplot(data=ALARM_VIOLATIONS_220277,x='THRESHOLD_TYPE', y='DIF_ACTUAL_TH', ax=axs[1])
#axs[1].set_title("Difference Between Actual and Threshold - 220045 Boxplot")
axs[1].set_ylabel("Difference Between Actual and Threshold", fontsize=14)
axs[1].set_xlabel("Threshold Type",fontsize=14)

#sns.histplot(data=ALARM_VIOLATIONS_220277, x='DIF_ACTUAL_TH', ax=axs[2])
#axs[2].set_title("HR_violations_clean histogram")
#axs[2].set_xlabel("HR_violations_clean VALUENUM")

plt.show(fig)

### 224687 - Minute Volume

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("whitegrid")
ALARM_VIOLATIONS_224687 = ALARM_VIOLATIONS[(ALARM_VIOLATIONS['ITEMID'] == '224687')]

fig, axs = plt.subplots(1, 2, figsize=(25, 5))
fig.suptitle("Difference Between Actual and Threshold - 224687", fontsize=18)

sns.stripplot(data=ALARM_VIOLATIONS_224687,x='THRESHOLD_TYPE', y='DIF_ACTUAL_TH', ax=axs[0])
#axs[0].set_title("Difference Between Actual and Threshold - 220045 Scatter Plot",fontsize=14)
axs[0].set_ylabel("Difference Between Actual and Threshold",fontsize=14)
axs[0].set_xlabel("Threshold Type",fontsize=14)

sns.boxplot(data=ALARM_VIOLATIONS_224687,x='THRESHOLD_TYPE', y='DIF_ACTUAL_TH', ax=axs[1])
#axs[1].set_title("Difference Between Actual and Threshold - 220045 Boxplot")
axs[1].set_ylabel("Difference Between Actual and Threshold", fontsize=14)
axs[1].set_xlabel("Threshold Type",fontsize=14)

#sns.histplot(data=ALARM_VIOLATIONS_224687, x='DIF_ACTUAL_TH', ax=axs[2])
#axs[2].set_title("HR_violations_clean histogram")
#axs[2].set_xlabel("HR_violations_clean VALUENUM")

plt.show(fig)

## Analysis of Time Between Setting a Threshold and Raising an Alarm

### Creation of Additional Columns

In [None]:
import pandas as pd

set_threshold = pd.to_datetime(ALARM_VIOLATIONS['THRESHOLD_CHARTTIME'])
raised_alarm = pd.to_datetime(ALARM_VIOLATIONS['CHARTTIME'])

ALARM_VIOLATIONS['TIME_UNTIL_ALARM'] = pd.to_timedelta(raised_alarm - set_threshold)

ALARM_VIOLATIONS['SEC_UNTIL_ALARM'] = ALARM_VIOLATIONS['TIME_UNTIL_ALARM']\
    .dt\
    .total_seconds()\
    .astype(int)

ALARM_VIOLATIONS.head()

In [None]:
time_with_sec_info = ALARM_VIOLATIONS[ALARM_VIOLATIONS['SEC_UNTIL_ALARM'] % 60 != 0]
time_with_sec_info.SEC_UNTIL_ALARM.describe()

Since there are no seconds information in 'alarm_violations.csv', we can only examine the time difference for a minute accuracy.

In [None]:
ALARM_VIOLATIONS['MIN_UNTIL_ALARM'] = ALARM_VIOLATIONS['SEC_UNTIL_ALARM']\
    .divide(60)\
    .astype(int)

del ALARM_VIOLATIONS['SEC_UNTIL_ALARM']

ALARM_VIOLATIONS.head()

### Passed Time of All Triggered Alarms

In [None]:
import seaborn as sns

sns.set_style("whitegrid")
fig, axs = plt.subplots(
    2,
    1,
    figsize = (10, 15),
    sharex = True,
    dpi = 72)
fig.suptitle('Minutes Until Alarm is Triggered', fontweight='bold', color= 'black', fontsize=14, y=0.9)
fig.subplots_adjust(hspace = 0.1)

sns.stripplot(
    ax = axs[0],
    data = ALARM_VIOLATIONS,
    x = 'MIN_UNTIL_ALARM',
    palette = sns.color_palette("colorblind")
    )
axs[0].set_xlabel("")
axs[0].grid(b=True, which='both')
axs[0].margins(.1)

sns.boxplot(
    ax = axs[1],
    data = ALARM_VIOLATIONS,
    x = 'MIN_UNTIL_ALARM',
    palette = sns.color_palette("colorblind")
    )
axs[1].set_xlabel("Minutes")
axs[1].grid(b=True, which='both')
axs[1].margins(.1)

ALARM_VIOLATIONS.MIN_UNTIL_ALARM.describe()

As expected, the majority of these approx. 390,000 alarms was triggered in the time period of 0 to approx. 10,000 minutes with a descending trend. The maximum time it takes for an alarm to be triggered is approximately 22 days.

In [None]:
instant_alarms = ALARM_VIOLATIONS[ALARM_VIOLATIONS['MIN_UNTIL_ALARM'] < 1]
len(instant_alarms)

Among the 390,000 triggered alarms, there are 13,963 values that were triggered within the first minute after setting the threshold. These alarms should be removed, as we assume that the majority are threshold corrections that have already been made after a few seconds.

In [None]:
cleaned_alarms = ALARM_VIOLATIONS[ALARM_VIOLATIONS['MIN_UNTIL_ALARM'] >= 1]
cleaned_alarms.MIN_UNTIL_ALARM.describe()

### First 15 Minutes After Setting a Threshold

In [None]:
alarms_within_15min = ALARM_VIOLATIONS[(ALARM_VIOLATIONS['MIN_UNTIL_ALARM'] >= 1) & (ALARM_VIOLATIONS['MIN_UNTIL_ALARM'] <= 15)]

In [None]:
import seaborn as sns

sns.histplot(
    data=alarms_within_15min,
    x='MIN_UNTIL_ALARM',
    kde=True,
    bins=np.arange(1, 17) - 0.5)

plt.title(
    'Alarms Triggered After 1 to After 15 Minutes',
    fontsize=12,
    fontweight='bold')
plt.xticks(range(1, 16, 2))
plt.xlabel('Minutes', fontsize=12)
plt.ylabel('Count', fontsize=12)

alarms_within_15min.MIN_UNTIL_ALARM.describe()

In the time period of one to two minutes after setting the threshold, 1,482 alarms are triggered. Perhaps these alarms should also be removed, as they can occur very shortly after the first minute has passed. It may be that a nurse noticed only then that she had made an incorrect input.

The numbers of alarms that occur after more than two minutes after setting a threshold seem plausible.

In [None]:
time_until_alarm_stratified_by_itemid = alarms_within_15min\
    .groupby(['ITEMID', 'MIN_UNTIL_ALARM'])\
    .size()\
    .reset_index(name='Count')

time_until_alarm_stratified_by_itemid.head()

In [None]:
sns.histplot(
    data=alarms_within_15min,
    x='MIN_UNTIL_ALARM',
    hue='ITEMID',
    multiple='stack',
    palette=sns.color_palette('colorblind', n_colors=5),
    bins=np.arange(1, 17) - 0.5)

plt.title(
    'Alarms Triggered After 1 to 15 Minutes (Stratified by ITEM ID)',
    fontsize=12,
    fontweight='bold')
plt.xticks(range(1, 16, 2))
plt.xlabel('Minutes', fontsize=12)
plt.ylabel('Count', fontsize=12)

In the stratified view, you can see that most alarms were triggered because of a too low or too high systolic blood pressure - regardless of how many minutes have passed. The other parameter thresholds are exceeded or undercut with approximately the same frequency.

## Open questions we have to answer with whole CHARTSEVENT table

* Can you deduce something from the time of day? E.g. more alarms occur at night than during the day?
* After how many minutes is an alarm triggered on average after setting a treshold (stratified by ITEM ID)?
* Thesis: "The more frequently per time unit a nurse adjusts the treshold, the more likely there is a violation."
* We need an extended data set to predict violations (see alarm_violations_extended_by_normal_measurements.png)
* We would also need to include the patient again in order to get the patient demographics