## SOFA /SAPS-I EVALUATION FOR COMPARISON PURPOSES

#### Introduction


<U>**SAPS-I**</U>   evaluates parameters such as age, heart rate, systolic blood pressure, body temperature, respiratory rate, urine output, and neurologic status within the first 24 hours of ICU admission, focusing on acute physiology. 

In contrast, <U>**SOFA**</U> (Sequential Organ Failure Assessment) assesses organ dysfunction by scoring parameters related to six organ systems: respiratory, coagulation, liver, cardiovascular, central nervous system, and renal, over the past 24 hours, providing a broader assessment of organ dysfunction over time.


To be able to compare the results of the models developed in this Project we need to calculate the Recall and Precission of these both methods in death prediction.
As both give a numerical probability scale with low, medium and high values to predict death we will consider:
 - low probability = 0
 - high probability = 1
 - medium probability = 0 (in order to give the model the highest possible score)

Based on this we will calculate recall and precission of both prediction models

STEP 1: Load data

In [49]:
import pandas as pd
import numpy as np
from sklearn.metrics import recall_score, precision_score

In [50]:
df_target_a = pd.read_csv("data/outcomes/Outcomes-a.txt")
df_target_a = pd.read_csv("data/outcomes/Outcomes-b.txt")
df_target_c = pd.read_csv("data/outcomes/Outcomes-c.txt")

In [51]:
df =  pd.concat([df_target_a, df_target_b, df_target_c], ignore_index=True)
df.head()

Unnamed: 0,RecordID,SAPS-I,SOFA,Length_of_stay,Survival,In-hospital_death
0,142675,27,14,9,7,1
1,142676,12,1,31,468,0
2,142680,12,7,17,16,1
3,142683,19,15,17,-1,0
4,142688,3,0,9,-1,0


In [52]:
df.shape

(12000, 6)

STEP 2: remove actual values for each prediction model according to STEP1:
 - SOFA: 
    - Low risk: SOFA score ≤ 5 --> will be set to 0
    - Moderate risk: SOFA score 6-11 --> will be set to 0
    - High risk: SOFA score ≥ 12 --> will be set to 1
 - SAPS-I: 
    - Low risk: SAPS-I score ≤ 10 --> will be set to 0
    - Moderate risk: SAPS-I score 11-20 --> will be set to 0
    - High risk: SAPS-I score ≥ 21 --> will be set to 1

In [53]:
df['SOFA_EV'] = df['SOFA'].apply(lambda x: 0 if x < 12  else 1)
df['SAPS-I_EV'] = df['SAPS-I'].apply(lambda x: 0 if x < 21  else 1)

In [54]:
df.head()

Unnamed: 0,RecordID,SAPS-I,SOFA,Length_of_stay,Survival,In-hospital_death,SOFA_EV,SAPS-I_EV
0,142675,27,14,9,7,1,1,1
1,142676,12,1,31,468,0,0,0
2,142680,12,7,17,16,1,0,0
3,142683,19,15,17,-1,0,1,0
4,142688,3,0,9,-1,0,0,0


STEP 3: precision and recall evaluation

In [55]:
sofa_precision = precision_score(df['In-hospital_death'], df['SOFA_EV'])
saps_precision = precision_score(df['In-hospital_death'], df['SAPS-I_EV'])
sofa_recall = recall_score(df['In-hospital_death'], df['SOFA_EV'])
saps_recall = recall_score(df['In-hospital_death'], df['SAPS-I_EV'])

In [56]:
print(f"SOFA:  precision = {sofa_precision} , recall = {sofa_recall} \n"
      f"SAPS-I  precision = {saps_precision} , recall = {saps_recall}"
      )

SOFA:  precision = 0.2923898531375167 , recall = 0.2545031958163858 
SAPS-I  precision = 0.27563722584469474 , recall = 0.27019174898314935


## LENGTH_OF_STAY RANDOM PROBABILITY EVALUATION FOR COMPARISON PURPOSES

In order to find out if our models perform better than random Length_of_stay evaluation we make the following steps:
 - introduce 3 levels of lenth_of_stay:
    - less than 8 days = short (assigned as number = 1)
    - less or equal to 30 days = medium (assigned as number = 2)
    - more than 30 days = long (assigned as number = 3)

Based on this we will evaluate the random probability of each group of length_of_stay

In [73]:
df['Length_of_stay_EV'] = df['Length_of_stay'].apply(lambda x: 1 if x < 8 else (2 if x < 31 else 3))

In [74]:
random_lenght_of_stay_1 = (df[df['Length_of_stay_EV'] == 1].shape[0]/df.shape[0])*100
random_lenght_of_stay_2 = (df[df['Length_of_stay_EV'] == 2].shape[0]/df.shape[0])*100
random_lenght_of_stay_3 = (df[df['Length_of_stay_EV'] == 3].shape[0]/df.shape[0])*100

In [75]:
print(f" random length of stay = \n"
      f" for less or equal to 1 week = {random_lenght_of_stay_1:.2f}%\n"
      f" for less or equal to 1 month = {random_lenght_of_stay_2:.2f}%\n"
      f" for more than 1 month = {random_lenght_of_stay_3:.2f}%"
      )


 random length of stay = 
 for less or equal to 1 week = 35.25%
 for less or equal to 1 month = 57.37%
 for more than 1 month = 7.38%
