# id5059_p2_group_project_2024.ipynb

In [72]:
## Install libraries
# !{sys.executable} -m pip install numpy pandas matplotlib scikit-learn | grep -v 'already satisfied'#pip-env
# %conda install numpy pandas matplotlib scikit-learn# -y#conda-env

In [73]:
import numpy# as np
import pandas# as pd
import matplotlib# as plt
import sklearn

print("numpy version:", numpy.__version__)
print("pandas version:", pandas.__version__)
print("matplotlib version:", matplotlib.__version__)
print("scikit-learn version:", sklearn.__version__)

numpy version: 1.26.4
pandas version: 2.2.1
matplotlib version: 3.8.0
scikit-learn version: 1.2.2


# id5059 Predicting Liver Cirrhosis Outcomes (Group Project 2024)

## Dataset from Kaggle (fully labelled synthetic dataset)

### Web links

#### From the Specification
- Our synthetic dataset (for id5059 2024 and Kaggle competition 2023) https://www.kaggle.com/competitions/playground-series-s3e26/
- The real dataset (Mayo Clinic 1974 to 1984) https://www.kaggle.com/datasets/joebeachcapital/cirrhosis-patient-survival-prediction
#### Additional links
- Generic web search (kaggle + liver + cirrhosis) https://duckduckgo.com/?q=kaggle+liver+cirrhosis
- Various research papers on ML approaches to this task

The reason medics collect data on these markers is because we know these features predict disease outcome already from medical domain knowledge.

### Dataset background

Our synthetic dataset is based on the real dataset from the Mayo Clinic collected across 1974 to 1984, which has the following description from Kaggle: 


"About Dataset

Utilize 17 clinical features for predicting survival state of patients with liver cirrhosis. The survival states include 0 = D (death), 1 = C (censored), 2 = CL (censored due to liver transplantation).

For what purpose was the dataset created?

Cirrhosis results from prolonged liver damage, leading to extensive scarring, often due to conditions like hepatitis or chronic alcohol consumption. The data provided is sourced from a Mayo Clinic study on primary biliary cirrhosis (PBC) of the liver carried out from 1974 to 1984.

Who funded the creation of the dataset?

Mayo Clinic

What do the instances in this dataset represent?

People

Does the dataset contain data that might be considered sensitive in any way?

Sex, Age

Was there any data preprocessing performed?

    Drop all the rows where miss value (NA) were present in the Drug column
    Impute missing values with mean results
    One-hot encoding for all category attributes

Additional Information

During 1974 to 1984, 424 PBC patients referred to the Mayo Clinic qualified for the randomized placebo-controlled trial testing the drug D-penicillamine. Of these, the initial 312 patients took part in the trial and have mostly comprehensive data. The remaining 112 patients didn't join the clinical trial but agreed to record basic metrics and undergo survival tracking. Six of these patients were soon untraceable after their diagnosis, leaving data for 106 of these individuals in addition to the 312 who were part of the randomized trial."

![Alt text](image.png)

### Load the synthetic dataset from Kaggle

In [74]:
import os
import pandas as pd

os.chdir('/Users/user/Documents/cscrawl/cs/studres/ID5059/Coursework/Coursework-2/data')

df = pd.read_csv('train.csv')
df_train = pd.read_csv('train.csv')
df_test = pd.read_csv('test.csv')

# Data exploration
    # How sparse is our data? Quantify its sparseness so we can move on.
    # Our data is especially sparse with respect to transplants.

# List the unique values in the 'Status' column, ={​​D,C,CL}​​ as expected
unique_statuses = df_train['Status'].unique()

# Count the occurrences of each value in the 'Status' column
status_counts = df_train['Status'].value_counts()

# Print the counts
print('Unique `Status` values:\n',unique_statuses,'\n')
print('Counts of each `Status` value in `train.csv`:\n',df_train['Status'].value_counts(),'\n','\nLength of `train.csv`\n',len(df_train),'\n')
print('Count of unknown `Status` values to be predicted across `test.csv`:\n',len(df_test))

print('\nTotal data entries across `train.csv` and `test.csv`:\n',len(df_train)+len(df_test))
print('\nPercentage share of total data entries across `train.csv` and `test.csv`:\n',
      'train.csv',100*len(df_train)/(len(df_train)+len(df_test)),'%\n',
      'test.csv',100*len(df_test)/(len(df_train)+len(df_test)),'%')

Unique `Status` values:
 ['D' 'C' 'CL'] 

Counts of each `Status` value in `train.csv`:
 Status
C     4965
D     2665
CL     275
Name: count, dtype: int64 
 
Length of `train.csv`
 7905 

Count of unknown `Status` values to be predicted across `test.csv`:
 5271

Total data entries across `train.csv` and `test.csv`:
 13176

Percentage share of total data entries across `train.csv` and `test.csv`:
 train.csv 59.99544626593807 %
 test.csv 40.00455373406193 %


In [75]:
# Load the full training data csv
full_data_path = 'train.csv'
full_data = pd.read_csv(full_data_path)
# Verify basic information like length of the full training data csv
full_data_info = full_data.info()
# full_data_head = full_data.head()

# full_data_info, full_data_head

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7905 entries, 0 to 7904
Data columns (total 20 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   id             7905 non-null   int64  
 1   N_Days         7905 non-null   int64  
 2   Drug           7905 non-null   object 
 3   Age            7905 non-null   int64  
 4   Sex            7905 non-null   object 
 5   Ascites        7905 non-null   object 
 6   Hepatomegaly   7905 non-null   object 
 7   Spiders        7905 non-null   object 
 8   Edema          7905 non-null   object 
 9   Bilirubin      7905 non-null   float64
 10  Cholesterol    7905 non-null   float64
 11  Albumin        7905 non-null   float64
 12  Copper         7905 non-null   float64
 13  Alk_Phos       7905 non-null   float64
 14  SGOT           7905 non-null   float64
 15  Tryglicerides  7905 non-null   float64
 16  Platelets      7905 non-null   float64
 17  Prothrombin    7905 non-null   float64
 18  Stage   

The full `train.csv` dataset is loaded. It contains 7,905 entries and 20 features.

Next steps:
1. **Splitting the Data:** We'll perform an 80/20 split of the `train.csv` data into a training set and a validation set, using stratified sampling to maintain the distribution of the `Status` classes.
    
2. **Splitting Imbalance Data:** Given the class imbalance, particularly the small proportion of the 'CL' class, we'll apply a stratified sampling technique during the split to avoid distorting the class distributions.
    
[ToDo] 3. **Deletion Scheme:** To explore the impact of data reduction while respecting the class imbalance, we'll initially remove 10% of the data using a stratified approach and later consider varying this percentage.
    
4. **Pre-processing:** Ahead of the selected models (e.g. gradient boosting, random forest), we'll apply necessary pre-processing steps, including scaling and normalization. This step will also facilitate more meaningful visualizations and analyses.
    
[ToDo] 5. **Alternative Feature Encoding:** Based on Pamela's suggestion, we can identify the alternative encoding technique to one-hot encoding she mentioned for use with categorical variables [???].
    
I'll begin by performing my stratified 80/20 data split. ​​

### Straitfied 80/20 Split into Train/Validation datasets
Stratified split, choosing 20% of each class to construct the validation set, in order to maintain the initial class distributions into both the Train and Validation datasets.

In [76]:
# Splitting data 80/20 with stratification to maintain the class proportions
# The style for sklearn use seems to be to import specific functions from the documentation
from sklearn.model_selection import train_test_split

# Spliting full_data into `X` (just the features) and `y` (just the target variable `Status`)
X = full_data.drop('Status', axis=1)
y = full_data['Status']

# Splitting the data while maintaining the distribution of the 'Status' classes
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.20, stratify=y, random_state=42)

# Verifying the stratification by comparing class distributions in the full dataset and splits
distribution_full = y.value_counts(normalize=True)
distribution_train = y_train.value_counts(normalize=True)
distribution_val = y_val.value_counts(normalize=True)

# distribution_full, distribution_train, distribution_val
print('Same exact proportions C,D,CL 62.8%,33.7%,3.5% are maintained upon an 80/20 split as the original count C,D,CL 4965,2665,275 is divisible by 5 into 4:1 ratio train:val sets.')
print(distribution_full, distribution_train, distribution_val)
print(y.value_counts(), y_train.value_counts(), y_val.value_counts())

Same exact proportions C,D,CL 62.8%,33.7%,3.5% are maintained upon an 80/20 split as the original count C,D,CL 4965,2665,275 is divisible by 5 into 4:1 ratio train:val sets.
Status
C     0.628083
D     0.337128
CL    0.034788
Name: proportion, dtype: float64 Status
C     0.628083
D     0.337128
CL    0.034788
Name: proportion, dtype: float64 Status
C     0.628083
D     0.337128
CL    0.034788
Name: proportion, dtype: float64
Status
C     4965
D     2665
CL     275
Name: count, dtype: int64 Status
C     3972
D     2132
CL     220
Name: count, dtype: int64 Status
C     993
D     533
CL     55
Name: count, dtype: int64


The stratified 80/20 split worked, maintaining the distribution of the `Status` classes across the training and validation sets. The class proportions in both subsets closely match the original dataset:

    * C: Cirrhosis - Approximately 62.81%
    * D: Death - Approximately 33.71%
    * CL: Liver Cancer - Approximately 3.48%

### Pre-Processing (Categorical features -> OneHot, Numerical features -> zero mean and unit variance)

In [77]:
## Pre-processing
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer

# Identifying categorical and numerical features
categorical_features = ['Drug', 'Sex', 'Ascites', 'Hepatomegaly', 'Spiders', 'Edema']
numerical_features = X_train.columns.difference(categorical_features + ['id'])  # Exclude 'id' from features

# Creating a column transformer to apply different preprocessing to categorical and numerical features
preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), numerical_features),
        ('cat', OneHotEncoder(), categorical_features)
    ])

# Fitting the preprocessor to the training data and transforming both training and validation sets
X_train_processed = preprocessor.fit_transform(X_train)
X_val_processed = preprocessor.transform(X_val)

# Converting processed data back to DataFrame for better readability (optional step)
X_train_processed_df = pd.DataFrame(X_train_processed, columns = preprocessor.get_feature_names_out())
X_val_processed_df = pd.DataFrame(X_val_processed, columns = preprocessor.get_feature_names_out())

X_train_processed_df.head()

Unnamed: 0,num__Age,num__Albumin,num__Alk_Phos,num__Bilirubin,num__Cholesterol,num__Copper,num__N_Days,num__Platelets,num__Prothrombin,num__SGOT,...,cat__Sex_M,cat__Ascites_N,cat__Ascites_Y,cat__Hepatomegaly_N,cat__Hepatomegaly_Y,cat__Spiders_N,cat__Spiders_Y,cat__Edema_N,cat__Edema_S,cat__Edema_Y
0,2.786334,0.735312,-0.437535,1.212277,-0.552543,3.906874,-0.246369,-1.943384,0.48032,0.914594,...,1.0,0.0,1.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0
1,0.281587,0.618653,-0.423642,-0.150199,-0.609257,-0.443949,-0.07725,-0.398753,-0.92969,-0.315621,...,0.0,1.0,0.0,0.0,1.0,1.0,0.0,1.0,0.0,0.0
2,0.795451,0.443664,-0.193355,-0.176914,0.566258,-0.193447,0.572859,-0.122103,-0.92969,0.578321,...,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0
3,1.144781,-1.510372,-0.326932,2.681614,0.117706,0.742639,-1.654788,-0.871364,1.121234,0.687975,...,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0
4,-0.753434,0.939465,-0.066723,-0.444067,1.174638,-0.087973,0.19916,0.062331,-1.186055,0.040494,...,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0


The pre-processing steps have been successfully applied:

1. **Numerical Features:** Standardized to have zero mean and unit variance.
2. **Categorical Features:** One-hot encoded to ensure proper representation without implying ordinal relationships.

The transformed training data now has 25 features, because the one-hot encoding expanded the categorical variables into multiple binary features.

### Models (Gradient Boost, Random Forest, ???)

#### Gradient Boosting model

In [78]:
# Gradient boosting model
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import classification_report, accuracy_score

# Initialize the Gradient Boosting Classifier
gb_classifier = GradientBoostingClassifier(random_state=42)#random_seed=42

# Train the classifier
gb_classifier.fit(X_train_processed, y_train)

# Predict on the validation set
y_val_pred = gb_classifier.predict(X_val_processed)

# Calculate accuracy and generate a classification report
accuracy_val = accuracy_score(y_val, y_val_pred)
classification_report_val = classification_report(y_val, y_val_pred)

# accuracy_val, classification_report_val
print('accuracy_val:\n',accuracy_val,'\n\n','classification_report_val:\n', classification_report_val)

accuracy_val:
 0.8266919671094244 

 classification_report_val:
               precision    recall  f1-score   support

           C       0.84      0.91      0.88       993
          CL       0.62      0.18      0.28        55
           D       0.79      0.74      0.77       533

    accuracy                           0.83      1581
   macro avg       0.75      0.61      0.64      1581
weighted avg       0.82      0.83      0.82      1581



The gradient boosting classifier achieved an accuracy of approximately 82.67% on the validation set.

Breakdown of the model's performance:

* **Precision:**
    * Class C (Cirrhosis): 84%
    * Class CL (Transplant): 62%
    * Class D (Death): 79%
* **Recall:**
    * Class C (Cirrhosis): 91%
    * Class CL (Transplant): 18%
    * Class D (Death): 74%
* **F1-Score:**
    * Class C (Cirrhosis): 0.88
    * Class CL (Transplant): 0.28
    * Class D (Death): 0.77

The model performs well for the C and D classes (84% and 79%), but struggles with the CL class (62%), likely due to CL's smaller presence in the dataset. This is a weakness of the model that should be improved by exploiting techniques to address the class imbalance more effectively and/or by exploring more complex models.

Possible next steps:

    1. Try more sophisticated models or ensemble techniques.
    2. Techniques to handle imbalanced data, like SMOTE or Class Weights Adjustment.
    3. Hyperparameter tuning of the gradient boosting model.
    4. Comparing these results with a neural network model, considering its potential if we could achieve appropriate scaling and normalization, although I expect we will not achieve that to a high enough standard.

#### Random Forest model

In [79]:
# RandomForest model
from sklearn.ensemble import RandomForestClassifier

# Initialize the Random Forest Classifier
rf_classifier = RandomForestClassifier(random_state=42)

# Train the classifier
rf_classifier.fit(X_train_processed, y_train)

# Predict on the validation set
y_val_pred_rf = rf_classifier.predict(X_val_processed)

# Calculate accuracy and generate a classification report for Random Forest
accuracy_val_rf = accuracy_score(y_val, y_val_pred_rf)
classification_report_val_rf = classification_report(y_val, y_val_pred_rf)

# accuracy_val_rf, classification_report_val_rf
print('accuracy_val_rf:\n',accuracy_val_rf,'\n\n','classification_report_val_rf:\n', classification_report_val_rf)

accuracy_val_rf:
 0.8298545224541429 

 classification_report_val_rf:
               precision    recall  f1-score   support

           C       0.85      0.91      0.88       993
          CL       0.80      0.07      0.13        55
           D       0.80      0.75      0.77       533

    accuracy                           0.83      1581
   macro avg       0.81      0.58      0.59      1581
weighted avg       0.83      0.83      0.82      1581



The Random Forest classifier achieved an accuracy of approximately 82.99% on the validation set, which is slightly better than the Gradient Boosting model.

Performance breakdown for the Random Forest classifier:

* **Precision:**
    *   Class C (Cirrhosis): 85%
    *   Class CL (Transplant): 80% #Suspiciously good, so have I done it wrong?
    *   Class D (Death): 80%
* **Recall:**
    *   Class C (Cirrhosis): 91%
    *   Class CL (Transplant): 7%
    *   Class D (Death): 75%
* **F1-Score:**
    *   Class C (Cirrhosis): 0.88
    *   Class CL (Transplant): 0.13
    *   Class D (Death): 0.77

Precision for the CL class improved significantly with the Random Forest model c/w Gradient Boost (80% versus 62%). But Random Forest's recall remains very low (7%), and worse than Gradient Boost (18%), indicating that the model struggles to correctly identify the CL class instances.

### Class Imbalance Techniques (SMOTE and Class Weights Adjustment)

#### Class Imbalance via SMOTE 

Address Class Imbalance with SMOTE

To address the low recall for the CL class, I will apply the Synthetic Minority Over-sampling Technique (SMOTE). This technique generates synthetic samples in the feature space for the minority class, which can help improve classifier performance on imbalanced datasets.

I'll apply SMOTE to the training data and then retrain the Random Forest classifier to observe any improvements in handling the CL class:

1. Apply SMOTE to the training data.
2. Retrain the Random Forest classifier using the SMOTE-enhanced data.
3. Evaluate the model performance on the original (non-SMOTE) validation set. ​​

In [80]:
# Try imblearn library for SMOTE
    # search: https://duckduckgo.com/?q=imblearn+smote
    # documentation: https://imbalanced-learn.org/dev/over_sampling.html#smote-adasyn
from imblearn.over_sampling import SMOTE

# Initialize SMOTE
smote = SMOTE(random_state=42)

# Apply SMOTE to the training data
X_train_smote, y_train_smote = smote.fit_resample(X_train_processed, y_train)

# Retrain the classifier using the SMOTE-enhanced data
rf_classifier_smote = RandomForestClassifier(random_state=42)
rf_classifier_smote.fit(X_train_smote, y_train_smote)

# Predict on the validation set
y_val_pred_rf_smote = rf_classifier_smote.predict(X_val_processed)

# Calculate accuracy and generate a classification report for Random Forest after SMOTE
accuracy_val_rf_smote = accuracy_score(y_val, y_val_pred_rf_smote)
classification_report_val_rf_smote = classification_report(y_val, y_val_pred_rf_smote)

# accuracy_val_rf_smote, classification_report_val_rf_smote
print('accuracy_val_rf_smote:\n',accuracy_val_rf_smote,'\n\n','classification_report_val_rf_smote:\n', classification_report_val_rf_smote)
# # Compare with the RandomForest pre-SMOTE
# print('original accuracy_val_rf:\n',accuracy_val_rf,'\n\n','original classification_report_val_rf:\n', classification_report_val_rf)

accuracy_val_rf_smote:
 0.8102466793168881 

 classification_report_val_rf_smote:
               precision    recall  f1-score   support

           C       0.87      0.86      0.86       993
          CL       0.34      0.29      0.31        55
           D       0.75      0.77      0.76       533

    accuracy                           0.81      1581
   macro avg       0.65      0.64      0.65      1581
weighted avg       0.81      0.81      0.81      1581



Adding SMOTE to RandomForest trades some overall accuracy (81.02% down from 82.99%) in exchange for improved CL recall (29% up from 7%) and f1-score (0.31 up from 0.13).

Note also that Precision has been revised downwards, including for CL (34% down from 80%), but I do not believe the previous RandomForest CL Precision figure of 80%.

Breakdown of the SMOTE + RandomForest model performance:

* **Precision:**
    *   Class C (Cirrhosis): 87% up from 85%
    *   Class CL (Transplant): 34% down from 80% #Suspicious
    *   Class D (Death): 75% down from 80%
* **Recall:**
    *   Class C (Cirrhosis): 86% down from 91%
    *   Class CL (Transplant): 29% up from 7%
    *   Class D (Death): 77% up from 75%
* **F1-Score:**
    *   Class C (Cirrhosis): 0.86 down from 0.88
    *   Class CL (Transplant): 0.31 up from 0.13
    *   Class D (Death): 0.76 down from 0.77

Analysis of SMOTE results:

* **Accuracy cost:** 
Slight decrease in overall accuracy from 82.99% to 81.02%. This decrease could be attributed to the model now paying more attention to the minority class (CL) at the expense of overall accuracy.

* **Improved Recall for CL:** 
The recall for the CL class has improved from 7% to 29%. This indicates that the SMOTE-enhanced model is now better at identifying positive instances of the CL class compared to the original model. Recall is crucial in medical contexts like this, as missing out on true positive cases can be more detrimental than false positives.
Precision measure for CL apparently decreases: Precision drops for the CL class from 80% to 34%. The pre-SMOTE 80% figure may be misleading because it was achieved by correctly predicting true negatives across the dataset because most data entries are negative (i.e. not CL). Now our precision measure for CL is lower because it predicts more false positives (a penalized error), in order to also predict more true positive (clinically more important).

* **F1-Score Increase for CL:** 
The F1-score balances precision and recall, and has improved from 0.13 to 0.31 for the CL class, underscoring the trade-off between recall and precision post-SMOTE is actually a favorable improvement for CL with more balanced performance.

* **Performance on Other Classes:** 
We set out to improve performance on the CL class because our data is imbalanced, but this has come at the expense of the classifier's performance across the other classes D and C, which may be either desirable or undesirable for our clinical use case. Class imbalance techniques lead to redistribution of the classifier's focus across the classes, and this has costs.

**Summary:**
Applying SMOTE has made our Random Forest model more sensitive to the minority class (CL). This is evident from the improved recall, and may be particularly desirable in our medical use case where detecting positive cases is critical, even at the expense of more false positive predictions. But this came at the cost of precision, per the typical trade-off between recall and precision in imbalanced datasets. An improved F1-score for the CL class confirms that SMOTE has been effective in enhancing the model's ability to classify this minority class more accurately, despite the slight decrease in overall accuracy.

#### Class Imbalance via Class Weights Adjustment instead of SMOTE

In [81]:
# Adjusting class weights for the Random Forest classifier
class_weights = {
    "C": 1,
    "D": y_train.value_counts()["C"] / y_train.value_counts()["D"],
    "CL": y_train.value_counts()["C"] / y_train.value_counts()["CL"]
}
# C is assigned a baseline weight = 1, note that this does not prioritize the C class over the others
# CL is a minority class, so its ratio is expected to be large, significantly increasing the penalty for misclassifying CL class instances compared to C or D

rf_classifier_weighted = RandomForestClassifier(random_state=42, class_weight=class_weights)

# Retrain the classifier using the training data with adjusted class weights
rf_classifier_weighted.fit(X_train_processed, y_train)

# Predict on the validation set
y_val_pred_rf_weighted = rf_classifier_weighted.predict(X_val_processed)

# Calculate accuracy and generate a classification report for the weighted Random Forest
accuracy_val_rf_weighted = accuracy_score(y_val, y_val_pred_rf_weighted)
classification_report_val_rf_weighted = classification_report(y_val, y_val_pred_rf_weighted)

# accuracy_val_rf_weighted, classification_report_val_rf_weighted
print('accuracy_val_rf_weighted:\n',accuracy_val_rf_weighted,'\n\n','classification_report_val_rf_weighted:\n', classification_report_val_rf_weighted)
# # Compare with the RandomForest pre-SMOTE
# print('original accuracy_val_rf:\n',accuracy_val_rf,'\n\n','original classification_report_val_rf:\n', classification_report_val_rf)

accuracy_val_rf_weighted:
 0.8298545224541429 

 classification_report_val_rf_weighted:
               precision    recall  f1-score   support

           C       0.84      0.92      0.88       993
          CL       0.86      0.11      0.19        55
           D       0.80      0.74      0.77       533

    accuracy                           0.83      1581
   macro avg       0.83      0.59      0.61      1581
weighted avg       0.83      0.83      0.82      1581



Class Weights Adjustment influence on RandomForest model performance:

* **Precision:**
    *   Class C (Cirrhosis): 84% down from 85% (versus 87% SMOTE)
    *   Class CL (Transplant): 86% up from 80% (versus 34% SMOTE)
    *   Class D (Death): 80% same as 80% (versus 75% SMOTE)
* **Recall:**
    *   Class C (Cirrhosis): 92% up from 91% (versus 86% SMOTE)
    *   Class CL (Transplant): 11% up from 7% (versus 29% SMOTE)
    *   Class D (Death): 74% down from 75% (versus 77% SMOTE)
* **F1-Score:**
    *   Class C (Cirrhosis): 0.88 same as 0.88 (versus 0.86 SMOTE)
    *   Class CL (Transplant): 0.19 up from 0.13 (versus 0.31 SMOTE)
    *   Class D (Death): 0.77 same as 0.77 (versus 0.76 SMOTE)
    

Hence, Class Weights Adjustment is less effective than SMOTE at addressing class imbalance with CL F1-Scores of 0.19 versus 0.31 for SMOTE and a 0.13 baseline. Other implementations may vary; this implementation is simple ratio scaling of each of D and CL with respect to C, the most prevalent class. SMOTE also cost slightly more in terms of overall model performance with respect to C and D. So less effectively addressing class imbalance through Class Weights Adjustment also better preserved model performance with respect to C and D.