<center><img src="https://keras.io/img/logo-small.png" alt="Keras logo" width="100"><br/>
This starter notebook is provided by the Keras team.</center>

# HMS - Harmful Brain Activity Classification with [KerasCV](https://github.com/keras-team/keras-cv) and [Keras](https://github.com/keras-team/keras)

> The objective of this competition is to classify seizures and other patterns of harmful brain activity in critically ill patients

This notebook guides you through the process of training and inferring a Deep Learning model, specifically EfficientNetV2, using KerasCV on the competition dataset. Specificaclly, this notebook uses spectrogram of the eeg data to classify the patterns.

Fun fact: This notebook is backend-agnostic, supporting TensorFlow, PyTorch, and JAX. Utilizing KerasCV and Keras allows us to choose our preferred backend. Explore more details on [Keras](https://keras.io/keras_core/announcement/).

In this notebook, you will learn:

* Loading the data efficiently using [`tf.data`](https://www.tensorflow.org/guide/data).
* Creating the model using KerasCV presets.
* Training the model.
* Inference and Submission on test data.

**Note**: For a more in-depth understanding of KerasCV, refer to the [KerasCV guides](https://keras.io/guides/keras_cv/).

# 🛠 | Install Libraries  

Since internet access is **disabled** during inference, we cannot install libraries in the usual `!pip install <lib_name>` manner. Instead, we need to install libraries from local files. In the following cell, we will install libraries from our local files. The installation code stays very similar - we just use the `filepath` instead of the `filename` of the library. So now the code is `!pip install <local_filepath>`. 

> The `filepath` of these local libraries look quite complicated, but don't be intimidated! Also `--no-deps` argument ensures that we are not installing any additional libraries.

In [1]:
!pip install -q /kaggle/input/kerasv3-lib-ds/keras_cv-0.8.2-py3-none-any.whl --no-deps
!pip install -q /kaggle/input/kerasv3-lib-ds/tensorflow-2.15.0.post1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl --no-deps
!pip install -q /kaggle/input/kerasv3-lib-ds/keras-3.0.4-py3-none-any.whl --no-deps

# 📚 | Import Libraries 

In [2]:
import os
os.environ["KERAS_BACKEND"] = "jax" # you can also use tensorflow or torch

import keras_cv
import keras
from keras import ops
import tensorflow as tf

import cv2
import pandas as pd
import numpy as np
from glob import glob
from tqdm.notebook import tqdm
import joblib

import matplotlib.pyplot as plt 

2024-04-01 11:47:22.779700: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-01 11:47:22.779760: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-01 11:47:22.781111: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


## Library Versions

In [3]:
print("TensorFlow:", tf.__version__)
print("Keras:", keras.__version__)
print("KerasCV:", keras_cv.__version__)

TensorFlow: 2.15.0
Keras: 3.0.4
KerasCV: 0.8.2


# ⚙️ | Configuration

In [4]:
class CFG:
    verbose = 1  # Verbosity
    seed = 42  # Random seed
    preset = "efficientnetv2_b2_imagenet"  # Name of pretrained classifier
    image_size = [400, 401]  # Input image size
    epochs = 13 # Training epochs
    batch_size = 32  # Batch size
    lr_mode = "cos" # LR scheduler mode from one of "cos", "step", "exp"
    drop_remainder = True  # Drop incomplete batches
    num_classes = 6 # Number of classes in the dataset
    fold = 0 # Which fold to set as validation data
    class_names = ['Seizure', 'LPD', 'GPD', 'LRDA','GRDA', 'Other']
    label2name = dict(enumerate(class_names))
    name2label = {v:k for k, v in label2name.items()}

# ♻️ | Reproducibility 
Sets value for random seed to produce similar result in each run.

In [5]:
keras.utils.set_random_seed(CFG.seed)

# 📁 | Dataset Path 

In [6]:
def build_augmenter(dim=CFG.image_size):
    augmenters = [
        keras_cv.layers.MixUp(alpha=2.0),
        keras_cv.layers.RandomCutout(height_factor=(1.0, 1.0),
                                     width_factor=(0.06, 0.1)), # freq-masking
        keras_cv.layers.RandomCutout(height_factor=(0.06, 0.1),
                                     width_factor=(1.0, 1.0)), # time-masking
    ]
    
    def augment(img, label):
        data = {"images":img, "labels":label}
        for augmenter in augmenters:
            if tf.random.uniform([]) < 0.5:
                data = augmenter(data, training=True)
        return data["images"], data["labels"]
    
    return augment


def build_decoder(with_labels=True, target_size=CFG.image_size, dtype=32):
    def decode_signal(sig, offset=None):
        # Read .npy files and process the signal        
        
        # Log spectrogram 
#         sig = tf.clip_by_value(sig, tf.math.exp(-4.0), tf.math.exp(8.0)) # avoid 0 in log
#         sig = tf.math.log(sig)
        
#         # Normalize spectrogram
#         sig -= tf.math.reduce_mean(sig)
#         sig /= tf.math.reduce_std(sig) + 1e-6
        
        # Mono channel to 3 channels to use "ImageNet" weights
        sig = tf.tile(sig[..., None], [1, 1, 3])
        return sig
    
    def decode_label(label):
        label = tf.one_hot(label, CFG.num_classes)
        label = tf.cast(label, tf.float32)
        label = tf.reshape(label, [CFG.num_classes])
        return label
    
    def decode_with_labels(path, offset=None, label=None):
        sig = decode_signal(path, offset)
        label = decode_label(label)
        return (sig, label)
    
    return decode_with_labels if with_labels else decode_signal


def build_dataset(signals, offsets=None, labels=None, batch_size=32, cache=True,
                  decode_fn=None, augment_fn=None,
                  augment=False, repeat=True, shuffle=1024, 
                  cache_dir="", drop_remainder=False):
    
    
    if cache_dir != "" and cache is True:
        os.makedirs(cache_dir, exist_ok=True)
    
    if decode_fn is None:
        decode_fn = build_decoder(labels is not None)
    
    if augment_fn is None:
        augment_fn = build_augmenter()
    
    AUTO = tf.data.experimental.AUTOTUNE
    slices = (signals, offsets) if labels is None else (signals, offsets, labels)
    
    ds = tf.data.Dataset.from_tensor_slices(slices)
    ds = ds.map(decode_fn, num_parallel_calls=AUTO)
    # ds = ds.cache(cache_dir) if cache else ds
    # ds = ds.repeat() if repeat else ds
    if shuffle: 
        ds = ds.shuffle(shuffle, seed=CFG.seed)
        opt = tf.data.Options()
        opt.experimental_deterministic = False
        ds = ds.with_options(opt)
    ds = ds.batch(batch_size, drop_remainder=drop_remainder)
    # ds = ds.map(augment_fn, num_parallel_calls=AUTO) if augment else ds
    ds = ds.prefetch(AUTO)
    return ds

# 🔪 | Data Split

In the following code snippet, the data is divided into `5` folds. Note that, the `groups` argument is used to prevent any overlap of patients between the training and validation sets, thus avoiding potential **data leakage** issues. Additionally, each split is stratified based on the `class_label`, ensuring a uniform distribution of class labels in each fold.

## Build Train & Valid Dataset

Only first sample for each `spectrogram_id` is used in order to keep the dataset size managable. Feel free to train on full data.

In [7]:
spec_paths = [
    "/kaggle/input/spec-and-labels-one-sixth/spec_sig_2000_1908.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_sig_4000_1780.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_sig_6000_1868.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_sig_8000_1857.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_sig_10000_1854.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_sig_12000_1669.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_sig_14000_1832.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_sig_16000_1822.npy",
]

targets_paths = [
    "/kaggle/input/spec-and-labels-one-sixth/spec_og_targets_2000_1908.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_og_targets_4000_1780.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_og_targets_6000_1868.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_og_targets_8000_1857.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_og_targets_10000_1854.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_og_targets_12000_1669.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_og_targets_14000_1832.npy",
    "/kaggle/input/spec-and-labels-one-sixth/spec_og_targets_16000_1822.npy"
    
]

In [8]:
# Sample from full data
import gc

def create_ds_outer(path_spec, path_targ, index):
    path_split_last = path_spec.split('_')[-1]
    example_len = int(path_split_last.split('.')[0])
    
    if index == 0:
        first_half = np.load(path_spec, mmap_mode='r')
        first_half = first_half[:example_len//2]
        new_length = example_len//2
        
        first_spec = tf.convert_to_tensor(first_half, dtype=tf.float32)
        
        first_targ = np.load(path_targ, mmap_mode='r')
        first_targ = first_targ[:new_length]
        first_targ = tf.convert_to_tensor(first_targ, dtype=tf.int32)
        

        train_ds = build_dataset(first_spec[:int(0.8*new_length)], labels=first_targ[:int(0.8*new_length)], 
                             batch_size=CFG.batch_size, repeat=True, shuffle=True, augment=True, cache=True)

        valid_ds = build_dataset(first_spec[int(0.8*new_length):], labels=first_targ[int(0.8*new_length):], batch_size=CFG.batch_size,
                             repeat=False, shuffle=False, augment=False, cache=True)

        return train_ds, valid_ds, int(0.8*new_length)
    
    else:
        first_half = np.load(path_spec, mmap_mode='r')
        first_half = first_half[example_len//2:]
        new_length = example_len//2
        
        first_spec = tf.convert_to_tensor(first_half, dtype=tf.float32)
        
        first_targ = np.load(path_targ, mmap_mode='r')
        first_targ = first_targ[:new_length]
        first_targ = tf.convert_to_tensor(first_targ, dtype=tf.int32)
        

        train_ds = build_dataset(first_spec[:int(0.8*new_length)], labels=first_targ[:int(0.8*new_length)], 
                             batch_size=CFG.batch_size, repeat=True, shuffle=True, augment=True, cache=True)

        valid_ds = build_dataset(first_spec[int(0.8*new_length):], labels=first_targ[int(0.8*new_length):], batch_size=CFG.batch_size,
                             repeat=False, shuffle=False, augment=False, cache=True)
        
        first_spec, first_targ = None, None
        gc.collect()

        return train_ds, valid_ds, int(0.8*new_length)
        
        
    
    
    

def create_ds(path_spec, path_targ, upper_index):
    
    first_spec = tf.convert_to_tensor(np.load(path_spec), dtype=tf.float32)

    first_targ = tf.convert_to_tensor(np.load(path_targ), dtype=tf.int32)



    train_ds = build_dataset(first_spec[:int(0.8*len(first_spec))], labels=first_targ[:int(0.8*len(first_spec))], 
                         batch_size=CFG.batch_size, repeat=True, shuffle=True, augment=True, cache=True)

    valid_ds = build_dataset(first_spec[int(0.8*len(first_spec)):], labels=first_targ[int(0.8*len(first_spec)):], batch_size=CFG.batch_size,
                         repeat=False, shuffle=False, augment=False, cache=True)

    return train_ds, valid_ds, int(0.8*len(first_spec))




# 🔍 | Loss & Metric

The evaluation metric in this competition is **KL Divergence**, defined as,

$$
D_{\text{KL}}(P \parallel Q) = \sum_{i} P(i) \log\left(\frac{P(i)}{Q(i)}\right)
$$

Where:
- $P$ is the true distribution.
- $Q$ is the predicted distribution.

Interestingly, as KL Divergence is differentiable, we can directly use it as our loss function. Thus, we don't need to use a third-party metric like **Accuracy** to evaluate our model. Therefore, `valid_loss` can stand alone as an indicator for our evaluation. In keras, we already have impelementation for KL Divergence loss so we only need to import it.

In [9]:
LOSS = keras.losses.KLDivergence()

# 🤖 | Modeling

This notebook uses the `EfficientNetV2 B2` from KerasCV's collection of pretrained models. To explore other models, simply modify the `preset` in the `CFG` (config). Check the [KerasCV website](https://keras.io/api/keras_cv/models/tasks/image_classifier/) for a list of available pretrained models.

In [17]:
# Build Classifier

model=None

model = keras_cv.models.ImageClassifier.from_preset(
    CFG.preset, num_classes=CFG.num_classes
)

# Compile the model  
model.compile(optimizer=keras.optimizers.Adam(learning_rate=1e-4),
              loss=LOSS)

# Model Sumamry
model.summary()

Attaching 'config.json' from model 'keras/efficientnetv2/keras/efficientnetv2_b2_imagenet/2' to your Kaggle notebook...
Attaching 'config.json' from model 'keras/efficientnetv2/keras/efficientnetv2_b2_imagenet/2' to your Kaggle notebook...
Attaching 'model.weights.h5' from model 'keras/efficientnetv2/keras/efficientnetv2_b2_imagenet/2' to your Kaggle notebook...


# ⚓ | LR Schedule

A well-structured learning rate schedule is essential for efficient model training, ensuring optimal convergence and avoiding issues such as overshooting or stagnation.

In [20]:
import math

def get_lr_callback(batch_size=8, mode='cos', epochs=10, plot=False):
    lr_start, lr_max, lr_min = 5e-5, 6e-6 * batch_size, 1e-5
    lr_ramp_ep, lr_sus_ep, lr_decay = 3, 0, 0.75

    def lrfn(epoch):  # Learning rate update function
        if epoch < lr_ramp_ep: lr = (lr_max - lr_start) / lr_ramp_ep * epoch + lr_start
        elif epoch < lr_ramp_ep + lr_sus_ep: lr = lr_max
        elif mode == 'exp': lr = (lr_max - lr_min) * lr_decay**(epoch - lr_ramp_ep - lr_sus_ep) + lr_min
        elif mode == 'step': lr = lr_max * lr_decay**((epoch - lr_ramp_ep - lr_sus_ep) // 2)
        elif mode == 'cos':
            decay_total_epochs, decay_epoch_index = epochs - lr_ramp_ep - lr_sus_ep + 3, epoch - lr_ramp_ep - lr_sus_ep
            phase = math.pi * decay_epoch_index / decay_total_epochs
            lr = (lr_max - lr_min) * 0.5 * (1 + math.cos(phase)) + lr_min
        return lr

    if plot:  # Plot lr curve if plot is True
        plt.figure(figsize=(10, 5))
        plt.plot(np.arange(epochs), [lrfn(epoch) for epoch in np.arange(epochs)], marker='o')
        plt.xlabel('epoch'); plt.ylabel('lr')
        plt.title('LR Scheduler')
        plt.show()

    return keras.callbacks.LearningRateScheduler(lrfn, verbose=False)  # Create lr callback

In [21]:
lr_cb = get_lr_callback(CFG.batch_size, mode=CFG.lr_mode, plot=False)

# 💾 | Model Checkpointing

In [22]:
ckpt_cb = keras.callbacks.ModelCheckpoint("best_model.keras",
                                         monitor='val_loss',
                                         save_best_only=True,
                                         save_weights_only=False,
                                         mode='min')

# 🚂 | Training

In [23]:
tf.keras.backend.clear_session()
# del train_ds, valid_ds

In [24]:
# history = model.fit(
#     train_ds, 
#     epochs=CFG.epochs,
#     callbacks=[lr_cb, ckpt_cb], 
#     steps_per_epoch=len(train_df)//CFG.batch_size,
#     #validation_data=valid_ds, 
#     verbose=CFG.verbose
# )

for epoch in range(1, 10):
    print(f"epoch {epoch}/10")
    for s_path, t_path in zip(spec_paths,targets_paths) :
        for index in range(0,2):
            print(f"Training on dataset at: {s_path}")
            # Create the dataset for the current path
            # Optionally, create a validation dataset if you have validation data
            # valid_ds = create_dataset_from_path(validation_path)

            # Train the model on the current dataset
#             train_ds, valid_ds, train_ds_len = create_ds_outer(s_path, t_path, index=index)
#             history = model.fit(
#                     train_ds, 
#                     epochs=1,
#                     callbacks=[lr_cb, ckpt_cb], 
#                     steps_per_epoch=train_ds_len//CFG.batch_size,
#                     validation_data=valid_ds,  # Uncomment if you have a validation dataset
#                     verbose=CFG.verbose
#                 )
            
            try:
                train_ds, valid_ds, train_ds_len = create_ds_outer(s_path, t_path, index=index)
                history = model.fit(
                    train_ds, 
                    epochs=1,
                    callbacks=[lr_cb, ckpt_cb], 
                    steps_per_epoch=train_ds_len//CFG.batch_size,
                    validation_data=valid_ds,  # Uncomment if you have a validation dataset
                    verbose=CFG.verbose
                )
            
            except Exception:
                
                try:
                    history = model.fit(
                        train_ds, 
                        epochs=1,
                        callbacks=[lr_cb, ckpt_cb], 
                        steps_per_epoch=train_ds_len//CFG.batch_size,
                        verbose=CFG.verbose
                    )
                    
                    print(f'Able to train {s_path} but not able to validate')
            
                except Exception:
                    print(f'Unable to train {s_path}')
            
            train_ds, valid_ds = None, None
            gc.collect()

epoch 1/10
Training on dataset at: /kaggle/input/spec-and-labels-one-sixth/spec_sig_2000_1908.npy
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m101s[0m 3s/step - loss: 1.7945 - val_loss: 1.6787 - learning_rate: 5.0000e-05
Training on dataset at: /kaggle/input/spec-and-labels-one-sixth/spec_sig_2000_1908.npy
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 424ms/step - loss: 1.7290 - val_loss: 1.3982 - learning_rate: 5.0000e-05
Training on dataset at: /kaggle/input/spec-and-labels-one-sixth/spec_sig_4000_1780.npy
[1m22/22[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m61s[0m 3s/step - loss: 1.8045 - val_loss: 2.0003 - learning_rate: 5.0000e-05
Training on dataset at: /kaggle/input/spec-and-labels-one-sixth/spec_sig_4000_1780.npy
[1m22/22[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 367ms/step - loss: 1.7360 - val_loss: 1.9780 - learning_rate: 5.0000e-05
Training on dataset at: /kaggle/input/spec-and-labels-one-sixth/spec_sig_6000_1868.npy
[1m23/2

KeyboardInterrupt: 

# 🧪 | Prediction

## Load Best Model

In [None]:
#model.load_weights("best_model.keras")

## Build Test Dataset

In [None]:
# test_paths = test_df.spec2_path.values
# test_ds = build_dataset(test_paths, batch_size=min(CFG.batch_size, len(test_df)),
#                          repeat=False, shuffle=False, cache=False, augment=False)

## Inference

In [None]:
# preds = model.predict(test_ds)

# 📩 | Submission

In [None]:
# pred_df = test_df[["eeg_id"]].copy()
# target_cols = [x.lower()+'_vote' for x in CFG.class_names]
# pred_df[target_cols] = preds.tolist()
# sub_df = pd.read_csv(f'{BASE_PATH}/sample_submission.csv')
# sub_df = sub_df[["eeg_id"]].copy()
# sub_df = sub_df.merge(pred_df, on="eeg_id", how="left")
# sub_df.to_csv("submission.csv", index=False)
# sub_df.head()

# 📌 | Reference
* [HMS-HBAC: ResNet34d Baseline [Training]](https://www.kaggle.com/code/ttahara/hms-hbac-resnet34d-baseline-training) 
* [EfficientNetB2 Starter - [LB 0.57]](https://www.kaggle.com/code/cdeotte/efficientnetb2-starter-lb-0-57)