# ISIC 2024 - Skin Lesion Detection with 3D-TB using KerasCV and Keras

> The goal of this competition is to detect skin cancers in lesions cropped from 3D total body photographs.

<div align="center">
  <image src="https://i.ibb.co/PYLN6QR/isic2024.jpg">
</div>

This notebook guides through training and deploying a Deep Learning model for skin cancer detection using skin lesion data from 3D total body photographs. Specifically, we'll employ the EfficientNetV2 backbone from KerasCV on the competition dataset. The notebook integrates both image data and tabular features (e.g., age, sex) to enhance skin cancer detection.

**Fun fact:** This notebook is backend-agnostic, supporting TensorFlow, PyTorch, and JAX. Leveraging KerasCV and Keras allows flexibility in choosing the preferred backend. Explore more details on [Keras](https://keras.io/keras_core/announcement/).

In this notebook, following lessions will be covered:

- Designing a data pipeline for a multi-input model.
- Creating a random augmentation pipeline with KerasCV.
- Efficiently loading data using [`tf.data`](https://www.tensorflow.org/guide/data).
- Utilizing KerasCV presets to build the model.
- Training the model.
- Performing inference and generating submissions on testing data.

**Note:** For a deeper understanding of KerasCV, refer to the [KerasCV guides](https://keras.io/guides/keras_cv/).

# 📚 | Import Libraries

In [None]:
import os
os.environ["KERAS_BACKEND"] = "tensorflow" # other options: tensorflow or torch

import keras_cv
import keras
from keras import ops
import tensorflow as tf

import cv2
import pandas as pd
import numpy as np
from glob import glob
from tqdm.notebook import tqdm
import joblib

import matplotlib.pyplot as plt

## Library Versions

In [None]:
print("TensorFlow:", tf.__version__)
print("Keras:", keras.__version__)
print("KerasCV:", keras_cv.__version__)

# ⚙️ | Configuration

In [None]:
class CFG:
    verbose = 1  # Verbosity
    seed = 42  # Random seed
    neg_sample = 0.01 # Downsample negative calss
    pos_sample = 5.0  # Upsample positive class
    preset = "efficientnetv2_b2_imagenet"  # Name of pretrained classifier
    image_size = [128, 128]  # Input image size
    epochs = 8 # Training epochs
    batch_size = 8  # Batch size
    lr_mode = "cos" # LR scheduler mode from one of "cos", "step", "exp"
    class_names = ['target']
    num_classes = 1

# ♻️ | Reproducibility 
Sets value for random seed to produce similar result in each run.

In [None]:
keras.utils.set_random_seed(CFG.seed)

# 📁 | Dataset Path

In [None]:
BASE_PATH = "/kaggle/input/isic-2024-challenge"

# 📖 | Meta Data

In this dataset, following information is available:

- **train-image/**: Contains image files for the training set (provided for train only).
- **train-image.hdf5**: Training image data stored in a single HDF5 file, where each image is indexed by its `isic_id`.
- **train-metadata.csv**: Metadata corresponding to the training set, including:
  - `isic_id`: Unique image ID used to query images in the HDF5 file.
  - `patient_id`: Unique patient ID.
  - `sex`: Gender of the patient.
  - `age_approx`: Approximate age of the patient.
  - `anatom_site_general`: Location of the lesion.
  - Other relevant metadata fields.
- **test-image.hdf5**: Testing image data stored in a single HDF5 file, initially containing 3 testing examples to validate the inference pipeline. Upon notebook submission, this file is replaced with the full hidden testing set, which contains approximately 500,000 images.
- **test-metadata.csv**: Metadata corresponding to the testing subset.

In [None]:
# Train + Valid
df = pd.read_csv(f'{BASE_PATH}/train-metadata.csv')
df = df.ffill()
display(df.head(2))

# Testing
testing_df = pd.read_csv(f'{BASE_PATH}/test-metadata.csv')
testing_df = testing_df.ffill()
display(testing_df.head(2))

In [None]:
df.columns

# ⚖️ | Handle Class Imbalance

## Sample Data

There is a significant class imbalance in the dataset, with a large number of negative samples compared to positive samples. To address this issue, the negative class will be downsampled and upsample the positive class. To experiment with full dataset, simply adjust the `pos_sample` and `neg_sample` settings in `CFG`.

In [None]:
print("Class Distribution Before Sampling (%):")
display(df.target.value_counts(normalize=True)*100)

# Sampling
positive_df = df.query("target==0").sample(frac=CFG.neg_sample, random_state=CFG.seed)
negative_df = df.query("target==1").sample(frac=CFG.pos_sample, replace=True, random_state=CFG.seed)
df = pd.concat([positive_df, negative_df], axis=0).sample(frac=1.0)

print("\nCalss Distribution After Sampling (%):")
display(df.target.value_counts(normalize=True)*100)

## Class Weight

Even after downsampling the negative class and upsampling the positive class, there remains a significant class imbalance. To further address this imbalance during training, loss weighting will be used. This technique ensures that the model weights are updated more heavily for the positive samples, thereby reducing the bias towards the negative class. The following code computes the class weights for the loss:

In [None]:
from sklearn.utils.class_weight import compute_class_weight

# Assume df is your DataFrame and 'target' is the column with class labels
class_weights = compute_class_weight('balanced', classes=np.unique(df['target']), y=df['target'])
class_weights = dict(enumerate(class_weights))
print("Class Weights:", class_weights)

# 🖼️ | Load Image Byte String

In this competition, images are provided as byte strings. The following code snippet demonstrates how to load these images into memory. One might wonder why the provided `jpeg` images aren't being used in the `/train-image` folder for training. This is because testing images are not provided as JPEG images; instead, they are provided as byte strings. Why use byte strings? They occupy significantly less memory compared to `np.array` representations.

In [None]:
import h5py

training_validation_hdf5 = h5py.File(f"{BASE_PATH}/train-image.hdf5", 'r')
testing_hdf5 = h5py.File(f"{BASE_PATH}/test-image.hdf5", 'r')

## Check Image

Examining a sample image from the provided data is important. This step allows for a closer inspection of the image quality and content, ensuring it meets the requirements for further processing.

In [None]:
isic_id = df.isic_id.iloc[0]

# Image as Byte String
byte_string = training_validation_hdf5[isic_id][()]
print(f"Byte String: {byte_string[:20]}....")

# Convert byte string to numpy array
nparr = np.frombuffer(byte_string, np.uint8)

print("Image:")
image = cv2.imdecode(nparr, cv2.IMREAD_COLOR)[...,::-1] # reverse last axis for bgr -> rgb
plt.imshow(image);

# 🔪 | Data Split

In the following code,the data will be splitted into `5` stratified folds and use the first fold for training and validation. It's important to note that `StratifiedGroupKFold` is being used to ensure that `patient_id`s do not overlap between the training and validation datasets. This prevents data leakage, where the model could potentially peak at data it should not have access to.

> **Note**: Data leakage can lead to artificially high validation scores that do not reflect real-world performance.

In [None]:
from sklearn.model_selection import StratifiedGroupKFold

df = df.reset_index(drop=True) # ensure continuous index
df["fold"] = -1
sgkf = StratifiedGroupKFold(n_splits=5, shuffle=True, random_state=CFG.seed)
for i, (training_idx, validation_idx) in enumerate(sgkf.split(df, y=df.target, groups=df.patient_id)):
    df.loc[validation_idx, "fold"] = int(i)

# Use first fold for training and validation
training_df = df.query("fold!=0")
validation_df = df.query("fold==0")
print(f"# Num Train: {len(training_df)} | Num Valid: {len(validation_df)}")

## Class Distribution in Training

In [None]:
training_df.target.value_counts()

## Class Distribution in Validation

In [None]:
validation_df.target.value_counts()

# 📊 | Tabular Features

In this competition, alongside image data, tabular features such as age, sex, and the location of the lesion are available. Previous competitions, like [ISIC 2020](https://www.kaggle.com/c/siim-isic-melanoma-classification/overview), have demonstrated that incorporating these tabular features can significantly enhance model performance. A similar improvement is anticipated here.

The following code snippet provides a method for selecting which tabular features to include. It is encouraged to experiment with various combinations to determine the most effective set.

In [None]:
# Categorical features which will be one hot encoded
CATEGORICAL_COLUMNS = ["sex", "anatom_site_general",
            "tbp_tile_type","tbp_lv_location", ]

# Numeraical features which will be normalized
NUMERIC_COLUMNS = ["age_approx", "tbp_lv_nevi_confidence", "clin_size_long_diam_mm",
           "tbp_lv_areaMM2", "tbp_lv_area_perim_ratio", "tbp_lv_color_std_mean",
           "tbp_lv_deltaLBnorm", "tbp_lv_minorAxisMM", ]

# Tabular feature columns
FEAT_COLS = CATEGORICAL_COLUMNS + NUMERIC_COLUMNS

In [None]:
print(df[FEAT_COLS])

# Text Generation

# Conditional VAE

In [None]:
# ===========================
# 1. IMPORTS
# ===========================
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense, Lambda, Concatenate
from tensorflow.keras.models import Model
from tensorflow.keras import backend as K

# ===========================
# 2. DATA PREPROCESSING
# ===========================

# Assume your original df is loaded
CATEGORICAL_COLUMNS = ["sex", "anatom_site_general", "tbp_tile_type", "tbp_lv_location"]
NUMERIC_COLUMNS = ["age_approx", "tbp_lv_nevi_confidence", "clin_size_long_diam_mm",
                   "tbp_lv_areaMM2", "tbp_lv_area_perim_ratio", "tbp_lv_color_std_mean",
                   "tbp_lv_deltaLBnorm", "tbp_lv_minorAxisMM"]

df_original = df[CATEGORICAL_COLUMNS + NUMERIC_COLUMNS + ['target']]

# One-hot encode categorical data
ohe = OneHotEncoder(sparse=False)
cat_data = ohe.fit_transform(df_original[CATEGORICAL_COLUMNS])

# MinMax scale numerical data
scaler = MinMaxScaler()
num_data = scaler.fit_transform(df_original[NUMERIC_COLUMNS])

# Get target
target_data = df_original['target'].values.reshape(-1, 1)

# Combine preprocessed features
X = np.concatenate([cat_data, num_data], axis=1)

input_dim = X.shape[1]
latent_dim = 10  # Size of latent space

# ===========================
# 3. CVAE MODEL BUILDING
# ===========================

# ----- Encoder -----
features_input = Input(shape=(input_dim,))
target_input = Input(shape=(1,))  # 0 or 1

# Concatenate features + target
x = Concatenate()([features_input, target_input])
h = Dense(64, activation="relu")(x)
z_mean = Dense(latent_dim)(h)
z_log_var = Dense(latent_dim)(h)

# Sampling
def sampling(args):
    z_mean, z_log_var = args
    epsilon = K.random_normal(shape=(K.shape(z_mean)[0], latent_dim))
    return z_mean + K.exp(0.5 * z_log_var) * epsilon

z = Lambda(sampling)([z_mean, z_log_var])

# ----- Decoder -----
z_cond = Concatenate()([z, target_input])
decoder_h = Dense(64, activation="relu")
decoder_cat_out = Dense(cat_data.shape[1], activation="softmax")  # Categorical
decoder_num_out = Dense(num_data.shape[1], activation="sigmoid")  # Numerical

h_decoded = decoder_h(z_cond)
out_cat = decoder_cat_out(h_decoded)
out_num = decoder_num_out(h_decoded)
outputs = Concatenate()([out_cat, out_num])

# Full CVAE model
cvae = Model([features_input, target_input], outputs)

# Loss
cvae.compile(optimizer="adam", loss="mse")

# Summary
cvae.summary()

# ===========================
# 4. TRAIN THE CVAE
# ===========================

cvae.fit([X, target_data], X,
         epochs=50,
         batch_size=32,
         validation_split=0.2,
         verbose=1)

# ===========================
# 5. BUILD DECODER SEPARATELY
# ===========================

latent_input = Input(shape=(latent_dim,))
target_input_decoder = Input(shape=(1,))

z_cond_dec = Concatenate()([latent_input, target_input_decoder])
h_dec = decoder_h(z_cond_dec)
out_cat_dec = decoder_cat_out(h_dec)
out_num_dec = decoder_num_out(h_dec)
decoded_output = Concatenate()([out_cat_dec, out_num_dec])

decoder = Model([latent_input, target_input_decoder], decoded_output)

# ===========================
# 6. GENERATE SYNTHETIC DATA
# ===========================

# Generate for benign and malignant
num_samples = 1000  # Total synthetic samples

# Half benign, half malignant
z_sample_benign = np.random.normal(size=(num_samples // 2, latent_dim))
z_sample_malignant = np.random.normal(size=(num_samples // 2, latent_dim))

target_benign = np.zeros((num_samples // 2, 1))
target_malignant = np.ones((num_samples // 2, 1))

# Decode
generated_benign = decoder.predict([z_sample_benign, target_benign])
generated_malignant = decoder.predict([z_sample_malignant, target_malignant])

# Combine
generated = np.vstack([generated_benign, generated_malignant])
targets_generated = np.vstack([target_benign, target_malignant]).flatten()

print("\nGenerated synthetic feature shape:", generated.shape)

# ===========================
# 7. POSTPROCESS GENERATED DATA
# ===========================

# Separate categorical and numerical parts
generated_cat = generated[:, :cat_data.shape[1]]
generated_num = generated[:, cat_data.shape[1]:]

# Harden categorical outputs correctly (group-wise)
generated_cat_hardened = np.zeros_like(generated_cat)
category_sizes = [len(c) for c in ohe.categories_]

start_idx = 0
for size in category_sizes:
    end_idx = start_idx + size
    
    group = generated_cat[:, start_idx:end_idx]
    group_hardened = np.zeros_like(group)
    max_indices = np.argmax(group, axis=1)
    group_hardened[np.arange(len(group)), max_indices] = 1
    
    generated_cat_hardened[:, start_idx:end_idx] = group_hardened
    
    start_idx = end_idx

# Inverse transforms
original_cat_labels = ohe.inverse_transform(generated_cat_hardened)
original_num_data = scaler.inverse_transform(generated_num)

# ===========================
# 8. COMBINE INTO FINAL SYNTHETIC DATAFRAME
# ===========================

df_cat = pd.DataFrame(original_cat_labels, columns=CATEGORICAL_COLUMNS)
df_num = pd.DataFrame(original_num_data, columns=NUMERIC_COLUMNS)
df_target = pd.DataFrame(targets_generated, columns=["target"])

synthetic_df = pd.concat([df_cat, df_num, df_target], axis=1)

# Save final synthetic dataset
synthetic_df.to_csv("synthetic_data_cvae_final.csv", index=False)

print("\n✅ Final synthetic dataset created successfully: 'synthetic_data_cvae_final.csv'")
print(synthetic_df.head())

# 🍚 | DataLoader

This DataLoader is designed to process both `images` and tabular `features` simultaneously as inputs. It applies augmentations like `flip` and `cutout`, with additional options available such as random brightness, contrast, zoom, and rotation. Experimentation with different augmentations is encouraged. More details on the available augmentations in KerasCV can be found [here](https://keras.io/api/keras_cv/layers/preprocessing/).

> Note: Unlike standard augmentations, these augmentations are applied to a batch, enhancing training speed and reducing CPU bottlenecks.

In [None]:
def build_augmenter():
    # Define augmentations
    aug_layers = [
        keras_cv.layers.RandomCutout(height_factor=(0.02, 0.06), width_factor=(0.02, 0.06)),
        keras_cv.layers.RandomFlip(mode="horizontal"),
    ]
    
    # Apply augmentations to random samples
    aug_layers = [keras_cv.layers.RandomApply(x, rate=0.5) for x in aug_layers]
    
    # Build augmentation layer
    augmenter = keras_cv.layers.Augmenter(aug_layers)

    # Apply augmentations
    def augment(inp, label):
        images = inp["images"]
        aug_data = {"images": images}
        aug_data = augmenter(aug_data)
        inp["images"] = aug_data["images"]
        return inp, label
    return augment


def build_decoder(with_labels=True, target_size=CFG.image_size):
    def decode_image(inp):
        # Read jpeg image
        file_bytes = inp["images"]
        image = tf.io.decode_jpeg(file_bytes)
        
        # Resize
        image = tf.image.resize(image, size=target_size, method="area")
        
        # Rescale image
        image = tf.cast(image, tf.float32)
        image /= 255.0
        
        # Reshape
        image = tf.reshape(image, [*target_size, 3])
        
        inp["images"] = image
        return inp

    def decode_label(label, num_classes):
        label = tf.cast(label, tf.float32)
        label = tf.reshape(label, [num_classes])
        return label

    def decode_with_labels(inp, label=None):
        inp = decode_image(inp)
        label = decode_label(label, CFG.num_classes)
        return (inp, label)

    return decode_with_labels if with_labels else decode_image


def build_dataset(
    isic_ids,
    hdf5,
    features,
    labels=None,
    batch_size=32,
    decode_fn=None,
    augment_fn=None,
    augment=False,
    shuffle=1024,
    cache=True,
    drop_remainder=False,
):
    if decode_fn is None:
        decode_fn = build_decoder(labels is not None)

    if augment_fn is None:
        augment_fn = build_augmenter()

    AUTO = tf.data.experimental.AUTOTUNE

    images = [None]*len(isic_ids)
    for i, isic_id in enumerate(tqdm(isic_ids, desc="Loading Images ")):
        images[i] = hdf5[isic_id][()]
        
    inp = {"images": images, "features": features}
    slices = (inp, labels) if labels is not None else inp

    ds = tf.data.Dataset.from_tensor_slices(slices)
    ds = ds.cache() if cache else ds
    ds = ds.map(decode_fn, num_parallel_calls=AUTO)
    if shuffle:
        ds = ds.shuffle(shuffle, seed=CFG.seed)
        opt = tf.data.Options()
        opt.deterministic = False
        ds = ds.with_options(opt)
    ds = ds.batch(batch_size, drop_remainder=drop_remainder)
    ds = ds.map(augment_fn, num_parallel_calls=AUTO) if augment else ds
    ds = ds.prefetch(AUTO)
    return ds

## Build Training & Validation Dataset

In the following code, **training** and **validation** data loaders will be created.

In [None]:
## Train
print("# Training:")
training_features = dict(training_df[FEAT_COLS])
training_ids = training_df.isic_id.values
training_labels = training_df.target.values
training_ds = build_dataset(training_ids, training_validation_hdf5, training_features, 
                         training_labels, batch_size=CFG.batch_size,
                         shuffle=True, augment=True)

# Valid
print("# Validation:")
validation_features = dict(validation_df[FEAT_COLS])
validation_ids = validation_df.isic_id.values
validation_labels = validation_df.target.values
validation_ds = build_dataset(validation_ids, training_validation_hdf5, validation_features,
                         validation_labels, batch_size=CFG.batch_size,
                         shuffle=False, augment=False)

In [None]:
print(training_ds)

In [None]:
feature_space = keras.utils.FeatureSpace(
    features={
        # Categorical features encoded as integers
        "sex": "string_categorical",
        "anatom_site_general": "string_categorical",
        "tbp_tile_type": "string_categorical",
        "tbp_lv_location": "string_categorical",
        # Numerical features to discretize
        "age_approx": "float_discretized",
        # Numerical features to normalize
        "tbp_lv_nevi_confidence": "float_normalized",
        "clin_size_long_diam_mm": "float_normalized",
        "tbp_lv_areaMM2": "float_normalized",
        "tbp_lv_area_perim_ratio": "float_normalized",
        "tbp_lv_color_std_mean": "float_normalized",
        "tbp_lv_deltaLBnorm": "float_normalized",
        "tbp_lv_minorAxisMM": "float_normalized",
    },
    output_mode="concat",
)

## Configuring a `FeatureSpace`

To set up how each tabular feature should be preprocessed, `keras.utils.FeatureSpace` is used. A dictionary is passsed to it that maps each feature name to a string describing its type.

- **String Categorical Features**: 
  - Examples: `sex`, `anotm_site_general`
- **Numerical Features**:
  - Examples: `tbp_lv_nevi_confidence`, `clin_size_long_diam_mm`
  - Note: These features will be normalized.
- **Numerical Discrete Features**:
  - `age_approx`: Need to discretize this feature into a number of bins.

## Adapt Tabular Features

Before the `FeatureSpace` is used to build a model, it needs to be adapted to the training data. During `adapt()`, the `FeatureSpace` will:

- Index the set of possible values for **categorical features**.
- Compute the mean and variance for **numerical features** to normalize.
- Compute the value boundaries for the different bins for **numerical features** to discretize.

Note: `adapt()` should be called on a `tf.data.Dataset` that yields dictionaries of feature values – no labels.

In [None]:
training_ds_with_no_labels = training_ds.map(lambda x, _: x["features"])
feature_space.adapt(training_ds_with_no_labels)

At this point, the `FeatureSpace` can be called on a dictionary of raw feature values. It will return a single concatenated vector for each sample, combining encoded features.

In the code below, it can be noticed that even though $12$ raw (tabular) features are used, after processing with `FeatureSpace`, a vector of size $71$ is created. This is because operations like one-hot encoding are being applied here.

In [None]:
for x, _ in training_ds.take(1):
    preprocessed_x = feature_space(x["features"])
    print("preprocessed_x.shape:", preprocessed_x.shape)
    print("preprocessed_x.dtype:", preprocessed_x.dtype)

## Apply Feature Processing

Integrating feature space processing into the data pipeline before the model is crucial. This approach enables asynchronous, parallel preprocessing of data on the CPU, ensuring it is optimized before being fed into the model.

In [None]:
training_ds = training_ds.map(
    lambda x, y: ({"images": x["images"],
                   "features": feature_space(x["features"])}, y), num_parallel_calls=tf.data.AUTOTUNE)

validation_ds = validation_ds.map(
    lambda x, y: ({"images": x["images"],
                   "features": feature_space(x["features"])}, y), num_parallel_calls=tf.data.AUTOTUNE)

## Output Shape of a Batch

Verifying the shape of a batch sample is essential. This step ensures that the dataloader is generating inputs with the correct dimensions, which is critical for the model's performance.

In [None]:
batch = next(iter(validation_ds))

print("Images:",batch[0]["images"].shape)
print("Features:", batch[0]["features"].shape)
print("Targets:", batch[1].shape)

## Dataset Check

Visualizing samples along with their associated labels from the dataset is a vital step. This process helps in understanding the data distribution and ensures that the dataset is correctly labeled.

# 🔍 | Loss & Metric

This competition utilizes the Partial Area Under the Curve (pAUC) metric for evaluation. In Keras, a similar metric is ROC AUC, which can be used for approximate evaluation. For optimizing our model, binary cross entropy (BCE) loss from Keras will be used.

It's important to note that BCE loss is sensitive to class imbalance. To mitigate this issue, the class weights will be used which is computed earlier. These weights will prioritize updating the model weights more strongly for the minority class (positive class), thereby mitigating class imbalance.

In [None]:
# AUC
auc = keras.metrics.AUC()

# Loss
loss = keras.losses.BinaryCrossentropy(label_smoothing=0.02)

# 🤖 | Modeling

In this notebook, we're using the `EfficientNetV2 B2` backbone from KerasCV's pretrained models to extract features from images. For processing tabular data, `Dense` layers are used. The final layer (head) is a `Dense` layer with a `sigmoid` activation function. The `sigmoid` activation is chosen because the target is binary; `softmax` would've been selected if the target was multiclass.

To explore other backbones, one can simply modify the `preset` in the `CFG` (config). A list of available pretrained backbones can be found on the [KerasCV website](https://keras.io/api/keras_cv/models/).

> **Note**: Since the size of tabular features likely to change due to feature space processing,  `feature_space.get_encoded_features()` is used to determine the final size of the tabular feature vector for building the model.

In [None]:
# Define input layers
image_input = keras.Input(shape=(*CFG.image_size, 3), name="images")
feat_input = keras.Input(shape=(feature_space.get_encoded_features().shape[1],), name="features")
inp = {"images":image_input, "features":feat_input}

# Branch for image input
backbone = keras_cv.models.EfficientNetV2Backbone.from_preset(CFG.preset)
x1 = backbone(image_input)
x1 = keras.layers.GlobalAveragePooling2D()(x1)
x1 = keras.layers.Dropout(0.2)(x1)

# Branch for tabular/feature input
x2 = keras.layers.Dense(96, activation="selu")(feat_input)
x2 = keras.layers.Dense(128, activation="selu")(x2)
x2 = keras.layers.Dropout(0.1)(x2)

# Concatenate both branches
concat = keras.layers.Concatenate()([x1, x2])

# Output layer
out = keras.layers.Dense(1, activation="sigmoid", dtype="float32")(concat)

# Build model
model = keras.models.Model(inp, out)

# Compile the model
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-4),
    loss=loss,
    metrics=[auc],
)

# Model Summary
model.summary()

## Plot Model

As our model is multi-input, it is difficult to understand what is going on inside the architecture. That is where `plot_model` from **Keras** can be very handy. Overall architecture is shown, making it easier to design or recheck our architecture.

In [None]:
keras.utils.plot_model(model, show_shapes=True, show_layer_names=True, dpi=60)

# ⚓ | LR Schedule

A well-structured learning rate schedule is essential for efficient model training, ensuring optimal convergence and avoiding issues such as overshooting or stagnation.

In [None]:
import math

def get_lr_callback(batch_size=8, mode='cos', epochs=10, plot=False):
    lr_start, lr_max, lr_min = 2.5e-5, 5e-6 * batch_size, 0.8e-5
    lr_ramp_ep, lr_sus_ep, lr_decay = 3, 0, 0.75

    def lrfn(epoch):  # Learning rate update function
        if epoch < lr_ramp_ep: lr = (lr_max - lr_start) / lr_ramp_ep * epoch + lr_start
        elif epoch < lr_ramp_ep + lr_sus_ep: lr = lr_max
        elif mode == 'exp': lr = (lr_max - lr_min) * lr_decay**(epoch - lr_ramp_ep - lr_sus_ep) + lr_min
        elif mode == 'step': lr = lr_max * lr_decay**((epoch - lr_ramp_ep - lr_sus_ep) // 2)
        elif mode == 'cos':
            decay_total_epochs, decay_epoch_index = epochs - lr_ramp_ep - lr_sus_ep + 3, epoch - lr_ramp_ep - lr_sus_ep
            phase = math.pi * decay_epoch_index / decay_total_epochs
            lr = (lr_max - lr_min) * 0.5 * (1 + math.cos(phase)) + lr_min
        return lr

    if plot:  # Plot lr curve if plot is True
        plt.figure(figsize=(10, 5))
        plt.plot(np.arange(epochs), [lrfn(epoch) for epoch in np.arange(epochs)], marker='o')
        plt.xlabel('epoch'); plt.ylabel('lr')
        
        plt.title('LR Scheduler')
        plt.show()

    return keras.callbacks.LearningRateScheduler(lrfn, verbose=False)  # Create lr callback

In [None]:
inputs, targets = next(iter(training_ds))
images = inputs["images"]
num_images, NUMERIC_COLUMNS = 8, 4

plt.figure(figsize=(4 * NUMERIC_COLUMNS, num_images // NUMERIC_COLUMNS * 4))
for i, (image, target) in enumerate(zip(images[:num_images], targets[:num_images])):
    plt.subplot(num_images // NUMERIC_COLUMNS, NUMERIC_COLUMNS, i + 1)
    image = image.numpy().astype("float32")
    target= target.numpy().astype("int32")[0]
    
    image = (image - image.min()) / (image.max() + 1e-4)

    plt.imshow(image)
    plt.title(f"Target: {target}")
    plt.axis("off")

plt.tight_layout()
plt.show()

In [None]:
lr_cb = get_lr_callback(CFG.batch_size, mode="exp", plot=True)

# 💾 | Model Checkpoint

The `ModelCheckpoint` callback is used to save the model during training. It monitors the performance of the model on the validation data and saves the model with the best performance based on a specified metric.

Following setup ensures that the model with the highest validation AUC (`val_auc`) during training is saved to the file `best_model.keras`. Only the best model is saved, preventing overwriting with worse-performing models.

In [None]:
ckpt_cb = keras.callbacks.ModelCheckpoint(
    "best_model.keras",   # Filepath where the model will be saved.
    monitor="val_auc",    # Metric to monitor (validation AUC in this case).
    save_best_only=True,  # Save only the model with the best performance.
    save_weights_only=False,  # Save the entire model (not just the weights).
    mode="max",           # The model with the maximum 'val_auc' will be saved.
)

# 🚂 | Training

Follwing cell will train the prepared model. Notice that, `class_weight` is used in the training.

In [None]:
history = model.fit(
    training_ds,
    epochs=CFG.epochs,
    callbacks=[lr_cb, ckpt_cb],
    validation_data=validation_ds,
    verbose=CFG.verbose,
    class_weight=class_weights,
)

## Visualize AUC with Epochs

In [None]:
# Extract AUC and validation AUC from history
auc = history.history['auc']
val_auc = history.history['val_auc']
epochs = range(1, len(auc) + 1)

# Find the epoch with the maximum val_auc
max_val_auc_epoch = np.argmax(val_auc)
max_val_auc = val_auc[max_val_auc_epoch]

# Plotting
plt.figure(figsize=(10, 6))
plt.plot(epochs, auc, 'o-', label='Training AUC', markersize=5, color='tab:blue')
plt.plot(epochs, val_auc, 's-', label='Validation AUC', markersize=5, color='tab:orange')

# Highlight the max val_auc
plt.scatter(max_val_auc_epoch + 1, max_val_auc, color='red', s=100, label=f'Max Val AUC: {max_val_auc:.4f}')
plt.annotate(f'Max Val AUC: {max_val_auc:.4f}', 
             xy=(max_val_auc_epoch + 1, max_val_auc), 
             xytext=(max_val_auc_epoch + 1 + 0.5, max_val_auc - 0.05),
             arrowprops=dict(facecolor='black', arrowstyle="->"),
             fontsize=12,
             color='tab:red')

# Enhancing the plot
plt.title('AUC over Epochs', fontsize=14)
plt.xlabel('Epoch', fontsize=12)
plt.ylabel('AUC', fontsize=12)
plt.legend(loc='lower right', fontsize=12)
plt.grid(True)
plt.xticks(epochs)

# Show the plot
plt.show()

# 📋 | Result

In [None]:
# Best Result
best_score = max(history.history['val_auc'])
best_epoch = np.argmax(history.history['val_auc']) + 1
print("#" * 10 + " Result " + "#" * 10)
print(f"Best AUC: {best_score:.5f}")
print(f"Best Epoch: {best_epoch}")
print("#" * 28)

# 🧪 | Prediction

## Load Best Model

In [None]:
model.load_weights("best_model.keras")

## Build Testing Dataset

Don't forget to normalize for the testing data as well.

In [None]:
# Testing
print("# Testing:")
testing_features = dict(testing_df[FEAT_COLS])
testing_ids = testing_df.isic_id.values
testing_ds = build_dataset(testing_ids, testing_hdf5,
                        testing_features, batch_size=CFG.batch_size,
                         shuffle=False, augment=False, cache=False)
# Apply feature space processing
testing_ds = testing_ds.map(
    lambda x: {"images": x["images"],
               "features": feature_space(x["features"])}, num_parallel_calls=tf.data.AUTOTUNE)

## Inference

In [None]:
preds = model.predict(testing_ds).squeeze()

## Check Prediction

In [None]:
inputs = next(iter(testing_ds))
images = inputs["images"]

# Plotting
plt.figure(figsize=(10, 4))

for i in range(3):
    plt.subplot(1, 3, i+1)  # 1 row, 3 columns, i+1th subplot
    plt.imshow(images[i])  # Show image
    plt.title(f'Prediction: {preds[i]:.2f}')  # Set title with prediction
    plt.axis('off')  # Hide axis

plt.suptitle('Model Predictions on Testing Images', fontsize=16)
plt.tight_layout()
plt.show()

## DenseNet121

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.metrics import Precision, Recall, AUC, Accuracy  # Import the necessary metrics

# Define input layers for images and tabular metadata
image_input = keras.Input(shape=(128, 128, 3), name="images")
feat_input = keras.Input(shape=(feature_space.get_encoded_features().shape[1],), name="features")
inp = {"images": image_input, "features": feat_input}

# Branch for image input using EfficientNet backbone
backbone = EfficientNetB0(weights=None, include_top=False, input_shape=(128, 128, 3))
x1 = backbone(image_input)
x1 = keras.layers.GlobalAveragePooling2D()(x1)
x1 = keras.layers.BatchNormalization()(x1)  # Adding Batch Normalization

# Branch for tabular/feature input
x2 = keras.layers.Dense(128, activation="relu")(feat_input)  # Increased units and changed activation
x2 = keras.layers.Dense(256, activation="relu")(x2)  # Increased units
x2 = keras.layers.BatchNormalization()(x2)  # Adding Batch Normalization

# Concatenate both branches
concat = keras.layers.Concatenate()([x1, x2])

# Output layer for binary classification (benign vs malignant)
out = keras.layers.Dense(1, activation="sigmoid", dtype="float32")(concat)

# Build the multi-modal model
model = keras.models.Model(inputs=inp, outputs=out)

# Compile the model
auc = AUC(name="auc")
precision = Precision(name="precision")  # Define precision metric
recall = Recall(name="recall")  # Define recall metric
accuracy = Accuracy(name="accuracy")  # Define accuracy metric
loss = keras.losses.BinaryCrossentropy(from_logits=False)

model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-4),
    loss=loss,
    metrics=[auc, precision, recall],  # Add precision, recall, and accuracy here
)

# Model summary
model.summary()

# Training the model
history = model.fit(
    training_ds,
    epochs=30,
    callbacks=[lr_cb, ckpt_cb],
    validation_data=validation_ds,
    verbose=CFG.verbose,
    class_weight=class_weights,
)

In [None]:
# Step 1: Find the epoch with the best validation AUC
best_val_auc = history.history['val_auc'][best_epoch]  # Best AUC at that epoch

# Step 2: Extract the corresponding metrics at that epoch
best_precision = history.history['precision'][best_epoch]
best_recall = history.history['recall'][best_epoch]

# Step 3: Print the results
print(f"Best Precision: {best_precision:.5f}")
print(f"Best Recall: {best_recall:.5f}")
print(f"Best AUC: {best_val_auc:.5f}")

## ResNet50

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.metrics import Precision, Recall, AUC

# Define input layers for images and tabular metadata
image_input = keras.Input(shape=(128, 128, 3), name="images")
feat_input = keras.Input(shape=(feature_space.get_encoded_features().shape[1],), name="features")
inp = {"images": image_input, "features": feat_input}

# Branch for image input using ResNet50
backbone = ResNet50(weights=None, include_top=False, input_shape=(128, 128, 3))
x1 = backbone(image_input)
x1 = keras.layers.GlobalAveragePooling2D()(x1)
x1 = keras.layers.BatchNormalization()(x1)
x1 = keras.layers.Dropout(0.3)(x1)

# Branch for tabular/feature input
x2 = keras.layers.Dense(128, activation="relu")(feat_input)
x2 = keras.layers.BatchNormalization()(x2)
x2 = keras.layers.Dense(64, activation="relu")(x2)
x2 = keras.layers.Dropout(0.2)(x2)

# Concatenate both branches
concat = keras.layers.Concatenate()([x1, x2])

# Additional Dense layers after concatenation for deeper learning
x = keras.layers.Dense(64, activation="relu")(concat)
x = keras.layers.BatchNormalization()(x)
out = keras.layers.Dense(1, activation="sigmoid", dtype="float32")(x)

# Build the multi-modal model
model = keras.models.Model(inputs=inp, outputs=out)

# Compile the model with additional metrics (precision, recall, AUC)
auc = AUC(name="auc")
precision = Precision(name="precision")
recall = Recall(name="recall")
accuracy = keras.metrics.BinaryAccuracy(name="accuracy")

loss = keras.losses.BinaryCrossentropy(from_logits=False)

model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-4),
    loss=loss,
    metrics=[auc, precision, recall],  # Add precision, recall, and accuracy here
)

# Model summary
model.summary()

# Training the model
history = model.fit(
    training_ds,
    epochs=30,
    callbacks=[lr_cb, ckpt_cb],
    validation_data=validation_ds,
    verbose=CFG.verbose,
    class_weight=class_weights,
)

In [None]:
# Step 1: Find the epoch with the best validation AUC
best_val_auc = history.history['val_auc'][best_epoch]  # Best AUC at that epoch

# Step 2: Extract the corresponding metrics at that epoch
best_precision = history.history['precision'][best_epoch]
best_recall = history.history['recall'][best_epoch]

# Step 3: Print the results
print(f"Best Precision: {best_precision:.5f}")
print(f"Best Recall: {best_recall:.5f}")
print(f"Best AUC: {best_val_auc:.5f}")

## InceptionV3

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import InceptionV3

# Define input layers for images and tabular metadata
image_input = keras.Input(shape=(128, 128, 3), name="images")
feat_input = keras.Input(shape=(feature_space.get_encoded_features().shape[1],), name="features")
inp = {"images": image_input, "features": feat_input}

# Branch for image input using InceptionV3
backbone = InceptionV3(weights=None, include_top=False, input_shape=(128, 128, 3))
x1 = backbone(image_input)
x1 = keras.layers.GlobalAveragePooling2D()(x1)
x1 = keras.layers.BatchNormalization()(x1)  # Added BatchNormalization
x1 = keras.layers.Dropout(0.3)(x1)  # Increased Dropout rate

# Branch for tabular/feature input
x2 = keras.layers.Dense(128, activation="relu")(feat_input)  # Increased the first layer size
x2 = keras.layers.BatchNormalization()(x2)  # Added BatchNormalization
x2 = keras.layers.Dense(64, activation="relu")(x2)  # Changed second layer size
x2 = keras.layers.Dropout(0.2)(x2)  # Modified Dropout rate

# Concatenate both branches
concat = keras.layers.Concatenate()([x1, x2])

# Additional Dense layers after concatenation for deeper learning
x = keras.layers.Dense(64, activation="relu")(concat)  # New dense layer
x = keras.layers.BatchNormalization()(x)  # Added BatchNormalization
x = keras.layers.Dropout(0.3)(x)  # Increased Dropout rate
out = keras.layers.Dense(1, activation="sigmoid", dtype="float32")(x)

# Build the multi-modal model
model = keras.models.Model(inputs=inp, outputs=out)

# Compile the model
auc = keras.metrics.AUC(name="auc")
loss = keras.losses.BinaryCrossentropy(from_logits=False)

model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-4),
    loss=loss,
    metrics=[auc, precision, recall],  # Add precision, recall, and accuracy here
)

# Model summary
model.summary()

# Training the model
history = model.fit(
    training_ds,
    epochs=30,
    callbacks=[lr_cb, ckpt_cb],
    validation_data=validation_ds,
    verbose=CFG.verbose,
    class_weight=class_weights,
)

In [None]:
# Step 1: Find the epoch with the best validation AUC
best_val_auc = history.history['val_auc'][best_epoch]  # Best AUC at that epoch

# Step 2: Extract the corresponding metrics at that epoch
best_precision = history.history['precision'][best_epoch]
best_recall = history.history['recall'][best_epoch]

# Step 3: Print the results
print(f"Best Precision: {best_precision:.5f}")
print(f"Best Recall: {best_recall:.5f}")
print(f"Best AUC: {best_val_auc:.5f}")

## Hierarchical Multi-Scale Attention Fusion Network

## Updated one

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras import regularizers
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.metrics import AUC, Precision, Recall, Accuracy

# Define input layers for images and tabular metadata
image_input = keras.Input(shape=(128, 128, 3), name="images")
feat_input = keras.Input(shape=(feature_space.get_encoded_features().shape[1],), name="features")

# ----------------- Image Branch -----------------
# EfficientNet Backbone with Pretrained Weights
backbone = EfficientNetB0(weights=None, include_top=False, input_shape=(128, 128, 3))
x1 = backbone(image_input)
x1 = keras.layers.GlobalAveragePooling2D()(x1)
x1 = keras.layers.BatchNormalization()(x1)
x1 = keras.layers.Dropout(0.5)(x1)  # Increased dropout in the image branch

# First latent projection for image embeddings
x1_latent = keras.layers.Dense(512, activation="relu", name="image_latent_projection")(x1)
x1_latent = keras.layers.BatchNormalization()(x1_latent)

# ----------------- Feature Branch -----------------
# Tabular Data Feature Processing with Increased Dense Layer Sizes
x2 = keras.layers.Dense(512, activation="relu", kernel_regularizer=regularizers.l2(1e-4))(feat_input)
x2 = keras.layers.BatchNormalization()(x2)
x2 = keras.layers.Dropout(0.4)(x2)  # Increased dropout in the feature branch

# First latent projection for feature embeddings
x2_latent = keras.layers.Dense(512, activation="relu", name="feature_latent_projection")(x2)
x2_latent = keras.layers.BatchNormalization()(x2_latent)

# ----------------- Latent Space Alignment -----------------
# Discriminator for Alignment with Increased Network Capacity
def make_discriminator():
    d_input = keras.Input(shape=(512,))
    d_x = keras.layers.Dense(256, activation="relu")(d_input)
    d_x = keras.layers.BatchNormalization()(d_x)
    d_x = keras.layers.Dense(128, activation="relu")(d_x)
    d_x = keras.layers.BatchNormalization()(d_x)
    d_x = keras.layers.Dense(64, activation="relu")(d_x)
    d_output = keras.layers.Dense(1, activation="sigmoid", name="discriminator_output")(d_x)
    return keras.models.Model(inputs=d_input, outputs=d_output, name="Discriminator")

discriminator = make_discriminator()

# Latent space alignment
d_image = discriminator(x1_latent)
d_feature = discriminator(x2_latent)

# Loss for discriminator alignment
adversarial_loss = keras.losses.BinaryCrossentropy(from_logits=False)

# ----------------- Feature Aggregation -----------------
# Concatenate Latent Spaces
concat = keras.layers.Concatenate()([x1_latent, x2_latent])

# Hierarchical Feature Aggregation with Larger Dense Layer
agg = keras.layers.Dense(2048, activation="relu", name="aggregated_features")(concat)
agg = keras.layers.BatchNormalization()(agg)
agg = keras.layers.Dropout(0.5)(agg)  # Increased dropout

# ----------------- Output Layer -----------------
# Final Binary Classification
out = keras.layers.Dense(1, activation="sigmoid", dtype="float32", name="output")(agg)

# Build the MedBlendNet Model
medblendnet = keras.models.Model(inputs={"images": image_input, "features": feat_input}, outputs=out)

# ----------------- Compile the Model -----------------
# Early stopping and learning rate scheduler for better performance
early_stopping_cb = EarlyStopping(monitor="val_auc", patience=10, restore_best_weights=True, verbose=1, mode='max')
lr_scheduler = ReduceLROnPlateau(monitor='val_auc', factor=0.5, patience=5, verbose=1, mode='max')

# Compile the model
medblendnet.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-4),
    loss="binary_crossentropy",
    metrics=[AUC(), Precision(), Recall()]  # Correctly instantiate the metrics
)

# Model summary
medblendnet.summary()
# ----------------- Train the Model -----------------
history = medblendnet.fit(
    training_ds,
    epochs=10,
    callbacks=[lr_scheduler, early_stopping_cb, ckpt_cb],  # Add checkpoints if needed
    validation_data=validation_ds,
    verbose=CFG.verbose,
    class_weight=class_weights,
)

In [None]:
#import numpy as np

# Step 1: Find the epoch with the best validation AUC
best_epoch = np.argmax(history.history['val_auc_2'])  # Index of the epoch with the highest validation AUC

# Step 2: Extract the corresponding metrics at that epoch
best_precision = history.history['precision_1'][best_epoch]
best_recall = history.history['recall_1'][best_epoch]
best_val_auc = history.history['val_auc_2'][best_epoch]

# Step 3: Print the results
print(f"Best Precision at epoch {best_epoch + 1}: {best_precision:.5f}")
print(f"Best Recall at epoch {best_epoch + 1}: {best_recall:.5f}")
print(f"Best Validation AUC at epoch {best_epoch + 1}: {best_val_auc:.5f}")

In [None]:
import tensorflow as tf
import numpy as np
import time
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2

def benchmark_model(model, model_name="Model", img_shape=(128, 128, 3), tabular_shape=(10,), runs=100):
    print(f"\n📊 Benchmarking: {model_name}")
    print("-" * 50)

    # ------------------ Parameters ------------------
    total_params = model.count_params()
    print(f"🔢 Total Parameters: {total_params / 1e6:.2f} Million")

    # ------------------ FLOPs ------------------
    try:
        func = tf.function(lambda images, features: model({"images": images, "features": features}))
        concrete_func = func.get_concrete_function(
            images=tf.TensorSpec([1, *img_shape], tf.float32),
            features=tf.TensorSpec([1, *tabular_shape], tf.float32)
        )

        frozen_func = convert_variables_to_constants_v2(concrete_func)
        graph_def = frozen_func.graph.as_graph_def()

        with tf.Graph().as_default() as graph:
            tf.import_graph_def(graph_def, name='')
            run_meta = tf.compat.v1.RunMetadata()
            opts = tf.compat.v1.profiler.ProfileOptionBuilder.float_operation()
            flops = tf.compat.v1.profiler.profile(graph=graph, run_meta=run_meta, options=opts)
            print(f"⚙️ FLOPs: {flops.total_float_ops / 1e9:.4f} GFLOPs")
    except Exception as e:
        print(f"⚠️ FLOPs calculation failed: {e}")

    # ------------------ Inference Time ------------------
    try:
        img_input = np.random.rand(1, *img_shape).astype(np.float32)
        tab_input = np.random.rand(1, *tabular_shape).astype(np.float32)

        # Warm-up
        for _ in range(5):
            _ = model.predict({"images": img_input, "features": tab_input}, verbose=0)

        # Timed runs
        start = time.time()
        for _ in range(runs):
            _ = model.predict({"images": img_input, "features": tab_input}, verbose=0)
        end = time.time()

        avg_time = (end - start) / runs
        print(f"⏱️ Avg Inference Time (over {runs} runs): {avg_time:.6f} seconds")
    except Exception as e:
        print(f"⚠️ Inference timing failed: {e}")

    print("-" * 50)

In [None]:
# Your MedBlendNet
benchmark_model(medblendnet, model_name="MedBlendNet", img_shape=(128,128,3), tabular_shape=(feature_space.get_encoded_features().shape[1],))

## CMMANN

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, regularizers
from tensorflow.keras.applications import EfficientNetB4
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import LearningRateScheduler, EarlyStopping, ReduceLROnPlateau

# Memory-Augmented Network
class MemoryAugmentedNetwork(keras.layers.Layer):
    def __init__(self, memory_size=512, memory_dim=256, temperature=0.1, **kwargs):
        super(MemoryAugmentedNetwork, self).__init__(**kwargs)
        self.memory_size = memory_size
        self.memory_dim = memory_dim
        self.temperature = temperature
        self.memory = self.add_weight(
            name="memory",
            shape=(memory_size, memory_dim),
            initializer="glorot_uniform",
            trainable=True,
        )
        
    def call(self, inputs):
        # Scaled dot-product attention with temperature
        similarity = tf.matmul(inputs, self.memory, transpose_b=True) / self.temperature
        attention_weights = tf.nn.softmax(similarity, axis=-1)
        memory_read = tf.matmul(attention_weights, self.memory)
        # Residual connection
        return memory_read + inputs

# Image and Tabular Data Inputs
image_input = keras.Input(shape=(128, 128, 3), name="images")
feat_input = keras.Input(shape=(71,), name="features")

# ----------------- Image Branch -----------------
# Using EfficientNetB4 for better feature extraction without pre-trained weights
backbone = EfficientNetB4(weights=None, include_top=False, input_shape=(128, 128, 3))

# Freeze early layers
for layer in backbone.layers[:100]:
    layer.trainable = False

x1 = backbone(image_input)
x1 = layers.GlobalAveragePooling2D()(x1)
x1 = layers.BatchNormalization()(x1)
x1 = layers.Dense(512, activation="relu")(x1)
x1 = layers.Dropout(0.5)(x1)  # Increased dropout

# Latent Representation for Image Features
x1_latent = layers.Dense(512, activation="relu", 
                        kernel_regularizer=regularizers.l2(1e-4),
                        name="image_latent_projection")(x1)

# ----------------- Feature Branch -----------------
# Enhanced Tabular Data Processing
x2 = layers.Dense(256, activation="relu")(feat_input)
x2 = layers.BatchNormalization()(x2)
x2 = layers.Dense(512, activation="relu")(x2)
x2 = layers.BatchNormalization()(x2)
x2 = layers.Dropout(0.5)(x2)  # Increased dropout

# Latent Representation for Tabular Features
x2_latent = layers.Dense(512, activation="relu",
                        kernel_regularizer=regularizers.l2(1e-4),
                        name="feature_latent_projection")(x2)

# ----------------- Memory-Augmented Module -----------------
memory_module = MemoryAugmentedNetwork(memory_size=512, memory_dim=256, temperature=0.1)
x1_latent_projected = layers.Dense(256, activation="relu")(x1_latent)
x2_latent_projected = layers.Dense(256, activation="relu")(x2_latent)

x1_mem = memory_module(x1_latent_projected)
x2_mem = memory_module(x2_latent_projected)

# ----------------- Enhanced Contrastive Loss -----------------
def improved_contrastive_loss(y_true, y_pred, margin=1.5, alpha=0.6):
    bce = tf.keras.losses.binary_crossentropy(y_true, y_pred)
    
    # Enhanced contrastive component
    pos_pair_distance = tf.reduce_sum(tf.square(y_pred - y_true), axis=-1)
    neg_pair_distance = tf.maximum(margin - pos_pair_distance, 0.0)
    
    # Focal loss component
    gamma = 2.0
    focal_weight = tf.pow(1. - y_pred, gamma) * y_true + tf.pow(y_pred, gamma) * (1. - y_true)
    focal = focal_weight * bce
    
    return alpha * (pos_pair_distance + neg_pair_distance) + (1 - alpha) * focal

# ----------------- Feature Aggregation -----------------
concat = layers.Concatenate(axis=-1)([x1_mem, x2_mem])

# Enhanced Cross-Modal Feature Aggregation
agg = layers.Dense(1024, activation="relu", kernel_regularizer=regularizers.l2(1e-4))(concat)
agg = layers.BatchNormalization()(agg)
agg = layers.Dropout(0.5)(agg)  # Increased dropout
agg = layers.Dense(512, activation="relu")(agg)
agg = layers.BatchNormalization()(agg)
agg = layers.Dropout(0.5)(agg)  # Increased dropout

# ----------------- Output Layer -----------------
out = layers.Dense(1, activation="sigmoid", dtype="float32")(agg)

# Build Model
cmmann = keras.models.Model(inputs={"images": image_input, "features": feat_input}, outputs=out)

# Compile with custom metrics
cmmann.compile(
    optimizer=keras.optimizers.Adam(learning_rate=5e-5),
    loss=improved_contrastive_loss,
    metrics=[AUC(), Precision(), Recall()]  # Correctly instantiate the metrics

)

# ----------------- Enhanced Callbacks -----------------
reduce_lr = ReduceLROnPlateau(
    monitor='val_auc',
    factor=0.4,  # Lower factor
    patience=6,  # Adjusted patience
    min_lr=1e-6,
    mode='max'
)

early_stopping = EarlyStopping(
    monitor='val_auc',
    patience=20,  # Increased patience
    restore_best_weights=True,
    mode='max'
)

# ----------------- Enhanced Data Augmentation -----------------
train_datagen = ImageDataGenerator(
    rotation_range=30,
    width_shift_range=0.3,
    height_shift_range=0.3,
    shear_range=0.3,
    zoom_range=0.3,
    horizontal_flip=True,
    vertical_flip=False,
    fill_mode='nearest',
    brightness_range=[0.7, 1.3],
    channel_shift_range=50.0
)
keras.utils.plot_model(cmmann, show_shapes=True, show_layer_names=True, dpi=60)

# ----------------- Training -----------------
history = cmmann.fit(
    training_ds,
    epochs=20,  # Increased epochs
    callbacks=[early_stopping, reduce_lr],
    validation_data=validation_ds,
    verbose=CFG.verbose,
    class_weight=class_weights
)

In [None]:
#import numpy as np

# Step 1: Find the epoch with the best validation AUC
best_epoch = np.argmax(history.history['val_auc_3'])  # Index of the epoch with the highest validation AUC

# Step 2: Extract the corresponding metrics at that epoch
best_precision = history.history['precision_2'][best_epoch]
best_recall = history.history['recall_2'][best_epoch]
best_val_auc = history.history['val_auc_3'][best_epoch]

# Step 3: Print the results
print(f"Best Precision at epoch {best_epoch + 1}: {best_precision:.5f}")
print(f"Best Recall at epoch {best_epoch + 1}: {best_recall:.5f}")
print(f"Best Validation AUC at epoch {best_epoch + 1}: {best_val_auc:.5f}")

## Attention Model

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.layers import Attention, Add, MultiHeadAttention, Dense, Flatten, BatchNormalization, Concatenate, Dropout, GlobalAveragePooling2D, Reshape
from tensorflow.keras.metrics import Precision, Recall, AUC, Accuracy
from tensorflow.keras import regularizers

# Define input layers for images and tabular metadata
image_input = keras.Input(shape=(128, 128, 3), name="images")
feat_input = keras.Input(shape=(feature_space.get_encoded_features().shape[1],), name="features")

# EfficientNetB0 Backbone with Batch Normalization
backbone = EfficientNetB0(weights=None, include_top=False, input_shape=(128, 128, 3))
x1 = backbone(image_input)
x1 = GlobalAveragePooling2D()(x1)
x1 = BatchNormalization()(x1)  # Adding Batch Normalization
x1 = Dropout(0.3)(x1)  # Dropout to prevent overfitting

# Attention Mechanism
x1_reshaped = Reshape((1, 1280))(x1)
x1_attention = MultiHeadAttention(num_heads=8, key_dim=128)(x1_reshaped, x1_reshaped)
x1 = Add()([x1_reshaped, x1_attention])
x1 = BatchNormalization()(x1)  # Batch normalization after attention
x1 = Flatten()(x1)

# Tabular Branch with DenseNet-like connections and regularization
x2 = Dense(128, activation="relu", kernel_regularizer=regularizers.l2(0.01))(feat_input)
x2 = Dense(256, activation="relu", kernel_regularizer=regularizers.l2(0.01))(x2)
x2 = BatchNormalization()(x2)
x2 = Dropout(0.3)(x2)  # Dropout for regularization

# Attention mechanism for tabular features
x2_reshaped = Reshape((1, 256))(x2)
x2_attention = MultiHeadAttention(num_heads=8, key_dim=128)(x2_reshaped, x2_reshaped)
x2 = Add()([x2_reshaped, x2_attention])
x2 = BatchNormalization()(x2)  # Batch normalization after attention
x2 = Flatten()(x2)

# Combine Image and Tabular Branches
concat = Concatenate()([x1, x2])

# Output Layer
out = Dense(1, activation="sigmoid")(concat)

# Build Model
model = keras.models.Model(inputs=[image_input, feat_input], outputs=out)

# Compile the model
auc = AUC(name="auc")
precision = Precision(name="precision")
recall = Recall(name="recall")
accuracy = Accuracy(name="accuracy")

model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-4),
    loss='binary_crossentropy',
    metrics=[AUC(), Precision(), Recall()]  # Correctly instantiate the metrics
)

keras.utils.plot_model(model, show_shapes=True, show_layer_names=True, dpi=60)

from tensorflow.keras.utils import plot_model

plot_model(model, to_file='model_architecture.png', show_shapes=True, show_layer_names=True, dpi=150)
from IPython.display import FileLink

# Display a download link
FileLink('model_architecture.png')
# Model summary

In [None]:
#import numpy as np

# Step 1: Find the epoch with the best validation AUC
best_epoch = np.argmax(history.history['val_auc_4'])  # Index of the epoch with the highest validation AUC

# Step 2: Extract the corresponding metrics at that epoch
best_precision = history.history['precision_3'][best_epoch]
best_recall = history.history['recall_3'][best_epoch]
best_val_auc = history.history['val_auc_4'][best_epoch]

# Step 3: Print the results
print(f"Best Precision at epoch {best_epoch + 1}: {best_precision:.5f}")
print(f"Best Recall at epoch {best_epoch + 1}: {best_recall:.5f}")
print(f"Best Validation AUC at epoch {best_epoch + 1}: {best_val_auc:.5f}")

# **FourierFusionNet **

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import cv2
import tensorflow as tf

def visualize_fft(image):
    """Compute and visualize FFT magnitude spectrum of a skin lesion image."""
    gray_image = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)  # Convert to grayscale
    f_transform = np.fft.fft2(gray_image)  # Apply 2D FFT
    f_shift = np.fft.fftshift(f_transform)  # Shift zero frequency component to center
    magnitude_spectrum = np.log(np.abs(f_shift) + 1)  # Compute magnitude and apply log scaling
    
    # Plot original image and its FFT spectrum
    plt.figure(figsize=(10, 5))
    plt.subplot(1, 2, 1), plt.imshow(gray_image, cmap="gray")
    plt.title("Original Image"), plt.axis("off")

    plt.subplot(1, 2, 2), plt.imshow(magnitude_spectrum, cmap="inferno")
    plt.title("FFT Magnitude Spectrum"), plt.axis("off")

    plt.show()

# Load example images of benign and malignant lesions
benign_image = cv2.imread("benign_example.jpg")  # Replace with actual dataset image
malignant_image = cv2.imread("malignant_example.jpg")

# Visualize FFT for both cases
visualize_fft(benign_image)
visualize_fft(malignant_image)

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.metrics import Precision, Recall, AUC
import tensorflow.keras.backend as K

def fourier_transform_layer(image):
    image_gray = tf.image.rgb_to_grayscale(image)
    image_gray = tf.cast(image_gray, tf.float32)
    fft = tf.signal.fft2d(tf.complex(image_gray[..., 0], tf.zeros_like(image_gray[..., 0])))
    fft_magnitude = tf.abs(fft)
    fft_magnitude = tf.math.log1p(fft_magnitude)
    fft_magnitude = tf.expand_dims(fft_magnitude, axis=-1)
    return tf.image.resize(fft_magnitude, (32, 32))

class AdaptiveFrequencyFiltering(keras.layers.Layer):
    def __init__(self):
        super(AdaptiveFrequencyFiltering, self).__init__()

    def build(self, input_shape):
        input_dim = input_shape[-1]
        self.filter_weights = self.add_weight(shape=(input_dim,), initializer="random_normal", trainable=True)
        self.bias = self.add_weight(shape=(input_dim,), initializer="zeros", trainable=True)

    def call(self, inputs):
        return inputs * tf.sigmoid(self.filter_weights) + self.bias

image_input = keras.Input(shape=(128, 128, 3), name="images")
feat_input = keras.Input(shape=(71,), name="features")

backbone = EfficientNetB0(weights="imagenet", include_top=False, input_shape=(128, 128, 3))

for layer in backbone.layers[:100]:
    layer.trainable = False

x1 = backbone(image_input)
x1 = keras.layers.GlobalAveragePooling2D()(x1)
x1 = keras.layers.BatchNormalization()(x1)

fourier_features = keras.layers.Lambda(fourier_transform_layer)(image_input)
fourier_features = keras.layers.Conv2D(16, kernel_size=3, activation="relu", padding="same")(fourier_features)
fourier_features = keras.layers.Conv2D(32, kernel_size=3, activation="relu", padding="same")(fourier_features)
fourier_features = keras.layers.GlobalAveragePooling2D()(fourier_features)
fourier_features = AdaptiveFrequencyFiltering()(fourier_features)
fourier_features = keras.layers.Dense(128, activation="relu")(fourier_features)

x2 = keras.layers.Dense(128, activation="relu")(feat_input)
x2 = keras.layers.Dense(256, activation="relu")(x2)
x2 = keras.layers.BatchNormalization()(x2)

concat = keras.layers.Concatenate()([x1, fourier_features, x2])

out = keras.layers.Dense(1, activation="sigmoid", dtype="float32")(concat)

model = keras.models.Model(inputs={"images": image_input, "features": feat_input}, outputs=out)

model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-4),
    loss=keras.losses.BinaryCrossentropy(from_logits=False),
    metrics=[
        AUC(name="auc"),
        Precision(name="precision"),
        Recall(name="recall"),
    ],
)

model.summary()

history = model.fit(
    training_ds,
    epochs=10,
    callbacks=[lr_cb, ckpt_cb],
    validation_data=validation_ds,
    verbose=1,
    class_weight=class_weights,
)

In [None]:
benchmark_model(model, model_name="FFT", img_shape=(128,128,3), tabular_shape=(feature_space.get_encoded_features().shape[1],))

In [None]:
best_epoch = np.argmax(history['val_auc'])

best_precision = history['val_precision'][best_epoch]
best_recall = history['val_recall'][best_epoch]
best_val_auc = history['val_auc'][best_epoch]

print(f"Best Precision: {best_precision:.5f}")
print(f"Best Recall: {best_recall:.5f}")
print(f"Best Validation AUC: {best_val_auc:.5f}")

# Symbolic Reasoning Model

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.metrics import Precision, Recall, AUC
import tensorflow.keras.backend as K

class SymbolicReasoningLayer(keras.layers.Layer):
    def __init__(self, **kwargs):
        super(SymbolicReasoningLayer, self).__init__(**kwargs)

    def call(self, inputs):
        image_features, metadata_features = inputs
        asymmetry_score = metadata_features[:, 0]
        border_irregularity = metadata_features[:, 1]
        rule_based_score = (asymmetry_score + border_irregularity) / 2
        rule_based_score = tf.expand_dims(rule_based_score, axis=-1)
        return tf.concat([image_features, rule_based_score], axis=-1)

image_input = keras.Input(shape=(128, 128, 3), name="images")
feat_input = keras.Input(shape=(71,), name="features")

backbone = EfficientNetB0(weights='imagenet', include_top=False, input_shape=(128, 128, 3))
x1 = backbone(image_input)
x1 = keras.layers.GlobalAveragePooling2D()(x1)
x1 = keras.layers.BatchNormalization()(x1)
x1 = keras.layers.Dense(256, activation="relu")(x1)

x2 = keras.layers.Dense(128, activation="relu")(feat_input)
x2 = keras.layers.Dense(256, activation="relu")(x2)
x2 = keras.layers.BatchNormalization()(x2)

reasoning_out = SymbolicReasoningLayer()([x1, x2])

out = keras.layers.Dense(1, activation="sigmoid", dtype="float32")(reasoning_out)

model = keras.models.Model(inputs={"images": image_input, "features": feat_input}, outputs=out)

model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-4),
    loss=keras.losses.BinaryCrossentropy(from_logits=False),
    metrics=[AUC(name="auc"), Precision(name="precision"), Recall(name="recall")],
)

model.summary()
# Train the Model
history = model.fit(
    training_ds,
    epochs=30,
    callbacks=[lr_cb, ckpt_cb],
    validation_data=validation_ds,
    verbose=2,  
    class_weight=class_weights
)

In [None]:
epoch = np.argmax(history['val_auc'])

precision = history['val_precision'][epoch]
recall = history['val_recall'][epoch]
val_auc = history['val_auc'][epoch]

print(f"Precision: {precision:.5f}")
print(f"Recall: {recall:.5f}")
print(f"Validation AUC: {val_auc:.5f}")

# WaveLet Model

In [None]:
import tensorflow as tf
import numpy as np
from tensorflow import keras

class LearnableWaveletTransform(keras.layers.Layer):
    def __init__(self, filters=32):
        super(LearnableWaveletTransform, self).__init__()
        self.filters = filters
    def build(self, input_shape):
        # Initialize low-pass and high-pass kernels with correct shape
        low_pass_init = np.array([[1, 2, 1],
                                  [2, 4, 2],
                                  [1, 2, 1]]) / 16.0
    
        high_pass_init = np.array([[-1, 0, 1],
                                    [-2, 0, 2],
                                    [-1, 0, 1]])
    
        # Expand dims to match the required shape (3, 3, input_channels, output_filters)
        low_pass_init = np.expand_dims(low_pass_init, axis=-1)  # (3, 3, 1)
        low_pass_init = np.tile(low_pass_init, (1, 1, input_shape[-1]))  # (3, 3, input_channels)
        low_pass_init = np.expand_dims(low_pass_init, axis=-1)  # (3, 3, input_channels, 1)
        low_pass_init = np.tile(low_pass_init, (1, 1, 1, self.filters))  # (3, 3, input_channels, filters)
    
        high_pass_init = np.expand_dims(high_pass_init, axis=-1)
        high_pass_init = np.tile(high_pass_init, (1, 1, input_shape[-1]))
        high_pass_init = np.expand_dims(high_pass_init, axis=-1)
        high_pass_init = np.tile(high_pass_init, (1, 1, 1, self.filters))
    
        # Ensure the initializer is properly shaped
        self.low_pass_kernels = self.add_weight(
            shape=(3, 3, input_shape[-1], self.filters),
            initializer=tf.keras.initializers.Constant(low_pass_init),
            trainable=True,
            name="low_pass_kernels"
        )
    
        self.high_pass_kernels = self.add_weight(
            shape=(3, 3, input_shape[-1], self.filters),
            initializer=tf.keras.initializers.Constant(high_pass_init),
            trainable=True,
            name="high_pass_kernels"
        )


    def call(self, inputs):
        low_freq = tf.nn.conv2d(inputs, self.low_pass_kernels, strides=1, padding="SAME")
        high_freq = tf.nn.conv2d(inputs, self.high_pass_kernels, strides=1, padding="SAME")

        low_freq_downsampled = tf.nn.avg_pool(low_freq, ksize=2, strides=2, padding="SAME")
        high_freq_downsampled = tf.nn.avg_pool(high_freq, ksize=2, strides=2, padding="SAME")

        return tf.concat([low_freq_downsampled, high_freq_downsampled], axis=-1)

def squeeze_excitation_block(inputs, ratio=16):
    filters = inputs.shape[-1]
    se = keras.layers.GlobalAveragePooling2D()(inputs)
    se = keras.layers.Dense(filters // ratio, activation="relu")(se)
    se = keras.layers.Dense(filters, activation="sigmoid")(se)
    return keras.layers.Multiply()([inputs, se])

image_input = keras.Input(shape=(128, 128, 3), name="images")
feat_input = keras.Input(shape=(71,), name="features")

backbone = EfficientNetB0(weights="imagenet", include_top=False, input_shape=(128, 128, 3))
for layer in backbone.layers[:100]:
    layer.trainable = False

x1 = backbone(image_input)
x1 = squeeze_excitation_block(x1)
x1 = keras.layers.GlobalAveragePooling2D()(x1)
x1 = keras.layers.BatchNormalization()(x1)

wavelet_features = LearnableWaveletTransform(filters=32)(image_input)
wavelet_features = keras.layers.Conv2D(16, kernel_size=3, activation="relu", padding="same")(wavelet_features)
wavelet_features = keras.layers.Conv2D(32, kernel_size=3, activation="relu", padding="same")(wavelet_features)
wavelet_features = keras.layers.GlobalAveragePooling2D()(wavelet_features)
wavelet_features = keras.layers.Dense(128, activation="relu")(wavelet_features)

x2 = keras.layers.Dense(128, activation="relu")(feat_input)
x2 = keras.layers.Dense(256, activation="relu")(x2)
x2 = keras.layers.BatchNormalization()(x2)

concat = keras.layers.Concatenate()([x1, wavelet_features, x2])

out = keras.layers.Dense(1, activation="sigmoid", dtype="float32")(concat)

model = keras.models.Model(inputs={"images": image_input, "features": feat_input}, outputs=out)

model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-4),
    loss=keras.losses.BinaryCrossentropy(from_logits=False),
    metrics=[
        AUC(name="auc"),
        Precision(name="precision"),
        Recall(name="recall"),
    ],
)

model.summary()

history = model.fit(
    training_ds,
    epochs=30,
    callbacks=[lr_cb, ckpt_cb],
    validation_data=validation_ds,
    verbose=2,
    class_weight=class_weights,
)

In [None]:

epoch = np.argmax(history['val_auc'])

precision = history['val_precision'][epoch]
recall = history['val_recall'][epoch]
val_auc = history['val_auc'][epoch]

print(f"Precision: {precision:.5f}")
print(f"Recall: {recall:.5f}")
print(f"Validation AUC: {val_auc:.5f}")