# ResNet50 Multi-label Xray Model 
1.  Architecture:
- Uses ResNet50 (ImageNet-pretrained) as the feature extractor with a custom head:
-  Global Average Pooling → Dense(256, ReLU) → Dropout(0.4) → Dense(14, Sigmoid).
2. Data Loading:
- Uses a custom XRayDataGenerator (OpenCV-based) for efficient image loading, resizing, and preprocessing.
3. Optimization:
- XLA compilation for faster GPU performance.
- CosineDecayRestarts learning-rate schedule.
- Adam optimizer with binary cross-entropy loss.
4. Training Strategy:
- 1-fold Multilabel Stratified K-Fold CV for robust validation. Full fold training — complete training + validation on fold 0 of your 5-fold MSKF setup. (previously in ResNet.ipynb it was a partial fold training)
5. Two-stage training:
- Train top layers (frozen base).
- Fine-tune last 20 ResNet layers.
6. Callbacks:
- ModelCheckpoint (saves best model by val_AUC) and EarlyStopping (patience=5).
7. Output:
- Saves each fold’s best/final models.
Exports final submission.csv for Kaggle.

**public score= 0.83, this optimized” version likely over-regularized or distorted label balance.**

Difficulties:
- setting global float to 16 before training correupted the training model, forcing me to retrain and predict.
- Runtime heavily relies on CPUs since Xray Data Generator relies on OpenCV which relies heavily on CPUS (an improvement next would be finding a faster way of data generating images)

In [2]:
!pip install iterative-stratification

Collecting iterative-stratification
  Downloading iterative_stratification-0.1.9-py3-none-any.whl.metadata (1.3 kB)
Downloading iterative_stratification-0.1.9-py3-none-any.whl (8.5 kB)
Installing collected packages: iterative-stratification
Successfully installed iterative-stratification-0.1.9


In [3]:
import tensorflow as tf
print("GPUs Available:", len(tf.config.list_physical_devices('GPU')))


GPUs Available: 1


In [4]:
# Imports
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.image as mpimg
import random
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import Sequence
import cv2
import os
import sys
import os, gc
import tensorflow as tf
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from tensorflow.keras.optimizers.schedules import CosineDecayRestarts
from tensorflow.keras import mixed_precision
from iterstrat.ml_stratifiers import MultilabelStratifiedKFold
from io import StringIO
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Dropout, GlobalAveragePooling2D
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import Sequence
import tensorflow.keras.applications.resnet50 as resnet
import warnings
from tensorflow.keras.utils import Sequence
import tensorflow.keras.applications.resnet50 as resnet
warnings.filterwarnings('ignore')

In [5]:
print(os.listdir("/kaggle/input"))

# Path to competition dataset
data_dir = "/kaggle/input/grand-xray-slam-division-b"
# Check what files are inside
print('Filenames of the data', os.listdir(data_dir))

['grand-xray-slam-division-b']
Filenames of the data ['test2', 'sample_submission_2.csv', 'train2.csv', 'train2']


In [6]:
# Load the training CSV metadata with labels
train = pd.read_csv("/kaggle/input/grand-xray-slam-division-b/train2.csv")

print('Metadata shape:',train.shape)
train.head()

Metadata shape: (108494, 21)


Unnamed: 0,Image_name,Patient_ID,Study,Sex,Age,ViewCategory,ViewPosition,Atelectasis,Cardiomegaly,Consolidation,...,Enlarged Cardiomediastinum,Fracture,Lung Lesion,Lung Opacity,No Finding,Pleural Effusion,Pleural Other,Pneumonia,Pneumothorax,Support Devices
0,00000003_001_001.jpg,3,1,Male,41.0,Frontal,AP,0,1,0,...,1,0,0,1,0,0,0,0,0,0
1,00000004_001_001.jpg,4,1,Female,20.0,Frontal,PA,0,0,0,...,0,0,0,0,1,0,0,0,0,0
2,00000004_001_002.jpg,4,1,Female,20.0,Lateral,Lateral,0,0,0,...,0,0,0,0,1,0,0,0,0,0
3,00000006_001_001.jpg,6,1,Female,42.0,Frontal,AP,0,0,0,...,0,0,0,0,1,0,0,0,0,0
4,00000010_001_001.jpg,10,1,Female,50.0,Frontal,PA,0,0,0,...,0,0,0,0,1,0,0,0,0,0


In [7]:
# 1. Feature & Target Preperation
# Define labels
conditions = [
    'Atelectasis', 'Cardiomegaly', 'Consolidation', 'Edema', 'Enlarged Cardiomediastinum',
    'Fracture', 'Lung Lesion', 'Lung Opacity', 'No Finding', 'Pleural Effusion',
    'Pleural Other', 'Pneumonia', 'Pneumothorax', 'Support Devices'
]
# Features you want
features = ["ViewCategory", "ViewPosition", "Age", "Sex"]

# Encode categorical features
from sklearn.preprocessing import LabelEncoder

train_enc = train.copy()   # train data encoded
for col in ["ViewCategory", "ViewPosition", "Sex"]:  # features that can be encoded
    le = LabelEncoder()
    train_enc[col] = le.fit_transform(train[col].astype(str))

X = train_enc[features].values
y = train[conditions].values
print(X.shape) # 4 features (ViewCategory, ViewPosition, Age, Sex)
print(y.shape)  # 14 conditions

(108494, 4)
(108494, 14)


In [8]:
# 2. Adding ViewBalancing for Stratification: ViewCategory= Frontal, Lateral; since ViewCategory is unbalanced

# One-hot encode ViewCategory and append to 
view_onehot = pd.get_dummies(train["ViewCategory"], prefix="view").values

y_aug = np.hstack([y, view_onehot])  # augmented target matrix (added ViewCategory as y to stratify and reduce bias)

# Data Generator

In [9]:
# Data Generator (OpenCV based)
class XRayDataGenerator(Sequence):
    def __init__(self, dataframe, batch_size=32, img_size=(224, 224), is_test=False, **kwargs):
        super().__init__(**kwargs)
        self.dataframe = dataframe.reset_index(drop=True)
        self.batch_size = batch_size
        self.img_size = img_size
        self.is_test = is_test
        self.image_dir = '/kaggle/input/grand-xray-slam-division-b/train2/' if not is_test else '/kaggle/input/grand-xray-slam-division-b/test2/'
        self.conditions = conditions
        
        if not os.path.exists(self.image_dir):
            raise FileNotFoundError(f"Directory {self.image_dir} not found.")

    
    def __len__(self):
        return (len(self.dataframe) + self.batch_size - 1) // self.batch_size
    
    def __getitem__(self, idx):
        start = idx * self.batch_size
        end = min(start + self.batch_size, len(self.dataframe))
        batch_data = self.dataframe.iloc[start:end]
        
        images, labels = [], []
        for _, row in batch_data.iterrows():
            # image loading
            img_path = os.path.join(self.image_dir, row['Image_name'])
            img = cv2.imread(img_path, cv2.IMREAD_COLOR)
            
            if img is not None:
                # image augmentation: resize and preprocess using resnet
                img = cv2.resize(img, self.img_size)
                img = resnet.preprocess_input(img)
                images.append(img)
                if not self.is_test:
                    labels.append(row[self.conditions].values.astype(np.float32))
        
        if not images:
            images.append(np.zeros((*self.img_size, 3), dtype=np.float32))
            if not self.is_test:
                labels.append(np.zeros(len(self.conditions), dtype=np.float32))
        
        return (np.array(images), np.array(labels)) if not self.is_test else np.array(images)


# ResNet 
**2-Fold stratified training with mixed precision, cosine LR schedule, and fine-tuning of top ResNet layers.
Uses custom OpenCV generator + ensemble of folds for strong, GPU-optimized AUC performance.**

1. fold 1: 3 epochs frozen, 3 epochs trainable 20 layers saved as **fold_0_final.h5** AUC=0.9003.
2. fold 2: 6 epochs frozen, 0 epochs trainable 20 layers 

In [10]:
print('**********Building ResNet Model******************')
#  Build ResNet50 model
# ============================================================
def build_resnet_model(num_classes=14, unfreeze_layers=None):
    base = ResNet50(weights="imagenet", include_top=False, input_shape=(224,224,3))
    if unfreeze_layers:
        for layer in base.layers[-unfreeze_layers:]:
            layer.trainable = True
    else:
        base.trainable = False

    x = tf.keras.layers.GlobalAveragePooling2D()(base.output)
    x = tf.keras.layers.Dense(256, activation="relu")(x)
    x = tf.keras.layers.Dropout(0.4)(x)
    out = tf.keras.layers.Dense(num_classes, activation="sigmoid")(x)
    return tf.keras.Model(inputs=base.input, outputs=out)

# ============================================================
# ⚙️ Callbacks: Automatically saves best weights model if interupttion occurs as fold_*_best.h5
# ============================================================
def get_callbacks(fold):
    return [
        ModelCheckpoint(f"fold_{fold}_best.h5", monitor="val_auc", mode="max", save_best_only=True, verbose=1),
        EarlyStopping(monitor="val_auc", mode="max", patience=5, restore_best_weights=True, verbose=1),
    ]

# ============================================================
#  Cross-validation training loop Full mskf CV fold
# ============================================================
mskf = MultilabelStratifiedKFold(n_splits=5, shuffle=True, random_state=42)
# Try to load existing AUCs if ran some folds
try:
    fold_aucs
except NameError:
    fold_aucs = []
print('**********Starting MSKF (k-fold) and CV training******************')
IMG_SIZE = (224, 224)
BATCH_SIZE = 32
# UNFREEZE_LAYERS = 20
# EPOCHS_FROZEN = 3
# EPOCHS_FINE = 3

# *********do two folds only************************** if fold >1: break
for fold, (train_idx, val_idx) in enumerate(mskf.split(X, y_aug)):
    # skip already completed folds
    if os.path.exists(f"fold_{fold}_finalv2.h5"):
        print(f"✅ Skipping fold {fold} (already completed)")
        continue
    if fold == 1:  # stop after fold 1
        break

    print(f"\n================ FOLD {fold+1} ================")
    train_df = train.iloc[train_idx].reset_index(drop=True)
    val_df   = train.iloc[val_idx].reset_index(drop=True)

    # Use generators
    train_generator = XRayDataGenerator(train_df, batch_size=BATCH_SIZE, img_size=IMG_SIZE)
    val_generator   = XRayDataGenerator(val_df, batch_size=BATCH_SIZE, img_size=IMG_SIZE)

    # Learning rate schedule
    lr_schedule = CosineDecayRestarts(
        initial_learning_rate=1e-4,
        first_decay_steps=len(train_generator)*2,
        t_mul=2.0, m_mul=0.9, alpha=1e-6
    )
    # building resnet architecture 
    model = build_resnet_model() 
    model.compile(optimizer=tf.keras.optimizers.Adam(lr_schedule), loss="binary_crossentropy",
        metrics=[tf.keras.metrics.AUC(name="auc")]
    )
    history = model.fit(
        train_generator,validation_data=val_generator,
        epochs=3, callbacks=get_callbacks(fold), verbose=1) # decrease epochs for less time AUC=0.9003

    # Fine-tuning last 20 layers
    for layer in model.layers[-20:]:
        layer.trainable = True    
    model.compile(
        optimizer=tf.keras.optimizers.Adam(1e-5), loss="binary_crossentropy",
        metrics=[tf.keras.metrics.AUC(name="auc")]
    )
    history_ft = model.fit(
        train_generator, validation_data=val_generator,
        epochs=3, callbacks=get_callbacks(fold), verbose=1) # decrease epochs for less time

    model.save(f"fold_{fold}_finalv2.h5")    # save model after each CV fold
    key = 'val_auc' if 'val_auc' in history.history else 'val_AUC'
    best_auc = max(history.history[key] + history_ft.history[key])
    fold_aucs.append(best_auc)
    print(f"✅ Fold {fold+1} Best AUC: {best_auc:.4f}")
    gc.collect()

print('*************************MSKF K-FOld COMPLETE*************')

# ============================================================
# 🧾 CV Summary
# ============================================================
print(f"\n📊 Cross-validation AUCs: {fold_aucs}")
print(f"🏆 Mean CV AUC: {np.mean(fold_aucs):.4f}")

**********Building ResNet Model******************
**********Starting MSKF (k-fold) and CV training******************



I0000 00:00:1759764219.727739      36 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 15513 MB memory:  -> device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0


Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
[1m94765736/94765736[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
Epoch 1/3


I0000 00:00:1759764238.007081      85 service.cc:148] XLA service 0x798b74003460 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1759764238.007848      85 service.cc:156]   StreamExecutor device (0): Tesla P100-PCIE-16GB, Compute Capability 6.0
I0000 00:00:1759764239.658251      85 cuda_dnn.cc:529] Loaded cuDNN version 90300


[1m   1/2713[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m13:12:33[0m 18s/step - auc: 0.3438 - loss: 1.1298

I0000 00:00:1759764243.809269      85 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


[1m2713/2713[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1s/step - auc: 0.8169 - loss: 0.4330
Epoch 1: val_auc improved from -inf to 0.89243, saving model to fold_0_best.h5
[1m2713/2713[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4545s[0m 2s/step - auc: 0.8169 - loss: 0.4330 - val_auc: 0.8924 - val_loss: 0.3456
Epoch 2/3
[1m2713/2713[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1s/step - auc: 0.8835 - loss: 0.3569
Epoch 2: val_auc improved from 0.89243 to 0.89567, saving model to fold_0_best.h5
[1m2713/2713[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4486s[0m 2s/step - auc: 0.8835 - loss: 0.3569 - val_auc: 0.8957 - val_loss: 0.3400
Epoch 3/3
[1m2713/2713[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1s/step - auc: 0.8857 - loss: 0.3548
Epoch 3: val_auc improved from 0.89567 to 0.90034, saving model to fold_0_best.h5
[1m2713/2713[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4468s[0m 2s/step - auc: 0.8857 - loss: 0.3548 - val_auc: 0.9003 - val_

KeyboardInterrupt: 

In [11]:
best_auc = max(history.history['val_auc'] + history_ft.history['val_auc'])
# fold_aucs.append(best_auc)
print(f"✅ Fold {fold+1} Best AUC: {best_auc:.4f}")

# ============================================================
# 🧾 CV Summary
# ============================================================
print(f"\n📊 Cross-validation AUCs: {fold_aucs}")
print(f"🏆 Mean CV AUC: {np.mean(fold_aucs):.4f}")

✅ Fold 2 Best AUC: 0.9003

📊 Cross-validation AUCs: [0.9003377556800842]
🏆 Mean CV AUC: 0.9003


In [12]:
print(history.history)   # 3 epochs frozen
history_ft.history       # 3 epochs trained layers

{'auc': [0.8570416569709778, 0.8851416110992432, 0.888046145439148], 'loss': [0.3901161253452301, 0.3543761968612671, 0.3503364324569702], 'val_auc': [0.8924337029457092, 0.8956747651100159, 0.9003377556800842], 'val_loss': [0.3455859124660492, 0.3400290310382843, 0.33285823464393616]}


{'auc': [0.8947617411613464, 0.9070575833320618, 0.9143781661987305],
 'loss': [0.34042632579803467, 0.3212393522262573, 0.3090875446796417],
 'val_auc': [0.8999371528625488, 0.8923774361610413, 0.8894196152687073],
 'val_loss': [0.3383289575576782, 0.3508062958717346, 0.35240888595581055]}

In [14]:
# Predictions
test_df = pd.read_csv("/kaggle/input/grand-xray-slam-division-b/sample_submission_2.csv")
test_df["Image_name"] = test_df["Image_name"].astype(str)
test_generator = XRayDataGenerator(test_df, batch_size=16, img_size=(224, 224), is_test=True)

# m = tf.keras.models.load_model("fold_0_bset.h5", compile=False)   # load fold 0 model and generate predictions or fold_0_finalv2.h5
preds = model.predict(test_generator, verbose=1)

submission = pd.DataFrame(preds, columns=conditions)
submission.insert(0, "Image_name", test_df["Image_name"].values)
submission.to_csv("submission_fold0_v2.csv", index=False)
print("✅ submission_fold0_v2.csv created successfully!")


[1m2996/2996[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2021s[0m 673ms/step
✅ submission_fold0_v2.csv created successfully!


In [16]:
print(submission.shape)
submission.head()


(47927, 15)


Unnamed: 0,Image_name,Atelectasis,Cardiomegaly,Consolidation,Edema,Enlarged Cardiomediastinum,Fracture,Lung Lesion,Lung Opacity,No Finding,Pleural Effusion,Pleural Other,Pneumonia,Pneumothorax,Support Devices
0,00000002_002_001.jpg,0.679127,0.752686,0.653812,0.546733,0.743701,0.461579,0.172953,0.787016,0.082552,0.584544,0.152063,0.332639,0.183815,0.647162
1,00000002_001_001.jpg,0.602071,0.372287,0.445286,0.309077,0.44798,0.379972,0.2299,0.657152,0.255947,0.484489,0.188278,0.237635,0.223552,0.385622
2,00000002_001_002.jpg,0.647785,0.635625,0.496112,0.429408,0.599729,0.422771,0.188679,0.638194,0.146228,0.701488,0.164266,0.384874,0.124461,0.446724
3,00000008_001_001.jpg,0.777989,0.765465,0.707251,0.69223,0.917911,0.196626,0.129664,0.834337,0.035341,0.806898,0.080835,0.272634,0.142398,0.873306
4,00000008_002_001.jpg,0.836159,0.681813,0.648309,0.643027,0.826598,0.257827,0.203145,0.773741,0.084744,0.484107,0.076545,0.238657,0.112889,0.794926


In [None]:
# Public score = 0.83, we'll fall back to baseline resnet model
# **public score= 0.83, this optimized” version likely over-regularized or distorted label balance.**