# `tensorflow.keras` fine tuning approach

This notebook demonstrates a fine-tuning approach. While such an approach was no longer championed in the final models used, it is likely that such may further improve the models performance. 

The steps in this notebook combined with the tuning of the fine-tuning hyperparameters (e.g., fine-tuning depth) can be done using the demonstrated `sweeps` in this project (shown in another notebook).

Notes:
1. The fine-tuning step mentioned comes after a transfer learning workflow--that is, using pretrained weights we train a classification head on a new task which may be different from the previous pretraining of the model. Hence, we first train the new classification head on our task (i.e., unhealthy-healthy glottis prediction) while freezing the convolutional base (model with pre-trained weights).

2. After an initial training of the dense head (classifier), we unfreeze the convolutional layers with a very low learning rate. Such roughly ensures that preservation of the low level feature maps from pretraining, otherwise--we're almost discarding the pretrained weights the earlier convolutional layers. Usually, our goal is to fine tune the weights that result to the forming of high level feature maps (i.e., concepts--for example, the glottis shape).

Given the notes, the first few steps are just similar to that of the transfer learning notebooks presented in this repository.

References:

* The pretrained weights published by [Mei et al (2022)](https://pubs.rsna.org/doi/10.1148/ryai.210315) in their github [link](https://github.com/BMEII-AI/RadImageNet).

* The preprocessing pipeline by the BAGLS team contained in this github [link](https://github.com/anki-xyz/bagls/blob/master/Utils/DataGenerator.py#L109)

In [1]:
# dev convenience
%load_ext autoreload
%autoreload 2

In [2]:
import sys
sys.path.append("..")
import PATHS

import os
import numpy as np

os.environ["CUDA_VISIBLE_DEVICES"] = "3"

In [3]:
# tf loader
import tensorflow as tf
from tensorflow.keras.models import load_model
from tensorflow import keras

In [4]:
from tensorflow.python.client import device_lib

print(device_lib.list_local_devices())

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 17548735796084712585
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 10925703168
locality {
  bus_id: 2
  numa_node: 1
  links {
  }
}
incarnation: 1219952536450970898
physical_device_desc: "device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:85:00.0, compute capability: 6.1"
]


2022-11-27 16:49:46.797918: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-11-27 16:49:47.789387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /device:GPU:0 with 10419 MB memory:  -> device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:85:00.0, compute capability: 6.1


In [5]:
os.environ["CUDA_VISIBLE_DEVICES"] = "4"

In [6]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.imagenet_utils import preprocess_input

# initialize data generator
train_data_generator = ImageDataGenerator(
    preprocessing_function=preprocess_input,
    validation_split=0.05,
    rescale=1./255,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.1,
    zoom_range=0.1,
    fill_mode='nearest',
#     horizontal_flip=True,
#     rotation_range=10,
)

validation_data_generator = ImageDataGenerator(
    preprocessing_function=preprocess_input, 
    rescale=1./255,
    validation_split=0.05,
)

test_data_generator = ImageDataGenerator(
    preprocessing_function=preprocess_input, 
    rescale=1./255
)

In [7]:
# source directories
# train_dir = '../sample-dataset/train'
# test_dir = '../sample-dataset/test'

train_dir = '../dataset/train'
test_dir = '../dataset/test'

batch_size = 64
class_names = ["healthy", "unhealthy"]
interpol = "bilinear"
cmap = "rgb"
label_mode = "categorical" # 
labels = "inferred"
image_size = (224, 224)


train_dataset = train_data_generator.flow_from_directory(
    directory=train_dir,
    target_size=image_size,
    color_mode=cmap,
    classes=class_names,
    class_mode=label_mode,
    batch_size=batch_size,
    interpolation=interpol,
    subset="training",
)

val_dataset = train_data_generator.flow_from_directory(
    directory=train_dir,
    target_size=image_size,
    color_mode=cmap,
    classes=class_names,
    class_mode=label_mode,
    batch_size=batch_size,
    interpolation=interpol,
    subset="validation",
)

test_dataset = test_data_generator.flow_from_directory(
    directory=test_dir,
    target_size=image_size,
    color_mode=cmap,
    classes=class_names,
    class_mode=label_mode,
    batch_size=batch_size,
    interpolation=interpol,
    shuffle=False, # do not shuffle for later evaluation, alphanum sort
)

Found 52393 images belonging to 2 classes.
Found 2757 images belonging to 2 classes.
Found 3300 images belonging to 2 classes.


In [8]:
dropout_rate = 0.2
def define_model(pretrained):
    conv_base = pretrained(
        include_top=False,
        weights=PATHS.resnet50_weights,
        input_shape=(*image_size, 3),
        pooling='avg',
    )
    print("Num trainable at load:", len(conv_base.trainable_weights)) 
    conv_base.trainable = False
    print("Num trainable:", len(conv_base.trainable_weights)) 
    
    x = conv_base.output
    
    # layers at this stage are arbitrary
    # can be subjected to hyperparam tuning
    x = keras.layers.Flatten()(x)
    x = keras.layers.Dropout(dropout_rate)(x)
    x = keras.layers.Dense(units=512, activation='relu')(x)
    x = keras.layers.Dropout(dropout_rate)(x)
    
    # multiclass
    outputs = keras.layers.Dense(2, activation="softmax")(x)
    model = keras.Model(inputs=conv_base.input, outputs=outputs)
    return model

In [9]:
pretrained = keras.applications.ResNet50
model = define_model(pretrained)

2022-11-27 16:49:51.046438: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10419 MB memory:  -> device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:85:00.0, compute capability: 6.1


Num trainable at load: 212
Num trainable: 0


In [10]:
# Note on batchnorm layers

In [11]:
def print_layer_trainable(conv_model, only_true=False):
    """Inspect trainable layers"""
    for layer in conv_model.layers:
        if only_true & layer.trainable:
            if layer.trainable:
                print("{0}:\t{1}".format(layer.trainable, layer.name))
        else:
            print("{0}:\t{1}".format(layer.trainable, layer.name))

In [12]:
print_layer_trainable(model)

False:	input_1
False:	conv1_pad
False:	conv1_conv
False:	conv1_bn
False:	conv1_relu
False:	pool1_pad
False:	pool1_pool
False:	conv2_block1_1_conv
False:	conv2_block1_1_bn
False:	conv2_block1_1_relu
False:	conv2_block1_2_conv
False:	conv2_block1_2_bn
False:	conv2_block1_2_relu
False:	conv2_block1_0_conv
False:	conv2_block1_3_conv
False:	conv2_block1_0_bn
False:	conv2_block1_3_bn
False:	conv2_block1_add
False:	conv2_block1_out
False:	conv2_block2_1_conv
False:	conv2_block2_1_bn
False:	conv2_block2_1_relu
False:	conv2_block2_2_conv
False:	conv2_block2_2_bn
False:	conv2_block2_2_relu
False:	conv2_block2_3_conv
False:	conv2_block2_3_bn
False:	conv2_block2_add
False:	conv2_block2_out
False:	conv2_block3_1_conv
False:	conv2_block3_1_bn
False:	conv2_block3_1_relu
False:	conv2_block3_2_conv
False:	conv2_block3_2_bn
False:	conv2_block3_2_relu
False:	conv2_block3_3_conv
False:	conv2_block3_3_bn
False:	conv2_block3_add
False:	conv2_block3_out
False:	conv3_block1_1_conv
False:	conv3_block1_1_bn
Fal

In [13]:
from tensorflow.keras import metrics
thresh = 0.5
metrics_dict = {
    "ACC":  metrics.BinaryAccuracy(name="ACC", threshold=thresh),
    "AUC-ROC": metrics.AUC(name='ROC', curve='ROC'),
    "AUC-PR": metrics.AUC(name='PR', curve='PR'),
    "TP": metrics.TruePositives(name="TP", thresholds=thresh),
    "TN": metrics.TrueNegatives(name="TN", thresholds=thresh),
    "FP": metrics.FalsePositives(name="FP", thresholds=thresh),
    "FN": metrics.FalseNegatives(name="FN", thresholds=thresh),
}

In [14]:
from tensorflow.keras import optimizers
from tensorflow.keras import losses

# opt = optimizers.Adam(learning_rate=1e-06)
opt = optimizers.Adam()
met = list(metrics_dict.values())

model.compile(
    loss=losses.CategoricalCrossentropy(), # multiclass
    optimizer=opt,
    metrics=met,
)

In [15]:
# verify arch
# base predictions with untrained classif head
base_preds = model.predict(test_dataset)
base_preds

2022-11-27 16:49:54.277314: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
2022-11-27 16:49:56.926646: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8400


array([[0.45578143, 0.5442186 ],
       [0.47979867, 0.5202014 ],
       [0.46865255, 0.5313474 ],
       ...,
       [0.46448022, 0.5355198 ],
       [0.47043678, 0.5295632 ],
       [0.5413459 , 0.45865414]], dtype=float32)

In [16]:
# no save callback, not yet done training
history = model.fit(
    train_dataset,
    validation_data=val_dataset,
    epochs=20, # less epochs
    shuffle=True,
)

Epoch 1/20
Epoch 2/20

KeyboardInterrupt: 

In [None]:
# set base layers to trainable  
def set_trainable_base(model):
    """Specific for resnet50v2 where there are preact_bns and otherwise"""
    for layer in model.layers[:]:          
        if ('preact_bn' in layer.name):
            layer.trainable = False
        else:
            layer.trainable = True
    return model

In [None]:
# make other layers trainable
model = set_trainable_base(model)
print_layer_trainable(model) # verify that all is trainable

In [None]:
opt = optimizers.Adam(learning_rate=1e-06) # use small LR
met = list(metrics_dict.values())

model.compile(
    loss=losses.CategoricalCrossentropy(), # multiclass
    optimizer=opt,
    metrics=met,
)

In [None]:
def get_save_path(model_name):
    model_dir = PATHS.models_dir
    if not os.path.exists(model_dir):
        os.makedirs(model_dir)
    model_filepath = os.path.join(model_dir, model_name)
    return model_filepath

In [None]:
model_name = "resnet_ft_radimagenet_weights.keras"
model_filepath = get_save_path(model_name)

callbacks = [
    keras.callbacks.ModelCheckpoint(
        filepath=model_filepath,
        save_best_only=True,
        monitor="val_ROC",
    ),
]

# set verbose to 0 to limit notebook size
history = model.fit(
    train_dataset,
    validation_data=val_dataset,
    epochs=30, # more epochs
    shuffle=True,
    callbacks=callbacks,
    verbose=0,
)

In [None]:
# load model again and inspect which layers are trainable
best_model = keras.models.load_model(model_filepath)
print_layer_trainable(model)
for layer in model.layers[:]:          
    layer.trainable = False
print_layer_trainable(model)

# save after editing trainable layers
best_model.save(model_filepath)

In [None]:
best_model = keras.models.load_model(model_filepath)

# test metrics
metrics_results = best_model.evaluate(test_dataset)
print(f"Metrics: \n", dict(zip(metrics_results, metrics_dict)))

# get preds
trained_preds = best_model.predict(test_dataset)
trained_preds

## End