# Model Training and Transfer Notebook

## Description:
This Colab notebook provides a comprehensive exploration of model training and transfer scenarios using various initialization strategies. The notebook showcases four distinct scenarios, each shedding light on the impact of different weight initialization techniques on model performance. Through a step-by-step guide, users gain insights into how models are trained, initialized, and transferred between different locations.

## Scenarios:

### 1. Random Initialization:
In this scenario, the notebook demonstrates the training process of a model from scratch. The model is initialized with random weights, and the notebook elucidates how this initialization strategy affects the learning process and eventual convergence. Users will grasp the nuances of training a model with no prior knowledge.

### 2. Pre-trained ImageNet Initialization:
This section delves into the utilization of pre-trained weights from the ImageNet dataset. The notebook elucidates the advantages of leveraging pre-trained weights as initializations. Users witness the accelerated convergence and improved performance achieved when the model starts with learned features.

### 3. Location Transfer with Random Initialization:
The notebook examines the scenario of transferring a model trained at one location to another. It employs the model from Scenario 1 as the source and demonstrates how to adapt it to a new task or dataset. This illustrates the adaptability of models to novel contexts while using random initialization.

### 4. Location Transfer with ImageNet-initialized Models:
Expanding on the third scenario, this section focuses on transferring a model across locations while using the model from Scenario 2 as the source. Users witness how pre-trained weights enhance transfer learning, enabling the model to quickly adapt to new environments and tasks.

Through clear explanations, code examples, and visualizations, this Colab notebook equips users with the knowledge and skills to effectively initialize, train, and transfer models. Whether experimenting with random weights or harnessing the power of pre-trained networks, users will gain a deeper understanding of the intricate dynamics that influence model behavior in various scenarios.


## Initialized the environments

In [None]:
%pip install segmentation-models &> /dev/null
%load_ext tensorboard

# https://stackoverflow.com/questions/75433717/module-keras-utils-generic-utils-has-no-attribute-get-custom-objects-when-im
# open the file keras.py, change all the 'init_keras_custom_objects' to 'init_tfkeras_custom_objects'.
# the location of the keras.py is in the error message. In your case, it should be in /usr/local/lib/python3.8/dist-packages/efficientnet/
!cp 'libskeras.py' '/usr/local/lib/python3.10/dist-packages/efficientnet/keras.py'

import os
import shutil

import numpy as np
import segmentation_models as sm
import tensorflow as tf
from keras import backend as K
from keras.models import load_model
from tensorflow.keras.callbacks import (EarlyStopping, ModelCheckpoint,
                                        ReduceLROnPlateau, TensorBoard)
from tensorflow.keras.layers import Conv2D, Input
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.optimizers import SGD, Adam, RMSprop

from unet_util import (UNET_224, Residual_CNN_block,
                       attention_up_and_concatenate,
                       attention_up_and_concatenate2, dice_coef,
                       dice_coef_loss, evaluate_prediction_result, jacard_coef,
                       multiplication, multiplication2)

sm.set_framework('tf.keras')
sm.framework()

In [None]:
name_id = '06302023' #You can change the id for each run so that all models and stats are saved separately.
input_data = './samples/'
prediction_path = './predicts_rerun_covington_'+name_id+'/'
log_path = './logs_rerun_covington_'+name_id+'/'
model_path = './models_rerun_covington_'+name_id+'/'
save_model_path = './model_rerun_covington_'+name_id+'/'

# Create the folder if it does not exist
os.makedirs(input_data, exist_ok=True)
os.makedirs(model_path, exist_ok=True)
os.makedirs(prediction_path, exist_ok=True)

## Select the models and choose the location

The available backbones are listed here. You can copy and paste them in to the pbackends array. The available samples are listed here you can choose one location to train the model.

In [None]:
# Avaiable backbones for Unet architechture
# 'vgg16' 'vgg19' 'resnet18' 'resnet34' 'resnet50' 'resnet101' 'resnet152' 'inceptionv3'
# 'inceptionresnetv2' 'densenet121' 'densenet169' 'densenet201' 'seresnet18' 'seresnet34'
# 'seresnet50' 'seresnet101' 'seresnet152', and 'attentionUnet'
backends = [
    'resnet50', 'attentionUnet', 'densenet121','densenet169'
]

# Data location
# 'covington' 'rowancreek'
location = 'covington'


### Prepare data

In [None]:
X_train = np.load(input_data+location+'/train_data.npy').astype(np.float32)
Y_train = np.load(input_data+location+'/train_label.npy').astype(np.float32)
X_validation = np.load(input_data+location+'/vali_data.npy').astype(np.float32)
Y_validation = np.load(input_data+location+'/vali_label.npy').astype(np.float32)
# load the test data
X_test = np.load(input_data+location+'/bottom_half_test_data.npy').astype(np.float32)

## Scenario 1: Random Initialization Training

the training process of a model with randomly initialized weights. This serves as a baseline for understanding how models evolve and adapt to a specific task when starting from scratch.


In [None]:
# Fine-tuning flag
# Set to False to random initialize the model
finetune = False

for backend in backends:

  name = location+'-Unet-'+ backend + ('-tf' if(finetune) else '')

  logdir = log_path + name
  if(os.path.isdir(logdir)):
    shutil.rmtree(logdir)
  os.makedirs(logdir, exist_ok=True)
  tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir)

  print('model location: '+ model_path+name+'.h5')

  # Create U-net model with the chosen backbone
  if (backend=="attentionUnet"):
    # Attention U-net model
    learning_rate = 0.0000359
    model = UNET_224()
    model.compile(optimizer = Adam(learning_rate=learning_rate),
                  loss = dice_coef_loss,
                  metrics = [dice_coef,'accuracy'])
  else:
    if (not finetune):
      # Unet without ImageNet backends
      base_model = sm.Unet(backend, classes = 1, encoder_weights=None, input_shape=(None, None, 3))

    else:
      # Unet with ImageNet backends
      base_model = sm.Unet(backend, classes = 1, encoder_weights = 'imagenet', encoder_freeze = finetune)

    # The backbones are trained RGB so we need to add new input wiht 8 channels
    # Conv2D will convert 8 channels input to 3 channels input for the pretrained backbones
    inp = Input(shape=(None, None, 8))
    l1 = Conv2D(3, (1, 1))(inp) # map N channels data to 3 channels
    out = base_model(l1)
    model = Model(inp, out, name = base_model.name)

    # Compile the model with 'Adam' optimizer (0.001 is the default learning rate) and define the loss and metrics
    model.compile(optimizer = Adam(),
                  loss = dice_coef_loss,
                  metrics=[dice_coef,'accuracy'])

  # define hyperparameters and callback modules
  patience = 10
  maxepoch = 500
  callbacks = [ReduceLROnPlateau(monitor='val_loss', factor=0.7, patience=patience, min_lr=1e-9, verbose=1, mode='min'),
              EarlyStopping(monitor='val_loss', patience=patience, verbose=0),
              ModelCheckpoint(model_path+name+'.h5', monitor='val_loss', save_best_only=True, verbose=0),
              TensorBoard(log_dir=logdir)]

  train_history = model.fit(x = X_train,y = Y_train,
                            validation_data = (X_validation, Y_validation),
                            batch_size = 16, epochs = maxepoch, verbose=0, callbacks = callbacks)

  if(finetune and backend != "attentionUnet"):

    # For fine-tuning we need to set the tranable flag to true for the whole model
    model.trainable = True

    # Recompile the model with the smaller learning rate at the optimizer (Adam(1e-5))
    model.compile(optimizer = Adam(1e-5), loss = dice_coef_loss, metrics=[dice_coef,'accuracy'])

    # train the model again
    train_history_2 = model.fit(x = X_train, y = Y_train,
                                validation_data=(X_validation, Y_validation),
                                batch_size=16,epochs=maxepoch,
                                initial_epoch = len(train_history.history['val_loss'])-1,
                                verbose=0 ,callbacks=callbacks)

    # Load the best model saved by the callback module
  if(backend != "attentionUnet"):
    best_model = load_model(model_path+name+'.h5',
                            custom_objects={'dice_coef_loss':dice_coef_loss,
                                            'dice_coef':dice_coef,})
  else:
    best_model = load_model(model_path+name+'.h5',
                            custom_objects={'multiplication': multiplication,
                                            'multiplication2': multiplication2,
                                            'dice_coef_loss':dice_coef_loss,
                                            'dice_coef':dice_coef,})

  # predict the test data using the loaded model
  test_predicted= best_model.predict(X_test)

  # convert the prediction probability to true or false with threshold at 0.5
  test_predicted_threshold = (test_predicted > 0.5).astype(np.uint8)

  # save the prediction results
  np.save(prediction_path+name+'_predict.npy',test_predicted_threshold)
  print('Predtion results saved: ' + prediction_path+name+'_predict.npy')

  pred_npy = prediction_path+name+'_predict.npy'
  mask_npy = input_data+location+'/bottom_half_test_mask.npy'
  label_npy = input_data+location+'/bottom_half_test_label.npy'
  model = model_path + name + '.h5'
  text_path = prediction_path+'prediction_results.txt'

  evaluate_prediction_result(location, pred_npy, mask_npy, label_npy, model, text_path)

## Scenario 2: Pre-trained ImageNet Initialization Training

The second scenario showcases the training process of a model initialized with pre-trained weights from the ImageNet dataset. This highlights the benefits of leveraging transfer learning and how pre-existing knowledge can expedite convergence and enhance overall performance.

In [None]:
# Fine-tuning flag
# the flag is set to True to allow the model to be initialized by ImageNet pre-trained weights
finetune = True

for backend in backends:

  name = location+'-Unet-'+ backend + ('-tf' if(finetune) else '')

  logdir = log_path + name
  if(os.path.isdir(logdir)):
    shutil.rmtree(logdir)
  os.makedirs(logdir, exist_ok=True)
  tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir)

  print('model location: '+ model_path+name+'.h5')

  # Create U-net model with the chosen backbone

  if (backend=="attentionUnet"):
    # Attention U-net model
    learning_rate = 0.0000359
    model = UNET_224()
    model.compile(optimizer = Adam(learning_rate=learning_rate),
                  loss = dice_coef_loss,
                  metrics = [dice_coef,'accuracy'])
  else:
    if (not finetune):
      # Unet with ImageNet backends
      base_model = sm.Unet(backend, classes = 1, encoder_weights=None, input_shape=(None, None, 3))
      # name = name + '-addedlayer'
    else:
      # Unet with ImageNet backends
      base_model = sm.Unet(backend, classes = 1, encoder_weights = 'imagenet', encoder_freeze = finetune)

    # The backbones are trained RGB so we need to add new input wiht 8 channels
    # Conv2D will convert 8 channels input to 3 channels input for the pretrained backbones
    inp = Input(shape=(None, None, 8))
    l1 = Conv2D(3, (1, 1))(inp) # map N channels data to 3 channels
    out = base_model(l1)
    model = Model(inp, out, name = base_model.name)

    # Compile the model with 'Adam' optimizer (0.001 is the default learning rate) and define the loss and metrics
    model.compile(optimizer = Adam(),
                  loss = dice_coef_loss,
                  metrics=[dice_coef,'accuracy'])

  # define hyperparameters and callback modules
  patience = 10
  maxepoch = 500
  callbacks = [ReduceLROnPlateau(monitor='val_loss', factor=0.7, patience=patience, min_lr=1e-9, verbose=1, mode='min'),
              EarlyStopping(monitor='val_loss', patience=patience, verbose=0),
              ModelCheckpoint(model_path+name+'.h5', monitor='val_loss', save_best_only=True, verbose=0),
              TensorBoard(log_dir=logdir)]

  train_history = model.fit(x = X_train,y = Y_train,
                            validation_data = (X_validation, Y_validation),
                            batch_size = 16, epochs = maxepoch, verbose=0, callbacks = callbacks)

  if(finetune and backend != "attentionUnet"):

    # For fine-tuning we need to set the tranable flag to true for the whole model
    model.trainable = True

    # Recompile the model with the smaller learning rate at the optimizer (Adam(1e-5))
    model.compile(optimizer = Adam(1e-5), loss = dice_coef_loss, metrics=[dice_coef,'accuracy'])

    # train the model again
    train_history_2 = model.fit(x = X_train, y = Y_train,
                                validation_data=(X_validation, Y_validation),
                                batch_size=16,epochs=maxepoch,
                                initial_epoch = len(train_history.history['val_loss'])-1,
                                verbose=0 ,callbacks=callbacks)

    # Load the best model saved by the callback module
  from keras.models import load_model
  if(backend != "attentionUnet"):
    best_model = load_model(model_path+name+'.h5',
                            custom_objects={'dice_coef_loss':dice_coef_loss,
                                            'dice_coef':dice_coef,})
  else:
    best_model = load_model(model_path+name+'.h5',
                            custom_objects={'multiplication': multiplication,
                                            'multiplication2': multiplication2,
                                            'dice_coef_loss':dice_coef_loss,
                                            'dice_coef':dice_coef,})

  # predict the test data using the loaded model
  test_predicted= best_model.predict(X_test)

  # convert the prediction probability to true or false with threshold at 0.5
  test_predicted_threshold = (test_predicted > 0.5).astype(np.uint8)

  # save the prediction results
  np.save(prediction_path+name+'_predict.npy',test_predicted_threshold)
  print('Predtion results saved: ' + prediction_path+name+'_predict.npy')

  pred_npy = prediction_path+name+'_predict.npy'
  mask_npy = input_data+location+'/bottom_half_test_mask.npy'
  label_npy = input_data+location+'/bottom_half_test_label.npy'
  model = model_path + name + '.h5'
  text_path = prediction_path+'prediction_results.txt'

  evaluate_prediction_result(location, pred_npy, mask_npy, label_npy, model, text_path)

## Scenario 3: Location Transfer with Random Initialization

 It employs the model from Scenario 1 as the source and demonstrates how to adapt it to a new task or dataset. This illustrates the adaptability of models to novel contexts while using random initialization.

In [None]:
# Source Location
# 'covington' 'rowancreek'
source_location = 'rowancreek'

# Target Location
# 'covington' 'rowancreek'
target_location = 'covington'

# Fine-tuning flag
# True/False
finetune = True

# Test no ImageNet intialized weights
noImagenet = False

for backend in backends:

  # Construct the model save file name
  source_name = source_location+'-Unet-'+backend + ('-tf' if(finetune) else '')


  # Construct the model save file name
  tareget_name = source_location+'-to-'+target_location+'-Unet-'+backend + ('-noImgN' if(noImagenet) else '-ImgN')

  logdir = log_path + tareget_name
  if(os.path.isdir(logdir)):
    shutil.rmtree(logdir)
  os.makedirs(logdir, exist_ok=True)
  # tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir)

  print('source model location: '+ model_path+source_name+'.h5')
  print('target model location: '+ model_path+tareget_name+'.h5')

  # define hyperparameters and callback modules
  patience = 10
  maxepoch = 500
  callbacks = [ReduceLROnPlateau(monitor='val_loss', factor=0.7, patience=patience, min_lr=1e-9, verbose=1, mode='min'),
              EarlyStopping(monitor='val_loss', patience=patience, verbose=0),
              ModelCheckpoint(save_model_path+tareget_name+'.h5', monitor='val_loss', save_best_only=True, verbose=0),
              TensorBoard(log_dir=logdir)]

  # Load the best model saved by the callback module
  if(backend != "attentionUnet"):
    model = load_model(model_path+source_name+'.h5',
                          custom_objects={'dice_coef_loss':dice_coef_loss,
                                          'dice_coef':dice_coef,})
  else:
    model = load_model(model_path+source_name+'.h5',
                          custom_objects={'multiplication': multiplication,
                                          'multiplication2': multiplication2,
                                          'dice_coef_loss':dice_coef_loss,
                                          'dice_coef':dice_coef,})


  train_history = model.fit(x = X_train,y = Y_train,
                            validation_data = (X_validation, Y_validation),
                            batch_size = 16, epochs = maxepoch, verbose=0, callbacks = callbacks)

  if(finetune or noImagenet):

    # For fine-tuning we need to set the tranable flag to true for the whole model
    model.trainable = True

    # Recompile the model with the smaller learning rate at the optimizer (Adam(1e-5))
    model.compile(optimizer = Adam(1e-5), loss = dice_coef_loss, metrics=[dice_coef,'accuracy'])

    # train the model again
    train_history_2 = model.fit(x = X_train, y = Y_train,
                                validation_data=(X_validation, Y_validation),
                                batch_size=16,epochs=maxepoch,
                                initial_epoch = len(train_history.history['val_loss'])-1,
                                verbose=0 ,callbacks=callbacks)

  # Load the best model saved by the callback module
  if(backend != "attentionUnet"):
    best_model = load_model(save_model_path+tareget_name+'.h5',
                          custom_objects={'dice_coef_loss':dice_coef_loss,
                                          'dice_coef':dice_coef,})
  else:
    best_model = load_model(save_model_path+tareget_name+'.h5',
                          custom_objects={'multiplication': multiplication,
                                          'multiplication2': multiplication2,
                                          'dice_coef_loss':dice_coef_loss,
                                          'dice_coef':dice_coef,})

  # predict the test data using the loaded model
  test_predicted= best_model.predict(X_test)

  # convert the prediction probability to true or false with threshold at 0.5
  test_predicted_threshold = (test_predicted > 0.5).astype(np.uint8)

  # save the prediction results
  np.save(prediction_path+tareget_name+'_predict.npy',test_predicted_threshold)
  print('Predtion results saved: ' + prediction_path+tareget_name+'_predict.npy')

  # pred_npy = prediction_path+tareget_name+'_predict.npy'
  # mask_npy = input_data+target_location+'/bottom_half_test_mask.npy'
  # label_npy = input_data+target_location+'/bottom_half_test_label.npy'
  # model = save_model_path + tareget_name + '.h5'
  # text_path = prediction_path+'prediction_results.txt'

  # evaluate_prediction_result(target_location, pred_npy, mask_npy, label_npy, model, text_path)

## Scenario 4: Location Transfer with ImageNet-initialized Models

Expanding on the third scenario, this section focuses on transferring a model across locations while using the model from Scenario 2 as the source. Users witness how pre-trained weights enhance transfer learning, enabling the model to quickly adapt to new environments and tasks.

In [None]:
# Source Location
# 'covington' 'rowancreek'
source_location = 'rowancreek'

# Target Location
# 'covington' 'rowancreek'
target_location = 'covington'

# Fine-tuning flag
# True/False
finetune = True

# Test no ImageNet intialized weights
noImagenet = True


for backend in backends:

  # Construct the model save file name
  source_name = source_location+'-Unet-'+backend + ('-tf' if(finetune) else '')


  # Construct the model save file name
  tareget_name = source_location+'-to-'+target_location+'-Unet-'+backend + ('-noImgN' if(noImagenet) else '-ImgN')

  logdir = log_path + tareget_name
  if(os.path.isdir(logdir)):
    shutil.rmtree(logdir)
  os.makedirs(logdir, exist_ok=True)
  # tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir)

  print('source model location: '+ model_path+source_name+'.h5')
  print('target model location: '+ model_path+tareget_name+'.h5')

  # define hyperparameters and callback modules
  patience = 10
  maxepoch = 500
  callbacks = [ReduceLROnPlateau(monitor='val_loss', factor=0.7, patience=patience, min_lr=1e-9, verbose=1, mode='min'),
              EarlyStopping(monitor='val_loss', patience=patience, verbose=0),
              ModelCheckpoint(save_model_path+tareget_name+'.h5', monitor='val_loss', save_best_only=True, verbose=0),
              TensorBoard(log_dir=logdir)]

  # Load the best model saved by the callback module
  if(backend != "attentionUnet"):
    model = load_model(model_path+source_name+'.h5',
                          custom_objects={'dice_coef_loss':dice_coef_loss,
                                          'dice_coef':dice_coef,})
  else:
    model = load_model(model_path+source_name+'.h5',
                          custom_objects={'multiplication': multiplication,
                                          'multiplication2': multiplication2,
                                          'dice_coef_loss':dice_coef_loss,
                                          'dice_coef':dice_coef,})


  train_history = model.fit(x = X_train,y = Y_train,
                            validation_data = (X_validation, Y_validation),
                            batch_size = 16, epochs = maxepoch, verbose=0, callbacks = callbacks)

  if(finetune or noImagenet):

    # For fine-tuning we need to set the tranable flag to true for the whole model
    model.trainable = True

    # Recompile the model with the smaller learning rate at the optimizer (Adam(1e-5))
    model.compile(optimizer = Adam(1e-5), loss = dice_coef_loss, metrics=[dice_coef,'accuracy'])

    # train the model again
    train_history_2 = model.fit(x = X_train, y = Y_train,
                                validation_data=(X_validation, Y_validation),
                                batch_size=16,epochs=maxepoch,
                                initial_epoch = len(train_history.history['val_loss'])-1,
                                verbose=0 ,callbacks=callbacks)

  # Load the best model saved by the callback module
  if(backend != "attentionUnet"):
    best_model = load_model(save_model_path+tareget_name+'.h5',
                          custom_objects={'dice_coef_loss':dice_coef_loss,
                                          'dice_coef':dice_coef,})
  else:
    best_model = load_model(save_model_path+tareget_name+'.h5',
                          custom_objects={'multiplication': multiplication,
                                          'multiplication2': multiplication2,
                                          'dice_coef_loss':dice_coef_loss,
                                          'dice_coef':dice_coef,})

  # predict the test data using the loaded model
  test_predicted= best_model.predict(X_test)

  # convert the prediction probability to true or false with threshold at 0.5
  test_predicted_threshold = (test_predicted > 0.5).astype(np.uint8)

  # save the prediction results
  np.save(prediction_path+tareget_name+'_predict.npy',test_predicted_threshold)
  print('Predtion results saved: ' + prediction_path+tareget_name+'_predict.npy')

  pred_npy = prediction_path+tareget_name+'_predict.npy'
  mask_npy = input_data+target_location+'/bottom_half_test_mask.npy'
  label_npy = input_data+target_location+'/bottom_half_test_label.npy'
  model = save_model_path + tareget_name + '.h5'
  text_path = prediction_path+'prediction_results.txt'

  evaluate_prediction_result(target_location, pred_npy, mask_npy, label_npy, model, text_path)