# Notebook showing how transfer learning can be implemented in keras (tensorflow backend). 

When models are created/used they become part of the keras/tensorflow graph. Layers which are shared between models are not duplicated in the graph. Keras will get confused when it comes to training so it is important to have a clean graph.

It is thefore useful to use keras.backend.clear_session() whenever moving between different training schemes or models. And important to transfer layer information through shared model objects (backbones) via model.save_weights() and model.load_weights() as shown in the cells below.

In [1]:
#gan training script
from models import ResNet, Discriminator, Classifier, ResGen, Unet, mnist_data
import keras
from keras.datasets import mnist
import numpy as np 
import tensorflow as tf
import os


Using TensorFlow backend.


In [2]:
data = mnist_data()


In [3]:
keras.backend.clear_session()

# Create models and test outputs

## Init Discriminator model

In [4]:
backbone = ResNet()
discriminator = Discriminator(backbone)
discriminator_predicitons_1 = discriminator(data.get_test()[0])
discriminator.save_weights('discriminator_weights.npy')
backbone.save_weights('backbone_pretrained_weights.npy')

keras.backend.clear_session()

W0417 12:12:53.746738 4551929280 base_layer.py:1790] Layer Discriminator is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because it's dtype defaults to floatx.


To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.



In [5]:
discriminator_predicitons_1

<tf.Tensor: shape=(10000, 1), dtype=float32, numpy=
array([[0.75502723],
       [0.70718175],
       [0.6879473 ],
       ...,
       [0.707958  ],
       [0.6520088 ],
       [0.79798913]], dtype=float32)>

## Initialise Generator model

In [6]:
backbone = ResNet()
backbone(data.get_test()[0])
generator = ResGen(backbone)
# _ = generator.predict(data.get_test()[0])
backbone.load_weights('backbone_pretrained_weights.npy')
generator.save_weights('generator_weights.npy')
rand_data_shape = ((50,) + (7,7) + (1,))
random_noise_data = np.random.normal(size=rand_data_shape)
generator_predicitons_1 = generator(random_noise_data)

keras.backend.clear_session()

W0417 12:13:04.407965 4551929280 base_layer.py:1790] Layer ResNet is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because it's dtype defaults to floatx.


To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

W0417 12:13:14.763929 4551929280 base_layer.py:1790] Layer ResGen is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because it's dtype defaults to floatx.


To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the au

# Initialise Classifier model

In [7]:
keras.backend.clear_session()
backbone = ResNet()
classifier = Classifier(backbone,10)
classifier_predicitons_0 = classifier.predict(data.get_test()[0])
backbone.load_weights('backbone_pretrained_weights.npy')




# Train Classifier model

In [8]:
classifier.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
classifier.summary()
# classifier.fit(x=x_train,y=y_train,batch_size=6000,epochs=1, validation_data=(x_vali,y_vali),callbacks=[checkpoint])
classifier.fit(x=data.get_debug()[0],y=data.get_debug()[1],batch_size=6000,epochs=20)
# classifier.fit(x=data.get_train()[0],y=data.get_train()[1],batch_size=6000,epochs=1, validation_data=data.get_vali())


Model: "Classifier"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
ResNet (ResNet)              multiple                  208080    
_________________________________________________________________
batch_normalization (BatchNo multiple                  25088     
_________________________________________________________________
flatten (Flatten)            multiple                  0         
_________________________________________________________________
dense (Dense)                multiple                  62730     
Total params: 295,898
Trainable params: 282,010
Non-trainable params: 13,888
_________________________________________________________________
Train on 10 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoc

<tensorflow.python.keras.callbacks.History at 0x141722828>

## Save classifier

In [9]:
backbone = classifier.get_backbone()
backbone.save_weights('backbone_posttrained_weights.npy')
classifier.save_weights('classifier_weights.h5')
classifier_predicitons_1 = classifier.predict(data.get_test()[0])

keras.backend.clear_session()

## Generate predictions after training

In [10]:
keras.backend.clear_session()
backbone = ResNet()
discriminator = Discriminator(backbone)
_ = discriminator(data.get_test()[0])
discriminator.load_weights('discriminator_weights.npy')
backbone.load_weights('backbone_posttrained_weights.npy')
discriminator_predicitons_2 = discriminator(data.get_test()[0])



W0417 12:13:30.551861 4551929280 base_layer.py:1790] Layer Discriminator is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because it's dtype defaults to floatx.


To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.



In [11]:
keras.backend.clear_session()
backbone = ResNet()
classifier = Classifier(backbone,10)
_ = classifier.predict(data.get_test()[0])
classifier.load_weights('classifier_weights.h5')
backbone.load_weights('backbone_posttrained_weights.npy')
classifier_predicitons_2 = classifier.predict(data.get_test()[0])

keras.backend.clear_session()

In [12]:
keras.backend.clear_session()
backbone = ResNet()
backbone(data.get_test()[0])
generator = ResGen(backbone)
generator.load_weights('generator_weights.npy')
backbone.load_weights('backbone_posttrained_weights.npy')
generator_predicitons_2 = generator(random_noise_data)

keras.backend.clear_session()

W0417 12:14:05.293900 4551929280 base_layer.py:1790] Layer ResNet is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because it's dtype defaults to floatx.


To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

W0417 12:14:16.730381 4551929280 base_layer.py:1790] Layer ResGen is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because it's dtype defaults to floatx.


To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the au

## Compare with previous predictions

In [13]:
discriminator_diff = discriminator_predicitons_1 - discriminator_predicitons_2
classifier_diff = classifier_predicitons_1 - classifier_predicitons_2
generator_diff = generator_predicitons_1 - generator_predicitons_2

Difference between generator model with trained and untrained backbone 

should be nonzero

In [14]:
np.sum(generator_diff)

-2747.7236

Difference between predicted and loaded classifier model

should be zero

In [15]:
np.sum(classifier_diff)

0.0

Difference between discriminator model with trained and untrained backbone 

should be nonzero

In [16]:
np.sum(discriminator_diff)

-522.9665

## Test if model backbone or submodels are also loaded correctly with load_weights

In [17]:
keras.backend.clear_session()
backbone = ResNet()
classifier = Classifier(backbone,10)
_ = classifier.predict(data.get_test()[0])
classifier.load_weights('classifier_weights.h5')
classifier_predicitons_3 = classifier.predict(data.get_test()[0])

keras.backend.clear_session()

should be zero if submodels are loaded correctly

In [18]:
np.sum(classifier_predicitons_2 - classifier_predicitons_3)

0.0

## Changing backbone weights after model is loaded. This change should mean that the classifier predictions change. 

In [23]:
keras.backend.clear_session()
backbone = ResNet()
# backbone.load_weights('backbone_pretrained_weights.npy')
classifier = Classifier(backbone,10)
_ = classifier.predict(data.get_test()[0])
classifier.load_weights('classifier_weights.h5')
backbone.load_weights('backbone_pretrained_weights.npy')
classifier_predicitons_4 = classifier.predict(data.get_test()[0])

keras.backend.clear_session()

In [24]:
np.sum(classifier_predicitons_3 - classifier_predicitons_4)

3.4868717e-06