# A Transfer learning with Keras using ResNet50

**Trains a convolutional neural network to classify the CIFAR 10 dataset:**

**Abstract**

In this tutorial we will provide a guide through for transfer learning with the main aspects to take into account in the process, some tips and an example implementation in Keras using ResNet50 as the trained model. The task is to transfer the learning of a ResNet50 trained with Imagenet to a model that identify images from CIFAR-10 dataset. Several methods were tested to achieve a greater accuracy which we provide to show the variety of options for a training. However with the final model of this blog we get an accuracy of 94% on test set.

**Introducción**

Learning something new takes time and practice but we find it easy to do similar tasks. This is thanks to human association involved in learning. We have the capability to identify patterns from previous knowledge an apply it into new learning.

When we meet a person than is faster or better than us in something like a video game or coding it is almost certain that he has do it before or there is an association with a previous similar activity.

If we know how to ride a bike, we don’t need to learn from zero how to ride a motorbike. If we know how to play football, we don’t need to learn from zero how to play futsal. If we know how to play the piano, we don’t need to learn from zero how to play another instrument.

The same is applicable to machines, if we train a model with a database, it’s not necessary to retrain from zero all the model to adjust to a new similar dataset. Both Imagenet and CIFAR-10 have images that can train a model to classify images. Then, it is very promising if we can save time training a model (because it can really take long time) and start using the weights of a previously trained model. We are going through this concept of transfer learning with all what you need to also build a model on your own.

**Materials and Methods**

*Setting our environment*

We are going to use Keras which is an open source library written in Python for neural networks. We work over it with tensorflow in a Google Colab, a Jupyter notebook environment that runs in the cloud.

The first thing we do is importing the libraries needed with the line of code below. Running the version as 1.x is optional, without that first line it will run the last version of tensorflow for Colab. We also use numpy and a function of tensorflow but depending on how you build your own model is not necessary to import them.


NOTE: To avoid the error "AttributeError: 'str' object has no attribute 'decode'". downgraded my h5py package with the following command, then Restarted my ipython kernel and it worked
[SO: Post 53740577](https://stackoverflow.com/questions/53740577/does-any-one-got-attributeerror-str-object-has-no-attribute-decode-whi)

In [1]:
#from google.colab import drive
#drive.mount('/content/drive')
path_output = "./data/11-08-2021/"

In [2]:
#!pip install 'h5py==2.10.0' --force-reinstall

In [3]:
#!pip uninstall tensorflow-gpu -y
#!pip uninstall tensorflow -y

In [4]:
#!conda install tensorflow-gpu -y
#!pip install tensorflow
#!conda update keras

In [1]:
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
from tensorflow.python.client import device_lib 
print(device_lib.list_local_devices())

Num GPUs Available:  1
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 11713273532148542727
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 10562532800
locality {
  bus_id: 2
  numa_node: 1
  links {
  }
}
incarnation: 1882367395494068085
physical_device_desc: "device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:af:00.0, compute capability: 7.5"
]


In [2]:
import tensorflow as tf
tf.test.is_built_with_cuda()

True

In [3]:
tf.__version__

'2.4.1'

In [4]:
#!pip install tensorflow
!pip install onnx



In [4]:
#% tensorflow_version 1.x
import tensorflow.keras as K
import tensorflow as tf

#physical_devices = tf.config.experimental.list_physical_devices('GPU')
#assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
#config = tf.config.experimental.set_memory_growth(physical_devices[0], True)

Training a model uses a lot of resources so we recommend using a GPU configuration in the Colab. This will speed up the process and allow more testing. We will talk about some other ways to improve computation soon.

**Database**

CIFAR-10 is a dataset with 60000 32x32 colour images grouped in 10 classes, that means 6000 images per class. This is a dataset of 50,000 32x32 color training images and 10,000 test images, labeled over 10 categories.
The categories are airplane, automobile, beer, cat, deer, dog, frog, horse, ship, truck. We can take advantage of the fact that these categories and a lot more are into the Imagenet collection.
To load a database with Keras, we use:



```
tf.keras.datasets.cifar10.load_data()
```


**Preprocess**


Now that the data is loaded, we are going to build a preprocess function for the data. We have X as a numpy array of shape (m, 32, 32, 3) where m is the number of images, 32 and 32 the dimensions, and 3 is because we use color images (RGB). 

We have a set of X for training and a set of X for validation. Y is a numpy array of shape (m, ) that we want to be our labels. Since we work with 10 different categories, we make use of one-hot encoding with a function of Keras that makes our Y into a shape of (m, 10). 

That also applies for the validation.
As we said before, we are going to use ResNet50 but there are also many other models available with pre-trained weights such as VGG16, ResNet101, InceptionV3 and DenseNet121. Each one has its own preprocess function for the inputs.


In [5]:
def preprocess_data(X, Y):
    """
    a function that trains a convolutional neural network to classify the
    CIFAR 10 dataset
    :param X: X is a numpy.ndarray of shape (m, 32, 32, 3) containing the
    CIFAR 10 data, where m is the number of data points
    :param Y: Y is a numpy.ndarray of shape (m,) containing the CIFAR 10
    labels for X
    :return: X_p, Y_p
        X_p is a numpy.ndarray containing the preprocessed X
        Y_p is a numpy.ndarray containing the preprocessed Y
    """
    X_p = K.applications.resnet50.preprocess_input(X)
    Y_p = K.utils.to_categorical(Y, 10)
    return X_p, Y_p

Next, we are going to call our function with the parameters loaded from the CIFAR10 database. It’s important to get to know your data to monitor the steps and know how to build your model. Let’s print the shapes of our x_train and y_train before and after the preprocessing.

In [6]:
(x_train, y_train), (x_test, y_test) = K.datasets.cifar10.load_data()
print((x_train.shape, y_train.shape))
x_train, y_train = preprocess_data(x_train, y_train)
x_test, y_test = preprocess_data(x_test, y_test)
print((x_train.shape, y_train.shape))

((50000, 32, 32, 3), (50000, 1))
((50000, 32, 32, 3), (50000, 10))


**Using weights of a trained neural network**

A pretrained model from the Keras Applications has the advantage of allow you to use weights that are already calibrated to make predictions. In this case, we use the weights from Imagenet and the network is a ResNet50. The option include_top=False allows feature extraction by removing the last dense layers. This let us control the output and input of the model.

In [7]:
#input_t = K.Input(shape=(32, 32, 3))
input_t = K.Input(shape=(224, 224, 3))
res_model = K.applications.ResNet50(include_top=False,
                                        weights="imagenet",
                                        input_tensor=input_t)

In [9]:
#model = tf.keras.models.load_model('./data/11-08-2021/restored_keras_imagenet_resnet50.h5')

From this point it all comes to testing and a bit of creativity. The starting point is very advantageous since we have weights that already serve for image classification but since we are using it on a completely new dataset, there is a need for adjustments. Our objective is to build a model that has high accuracy in their classifications. In this case, if an image of a dog is presented, it successfully identifies it as a dog and not as a train, for example.

Let’s say we want to achieve an accuracy of more than 88% on training data but we also wish that it doesn’t have overfitting. How do we get this? Well at this point our models may diverge, this is where we test what tools we can use for that objective. The important here is to learn about transfer learning and making robust models. We follow an example but we can run with different approaches that we will discuss.
The two aproaches you can take in transfer learning are:

*   Feature extraction
*   Fine tuning


This refers on how you use the layers of your pretrained model. We have already a very huge amount of parameters because of the number of layer of the ResNet50 but we have calibrated weights. We can choose to ‘freeze’ those layers (as many as you can) so those values doesn’t change, and by that way saving time and computational cost. However as the dataset is entirely different is not a bad idea to train all the model

In this case, we ‘freeze’ all layers except for the last block of the ResNet50. The way to do this in Keras is with:

In [8]:
from tensorflow import keras
#res_model = tf.keras.models.load_model('./data/11-08-2021/restored_keras_imagenet_resnet50.h5')

In [11]:
#!pip3 install keras

In [9]:
from tensorflow.keras.models import load_model
import tensorflow as tf

In [13]:
tf.__version__

'2.4.1'

In [10]:
import time
import os
import copy
import csv
import pandas as pd
from datetime import datetime

In [11]:
for layer in res_model.layers[:143]:
    layer.trainable = False
# Check the freezed was done ok
for i, layer in enumerate(res_model.layers):
    print(i, layer.name, "-", layer.trainable)
#to_res = (224, 224)for layer in res_model.layers[:143]:
    layer.trainable = False
# Check the freezed was done ok
for i, layer in enumerate(res_model.layers):
    print(i, layer.name, "-", layer.trainable)
#to_res = (224, 224)

0 input_1 - False
1 conv1_pad - False
2 conv1_conv - False
3 conv1_bn - False
4 conv1_relu - False
5 pool1_pad - False
6 pool1_pool - False
7 conv2_block1_1_conv - False
8 conv2_block1_1_bn - False
9 conv2_block1_1_relu - False
10 conv2_block1_2_conv - False
11 conv2_block1_2_bn - False
12 conv2_block1_2_relu - False
13 conv2_block1_0_conv - False
14 conv2_block1_3_conv - False
15 conv2_block1_0_bn - False
16 conv2_block1_3_bn - False
17 conv2_block1_add - False
18 conv2_block1_out - False
19 conv2_block2_1_conv - False
20 conv2_block2_1_bn - False
21 conv2_block2_1_relu - False
22 conv2_block2_2_conv - False
23 conv2_block2_2_bn - False
24 conv2_block2_2_relu - False
25 conv2_block2_3_conv - False
26 conv2_block2_3_bn - False
27 conv2_block2_add - False
28 conv2_block2_out - False
29 conv2_block3_1_conv - False
30 conv2_block3_1_bn - False
31 conv2_block3_1_relu - False
32 conv2_block3_2_conv - False
33 conv2_block3_2_bn - False
34 conv2_block3_2_relu - False
35 conv2_block3_3_conv - 

In [12]:
len(res_model.layers)

175

Later, we need to connect our pretrained model with the new layers of our model. We can use global pooling or a flatten layer to connect the dimensions of the previous layers with the new layers. With just a flatten layer and a dense layer with softmax we can perform close the model and start making classification.



```
model = K.models.Sequential()
model.add(res_model)
model.add(K.layers.Flatten())
model.add(K.layers.Dense(10, activation='softmax'))
```

The final layers are below. However we explain some more aspects to improve the model and make a good classification. We present the main aspects taken into account to build the model.


In [13]:
model = K.models.Sequential()
model.add(K.layers.Lambda(lambda image: tf.image.resize(image, (224, 224))))
model.add(res_model)
model.add(K.layers.Flatten())
model.add(K.layers.BatchNormalization())
model.add(K.layers.Dense(256, activation='relu'))
model.add(K.layers.Dropout(0.5))
model.add(K.layers.BatchNormalization())
model.add(K.layers.Dense(128, activation='relu'))
model.add(K.layers.Dropout(0.5))
model.add(K.layers.BatchNormalization())
model.add(K.layers.Dense(64, activation='relu'))
model.add(K.layers.Dropout(0.5))
model.add(K.layers.BatchNormalization())
model.add(K.layers.Dense(10, activation='softmax'))

In [14]:
model = K.models.Sequential()
model.add(K.layers.Lambda(lambda image: tf.image.resize(image, (224, 224))))
model.add(res_model)
model.add(K.layers.Flatten())
model.add(K.layers.BatchNormalization())
model.add(K.layers.Dense(256, tf.keras.layers.Activation('relu')))
model.add(K.layers.Dropout(0.5))
model.add(K.layers.BatchNormalization())
model.add(K.layers.Dense(128, tf.keras.layers.Activation('relu')))
model.add(K.layers.Dropout(0.5))
model.add(K.layers.BatchNormalization())
model.add(K.layers.Dense(64, tf.keras.layers.Activation('relu')))
model.add(K.layers.Dropout(0.5))
model.add(K.layers.BatchNormalization())
model.add(K.layers.Dense(10, tf.keras.layers.Activation('softmax')))

We have regularizers to help us avoid overfitting and optimizers to get a faster result. Each of them can also affect our accuracy, so we present what to take into account. The most important are:



*   Batch size: It is recommended to use a number of batch size with powers of 2 (8, 16, 32, 64, 128, …) because it fits with the memory of the computer.
*   Learning rate: For transfer learning it is recommended a very low learning rate because we don’t want to change too much what is previously learned.
*   Number of layers: This depends on how much you relay from the layers of the pretrained model. We found that if we leave all the model for training just a flatten layer and a dense with softmax is enough but since we incorporated the feature extraction it was required more layers at the end.
*   Optimization methods: We tested with SGD and RMSprop. SGD with a very low learning required more epochs (30) to complete a razonable training. We used RMSprop with 5 epochs to get our result.
*  Regularization methods: To avoid overfitting we used Batch normalization and dropout in-between the dense layers.
*  Callbacks: In Keras, we can use callbacks in our model to perform certain actions in the training such as weight saving.


In [15]:
date = datetime.today().strftime('%Y-%m-%d-%H:%M:%S')
check_point = K.callbacks.ModelCheckpoint(filepath="./data/19-08-2021/resnet50-cifar10_{}.h5".format(date),
                                              monitor="val_acc",
                                              mode="max",
                                              save_best_only=True,
                                              )

In [16]:
model.compile(loss='categorical_crossentropy',
                  optimizer=K.optimizers.RMSprop(lr=2e-5),
                  metrics=['accuracy'])

In [17]:
batch_size = 32
epochs=50

In [22]:
#physical_devices = tf.config.experimental.list_physical_devices('GPU')
#assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
#config = tf.config.experimental.set_memory_growth(physical_devices[0], True)

In [18]:
history = model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1,
                        validation_data=(x_test, y_test),
                        callbacks=[check_point])

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50


Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [19]:
def export_history_csv(history_, model_name):
  since = time.time()
  date = datetime.today().strftime('%Y-%m-%d-%H:%M:%S')
  data_file = open('./data/19-08-2021/tf_exp_train_{}_{}.csv'.format(model_name, date), mode='w+', newline='', encoding='utf-8')
  data_writer = csv.writer(data_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
  data_writer.writerow(['Model','type', 'Dataset', 'Epoch', 'criterion', 'optimizer', 'scheduler','Train_loss', 'Train_acc', "val_loss", "Val_acc", 'time','Elapse_time','date'])
  for epoch_ in history_.epoch:
    data_writer.writerow([history_.model,'tensorflow', 'hymenoptera', epoch_, '', 
                          history_.model.optimizer, '',history_.history['loss'][epoch_], history_.history['accuracy'][epoch_], 
                          history_.history['val_loss'][epoch_], history_.history['val_accuracy'][epoch_], '','',date])
  data_file.close()


In [20]:
#history.history
model_name = 'resnet50-cifar10'

In [21]:
export_history_csv(history, model_name)

In [None]:
date = datetime.today().strftime('%Y-%m-%d-%H:%M:%S')
model.save("./data/19-08-2021/tf_resnet_cifar10_{}.h5".format(date))
model.summary()

In [14]:
# serialize model to JSON
model_name = 'resnet50'
model_type = 'keras'
def save_keras(model, model_type='direct'):
  model_json = model.to_json()
  with open("k_model_{}_{}.json".format(model_name, model_type), "w") as json_file:
    json_file.write(model_json)
  # serialize weights to HDF5
  model.save_weights("k_model_{}_{}.h5".format(model_name, model_type))
  print("Saved model to disk")

In [39]:
save_keras(model, 'trained_')

Saved model to disk


In [18]:
!pip install -U tf2onnx

Collecting tf2onnx
  Using cached tf2onnx-1.9.1-py3-none-any.whl (398 kB)
Collecting flatbuffers~=1.12
  Using cached flatbuffers-1.12-py2.py3-none-any.whl (15 kB)
Installing collected packages: flatbuffers, tf2onnx
  Attempting uninstall: flatbuffers
    Found existing installation: flatbuffers 2.0
    Uninstalling flatbuffers-2.0:
      Successfully uninstalled flatbuffers-2.0
Successfully installed flatbuffers-1.12 tf2onnx-1.9.1


In [12]:
!pip install onnx==1.8.1
!pip install onnx_tf
!pip install onnx_pytorch
!pip install pytorch2keras
#%tensorflow_version 1.x
from tensorflow import keras
import tensorflow as tf
print(tf.__version__)

Collecting onnxruntime
  Using cached onnxruntime-1.8.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.5 MB)
Installing collected packages: onnxruntime
Successfully installed onnxruntime-1.8.1


2.2.0


In [13]:
#Import needed packages
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
#from onnx_tf.backend import prepare
from __future__ import print_function, division
import torch.optim as optim
from torch.optim import lr_scheduler
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
from torch.autograd import Variable
#from pytorch2keras.converter import pytorch_to_keras
import matplotlib.pyplot as plt
import time
import os
import copy
import csv
import pandas as pd
from datetime import datetime

In [14]:
from __future__ import print_function
from __future__ import division
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import onnx

In [16]:
import tf2onnx
model = tf.keras.models.load_model('./data/resnet_cifar10-v2.h5')
model_proto, external_tensor_storage = tf2onnx.convert.from_keras(model,
                input_signature=None, opset=None, custom_ops=None,
                custom_op_handlers=None, custom_rewriter=None,
                inputs_as_nchw=None, extra_opset=None, shape_override=None,
                 target=None, large_model=False, output_path='keras-{}.onnx'.format(model_name))

ValueError: Unknown activation function: Activation

In [22]:
onnx_model_keras = onnx.load('keras-{}.onnx'.format(model_name))
onnx.checker.check_model(onnx_model_keras)

In [24]:
!pip uninstall onnxruntime -y

Found existing installation: onnxruntime 1.8.1
Uninstalling onnxruntime-1.8.1:
  Successfully uninstalled onnxruntime-1.8.1


In [25]:
!pip install onnxruntime-gpu

Collecting onnxruntime-gpu
  Downloading onnxruntime_gpu-1.8.1-cp38-cp38-manylinux2014_x86_64.whl (31.3 MB)
[K     |████████████████████████████████| 31.3 MB 248 kB/s  eta 0:00:01
Installing collected packages: onnxruntime-gpu
Successfully installed onnxruntime-gpu-1.8.1


In [27]:
import onnxruntime_gpu as ort
print(ort.get_device())

sess_options = ort.SessionOptions()
session = ort.InferenceSession(onnx_model_keras.SerializeToString(), sess_options)
onnx_time = timeit.timeit("session.run( [session.get_outputs()[1].name], {session.get_inputs()[0].name: test_data} )", number=7, setup="from __main__ import session, test_data")
print("LGBM->ONNX (GPU): {}".format(onnx_time))

ModuleNotFoundError: No module named 'onnxruntime_gpu'

In [None]:
import onnxruntime

ort_session = onnxruntime.InferenceSession('keras-{}.onnx'.format(model_name))
# compute ONNX Runtime output prediction
ort_inputs = {ort_session.get_inputs()[0].name: x_test}
ort_outs = ort_session.run(None, ort_inputs)

In [47]:
try:
  # Specify an invalid GPU device
  with tf.device('/device:GPU:0'):
    k_predict = model.predict(x_test)
except RuntimeError as e:
    print(e)

In [48]:
k_predict

array([[7.99335539e-05, 1.47037703e-04, 1.41063472e-04, ...,
        1.54273861e-04, 1.37313955e-05, 9.70138281e-05],
       [5.53062091e-05, 1.23364716e-05, 1.70311505e-05, ...,
        2.31776539e-05, 9.99850631e-01, 6.55657595e-06],
       [4.76072673e-05, 2.57226147e-05, 2.46316613e-05, ...,
        3.05302528e-05, 9.99828339e-01, 9.95226219e-06],
       ...,
       [7.68784012e-05, 4.31562694e-05, 1.49180996e-04, ...,
        5.48096199e-04, 3.66769673e-04, 5.86517308e-05],
       [8.87418792e-05, 9.99115407e-01, 1.22546626e-04, ...,
        5.05060525e-05, 4.29669053e-05, 1.53407236e-04],
       [5.24530733e-05, 4.97290603e-05, 8.08762779e-05, ...,
        9.98181343e-01, 1.90178092e-04, 1.80298433e-04]], dtype=float32)

In [58]:
!pip install onnx2pytorch

Collecting onnx2pytorch
  Downloading onnx2pytorch-0.3.0-py3-none-any.whl (29 kB)
Installing collected packages: onnx2pytorch
Successfully installed onnx2pytorch-0.3.0


In [60]:
#onnx_model = onnx.load(path_to_onnx_model)
from onnx2pytorch import ConvertModel
pytorch_model = ConvertModel(onnx_model_keras)

  layer.weight.data = torch.from_numpy(numpy_helper.to_array(weight))


In [61]:
pytorch_model

ConvertModel(
  (Upsample_Upsample__14:0): Upsample()
  (Conv_sequential_1/resnet50/conv1_bn/FusedBatchNormV3:0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3))
  (Relu_sequential_1/resnet50/conv1_relu/Relu:0): ReLU(inplace=True)
  (Pad_sequential_1/resnet50/pool1_pad/Pad:0): Pad()
  (MaxPool_sequential_1/resnet50/pool1_pool/MaxPool:0): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
  (Conv_sequential_1/resnet50/conv2_block1_1_bn/FusedBatchNormV3:0): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
  (Relu_sequential_1/resnet50/conv2_block1_1_relu/Relu:0): ReLU(inplace=True)
  (Conv_sequential_1/resnet50/conv2_block1_2_bn/FusedBatchNormV3:0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (Relu_sequential_1/resnet50/conv2_block1_2_relu/Relu:0): ReLU(inplace=True)
  (Conv_sequential_1/resnet50/conv2_block1_3_bn/FusedBatchNormV3:0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1))
  (Conv_sequential_1/resnet5

In [49]:
!pip install -U git+https://github.com/Microsoft/MMdnn.git@master

Collecting git+https://github.com/Microsoft/MMdnn.git@master
  Cloning https://github.com/Microsoft/MMdnn.git (to revision master) to /tmp/pip-req-build-pv5j2iim
  Running command git clone -q https://github.com/Microsoft/MMdnn.git /tmp/pip-req-build-pv5j2iim
  Resolved https://github.com/Microsoft/MMdnn.git to commit 19562a381c27545984a216eda7591430e274e518
Building wheels for collected packages: mmdnn
  Building wheel for mmdnn (setup.py) ... [?25ldone
[?25h  Created wheel for mmdnn: filename=mmdnn-0.3.1-py2.py3-none-any.whl size=319222 sha256=2dba62263c88e4c80f713448013b1552a122311c9873f1b830cc42d0cfba6bfd
  Stored in directory: /tmp/pip-ephem-wheel-cache-d93tqrfn/wheels/b5/9e/aa/a165e269d33fa3c6b45bd8f5577d9df11e6c785333cc476628
Successfully built mmdnn
Installing collected packages: mmdnn
Successfully installed mmdnn-0.3.1


In [None]:
!mmdownload -f keras -n resnet50 -o ./

Using TensorFlow backend.






2021-08-07 07:50:20.578531: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2021-08-07 07:50:20.583016: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2000194999 Hz
2021-08-07 07:50:20.583242: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55971ef72a00 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-08-07 07:50:20.583273: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-08-07 07:50:20.586469: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-08-07 07:50:20.762713: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUM

In [50]:
!pip install cntk

[31mERROR: Could not find a version that satisfies the requirement cntk (from versions: none)[0m
[31mERROR: No matching distribution found for cntk[0m


In [18]:
#!pip install tensorflow --upgrade --force-reinstall
!pip3 uninstall keras-nightly -y 
!pip3 uninstall -y tensorflow -y
!pip3 install keras==2.1.6
!pip3 install tensorflow==1.15.0
!pip3 install h5py==2.10.0
!pip install tensorflow-gpu==1.15


Found existing installation: keras-nightly 2.5.0.dev2021032900
Uninstalling keras-nightly-2.5.0.dev2021032900:
  Successfully uninstalled keras-nightly-2.5.0.dev2021032900
Collecting keras==2.1.6
  Using cached Keras-2.1.6-py2.py3-none-any.whl (339 kB)
Installing collected packages: keras
  Attempting uninstall: keras
    Found existing installation: keras 2.6.0
    Uninstalling keras-2.6.0:
      Successfully uninstalled keras-2.6.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pytorch2keras 0.2.4 requires tensorflow, which is not installed.[0m
Successfully installed keras-2.1.6
[31mERROR: Could not find a version that satisfies the requirement tensorflow==1.15.0 (from versions: 2.5.0rc0, 2.5.0rc1, 2.5.0rc2, 2.5.0rc3, 2.5.0, 2.5.1, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0)[0m
[31mERROR: No matching distribution found for tensorflow==1.15.0[0m
Collecting

In [None]:
!mmdownload -f keras -n resnet50 -o ./

/bin/bash: mmdownload: command not found


In [51]:
#!mmtoir -f keras -d imagenet_densenet.h5 -n imagenet_densenet.json -w imagenet_densenet.h5
!mmconvert -sf keras -iw './data/retrain_resnet50-cifar10-v2.h5' -df pytorch -om retrain_resnet50-cifar10.pb

Traceback (most recent call last):
  File "/store/travail/opmos/conda/envs/tf-gpu/bin/mmconvert", line 8, in <module>
    sys.exit(_main())
  File "/store/travail/opmos/conda/envs/tf-gpu/lib/python3.9/site-packages/mmdnn/conversion/_script/convert.py", line 102, in _main
    ret = convertToIR._convert(ir_args)
  File "/store/travail/opmos/conda/envs/tf-gpu/lib/python3.9/site-packages/mmdnn/conversion/_script/convertToIR.py", line 45, in _convert
    from mmdnn.conversion.keras.keras2_parser import Keras2Parser
  File "/store/travail/opmos/conda/envs/tf-gpu/lib/python3.9/site-packages/mmdnn/conversion/keras/keras2_parser.py", line 8, in <module>
    import keras as _keras
  File "/store/travail/opmos/conda/envs/tf-gpu/lib/python3.9/site-packages/keras/__init__.py", line 25, in <module>
    from keras import models
  File "/store/travail/opmos/conda/envs/tf-gpu/lib/python3.9/site-packages/keras/models.py", line 19, in <module>
    from keras import backend
  File "/store/tra

In [9]:
tf.keras.__version__

'2.3.0-tf'

In [None]:
import torch
import torchvision
import torchvision.transforms as transforms

In [None]:
transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

# Normalize the test set same as training set without augmentation
transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

In [None]:
trainset = torchvision.datasets.CIFAR10(
    root=opt.dataroot, train=True, download=True, transform=transform_train)
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=opt.batch_size_train, shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(
    root=opt.dataroot, train=False, download=True, transform=transform_test)
testloader = torch.utils.data.DataLoader(
    testset, batch_size=opt.batch_size_test, shuffle=False, num_workers=2)

NameError: ignored

In [None]:
def set_parameter_requires_grad(model, feature_extracting):
    if feature_extracting:
        for param in model.parameters():
            param.requires_grad = False

In [None]:
from google.colab import files
src = list(files.upload().values())[0]
open('/content/drive/MyDrive/Colab Notebooks/07-08-2021/imagenet_resnet50.py','wb').write(src)
import imagenet_resnet50

Saving imagenet_resnet50.py to imagenet_resnet50.py


In [62]:
import imp
import numpy as np
#MainModel = imp.load_source('MainModel', "/content/drive/MyDrive/Colab Notebooks/07-08-2021/imagenet_resnet50.py")
#resnet_pytorch_model = torch.load("/content/drive/MyDrive/Colab Notebooks/07-08-2021/imagenet_resnet50.pb")

In [63]:
transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

# Normalize the test set same as training set without augmentation
transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

In [64]:
trainset = torchvision.datasets.CIFAR10(
    root='./', train=True, download=True, transform=transform_train)
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=32, shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(
    root='./', train=False, download=True, transform=transform_test)
testloader = torch.utils.data.DataLoader(
    testset, batch_size=32, shuffle=False, num_workers=2)

0.1%

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./cifar-10-python.tar.gz


100.0%

Extracting ./cifar-10-python.tar.gz to ./
Files already downloaded and verified


In [65]:
resnet_pytorch_model = pytorch_model

In [66]:
import matplotlib.pyplot as plt
import time
import os
import copy
import csv
import pandas as pd
from datetime import datetime

In [67]:
dataloaders = {'train': trainloader, 'val':testloader}
dataset_sizes = {'train': len(trainloader.dataset), 'val':len(testloader.dataset) }
class_names = trainloader.dataset.classes

In [68]:
len(trainloader.dataset)

50000

In [76]:
def train_model(model, dataloaders, criterion, optimizer, scheduler, num_epochs=25, is_inception=False):
    since = time.time()
    date = datetime.today().strftime('%Y-%m-%d-%H:%M:%S')

    data_file = open('./data/experiment_train_{}.csv'.format(date), mode='w+', newline='', encoding='utf-8')
    data_writer = csv.writer(data_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    data_writer.writerow(['Model','type', 'Dataset', 'Epoch', 'criterion', 'optimizer', 'scheduler','Train_loss', 'Train_acc', "val_loss", "Val_acc", 'time','Elapse_time','date'])

    val_acc_history = []
    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0
    
    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)
        since_1 = time.time()

        # Each epoch has a training and validation phase
        #data_writer.writerow(['Model','type', 'Dataset', 'Train_loss', 'Train_acc', "val_loss", "Val_acc"])
        rows = [model, 'pytorch','cifar10','{}/{}'.format(epoch, num_epochs - 1) ,criterion, optimizer, scheduler]
        #for phase in ['train', 'val']:
        #for i, data in enumerate(trainloader, 0):
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            #for inputs, labels in dataloaders[phase]:
            for i, data in enumerate(dataloaders[phase], 0):
                inputs, labels = data
                inputs = inputs.to(device)
                labels = labels.to(device)
                inputs, labels = Variable(inputs), Variable(labels)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                   # Get model outputs and calculate loss
                    # Special case for inception because in training it has an auxiliary output. In train
                    #   mode we calculate the loss by summing the final output and the auxiliary output
                    #   but in testing we only consider the final output.
                    if is_inception and phase == 'train':
                        # From https://discuss.pytorch.org/t/how-to-optimize-inception-model-with-auxiliary-classifiers/7958
                        outputs, aux_outputs = model(inputs)
                        loss1 = criterion(outputs, labels)
                        loss2 = criterion(aux_outputs, labels)
                        loss = loss1 + 0.4*loss2
                    else:
                        outputs = model(inputs)
                        #print(criterion)
                        loss = criterion(outputs, labels)
                    #outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    #loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)
            if phase == 'train':
                scheduler.step()

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))
            rows.append(phase)
            rows.append('Loss: {:.4f}'.format(epoch_loss))
            rows.append('Acc: {:.4f}'.format(epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())
            if phase == 'val':
                val_acc_history.append(epoch_acc)
        time_elapsed_1 = time.time() - since_1
        print()
        rows.append(time.time())
        rows.append('{:.0f}m {:.0f}s'.format(time_elapsed_1 // 60, time_elapsed_1 % 60))
        data_writer.writerow(rows)

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    data_writer.writerow(['','', '', '', '', '', "", 'Best val Acc: {:4f}'.format(best_acc), time.time(),'Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60),''])

    data_file.close()
    return model, val_acc_history

In [77]:
model.last_linear = nn.Sequential(
    nn.BatchNorm1d(2048),
    nn.Dropout(p=0.25),
    nn.Linear(in_features=2048, out_features=2048),
    nn.ReLU(),
    nn.BatchNorm1d(2048, eps=1e-05, momentum=0.1),
    nn.Dropout(p=0.5),
    nn.Linear(in_features=2048, out_features=1103),
)

In [78]:
# Send the model to GPU
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

In [79]:

#resnet_pytorch_model = resnet_pytorch_model.to(device)

# Gather the parameters to be optimized/updated in this run. If we are
#  finetuning we will be updating all parameters. However, if we are
#  doing feature extract method, we will only update the parameters
#  that we have just initialized, i.e. the parameters with requires_grad
#  is True.
params_to_update = resnet_pytorch_model.parameters()
feature_extract = False
print("Params to learn:")
if feature_extract:
    params_to_update = []
    for name,param in resnet_pytorch_model.named_parameters():
        if param.requires_grad == True:
            params_to_update.append(param)
            print("\t",name)
else:
    for name,param in resnet_pytorch_model.named_parameters():
        if param.requires_grad == True:
            print("\t",name)

# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(params_to_update, lr=0.001, momentum=0.9)

Params to learn:
	 Conv_sequential_1/resnet50/conv1_bn/FusedBatchNormV3:0.weight
	 Conv_sequential_1/resnet50/conv1_bn/FusedBatchNormV3:0.bias
	 Conv_sequential_1/resnet50/conv2_block1_1_bn/FusedBatchNormV3:0.weight
	 Conv_sequential_1/resnet50/conv2_block1_1_bn/FusedBatchNormV3:0.bias
	 Conv_sequential_1/resnet50/conv2_block1_2_bn/FusedBatchNormV3:0.weight
	 Conv_sequential_1/resnet50/conv2_block1_2_bn/FusedBatchNormV3:0.bias
	 Conv_sequential_1/resnet50/conv2_block1_3_bn/FusedBatchNormV3:0.weight
	 Conv_sequential_1/resnet50/conv2_block1_3_bn/FusedBatchNormV3:0.bias
	 Conv_sequential_1/resnet50/conv2_block1_0_bn/FusedBatchNormV3:0.weight
	 Conv_sequential_1/resnet50/conv2_block1_0_bn/FusedBatchNormV3:0.bias
	 Conv_sequential_1/resnet50/conv2_block2_1_bn/FusedBatchNormV3:0.weight
	 Conv_sequential_1/resnet50/conv2_block2_1_bn/FusedBatchNormV3:0.bias
	 Conv_sequential_1/resnet50/conv2_block2_2_bn/FusedBatchNormV3:0.weight
	 Conv_sequential_1/resnet50/conv2_block2_2_bn/FusedBatchNormV3:

In [82]:
num_epochs = 10
# Setup the loss fxn
model_name = 'resnet50'
criterion = nn.CrossEntropyLoss()
# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)
# Train and evaluate
model_ft, hist = train_model(resnet_pytorch_model, dataloaders, criterion, optimizer_ft, exp_lr_scheduler, num_epochs=num_epochs, is_inception=(model_name=="inception"))

Epoch 0/9
----------


RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

In [83]:
torch.save(model_ft.state_dict(), 'resnet_pytorch_model_trained.pb')

NameError: name 'model_ft' is not defined