# Introduction

In this notebook we analyze how important the start weights are for a CNN.

**Key question: Do the start weights have a small or large influence on the final classification performance of a CNN?**

For this we conduct experiments where we start with different start weights and train a CNN till a certain classification performance is reached and then observe:

- Do the training curves differ regarding their form?
- How long does it take to reach the final classification performance threshold?

# Datasets used for the experiments

Image datasets needed in order to conduct the experiments:
- imagenette2: 10 very different object classes
- imagewoof  : 10 similar object classes (10 dog breeds)

Here I use the imagenette2 and imagewoof (both in the 320px versions) datasets which are available at

[https://github.com/fastai/imagenette](https://github.com/fastai/imagenette)

These are much smaller versions of the original imagenet dataset with only 10 object classes each.

You have to download the images manually before starting the experiments!

I used the 320px versions of the datasets.

# Prepare a train and test dataset

In [1]:
img_shape = (224,224,3)

import sys
sys.path.append("../../cnn_toolbox")
from cnn_toolbox import image_dataset

root_folder = "/media/juebrauer/Seagate Expansion Drive/datasets/01_images/18_imagenette2/320px/"
root_folder_train = root_folder + "train/"
root_folder_test  = root_folder + "val/"

ds_train = image_dataset(name="imagenette2-train",
                         root_folder=root_folder_train,
                         img_size=(img_shape[0],img_shape[1]))

ds_test = image_dataset(name="imagenette2-test",
                        root_folder=root_folder_test,
                        img_size=(img_shape[0],img_shape[1]))

Welcome to image_dataset class v1.0 by Juergen Brauer
Under root folder
	/media/juebrauer/Seagate Expansion Drive/datasets/01_images/18_imagenette2/320px/train/
I have found the following 10 subfolders/classes:

['cassette_player', 'chain_saw', 'church', 'dog_english_springer', 'fish_tench', 'french_horn', 'garbage_truck', 'gas_pump', 'golf_ball', 'parachute']
993 files in subfolder /media/juebrauer/Seagate Expansion Drive/datasets/01_images/18_imagenette2/320px/train//cassette_player/
858 files in subfolder /media/juebrauer/Seagate Expansion Drive/datasets/01_images/18_imagenette2/320px/train//chain_saw/
941 files in subfolder /media/juebrauer/Seagate Expansion Drive/datasets/01_images/18_imagenette2/320px/train//church/
955 files in subfolder /media/juebrauer/Seagate Expansion Drive/datasets/01_images/18_imagenette2/320px/train//dog_english_springer/
963 files in subfolder /media/juebrauer/Seagate Expansion Drive/datasets/01_images/18_imagenette2/320px/train//fish_tench/
956 files in

# Test helper function to build CNN models

In [2]:
from cnn_toolbox import create_cnn_model

model1 = create_cnn_model(model_name = "same_nr_filters",
                         input_shape = img_shape,
                         nr_outputs = ds_train.nr_classes)
model1.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 222, 222, 256)     7168      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 111, 111, 256)     0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 109, 109, 256)     590080    
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 54, 54, 256)       0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 52, 52, 256)       590080    
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 26, 26, 256)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 24, 24, 256)       5

In [3]:
model2 = create_cnn_model(model_name = "inc_nr_filters",
                         input_shape = img_shape,
                         nr_outputs = ds_train.nr_classes)
model2.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 222, 222, 32)      896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 111, 111, 32)      0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 109, 109, 64)      18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 54, 54, 64)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 52, 52, 128)       73856     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 26, 26, 128)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 24, 24, 256)       2

# Test helper function to train a CNN for one epoch

In [4]:
from cnn_toolbox import train_cnn_one_epoch    

In [5]:
ds_train.set_mini_batch_size(128)
#train_cnn_one_epoch(model1, ds_train)

# Test helper function to test a CNN with a test dataset

In [6]:
from cnn_toolbox import test_cnn

In [7]:
#test_cnn(model1, ds_test)

# Check availability of GPUs

In [8]:
from cnn_toolbox import gpu_check
gpu_check()

The following GPUs are available: []
Nr of GPUs available: 0


Note: for checking on a computer with NVIDIA GPUs, whether they are used during training, enter:

    watch -n1.0 nvidia-smi   

# Check controlability of start weights

In [9]:
from cnn_toolbox import initialize_pseudo_random_number_generators

In [10]:
initialize_pseudo_random_number_generators(1)

In [11]:
model1 = create_cnn_model(model_name = "same_nr_filters",
                          input_shape = img_shape,
                          nr_outputs = ds_train.nr_classes)

In [12]:
model1.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 222, 222, 256)     7168      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 111, 111, 256)     0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 109, 109, 256)     590080    
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 54, 54, 256)       0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 52, 52, 256)       590080    
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 26, 26, 256)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 24, 24, 256)       5

In [13]:
from cnn_toolbox import get_weights_from_conv_layer

filter_weights_1, bias_weights_1 = get_weights_from_conv_layer(model1, "conv2d", show_info=True)

filter_weights has shape: (3, 3, 3, 256)
bias_weights has shape: (256,)
filter_weights has type: <class 'numpy.ndarray'>
bias_weights has type: <class 'numpy.ndarray'>


In [14]:
f1 = filter_weights_1[:,:,:,0]

In [15]:
f1

array([[[-0.03397892, -0.02063269, -0.03835453],
        [ 0.02216351, -0.01557614, -0.04525996],
        [ 0.02965986,  0.04531969,  0.03822213]],

       [[ 0.01318451,  0.00451758, -0.02655915],
        [ 0.04403731,  0.00551137,  0.03676816],
        [ 0.00215926, -0.02074016,  0.00352234]],

       [[ 0.02060153, -0.01723271, -0.01694306],
        [ 0.02090051, -0.02011791,  0.02081885],
        [-0.01326452, -0.04043265,  0.04245097]]], dtype=float32)

In [16]:
model2 = create_cnn_model(model_name = "same_nr_filters",
                          input_shape = img_shape,
                          nr_outputs = ds_train.nr_classes)

In [17]:
filter_weights_2, bias_weights_2 = get_weights_from_conv_layer(model2, "conv2d", show_info=True)
f2 = filter_weights_2[:,:,:,0]
f2

filter_weights has shape: (3, 3, 3, 256)
bias_weights has shape: (256,)
filter_weights has type: <class 'numpy.ndarray'>
bias_weights has type: <class 'numpy.ndarray'>


array([[[ 0.00225292, -0.02375937, -0.04370089],
        [ 0.01732371, -0.04943231,  0.02387507],
        [ 0.03352253, -0.01579722,  0.04878741]],

       [[-0.0136955 , -0.02521811, -0.04715899],
        [ 0.04738381,  0.05001703, -0.03145186],
        [ 0.04049386,  0.02348245,  0.01627743]],

       [[ 0.02415584,  0.03840971, -0.01027263],
        [ 0.03119494, -0.02915212, -0.03134566],
        [ 0.02180289,  0.04078788,  0.01119361]]], dtype=float32)

In [18]:
# re-initialize the random number generators
# with the same random seed that we used when
# we generated model1a
initialize_pseudo_random_number_generators(1)

In [19]:
model3 = create_cnn_model(model_name = "same_nr_filters",
                          input_shape = img_shape,
                          nr_outputs = ds_train.nr_classes)
filter_weights_3, bias_weights_3 = get_weights_from_conv_layer(model3, "conv2d", show_info=True)
f3 = filter_weights_3[:,:,:,0]

filter_weights has shape: (3, 3, 3, 256)
bias_weights has shape: (256,)
filter_weights has type: <class 'numpy.ndarray'>
bias_weights has type: <class 'numpy.ndarray'>


In [20]:
f1

array([[[-0.03397892, -0.02063269, -0.03835453],
        [ 0.02216351, -0.01557614, -0.04525996],
        [ 0.02965986,  0.04531969,  0.03822213]],

       [[ 0.01318451,  0.00451758, -0.02655915],
        [ 0.04403731,  0.00551137,  0.03676816],
        [ 0.00215926, -0.02074016,  0.00352234]],

       [[ 0.02060153, -0.01723271, -0.01694306],
        [ 0.02090051, -0.02011791,  0.02081885],
        [-0.01326452, -0.04043265,  0.04245097]]], dtype=float32)

In [21]:
f3

array([[[-0.03397892, -0.02063269, -0.03835453],
        [ 0.02216351, -0.01557614, -0.04525996],
        [ 0.02965986,  0.04531969,  0.03822213]],

       [[ 0.01318451,  0.00451758, -0.02655915],
        [ 0.04403731,  0.00551137,  0.03676816],
        [ 0.00215926, -0.02074016,  0.00352234]],

       [[ 0.02060153, -0.01723271, -0.01694306],
        [ 0.02090051, -0.02011791,  0.02081885],
        [-0.01326452, -0.04043265,  0.04245097]]], dtype=float32)

In [22]:
f2

array([[[ 0.00225292, -0.02375937, -0.04370089],
        [ 0.01732371, -0.04943231,  0.02387507],
        [ 0.03352253, -0.01579722,  0.04878741]],

       [[-0.0136955 , -0.02521811, -0.04715899],
        [ 0.04738381,  0.05001703, -0.03145186],
        [ 0.04049386,  0.02348245,  0.01627743]],

       [[ 0.02415584,  0.03840971, -0.01027263],
        [ 0.03119494, -0.02915212, -0.03134566],
        [ 0.02180289,  0.04078788,  0.01119361]]], dtype=float32)

# Training a complete model

In [23]:
from cnn_toolbox import train_cnn_complete
from cnn_toolbox import save_history

model = create_cnn_model(model_name = "same_nr_filters",
                         input_shape = img_shape,
                         nr_outputs = ds_train.nr_classes)

history = train_cnn_complete(model,
                             ds_train,
                             ds_test,
                             stop_epochnr=2)

fname = "model01.history"
save_history(history, fname)




-----------------------------------------------
train_cnn_complete: starting to train the model
-----------------------------------------------
test_cnn: there are 50 testing images. So for a batch size of 128 we have to test 1 batches.
test_cnn: tested mini batch 1 of 1. Tested images so far: 50
test_cnn: correctly classified: 5 of 50 images of dataset 'imagenette2-train': --> classification rate: 0.10
test_cnn: there are 50 testing images. So for a batch size of 32 we have to test 2 batches.
test_cnn: tested mini batch 1 of 2. Tested images so far: 32
test_cnn: tested mini batch 2 of 2. Tested images so far: 50
test_cnn: correctly classified: 5 of 50 images of dataset 'imagenette2-test': --> classification rate: 0.10


********************************************************
train_cnn_complete: starting training epoch 1
train_cnn_one_epoch: finished training batch 1 of 1. Trained images so far: 50
train_cnn_one_epoch: time needed for training this epoch: 0:00:11.595730
***********

In [25]:
history

{'cl_rate_train': [0.1, 0.1, 0.12], 'cl_rate_test': [0.1, 0.1, 0.12]}