Oskar Lundqvist; osauli-0@student.ltu.se

Filip Renberg; filren-0@student.ltu.se

We are gonna do a simple lab where we train a model using Keras

We are gonna use the mnist dataset of digits imported from keras like we did in lab 1 and lab 4.

In [1]:
import logging
logging.getLogger('tensorflow').disabled = True # Disable warnings from tensorflow
import numpy as np
import keras
from keras import layers
from keras.datasets import mnist
input_shape = (28,28,1)
num_classes = 10

We preprocess our data

In [2]:
(train_data, train_labels), (test_data, test_labels) = mnist.load_data()
#Normalize input values
train_data = train_data.astype("float32")/255
test_data = test_data.astype("float32")/255
#Make sure images have the correct shape(28,28,1)
train_data = np.expand_dims(train_data, -1)
test_data = np.expand_dims(test_data, -1)

#Converts class vectors to binary class matrices
train_labels = keras.utils.to_categorical(train_labels, num_classes)
test_labels = keras.utils.to_categorical(test_labels, num_classes)

print("training shape: ", train_data.shape)
print(train_data.shape[0], "number of training samples")
print(test_data.shape[0], "number of testing samples")

training shape:  (60000, 28, 28, 1)
60000 number of training samples
10000 number of testing samples


We create our convolutional neural network using Keras

In [3]:
def create_model(nr_conv2d = 1, nr_filter = 32, kernel_size = (3, 3), model_summary = False):
    # Name the model based on the input values,
    # Model_nr_conv2d-nr_filter-kernel_size
    model_name=f"sequential_{nr_conv2d}-{nr_filter}-{kernel_size[0]}"
    
    hidden_layers = []

    hidden_layers.append(keras.Input(shape=input_shape)) # input layer

    for i in range(nr_conv2d):
        hidden_layers.append(layers.Conv2D(nr_filter, kernel_size=kernel_size, activation="relu"))
        hidden_layers.append(layers.MaxPooling2D(pool_size=[2,2]))

        nr_filter*=2

    hidden_layers.append(layers.Flatten())
    hidden_layers.append(layers.Dropout(0.5)) #Prevents overfitting
    hidden_layers.append(layers.Dense(num_classes, activation="softmax"))

    model = keras.Sequential(hidden_layers, name=model_name)

    if model_summary == True:
        model.summary()
    else:
        print(f'Model: {model_name}')
    
    return model

Next we are training the model. We can use several different loss functions here, for this project we will use "categorical_crossentropy", "poisson" and "binary_crossentropy"

In [4]:
n_epochs = 10
size_batch = 128

def model_training(pick, model):
    match pick:
        case 1:
            #Categorical loss function
            model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['categorical_accuracy'])
            print("Loss Function: categorical_crossentropy")
            model.fit(train_data, train_labels, batch_size=size_batch, epochs=n_epochs, validation_split=0.1, verbose=0)
        case 2:
            #Poisson loss function
            model.compile(loss="poisson", optimizer="adam", metrics=['categorical_accuracy'])
            print("Loss Function: poisson")
            model.fit(train_data, train_labels, batch_size=size_batch, epochs=n_epochs, validation_split=0.1, verbose=0)
        case 3:
            #Binary loss function
            model.compile(loss="binary_crossentropy", optimizer="adam", metrics=['categorical_accuracy'])
            print("Loss Function: binary_crossentropy")
            model.fit(train_data, train_labels, batch_size=size_batch, epochs=n_epochs, validation_split=0.1, verbose=0)


Define different values that we want to test

In [5]:
nr_conv2d_list = [1, 2, 3]
nr_filter_list = [8, 16, 32]
kernel_size_list = [1, 2, 3]

# We use this list to save the values for the best model
best_model = [[0, 1]]

In [6]:
def best_performance(best_model):
    # Resulting best model:
    create_model(nr_conv2d=best_model[0][2], nr_filter=best_model[0][3], model_summary=True)
    match best_model[0][4]:
        case 1:
            print("Using Categorical crossentropy loss function resulted in the \nfollowing accuracy and loss:")
        case 2:
            print("Using Poisson loss function resulted in the \nfollowing accuracy and loss:")
        case 3:
            print("Using binary crossentropy loss function resulted in the \nfollowing accuracy and loss")
    print("|\taccuracy: \t", best_model[0][0], "\t|")
    print("|\tloss: \t\t", best_model[0][1], "\t|")

We mute model summary for the test prints but you can still see the model name, example: 

`sequential_x1-x2-x3`

`x1` is the nr of conv2d layers, `x2` is the starting filter size, and `x3` is the kernel size.

In [7]:
for i in nr_conv2d_list: # The nr conv2d layers
    for j in nr_filter_list: # The starting size of the conv2d filters
        for k in kernel_size_list: # Kernel size for the filters
            for n in range(3): # The loss function

                model = create_model(nr_conv2d=i, nr_filter=j, kernel_size=(k, k)) # You can also change the kernel size
                train = model_training(n+1, model)

                #Evaluate
                score = model.evaluate(test_data, test_labels, verbose=0)
                print("|\taccuracy: \t", score[1], "\t|")
                print("|\tloss: \t\t", score[0], "\t|")

                if (score[1] > best_model[0][0]) & (score[0] < best_model[0][1]):
                    best_model.pop()
                    best_model.append((score[1], score[0], i, j, n+1))

                print("\n")

print("Finally done :)")

We mute model summary for the test prints but you can still see the model name:
first value is the nr of conv2d layers, second value is the starting filter size.
The third value is the kernel size but we don't change it in our current tests.

Model: sequential_1-8-1
Loss Function: categorical_crossentropy
|	accuracy: 	 0.9101999998092651 	|
|	loss: 		 0.31385284662246704 	|


Model: sequential_1-8-1
Loss Function: poisson
|	accuracy: 	 0.9101999998092651 	|
|	loss: 		 0.1323578655719757 	|


Model: sequential_1-8-1
Loss Function: binary_crossentropy
|	accuracy: 	 0.8991000056266785 	|
|	loss: 		 0.07752678543329239 	|


Model: sequential_1-8-2
Loss Function: categorical_crossentropy
|	accuracy: 	 0.9623000025749207 	|
|	loss: 		 0.14030881226062775 	|


Model: sequential_1-8-2
Loss Function: poisson
|	accuracy: 	 0.9603000283241272 	|
|	loss: 		 0.1140991598367691 	|


Model: sequential_1-8-2
Loss Function: binary_crossentropy
|	accuracy: 	 0.9627000093460083 	|
|	loss: 		 0.0324176438

The following is the best performing model from our tests:

In [8]:
best_performance(best_model)

Model: "sequential_2-32-3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_162 (Conv2D)         (None, 26, 26, 32)        320       
                                                                 
 max_pooling2d_162 (MaxPool  (None, 13, 13, 32)        0         
 ing2D)                                                          
                                                                 
 conv2d_163 (Conv2D)         (None, 11, 11, 64)        18496     
                                                                 
 max_pooling2d_163 (MaxPool  (None, 5, 5, 64)          0         
 ing2D)                                                          
                                                                 
 flatten_81 (Flatten)        (None, 1600)              0         
                                                                 
 dropout_81 (Dropout)        (None, 1600)        