# G Neural Networks Overfitting
_6 points_

- Train a neural net and prevent overfitting by regularization. 
- You can use any combination of regularizers we saw in class.
- Use the train and test splits in the data do evaluate the model.

In [1]:
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Flatten, BatchNormalization
from keras.layers import Conv2D, MaxPooling2D, Dropout, Activation
from keras import backend as K
from keras.utils import to_categorical
from keras import regularizers

Using TensorFlow backend.


In [10]:
# change batch_size and epochs for fine tuning
# image_classes MUST remain at 10!!!

batch_size = 128
image_classes = 10
epochs = 20

In [3]:
# split data
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
y_train = to_categorical(y_train, image_classes)
y_test = to_categorical(y_test, image_classes)

In [7]:
# create the model
model = Sequential()
decay = 1e-4
model.add(Conv2D(32, kernel_size=(5, 5), activation="softmax", input_shape=(32, 32, 3), kernel_regularizer=regularizers.l2(decay)))
model.add(BatchNormalization())
model.add(Conv2D(32, kernel_size=(3, 3), activation="softmax", kernel_regularizer=regularizers.l2(decay)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))

model.add(Conv2D(64, kernel_size=(3, 3), activation="softmax", kernel_regularizer=regularizers.l2(decay)))
model.add(BatchNormalization())
model.add(Conv2D(128, kernel_size=(3, 3), activation="softmax", kernel_regularizer=regularizers.l2(decay)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))

model.add(Dense(128, activation="softmax"))
model.add(Flatten())
model.add(Dense(image_classes, activation="softmax"))

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

In [8]:
# compilation
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_5 (Conv2D)            (None, 28, 28, 32)        2432      
_________________________________________________________________
batch_normalization_5 (Batch (None, 28, 28, 32)        128       
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 26, 26, 32)        9248      
_________________________________________________________________
batch_normalization_6 (Batch (None, 26, 26, 32)        128       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 11, 11, 64)        18496     
__________

In [11]:
train_size = 50000
test_size = 10000
if (y_test.shape == (10000, 10)):
    model.fit(x_train[:train_size], y_train[:train_size],
          batch_size=batch_size,
          epochs=epochs,
          shuffle=True,
          validation_data=(x_test[:test_size], y_test[:test_size]))
else:
    raise AttributeError("y_test.shape must be (10000, 10) but is {}".format(y_test.shape))

Train on 50000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [12]:
#score = model.evaluate(x_test[test_size:], y_test[test_size:], verbose=0)
score = model.evaluate(x_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])

Test loss: 1.1692674964904786
Test accuracy: 0.5982


# Answer

In this version we modified the model and made it a little more complex. Adding additional layers and doing the process of Convolution, Pooling, Dropout twice, we thought it give us better results. This is not the case directly; however, the difference we got between the accuracies in the training epochs and at the evaluation is quite outstanding. Both values only differ by about 0.7 % which is very good. Even though the final accuracy is "only" at 59%, we believe that this implementation shows that multiple changes to the model regarding regularization can improve it and prevent overfitting, but they do not always lead to a better model (in accuracy terms) or a faster runtime.