# Homework-  26.11.2018:
## State of the Art Neural Network Architectures

The purpose of this homework is to implement and evaluate the sota architectures presented in the lecture.
However, you are encouraged to try your own layer module ideas.
Feel free to consult the [Keras source code](https://github.com/keras-team/keras-applications):




1. Based on the CNN modules presented in the lecture e.g. VGG16, Inception, ResNet, Xception, DenseNet, come up with your own CNN module and write a small text discussing your idea and motivations behind the module.

2. Evaluate all your module using the Keras CIFAR10 dataset splits (The model with best test accuracy will present their solution to the class).

In [1]:
from tensorflow.keras.datasets import cifar10

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

In [2]:
import numpy as np
print(np.shape(x_train))
print(np.shape(y_train))
print(np.shape(x_test))
print(np.shape(y_test))
#print(x_train[0,:,:,:])#,RGB = 3
#print(y_train)


(50000, 32, 32, 3)
(50000, 1)
(10000, 32, 32, 3)
(10000, 1)


In [3]:
#Transform data to fit softmax
from tensorflow.keras import utils
y_train_categorical = utils.to_categorical(y_train, 10)
y_test_categorical = utils.to_categorical(y_test, 10)

In [68]:
#Hyperparameters
img_shape = (32,32,3)
classes_number = 10

In [22]:
#MODEL: AlexNet
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten, Conv2D, MaxPooling2D, BatchNormalization

alexnet = Sequential()

# Layer 1
# 96 filter mit 11x11 convolution too big for 32x32 img?
alexnet.add(Conv2D(96, (11, 11), input_shape=img_shape, padding='same'))
alexnet.add(BatchNormalization())
alexnet.add(Activation('relu'))
alexnet.add(MaxPooling2D(pool_size=(2, 2)))

# Layer 2
alexnet.add(Conv2D(256, (5, 5), padding='same'))
alexnet.add(BatchNormalization())
alexnet.add(Activation('relu'))
alexnet.add(MaxPooling2D(pool_size=(2, 2)))

# Layer 3
alexnet.add(Conv2D(384, (3, 3), padding='same'))
alexnet.add(BatchNormalization())
alexnet.add(Activation('relu'))
alexnet.add(MaxPooling2D(pool_size=(2, 2)))

# Layer 4
alexnet.add(Conv2D(384, (3, 3), padding='same'))
alexnet.add(BatchNormalization())
alexnet.add(Activation('relu'))

# Layer 5
alexnet.add(Conv2D(256, (3, 3), padding='same'))
alexnet.add(BatchNormalization())
alexnet.add(Activation('relu'))
alexnet.add(MaxPooling2D(pool_size=(2, 2)))

alexnet.add(Flatten())

# Layer 6 - fully connected layer
alexnet.add(Dense(4096))
alexnet.add(BatchNormalization())
alexnet.add(Activation('relu'))
alexnet.add(Dropout(0.5))

# Layer 7
alexnet.add(Dense(4096))
alexnet.add(BatchNormalization())
alexnet.add(Activation('relu'))
alexnet.add(Dropout(0.5))

# Layer 8
alexnet.add(Dense(classes_number))
alexnet.add(BatchNormalization())
alexnet.add(Activation('softmax'))

alexnet.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_12 (Conv2D)           (None, 32, 32, 96)        34944     
_________________________________________________________________
batch_normalization_18 (Batc (None, 32, 32, 96)        384       
_________________________________________________________________
activation_18 (Activation)   (None, 32, 32, 96)        0         
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 16, 16, 96)        0         
_________________________________________________________________
conv2d_13 (Conv2D)           (None, 16, 16, 256)       614656    
_________________________________________________________________
batch_normalization_19 (Batc (None, 16, 16, 256)       1024      
_________________________________________________________________
activation_19 (Activation)   (None, 16, 16, 256)       0         
__________

In [23]:
#Compile 
alexnet.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

#Train
alexnet.fit(x_train, y_train_categorical, validation_data=(x_test,y_test_categorical), batch_size=1000, epochs=1, verbose=1)

Train on 50000 samples, validate on 10000 samples
Epoch 1/1


<tensorflow.python.keras.callbacks.History at 0x1cf45b67f60>

Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 1188s 24ms/step - loss: 1.5467 - acc: 0.4583 - val_loss: 2.8624 - val_acc: 0.2014

3. Evaluate your module using the FERPlus dataset (The model with the best test accuracy will present their solution to the class).

    3.1 Download the [FER2013 dataset](https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data) (images_path).
    
    3.2 Download the [FERPlus labels](https://github.com/Microsoft/FERPlus/blob/master/fer2013new.csv) (labels_path).
    
    3.3 Use the following code snippet to load the dataset giving the appropiate paths to the csv files downloaded in 3.1 and 3.2:

In [69]:
import pandas as pd
import numpy as np
import cv2

In [11]:
class FERPlus(object):
    """Class for loading FER2013 [1] emotion classification dataset with
    the FERPlus labels [2]:
    [1] kaggle.com/c/challenges-in-representation-learning-facial-\
            expression-recognition-challenge
    [2] github.com/Microsoft/FERPlu://github.com/Microsoft/FERPlus"""

    def __init__(self, images_path, labels_path, split='train', image_size=(48, 48),
                 dataset_name='FERPlus'):

        self.split = split
        self.image_size = image_size
        self.dataset_name = dataset_name
        self.images_path = images_path
        self.labels_path = labels_path
        self.class_names = ['neutral', 'happiness', 'surprise', 'sadness',
                            'anger', 'disgust', 'fear', 'contempt']
        self.num_classes = len(self.class_names)
        self.arg_to_name = dict(zip(range(self.num_classes), self.class_names))
        self.name_to_arg = dict(zip(self.class_names, range(self.num_classes)))
        self._split_to_filter = {
            'train': 'Training', 'val': 'PublicTest', 'test': 'PrivateTest'}

    def load_data(self):
        filter_name = self._split_to_filter[self.split]
        pixel_sequences = pd.read_csv(self.images_path)
        pixel_sequences = pixel_sequences[pixel_sequences.Usage == filter_name]
        pixel_sequences = pixel_sequences['pixels'].tolist()
        faces = []
        for pixel_sequence in pixel_sequences:
            face = [float(pixel) for pixel in pixel_sequence.split(' ')]
            face = np.asarray(face).reshape(48, 48)
            faces.append(cv2.resize(face, self.image_size))
        faces = np.asarray(faces)
        faces = np.expand_dims(faces, -1)

        emotions = pd.read_csv(self.labels_path)
        emotions = emotions[emotions.Usage == filter_name]
        emotions = emotions.iloc[:, 2:10].values
        N = np.sum(emotions, axis=1)
        mask = N != 0
        N, faces, emotions = N[mask], faces[mask], emotions[mask]
        emotions = emotions / np.expand_dims(N, 1)
        return faces, emotions

In [19]:
validation_data = FERPlus("fer2013\\fer2013.csv", "fer2013new.csv")
faces, emotions = validation_data.load_data()

ValueError: Error when checking input: expected conv2d_5_input to have shape (32, 32, 3) but got array with shape (48, 48, 1)

In [8]:
#MODEL: AlexNet
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten, Conv2D, MaxPooling2D, BatchNormalization

# About CIFAR10:
# The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, 
# with 6000 images per class. There are 50000 training images and 10000 test images. 

#Hyperparameters
img_shape = (32,32,3)
classes_number = 10

ownnet = Sequential()

# Layer 1
# 16 pixels to one: 32 -> 8
# How many input neurons -> Lecture showed that less parameters and more layers are more useful
ownnet.add(Conv2D(classes_number * 3, (8, 8), input_shape=img_shape, padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(Activation('relu'))
ownnet.add(MaxPooling2D(pool_size=(2, 2)))

# Another layer with double amount of filters
ownnet.add(Conv2D(classes_number * 6, (8, 8), input_shape=img_shape, padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(Activation('relu'))
ownnet.add(MaxPooling2D(pool_size=(2, 2)))

# Layer 2
ownnet.add(Conv2D(128, (4, 4), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(Activation('relu'))
ownnet.add(MaxPooling2D(pool_size=(2, 2)))

# Layer 3
ownnet.add(Conv2D(128, (3, 3), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(Activation('relu'))
ownnet.add(MaxPooling2D(pool_size=(2, 2)))

# Layer 4
ownnet.add(Conv2D(128, (3, 3), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(Activation('relu'))

# Layer 5
ownnet.add(Conv2D(256, (3, 3), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(Activation('relu'))
ownnet.add(MaxPooling2D(pool_size=(2, 2)))

# Another layer
ownnet.add(Conv2D(64, (2, 2), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(Activation('relu'))

ownnet.add(Flatten())

# Layer 6 - fully connected layer
ownnet.add(Dense(1024))
ownnet.add(BatchNormalization())
ownnet.add(Activation('relu'))
ownnet.add(Dropout(0.5))

# Layer 7
ownnet.add(Dense(512))
ownnet.add(BatchNormalization())
ownnet.add(Activation('relu'))
ownnet.add(Dropout(0.5))

# Layer 8
ownnet.add(Dense(classes_number))
ownnet.add(BatchNormalization())
ownnet.add(Activation('softmax'))

ownnet.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_5 (Conv2D)            (None, 32, 32, 30)        5790      
_________________________________________________________________
batch_normalization_8 (Batch (None, 32, 32, 30)        120       
_________________________________________________________________
activation_8 (Activation)    (None, 32, 32, 30)        0         
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 16, 16, 30)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 16, 16, 60)        115260    
_________________________________________________________________
batch_normalization_9 (Batch (None, 16, 16, 60)        240       
_________________________________________________________________
activation_9 (Activation)    (None, 16, 16, 60)        0         
__________

Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 287s 6ms/step - loss: 1.7042 - acc: 0.3946 - val_loss: 2.1963 - val_acc: 0.1489

In [9]:
#Compile 
ownnet.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

#Train
ownnet.fit(x_train, y_train_categorical, validation_data=(x_test,y_test_categorical), batch_size=1000, epochs=1, verbose=1)

Train on 50000 samples, validate on 10000 samples
Epoch 1/1


<tensorflow.python.keras.callbacks.History at 0x1cf372bd898>

In [81]:
# Results of frist try were mixed, therefore some adjustments
# Use Leaky RelU instead of normal ReLU to prevent dead ReLU
# Add more filters to first layer, less to 2nd
# Added another dense layer
# Less pooling functions

#MODEL: AlexNet
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten, Conv2D, MaxPooling2D, BatchNormalization, LeakyReLU

# About CIFAR10:
# The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, 
# with 6000 images per class. There are 50000 training images and 10000 test images. 

#Hyperparameters
img_shape = (32,32,3)
classes_number = 10

ownnet = Sequential()

# Use smaller kernel but combinded with strides
ownnet.add(Conv2D(classes_number * 10, (10, 10), input_shape=img_shape, padding='valid', strides=2))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))

# Another layer with double amount of filters
ownnet.add(Conv2D(classes_number * 8, (8, 8), input_shape=img_shape, padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(MaxPooling2D(pool_size=(2, 2)))

# Layer 2
ownnet.add(Conv2D(256, (6, 6), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(MaxPooling2D(pool_size=(2, 2)))

# Layer 3
ownnet.add(Conv2D(128, (5, 5), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(MaxPooling2D(pool_size=(2, 2)))

# Layer 4
ownnet.add(Conv2D(128, (4, 4), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))

# Layer 5
ownnet.add(Conv2D(64, (3, 3), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))

# Another layer
ownnet.add(Conv2D(64, (2, 2), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(Flatten())

# Layer 6 - fully connected layer
ownnet.add(Dense(512))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(Dropout(0.5))

# Layer 7
ownnet.add(Dense(256))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(Dropout(0.5))

# Layer 7
ownnet.add(Dense(128))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(Dropout(0.5))

# Layer 8
ownnet.add(Dense(classes_number))
ownnet.add(BatchNormalization())
ownnet.add(Activation('softmax'))

ownnet.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_224 (Conv2D)          (None, 12, 12, 100)       30100     
_________________________________________________________________
batch_normalization_301 (Bat (None, 12, 12, 100)       400       
_________________________________________________________________
leaky_re_lu_231 (LeakyReLU)  (None, 12, 12, 100)       0         
_________________________________________________________________
conv2d_225 (Conv2D)          (None, 12, 12, 80)        512080    
_________________________________________________________________
batch_normalization_302 (Bat (None, 12, 12, 80)        320       
_________________________________________________________________
leaky_re_lu_232 (LeakyReLU)  (None, 12, 12, 80)        0         
_________________________________________________________________
max_pooling2d_163 (MaxPoolin (None, 6, 6, 80)          0         
__________

In [None]:
#Compile 
ownnet.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

#Train
ownnet.fit(x_train, y_train_categorical, validation_data=(x_test,y_test_categorical), batch_size=1000, epochs=1, verbose=1)

Train on 50000 samples, validate on 10000 samples
Epoch 1/1
 5000/50000 [==>...........................] - ETA: 7:47 - loss: 2.6747 - acc: 0.1000

Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 461s 9ms/step - loss: 2.1342 - acc: 0.2192 - val_loss: 2.3551 - val_acc: 0.1136

In [4]:
# 2nd try results were worse: Changeing first layer: No strides and smaller kernles (to enable edge detection)
# Using elu function instead of relu at two random points
# Reduced amount of parameters extremly (especially less filters in first few layers)
# Added another filter layer at start and another dense layer at end, therefore reduced density.

#MODEL: AlexNet
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten, Conv2D, MaxPooling2D, BatchNormalization, LeakyReLU

# About CIFAR10:
# The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, 
# with 6000 images per class. There are 50000 training images and 10000 test images. 

#Hyperparameters
img_shape = (32,32,3)
classes_number = 10

ownnet = Sequential()

# Use smaller kernel to improve potential edge detection 
ownnet.add(Conv2D(classes_number * 10, (3, 3), input_shape=img_shape, padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))

# Another layer with double amount of filters
ownnet.add(Conv2D(classes_number * 8, (5, 5), input_shape=img_shape, padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(MaxPooling2D(pool_size=(2, 2)))

# Layer 2
ownnet.add(Conv2D(128, (6, 6), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(Activation('elu'))
ownnet.add(MaxPooling2D(pool_size=(2, 2)))

# Layer 3
ownnet.add(Conv2D(96, (5, 5), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(MaxPooling2D(pool_size=(2, 2)))

# Layer 4
ownnet.add(Conv2D(64, (4, 4), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))

# Layer 5
ownnet.add(Conv2D(58, (3, 3), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(Activation('elu'))

# Another layer
ownnet.add(Conv2D(42, (2, 2), padding='same'))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(Flatten())

# Layer 6 - fully connected layer
ownnet.add(Dense(256))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(Dropout(0.5))

# Layer 7
ownnet.add(Dense(128))
ownnet.add(BatchNormalization())
ownnet.add(Activation('elu'))
ownnet.add(Dropout(0.5))

# Layer 7
ownnet.add(Dense(64))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(Dropout(0.5))

# Layer 7
ownnet.add(Dense(32))
ownnet.add(BatchNormalization())
ownnet.add(LeakyReLU(alpha=0.01))
ownnet.add(Dropout(0.5))


# Layer 8
ownnet.add(Dense(classes_number))
ownnet.add(BatchNormalization())
ownnet.add(Activation('softmax'))

ownnet.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 32, 32, 100)       2800      
_________________________________________________________________
batch_normalization (BatchNo (None, 32, 32, 100)       400       
_________________________________________________________________
leaky_re_lu (LeakyReLU)      (None, 32, 32, 100)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 32, 32, 80)        200080    
_________________________________________________________________
batch_normalization_1 (Batch (None, 32, 32, 80)        320       
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 32, 32, 80)        0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 16, 16, 80)        0         
__________

In [5]:
#Compile 
ownnet.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

#Train
ownnet.fit(x_train, y_train_categorical, validation_data=(x_test,y_test_categorical), batch_size=1000, epochs=1, verbose=1)

Train on 50000 samples, validate on 10000 samples
Epoch 1/1


<tensorflow.python.keras.callbacks.History at 0x15dca319ba8>

Train on 50000 samples, validate on 10000 samples
Epoch 1/1
50000/50000 [==============================] - 26171s 523ms/step - loss: 2.2899 - acc: 0.1791 - val_loss: 12.4215 - val_acc: 0.1001

In [20]:
loss = ownnet.evaluate(faces, emotions)
print(loss)

ValueError: Error when checking input: expected conv2d_5_input to have shape (32, 32, 3) but got array with shape (48, 48, 1)

In [None]:
shift + Tab --> Documentation
Tab --> Code completition