# Erasmus Neural Networks
http://michalbereta.pl/nn
## TensorFlow and Keras for MNIST


Keras: https://keras.io/

TensorFlow: https://www.tensorflow.org/

### Note !

Training exemplary neural networks in this notebook is computationally demanding. In case of problems, use the attached pretrained models  (files \*.hdf5).

### Check your configuration

In [0]:
import tensorflow as tf
import keras as krs

print(tf.__version__)
print(krs.__version__)

1.15.0
2.2.5


Using TensorFlow backend.


## MNIST database


The MNIST database contains a training set consisting of 60,000 examples of scans of hand-written numbers from 0 to 9 (classification problem with 10 classes).

The test set contains 10,000 examples.

Each image has a size of 28x28 pixels. They constitute 28 * 28 = 784 inputs to the network.

In machine learning and image recognition community, the MNIST database serves as a kind of `Hello world` problem.


Read more about the MNIST database:

http://yann.lecun.com/exdb/mnist/



## Getting MNIST

The MNIST database can be downloaded in a binary version directly from the website:

http://yann.lecun.com/exdb/mnist/


In the form of csv files , the MNIST database is available on the website:

https://pjreddie.com/projects/mnist-in-csv/


The most convenient way, however, is to use the MNIST database import using the Keras library. At the first import, this database will be downloaded automatically (about 12MB). It will be placed in the directory `~ / .keras / datasets / mnist.pkl.gz`.

Below is an example code that reads training and test data, and displays several sample images.

In [0]:
%matplotlib notebook
import tensorflow as tf
import keras as krs
import numpy as np

from keras.datasets import mnist
import matplotlib.pyplot as plt

(xtrain, ytrain), (xtest, ytest) = mnist.load_data()
print('xtrain.shape',xtrain.shape)
print('ytrain.shape',ytrain.shape)
print('xtest.shape',xtest.shape)
print('ytest.shape',ytest.shape)

#wyswietlenie pierwszego przykladu
plt.imshow(xtest[0,:,:], cmap=plt.get_cmap('gray'))

#Wyswietlenie kilku pierwszych przykladow
rows = 8
cols = 10
counter = 0

images = None

for i in range(rows):
    current_row = None
    for j in range(cols):
        im = xtest[counter,:,:]
        counter = counter + 1
        if current_row is None:
            current_row = im
        else:
            current_row = np.hstack((current_row, im))
    if images is None:
        images = current_row
    else:
        images = np.vstack((images, current_row))
        
plt.figure()
plt.imshow(images, cmap=plt.get_cmap('gray'))

plt.show()

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
xtrain.shape (60000, 28, 28)
ytrain.shape (60000,)
xtest.shape (10000, 28, 28)
ytest.shape (10000,)


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## MLP network with one layer hidden in the MNIST problem

We will check how the MNIST problem will be dealt with by the MLP neural network with one hidden layer.

### Imports

In [0]:
import tensorflow as tf
import numpy
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.models import load_model
from keras.utils import np_utils

#If necessary, change the current catalog
#import os
#path = '.'
#os.chdir(path)

print(tf.__version__)
print(keras.__version__)

1.15.0
2.2.5


### Loading data

Please note that the data is stored as a 3-dimensional tensor.

In [0]:
(xtrain, ytrain), (xtest, ytest) = mnist.load_data()
print('xtrain.shape',xtrain.shape)
print('ytrain.shape',ytrain.shape)
print('xtest.shape',xtest.shape)
print('ytest.shape',ytest.shape)

xtrain.shape (60000, 28, 28)
ytrain.shape (60000,)
xtest.shape (10000, 28, 28)
ytest.shape (10000,)


### Seed initialization (to allow for repeatability of calculations)

In [0]:
seed = 12345
numpy.random.seed(seed)

### Preparation of input data

The original 28x28 pixel images will be fed into the network input layer as vectors with a length of 784.

In addition, the normalization of pixel values from the interval [0.255] into interval [0,1] will have a positive impact on the network learning process.


In [0]:
inputs_num = xtrain.shape[1] * xtrain.shape[2] #number of pixels = number of network inputs
xtrain = xtrain.reshape(xtrain.shape[0], inputs_num).astype('float32')
xtest = xtest.reshape(xtest.shape[0], inputs_num).astype('float32')

# normalize inputs from 0-255 to 0-1
xtrain = xtrain / 255
xtest = xtest / 255

print(xtrain.shape)
print(xtest.shape)

(60000, 784)
(10000, 784)


### Coding of class information (requested responses from 10 output neurons)

In [0]:
ytrain = np_utils.to_categorical(ytrain)
ytest = np_utils.to_categorical(ytest)
num_classes = ytest.shape[1]
print(ytest.shape)

(10000, 10)


### Defining and compiling the model

In [0]:
model = Sequential()
model.add(Dense(500, input_dim=inputs_num, kernel_initializer='normal', activation='relu'))
model.add(Dense(num_classes, kernel_initializer='normal', activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])








### Saving the best model to a file

During the learning process, we can monitor selected metrics and save current network models to the file. This is useful for big problems, when network learning takes a very long time and loss of results in case of failure is an unpleasant experience.

In the following example, we monitor the quality of the classification on the validation set and save the model to the file, as long as it is better than any earlier (i.e., from earlier epochs).

In [0]:
logger = keras.callbacks.ModelCheckpoint('mnist_model_MLP.hdf5', monitor='val_acc', verbose=0, save_best_only=True)


In [0]:
model.fit(xtrain, ytrain, validation_data=(xtest, ytest), epochs=20, batch_size=200, verbose=2, callbacks=[logger])

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where



Train on 60000 samples, validate on 10000 samples
Epoch 1/20





 - 11s - loss: 0.3159 - acc: 0.9122 - val_loss: 0.1622 - val_acc: 0.9517
Epoch 2/20
 - 1s - loss: 0.1295 - acc: 0.9628 - val_loss: 0.1041 - val_acc: 0.9695
Epoch 3/20
 - 1s - loss: 0.0864 - acc: 0.9751 - val_loss: 0.0871 - val_acc: 0.9742
Epoch 4/20
 - 1s - loss: 0.0624 - acc: 0.9823 - val_loss: 0.0732 - val_acc: 0.9789
Epoch 5/20
 - 1s - loss: 0.0460 - acc: 0.9869 - val_loss: 0.0663 - val_acc: 0.9794
Epoch 6/20
 - 1s - loss: 0.0355 - acc: 0.9905 - val_loss: 0.0682 - val_acc: 0.9786
Epoch 7/20
 - 1s - loss: 0.0271 - acc: 0.9926 - val_loss: 0.0609 - val_acc: 0.9792
Epoch 8/20
 - 1s - loss: 0.0212 - acc: 0.9948 - val_loss: 0.0596 - val_acc: 0.9819
Epoch 9/20
 - 1s - loss: 0.0160 - acc: 0.9965 - val_loss: 0.0583 - val_acc: 0.9818
Epoch 10/20
 - 1s - loss: 0.0129 - acc: 0.9973 - val_loss: 0.0613 - val_acc: 0.9809
Epoch 11/20
 - 1

<keras.callbacks.History at 0x7f161dd18e10>

### Evaluation of the final and best model

In [0]:
scores = model.evaluate(xtest, ytest, verbose=0)
print("Test error: %.2f%%" % (100-scores[1]*100))

#reading the best model from file
model2 = load_model('mnist_model_MLP.hdf5')
scores2 = model2.evaluate(xtest, ytest, batch_size=200)
print('network from file:')
print("Test error: %.2f%%" % (100-scores2[1]*100))

Test error: 1.97%
network from file:
Test error: 1.68%


## The best results for MNIST 

How does our result compare the the best ones?

http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html#4d4e495354

## Convolutional networks

Currently, some of the best models for image analysis are convolutional networks.

The basis of their functioning are:

- convolutional layers (sharing weights between neurons)
- MaxPooling layers (reduction of the dimensionality of the problem)
- ReLU type activation functions
- regularization, e.g. by the Dropout method

A popular model is a convolutional network in which a number of convolutional layers with ReLU activation functions alternate with MaxPooling layers. After that one or more layers of the MLP type follows (the designation FC means' Fully Connected`).

Often, there are layers implementing the Dropout type of regularization strategies, which are supposed to counteract the over-fitting of the model.

The following pictures are from http://cs231n.github.io/convolutional-networks/

![image.png](attachment:image.png)

###  ReLU activation function

https://en.wikipedia.org/wiki/Rectifier_(neural_networks)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

### Convolutional layers

The convolutional layer is a set of filters (neurons) that scan all channels of the input image (three in the example below). Scanning means that the weights of the same neuron are used repeatedly, which reduces the number of neurons needed.

The following example has two 3x3 filters, so the output of this convolutionary layer will be a new image with two channels (Output Volume).

![image.png](attachment:image.png)

![image.png](attachment:image.png)

###  Max Pooling Layers

The Max Pooling layers are designed to limit the dimensionality of data transferred to subsequent layers. From each selected area (eg 2x2 pixels), the maximum value is selected.

Please note that pooling does not change the "depth" (number of channels).

For example:

![image.png](attachment:image.png)

## Implementation of a simple convolutional network with Keras 

The designed network will have only one convolutional layer, followed by one hidden MLP layer. The output of the entire network will be another layer of the `softmax` type.


### Imports and data loading

In [0]:
import tensorflow as tf
import numpy as np
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.models import load_model
from keras.utils import np_utils

#import os
#path = '.'
#os.chdir(path)

print(tf.__version__)
print(keras.__version__)

(xtrain, ytrain), (xtest, ytest) = mnist.load_data()
print('xtrain.shape',xtrain.shape)
print('ytrain.shape',ytrain.shape)
print('xtest.shape',xtest.shape)
print('ytest.shape',ytest.shape)


seed = 12345
np.random.seed(seed)

1.15.0
2.2.5
xtrain.shape (60000, 28, 28)
ytrain.shape (60000,)
xtest.shape (10000, 28, 28)
ytest.shape (10000,)


### Data preparation

Please note that the network inputs are now images, not one-dimensional data. The convolutional layer expects data with the dimensions `(width, height, number of channels)`. In our MNIST example, the images are 28x28x1 (one channel, monochrome images).

Please note the appropriate use of the `reshape` function.

It is also possible to use the input data in the form `(number of channels, width, height)`

In [0]:
xtrain = xtrain.reshape(xtrain.shape[0], 28, 28, 1).astype('float32')
xtest = xtest.reshape(xtest.shape[0], 28, 28, 1).astype('float32')

# normalize inputs from 0-255 to 0-1
xtrain = xtrain / 255
xtest = xtest / 255

# one hot encode outputs
ytrain = np_utils.to_categorical(ytrain)
ytest = np_utils.to_categorical(ytest)
num_classes = ytest.shape[1]

### Defining and compiling the model

The first layer is a convolution layer consisting of 32 filters with a size of 5x5 and a ReLU activation function.

Please note that it was given directly

`data_format='channel_last'`

that is, data should have the format `(width, height, number of channels)`, in this example 28x28x1.

The `Flatten` layer converts multidimensional data into one-dimensional, so that it can be used as another layer of MLP.

The `Dropout (0.2)` layer means that each time 20% of random neurons will be excluded from network activity. This is to prevent the network from overfitting.

In [0]:
model = Sequential()
model.add(Conv2D(32, (5, 5), input_shape=(28, 28, 1), activation='relu', data_format='channels_last'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])




Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


### Training


In [0]:
logger = keras.callbacks.ModelCheckpoint('mnist_model_CONV_SIMPLE.hdf5', monitor='val_acc', verbose=0, save_best_only=True)

# Fit the model
model.fit(xtrain, ytrain, validation_data=(xtest, ytest), epochs=20, batch_size=200, verbose=2, callbacks=[logger])

Train on 60000 samples, validate on 10000 samples
Epoch 1/20
 - 7s - loss: 0.2493 - acc: 0.9276 - val_loss: 0.0864 - val_acc: 0.9739
Epoch 2/20
 - 2s - loss: 0.0769 - acc: 0.9774 - val_loss: 0.0471 - val_acc: 0.9850
Epoch 3/20
 - 2s - loss: 0.0541 - acc: 0.9836 - val_loss: 0.0453 - val_acc: 0.9854
Epoch 4/20
 - 2s - loss: 0.0436 - acc: 0.9867 - val_loss: 0.0384 - val_acc: 0.9873
Epoch 5/20
 - 2s - loss: 0.0348 - acc: 0.9888 - val_loss: 0.0333 - val_acc: 0.9891
Epoch 6/20
 - 2s - loss: 0.0283 - acc: 0.9917 - val_loss: 0.0316 - val_acc: 0.9897
Epoch 7/20
 - 2s - loss: 0.0231 - acc: 0.9923 - val_loss: 0.0367 - val_acc: 0.9874
Epoch 8/20
 - 2s - loss: 0.0197 - acc: 0.9940 - val_loss: 0.0357 - val_acc: 0.9886
Epoch 9/20
 - 2s - loss: 0.0158 - acc: 0.9951 - val_loss: 0.0296 - val_acc: 0.9910
Epoch 10/20
 - 2s - loss: 0.0147 - acc: 0.9952 - val_loss: 0.0308 - val_acc: 0.9911
Epoch 11/20
 - 2s - loss: 0.0135 - acc: 0.9956 - val_loss: 0.0307 - val_acc: 0.9899
Epoch 12/20
 - 2s - loss: 0.0112 - 

<keras.callbacks.History at 0x7f15fe7cd3c8>

### Testing

In [0]:
scores = model.evaluate(xtest, ytest, verbose=0)
print("Test error: %.2f%%" % (100-scores[1]*100))

#Best model
model2 = load_model('mnist_model_CONV_SIMPLE.hdf5')
scores2 = model2.evaluate(xtest, ytest, batch_size=200)
print('The best network from file:')
print("Test error: %.2f%%" % (100-scores2[1]*100))

Test error: 0.91%
The best network from file:
Test error: 0.89%


## A bigger convolutional network

Compare previous results with the results of the following network. Read its architecture.

In [0]:
import tensorflow as tf
import numpy as np
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.models import load_model
from keras.utils import np_utils

#import os
#path = '.'
#os.chdir(path)

print(tf.__version__)
print(keras.__version__)

(xtrain, ytrain), (xtest, ytest) = mnist.load_data()
print('xtrain.shape',xtrain.shape)
print('ytrain.shape',ytrain.shape)
print('xtest.shape',xtest.shape)
print('ytest.shape',ytest.shape)

# fix random seed
seed = 12345
np.random.seed(seed)


xtrain = xtrain.reshape(xtrain.shape[0], 28, 28, 1).astype('float32')
xtest = xtest.reshape(xtest.shape[0], 28, 28, 1).astype('float32')

# normalize inputs from 0-255 to 0-1
xtrain = xtrain / 255
xtest = xtest / 255

# one hot encode outputs
ytrain = np_utils.to_categorical(ytrain)
ytest = np_utils.to_categorical(ytest)
num_classes = ytest.shape[1]

# define model
model = Sequential()
model.add(Conv2D(30, (5, 5), input_shape=(28, 28, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(15, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))


model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

logger = keras.callbacks.ModelCheckpoint('mnist_model_CONV_BIGGER.hdf5', monitor='val_acc', verbose=0, save_best_only=True)

# Fit the model
model.fit(xtrain, ytrain, validation_data=(xtest, ytest), epochs=20, batch_size=200, verbose=2, callbacks=[logger])

# Final evaluation of the model
scores = model.evaluate(xtest, ytest, verbose=0)
print("Test error: %.2f%%" % (100-scores[1]*100))

#Best model
model2 = load_model('mnist_model_CONV_BIGGER.hdf5')
scores2 = model2.evaluate(xtest, ytest, batch_size=200)
print('The best network from the file:')
print("Test error: %.2f%%" % (100-scores2[1]*100))

print('end')

1.15.0
2.2.5
xtrain.shape (60000, 28, 28)
ytrain.shape (60000,)
xtest.shape (10000, 28, 28)
ytest.shape (10000,)
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
 - 3s - loss: 0.3860 - acc: 0.8786 - val_loss: 0.0824 - val_acc: 0.9735
Epoch 2/20
 - 2s - loss: 0.0972 - acc: 0.9702 - val_loss: 0.0504 - val_acc: 0.9844
Epoch 3/20
 - 2s - loss: 0.0675 - acc: 0.9789 - val_loss: 0.0376 - val_acc: 0.9889
Epoch 4/20
 - 2s - loss: 0.0564 - acc: 0.9827 - val_loss: 0.0302 - val_acc: 0.9911
Epoch 5/20
 - 2s - loss: 0.0462 - acc: 0.9851 - val_loss: 0.0275 - val_acc: 0.9919
Epoch 6/20
 - 2s - loss: 0.0410 - acc: 0.9870 - val_loss: 0.0233 - val_acc: 0.9927
Epoch 7/20
 - 2s - loss: 0.0353 - acc: 0.9890 - val_loss: 0.0292 - val_acc: 0.9908
Epoch 8/20
 - 2s - loss: 0.0323 - acc: 0.9897 - val_loss: 0.0255 - val_acc: 0.9909
Epoch 9/20
 - 2s - loss: 0.0304 - acc: 0.9899 - val_loss: 0.0228 - val_acc: 0.9929
Epoch 10/20
 - 2s - loss: 0.0283 - acc: 0.9907 - val_loss: 0.0233 - val_acc: 0.9922
Epoch 

## Comparing the models

In case of problems with training, attached files

- `_mnist_model_MLP.hdf5`
- `_mnist_model_CONV_SIMPLE.hdf5`
- `_mnist_model_CONV_BIGGER.hdf5`

contain previously trained models.

Use the following script to compare their actions. How far are these models from the best known?

In [0]:
import tensorflow as tf
import numpy as np
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.models import load_model
from keras.utils import np_utils

#import os
#path = '.'
#os.chdir(path)

print(tf.__version__)
print(keras.__version__)

(xtrain, ytrain), (xtest, ytest) = mnist.load_data()
print('xtrain.shape',xtrain.shape)
print('ytrain.shape',ytrain.shape)
print('xtest.shape',xtest.shape)
print('ytest.shape',ytest.shape)

xtrain = xtrain.reshape(xtrain.shape[0], 784).astype('float32')
xtest = xtest.reshape(xtest.shape[0], 784).astype('float32')

# normalize inputs from 0-255 to 0-1
xtrain = xtrain / 255
xtest = xtest / 255

# one hot encode outputs
ytrain = np_utils.to_categorical(ytrain)
ytest = np_utils.to_categorical(ytest)

#Compare the networks

#MLP
model = load_model('mnist_model_MLP.hdf5')
scores = model.evaluate(xtest, ytest, batch_size=200)
print("Test error (Model MLP): %.2f%%" % (100-scores[1]*100))


xtrain = xtrain.reshape(xtrain.shape[0], 28, 28, 1).astype('float32')
xtest = xtest.reshape(xtest.shape[0], 28, 28, 1).astype('float32')

#Smaller conv net
model = load_model('mnist_model_CONV_SIMPLE.hdf5')
scores = model.evaluate(xtest, ytest, batch_size=200)
print("Test error (Model CONV SIMPLE): %.2f%%" % (100-scores[1]*100))

#Bigger conv net
model = load_model('mnist_model_CONV_BIGGER.hdf5')
scores = model.evaluate(xtest, ytest, batch_size=200)
print("Test error (Model CONV BIGGER): %.2f%%" % (100-scores[1]*100))

print('end')

1.15.0
2.2.5
xtrain.shape (60000, 28, 28)
ytrain.shape (60000,)
xtest.shape (10000, 28, 28)
ytest.shape (10000,)
Test error (Model MLP): 1.68%
Test error (Model CONV SIMPLE): 0.89%
Test error (Model CONV BIGGER): 0.67%
end


### Task 1

- Prepare and train a convolutional neural network on CIFAR-10 database. 
- Try different architectures of networks
- Compare and report the results


#### YOUR DESCRIPTION AND COMMENTS

### **Getting CIFAR-10**

In [0]:
%matplotlib notebook
import tensorflow as tf
import keras as krs
import numpy as np

from keras.datasets import cifar10
import matplotlib.pyplot as plt

(xtrain, ytrain), (xtest, ytest) = cifar10.load_data()
print('xtrain.shape',xtrain.shape)
print('ytrain.shape',ytrain.shape)
print('xtest.shape',xtest.shape)
print('ytest.shape',ytest.shape)

#wyswietlenie pierwszego przykladu
plt.imshow(xtest[0,:,:], cmap=plt.get_cmap('gray'))

#Wyswietlenie kilku pierwszych przykladow
rows = 8
cols = 10
counter = 0

images = None

for i in range(rows):
    current_row = None
    for j in range(cols):
        im = xtest[counter,:,:]
        counter = counter + 1
        if current_row is None:
            current_row = im
        else:
            current_row = np.hstack((current_row, im))
    if images is None:
        images = current_row
    else:
        images = np.vstack((images, current_row))
        
plt.figure()
plt.imshow(images, cmap=plt.get_cmap('gray'))

plt.show()

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
xtrain.shape (50000, 32, 32, 3)
ytrain.shape (50000, 1)
xtest.shape (10000, 32, 32, 3)
ytest.shape (10000, 1)


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

### **MLP network with one layer hidden in the CIFAR-10 problem**

In [0]:
import tensorflow as tf
import numpy
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.models import load_model
from keras.utils import np_utils

#If necessary, change the current catalog
#import os
#path = '.'
#os.chdir(path)

print(tf.__version__)
print(keras.__version__)

(xtrain, ytrain), (xtest, ytest) = cifar10.load_data()
print('xtrain.shape',xtrain.shape)
print('ytrain.shape',ytrain.shape)
print('xtest.shape',xtest.shape)
print('ytest.shape',ytest.shape)

seed = 12345
numpy.random.seed(seed)

inputs_num = xtrain.shape[1] * xtrain.shape[2] * xtrain.shape[3] #number of pixels = number of network inputs
xtrain = xtrain.reshape(xtrain.shape[0], inputs_num).astype('float32')
xtest = xtest.reshape(xtest.shape[0], inputs_num).astype('float32')

# normalize inputs from 0-255 to 0-1
xtrain = xtrain / 255
xtest = xtest / 255

print(xtrain.shape)
print(xtest.shape)

ytrain = np_utils.to_categorical(ytrain)
ytest = np_utils.to_categorical(ytest)
num_classes = ytest.shape[1]
print(ytest.shape)

model = Sequential()
model.add(Dense(500, input_dim=inputs_num, kernel_initializer='normal', activation='relu'))
model.add(Dense(num_classes, kernel_initializer='normal', activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

logger = keras.callbacks.ModelCheckpoint('cifar10_model_MLP.hdf5', monitor='val_acc', verbose=0, save_best_only=True)

model.fit(xtrain, ytrain, validation_data=(xtest, ytest), epochs=20, batch_size=200, verbose=2, callbacks=[logger])

scores = model.evaluate(xtest, ytest, verbose=0)
print("Test error: %.2f%%" % (100-scores[1]*100))

#reading the best model from file
model2 = load_model('cifar10_model_MLP.hdf5')
scores2 = model2.evaluate(xtest, ytest, batch_size=200)
print('network from file:')
print("Test error: %.2f%%" % (100-scores2[1]*100))

1.15.0
2.2.5
xtrain.shape (50000, 32, 32, 3)
ytrain.shape (50000, 1)
xtest.shape (10000, 32, 32, 3)
ytest.shape (10000, 1)
(50000, 3072)
(10000, 3072)
(10000, 10)
Train on 50000 samples, validate on 10000 samples
Epoch 1/20
 - 2s - loss: 1.9138 - acc: 0.3307 - val_loss: 1.7884 - val_acc: 0.3607
Epoch 2/20
 - 2s - loss: 1.6905 - acc: 0.4029 - val_loss: 1.6439 - val_acc: 0.4093
Epoch 3/20
 - 2s - loss: 1.6220 - acc: 0.4290 - val_loss: 1.6145 - val_acc: 0.4197
Epoch 4/20
 - 2s - loss: 1.5745 - acc: 0.4462 - val_loss: 1.5741 - val_acc: 0.4360
Epoch 5/20
 - 2s - loss: 1.5303 - acc: 0.4595 - val_loss: 1.5159 - val_acc: 0.4633
Epoch 6/20
 - 1s - loss: 1.4898 - acc: 0.4736 - val_loss: 1.5189 - val_acc: 0.4551
Epoch 7/20
 - 2s - loss: 1.4707 - acc: 0.4800 - val_loss: 1.4809 - val_acc: 0.4745
Epoch 8/20
 - 2s - loss: 1.4438 - acc: 0.4919 - val_loss: 1.4757 - val_acc: 0.4767
Epoch 9/20
 - 2s - loss: 1.4221 - acc: 0.4993 - val_loss: 1.4596 - val_acc: 0.4797
Epoch 10/20
 - 1s - loss: 1.3995 - acc: 

### **Simple CNN in the CIFAR-10 problem**

In [0]:
import tensorflow as tf
import numpy as np
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.models import load_model
from keras.utils import np_utils

#import os
#path = '.'
#os.chdir(path)

print(tf.__version__)
print(keras.__version__)

(xtrain, ytrain), (xtest, ytest) = cifar10.load_data()
print('xtrain.shape',xtrain.shape)
print('ytrain.shape',ytrain.shape)
print('xtest.shape',xtest.shape)
print('ytest.shape',ytest.shape)

seed = 12345
np.random.seed(seed)

xtrain = xtrain.reshape(xtrain.shape[0], 32, 32, 3).astype('float32')
xtest = xtest.reshape(xtest.shape[0], 32, 32, 3).astype('float32')

# normalize inputs from 0-255 to 0-1
xtrain = xtrain / 255
xtest = xtest / 255

# one hot encode outputs
ytrain = np_utils.to_categorical(ytrain)
ytest = np_utils.to_categorical(ytest)
num_classes = ytest.shape[1]

model = Sequential()
model.add(Conv2D(32, (5, 5), input_shape=(32, 32, 3), activation='relu', data_format='channels_last'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

logger = keras.callbacks.ModelCheckpoint('cifar10_model_CONV_SIMPLE.hdf5', monitor='val_acc', verbose=0, save_best_only=True)

# Fit the model
model.fit(xtrain, ytrain, validation_data=(xtest, ytest), epochs=20, batch_size=200, verbose=2, callbacks=[logger])

scores = model.evaluate(xtest, ytest, verbose=0)
print("Test error: %.2f%%" % (100-scores[1]*100))

#Best model
model2 = load_model('cifar10_model_CONV_SIMPLE.hdf5')
scores2 = model2.evaluate(xtest, ytest, batch_size=200)
print('The best network from file:')
print("Test error: %.2f%%" % (100-scores2[1]*100))

1.15.0
2.2.5
xtrain.shape (50000, 32, 32, 3)
ytrain.shape (50000, 1)
xtest.shape (10000, 32, 32, 3)
ytest.shape (10000, 1)
Train on 50000 samples, validate on 10000 samples
Epoch 1/20
 - 4s - loss: 1.6285 - acc: 0.4218 - val_loss: 1.3700 - val_acc: 0.5143
Epoch 2/20
 - 2s - loss: 1.2834 - acc: 0.5492 - val_loss: 1.2258 - val_acc: 0.5629
Epoch 3/20
 - 2s - loss: 1.1647 - acc: 0.5918 - val_loss: 1.1481 - val_acc: 0.5995
Epoch 4/20
 - 3s - loss: 1.0884 - acc: 0.6216 - val_loss: 1.1075 - val_acc: 0.6129
Epoch 5/20
 - 2s - loss: 1.0262 - acc: 0.6407 - val_loss: 1.0636 - val_acc: 0.6300
Epoch 6/20
 - 2s - loss: 0.9899 - acc: 0.6550 - val_loss: 1.0420 - val_acc: 0.6349
Epoch 7/20
 - 2s - loss: 0.9387 - acc: 0.6735 - val_loss: 1.0135 - val_acc: 0.6480
Epoch 8/20
 - 2s - loss: 0.9042 - acc: 0.6870 - val_loss: 1.0166 - val_acc: 0.6456
Epoch 9/20
 - 3s - loss: 0.8634 - acc: 0.7007 - val_loss: 0.9932 - val_acc: 0.6600
Epoch 10/20
 - 2s - loss: 0.8408 - acc: 0.7068 - val_loss: 0.9917 - val_acc: 0.6

### **Bigger CNN in the CIFAR-10 problem**

In [0]:
import tensorflow as tf
import numpy as np
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.models import load_model
from keras.utils import np_utils

#import os
#path = '.'
#os.chdir(path)

print(tf.__version__)
print(keras.__version__)

(xtrain, ytrain), (xtest, ytest) = cifar10.load_data()
print('xtrain.shape',xtrain.shape)
print('ytrain.shape',ytrain.shape)
print('xtest.shape',xtest.shape)
print('ytest.shape',ytest.shape)

# fix random seed
seed = 12345
np.random.seed(seed)


xtrain = xtrain.reshape(xtrain.shape[0], 32, 32, 3).astype('float32')
xtest = xtest.reshape(xtest.shape[0], 32, 32, 3).astype('float32')

# normalize inputs from 0-255 to 0-1
xtrain = xtrain / 255
xtest = xtest / 255

# one hot encode outputs
ytrain = np_utils.to_categorical(ytrain)
ytest = np_utils.to_categorical(ytest)
num_classes = ytest.shape[1]

# define model
model = Sequential()
model.add(Conv2D(30, (5, 5), input_shape=(32, 32, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(15, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))


model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

logger = keras.callbacks.ModelCheckpoint('cifar10_model_CONV_BIGGER.hdf5', monitor='val_acc', verbose=0, save_best_only=True)

# Fit the model
model.fit(xtrain, ytrain, validation_data=(xtest, ytest), epochs=20, batch_size=200, verbose=2, callbacks=[logger])

# Final evaluation of the model
scores = model.evaluate(xtest, ytest, verbose=0)
print("Test error: %.2f%%" % (100-scores[1]*100))

#Best model
model2 = load_model('cifar10_model_CONV_BIGGER.hdf5')
scores2 = model2.evaluate(xtest, ytest, batch_size=200)
print('The best network from the file:')
print("Test error: %.2f%%" % (100-scores2[1]*100))

print('end')

1.15.0
2.2.5
xtrain.shape (50000, 32, 32, 3)
ytrain.shape (50000, 1)
xtest.shape (10000, 32, 32, 3)
ytest.shape (10000, 1)
Train on 50000 samples, validate on 10000 samples
Epoch 1/20
 - 4s - loss: 1.7604 - acc: 0.3539 - val_loss: 1.4921 - val_acc: 0.4536
Epoch 2/20
 - 3s - loss: 1.4500 - acc: 0.4720 - val_loss: 1.3755 - val_acc: 0.5070
Epoch 3/20
 - 3s - loss: 1.3407 - acc: 0.5138 - val_loss: 1.2641 - val_acc: 0.5451
Epoch 4/20
 - 2s - loss: 1.2554 - acc: 0.5517 - val_loss: 1.1919 - val_acc: 0.5727
Epoch 5/20
 - 3s - loss: 1.2007 - acc: 0.5706 - val_loss: 1.1702 - val_acc: 0.5853
Epoch 6/20
 - 3s - loss: 1.1415 - acc: 0.5929 - val_loss: 1.1076 - val_acc: 0.6115
Epoch 7/20
 - 3s - loss: 1.0953 - acc: 0.6112 - val_loss: 1.0822 - val_acc: 0.6143
Epoch 8/20
 - 3s - loss: 1.0622 - acc: 0.6257 - val_loss: 1.0316 - val_acc: 0.6357
Epoch 9/20
 - 2s - loss: 1.0274 - acc: 0.6359 - val_loss: 1.0229 - val_acc: 0.6400
Epoch 10/20
 - 2s - loss: 0.9953 - acc: 0.6457 - val_loss: 1.0141 - val_acc: 0.6

**Comparing the results**

In [0]:
import tensorflow as tf
import numpy as np
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.models import load_model
from keras.utils import np_utils

#import os
#path = '.'
#os.chdir(path)

print(tf.__version__)
print(keras.__version__)

(xtrain, ytrain), (xtest, ytest) = cifar10.load_data()
print('xtrain.shape',xtrain.shape)
print('ytrain.shape',ytrain.shape)
print('xtest.shape',xtest.shape)
print('ytest.shape',ytest.shape)

inputs_num = xtrain.shape[1] * xtrain.shape[2] * xtrain.shape[3] #number of pixels = number of network inputs
xtrain = xtrain.reshape(xtrain.shape[0], inputs_num).astype('float32')
xtest = xtest.reshape(xtest.shape[0], inputs_num).astype('float32')

# normalize inputs from 0-255 to 0-1
xtrain = xtrain / 255
xtest = xtest / 255

# one hot encode outputs
ytrain = np_utils.to_categorical(ytrain)
ytest = np_utils.to_categorical(ytest)

#Compare the networks

#MLP
model = load_model('cifar10_model_MLP.hdf5')
scores = model.evaluate(xtest, ytest, batch_size=200)
print("Test error (Model MLP): %.2f%%" % (100-scores[1]*100))


xtrain = xtrain.reshape(xtrain.shape[0], 32, 32, 3).astype('float32')
xtest = xtest.reshape(xtest.shape[0], 32, 32, 3).astype('float32')

#Smaller conv net
model = load_model('cifar10_model_CONV_SIMPLE.hdf5')
scores = model.evaluate(xtest, ytest, batch_size=200)
print("Test error (Model CONV SIMPLE): %.2f%%" % (100-scores[1]*100))

#Bigger conv net
model = load_model('cifar10_model_CONV_BIGGER.hdf5')
scores = model.evaluate(xtest, ytest, batch_size=200)
print("Test error (Model CONV BIGGER): %.2f%%" % (100-scores[1]*100))

print('end')

1.15.0
2.2.5
xtrain.shape (50000, 32, 32, 3)
ytrain.shape (50000, 1)
xtest.shape (10000, 32, 32, 3)
ytest.shape (10000, 1)
Test error (Model MLP): 48.55%
Test error (Model CONV SIMPLE): 32.68%
Test error (Model CONV BIGGER): 30.70%
end


### Task 2

NOT OBLIGATORY, DO IT ONLY IF YOU WANT!

- Collect your own images from different categories and train different networks to recognize them.

- Provide the best models saved in files together with a script to load and test them.

#### YOUR DESCRIPTION AND COMMENTS

In [0]:
#YOUR CODE HERE