# Cifar10 classification with and without normalization

In this notebook you will download the cifar10 dataset which contains quite small images (32x32x3) of 10 classes. The data is from the Canadian Institute For Advanced Research. You will plot examples of the images with the class label. Note that because the images are so small it's not always very easy to recoginse which of the ten classes is on the image, even as a human. After loading the dataset you will train a Convolutional Neural Network to predict the the test dataset. You will train the neural network once with normalized data and once without.


**Dataset:**  You work with the Cifar10 dataset. You have 60'000 32x32 pixel color images of 10 classes ("airplane","automobile","bird","cat","deer","dog","frog","horse","ship","truck")

**Content:**
* load the original cifar10 data and create a train, validation and test dataset
* visualize samples of cifar10 dataset
* use keras to train a CNN with the normalized and the unnormalized version of the data
* check if the normalization has an impact on the test performance of the data



#### Imports


In [None]:
# load required libraries:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('default')
from sklearn.metrics import confusion_matrix

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Convolution2D, MaxPooling2D, Flatten , Activation
from tensorflow.keras.utils import to_categorical 
from tensorflow.keras import optimizers



### Load and plot the data

In the next cell you will load the Cifar10 dataset, 50'000 images are in the training set and 10'000 are in the test dataset. You will use 10'000 for the train and validation dataset.
You will plot one random example of each label and will see
that the images are really small. Finally you will one hot encode the lables.


In [None]:
from tensorflow.keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

In [None]:
# separate train val and test dataset
X_train=x_train[0:10000] 
Y_train=to_categorical(y_train[0:10000],10) # one-hot encoding

X_val=x_train[20000:30000] 
Y_val=to_categorical(y_train[20000:30000],10)

X_test=x_test 
Y_test=to_categorical(y_test,10)

del x_train, y_train, x_test, y_test


print(X_train.shape)
print(X_val.shape)
print(X_test.shape)

In [None]:
labels=np.array(["airplane","automobile","bird","cat","deer","dog","frog","horse","ship","truck"])
#sample image of each label
plt.figure(figsize=(15,15))
for i in range(0,len(np.unique(np.argmax(Y_train,axis=1)))):
    rmd=np.random.choice(np.where(np.argmax(Y_train,axis=1)==i)[0],1)
    plt.subplot(1,10,i+1)
    img=X_train[rmd]
    plt.imshow(img[0,:,:,:])
    plt.title(labels[i] + " " + str(np.argmax(Y_train,axis=1)[rmd][0]))

In [None]:
# check the shape of the data
X_train.shape,Y_train.shape,X_val.shape,Y_val.shape

# CNN as classification model for the Cifar10 dataset
Now it's your turn, train two CNNs with the same architecture.
* One CNN should be with the original image data (no normalization)
* One CNN should be with the normalized image data 
* Use the train data to fit the model, the validation to validate the training and the test dataset for the performance estimation on new unseen data.

Use the following hyperparameters 

- the relu activation function  
- kernelsize of 3x3  
- poolingsize of 2x2   
- use 2 convolutional layers with 8 filters and then a maxpooling layer followed by again 2 convolutional layers with 16 filters and then a maxpooling layer 
- then we flatten the output and use a fully connected layer with 40 nodes and the output has 10 nodes with the softmax activation.

Compare the performance of the network with and without normalization on the testdataset. What do you observe?

### Without normalization

In [None]:
### Your code here ####




### With normalization

We reload the original images and normalize this time by dividing with 255.

In [None]:
### Your code here ####


