<a href="https://colab.research.google.com/github/hikmatfarhat-ndu/veronica-thesis/blob/master/keras_cifar10.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Convolution Network

Convolution Neural Networks (CNN) have been very successful especially when modeling images. In this notebook we introduce CNNs and use Keras to learn the CIFAR10 data set

### Packages

In [None]:
import tensorflow as tf 
import numpy as np 
import matplotlib.pyplot as plt
from tensorflow.keras import models,layers
from tensorflow.keras.utils import Sequence
#from tensorflow.python.keras.utils import data_utils
import math
import os


### Convolution Operations
We start with a simple example. Let I be an input image. Typically $I$ would be represented by a tensor of shape $(H,W,C)$ where $H$, $W$, and $C$ are the height, width, and color channel respectively. Therefore,  $I[h,w,c]$ refers to the value of channel $c$ in pixel $(h,w)$. Let $K$ be a filter with shape $(m,n)$ then the convolution operation produces the following tensor
\begin{align*}
T_{i,j}=\sum_c\sum_{m,n}X_{i+m,j+n,c}*K_{m,n}
\end{align*}
The above operation is illustrated in the example below. Click on the figures to see the sequence of operations.



In [None]:
(img_train,label_train),(img_test,label_test)=tf.keras.datasets.cifar10.load_data()
img_train=img_train/255.0
img_test=img_test/255.0

In [None]:
print("img_train shape={},label_train shape={}".format(img_train.shape,label_train.shape))
print("img_test shape={},label_test shape={}".format(img_test.shape,label_test.shape))


We have dealt with this dataset before but it is helpful to recall some of its properties. As can be seen from the above the training data contains 50000 samples and test data contains 10000 samples. The labels are numbers from 0 to 9. Below we plot the first 10 images and their corresponding labels.

In [None]:
fig=plt.figure()
fig.tight_layout()
plt.subplots_adjust( wspace=1, hspace=1)

for i in range(0,10):
    img=img_train[i]    
    t=fig.add_subplot(2,5,i+1)
    t.set_title(str(label_train[i]))
    t.axes.get_xaxis().set_visible(False)
    t.axes.get_yaxis().set_visible(False)
    plt.imshow(img)

# Model

One can think of the input and output of convolution layers as **boxes** of the form (width,height,depth). Usually the depth of the input is the number of variables used for colors e.g. 3. The output of a Conv2D of nfilters of size (x,y) is (width-x+1, height-y+1,nfilters) where we have assumed that the stride is 1

In [None]:
def createModel():
    model = models.Sequential()
    model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
    model.add(layers.MaxPooling2D(pool_size=(2, 2)))

    model.add(layers.Conv2D(64, (3, 3),  activation='relu'))
    model.add(layers.MaxPooling2D(pool_size=(2, 2)))

    model.add(layers.Conv2D(64, (3, 3), activation='relu'))
    model.add(layers.MaxPooling2D(pool_size=(2, 2)))

    model.add(layers.Flatten())

    model.add(layers.Dense(64, activation='relu'))
    model.add(layers.Dense(10,activation='softmax'))
    
    return model

In [None]:
model=createModel()
#tf.keras.utils.plot_model(model,show_shapes=True)
model.summary()

## Optimization

Keras can use many optimization method. In this notebook we use the __Adam__ method which can be described loosely as __adaptive__ gradient descent.

Also since the labels are __NOT__ in one_hot_encoding we use the "Sparse" version of the crossentropy loss: __SparseCategoricalCrossentropy__. Finally, if we don't specify from_logits=False then the loss function would compute softwmax before computing the loss. Since we are computing softwmax in our model already we turn this step off by specifying from_logits=False

In [None]:
# if we don't use softmax in the last layer, i.e. if the output of the
# model is NOT probabilities then use from_logits=True
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

history = model.fit(img_train,label_train, batch_size=128,epochs=10, 
                   validation_data=(img_test, label_test))



### Testing the Accuracy

In [None]:
_,test_accuracy=model.evaluate(img_test,label_test)