# Computer Vision Lab

## 1. Introduction

This hands on lab will introduce you with the concept of convolutional neural networks used for image classification. Our goal will be to build a ConvNet with the high level of accuracy. Convolutional networks provide a machinery to learn these filters from the data directly instead of explicit mathematical models and have been found to be superior (in real world tasks) compared to historically crafted filters.  With convolutional networks, the focus is on learning the filter weights instead of learning individually fully connected pair-wise (between inputs and outputs) weights.

The MNIST dataset consisting of images of handwritten numbers from 0 to 9 will be used as a training, validation and test dataset. Data set is shown below.

<br>
<img src="http://petr-marek.com/wp-content/uploads/2017/07/mnist.png" width="480">
<br><br>

Keras API with Tensorflow or CNTK as a backend will be used to build the CNN network. Before building and training a network you will be dealing with MNIST data analysis, visualisation and preprocessing. After preparing the data a process of designing, building and evaluating the network will be performed. We will experiment with different models by changing their parameters.

In [4]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#import matplotlib.image as mpimg
import seaborn as sns

#np.random.seed(2)

from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
#import itertools

from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D,Dense, MaxPool2D, Dropout, Flatten
from keras.optimizers import RMSprop
from keras.datasets import mnist
#from keras.preprocessing.image import ImageDataGenerator
#from keras.callbacks import ReduceLROnPlateau



Using TensorFlow backend.


ImportError: Traceback (most recent call last):
  File "C:\Users\ggvozden\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "C:\Users\ggvozden\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "C:\Users\ggvozden\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "C:\ProgramData\Anaconda3\lib\imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "C:\ProgramData\Anaconda3\lib\imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: DLL load failed: The specified module could not be found.


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

## 2. Data download

The MNIST database contains standard handwritten digits that have been widely used for training and testing of machine learning algorithms. It has a training set of 60,000 images and a test set of 10,000 images with each image being 28 x 28 pixels. This set is easy to use visualize and train on any computer. The MNIST dataset is usually provided and downloadable as part of the Keras library.


In [None]:
#download MNIST data and split into train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

Example of a CNTK based function used for downloading the MNIST dataset. Comparing this function with the Keras option clearly shows the difference, simplicity and usability of Keras module.

<img src="images/mnistdownload.png">

## Explore the train and test data

In [None]:
#check the shape, size and dtype of loaded dataset
print(f'The shape of training set is: {X_train.shape}')
print(f'The shape of test set is: {X_test.shape}\n')

print(f'The size of training set is: {X_train.size/1024} kB')
print(f'The size of test set is: {X_test.size/1024} kB\n')


print(f'The training set is: {X_train.ndim} - dimensional array')
print(f'The test set is: {X_test.ndim} - dimensional array\n')

print(f'The training set type is: {X_train.dtype}')
print(f'The test set type is: {X_test.dtype}\n')

Explore the label dataset

In [None]:
#chek the shape, size, number of dimensions and dtype of loaded labels dataset
#insert your code below



## Visualize training data from MNIST dateset

In [None]:
# Plot a random image from the training set 
# np.random.randint returns random integers between 0 and 59999.
sampleNo = np.random.randint(0,59999)
print(f'Sample number is {sampleNo}\n')
sampleImg=X_train[sampleNo]
print(f'Image number {sampleNo} in MNIST dataset has the following {sampleImg.shape}\n')
#print(sampleImg,'\n')
print(f'Shape of the image is {sampleImg.shape} and image is {type(sampleImg)} array\n')

It can be helpful to plot and visualise the data before proceeding with further data handling and model design. This can help us avoid issues later during the model training. 

In [None]:
#plt.imshow(X_train[sampleNo,:,:].reshape(28,28), )
#plt.imshow(sampleImg, cmap='gray')
#plt.axis('off')
#print("Image label: ", y_train[sampleNo])

fig,axes=plt.subplots(nrows=1,ncols=6)
for i,ax in enumerate(axes):
    print(i, ax)
    ax.imshow(X_train[i], cmap="gray")
    ax.axis('off')    
plt.show()

The shape of images in MNIST dataset is 28 x 28 x 1 pixels which means that the convolutional neural network will be able to run over each image in our dataset pretty fast.

## Visualize labels from training data set

For visualisation of labels data we'll perform transformation to Pandas data frame.

In [None]:
y_trains=pd.Series(y_train)

In [None]:
# show the frequency distribution of label values 
print(y_trains.value_counts())

In [None]:
#show statistics 
print(y_trains.describe())

## Plot the frequency distribution of training label values

In [None]:
g = sns.countplot(y_trains)

## Plot the frequency distribution of test label values

In [None]:
y_tests=pd.Series(y_train)
g = sns.countplot(y_tests)

## Preprocessing of training and test data
This step requires declaring a dimension for the depth of the input image. As shown previosly images in our data set have the following shape (60000, 28, 28). The first parameters reffers to number of images in dataset and other two parameters represent image width and image height. Hence, the depth parameter for images is missing. In this case depth is 1 because we are working with grayscale images. In case of color images the depth parameter would be 3. 

In [None]:
# Reshapes data
x_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
x_test = X_test.reshape(X_test.shape[0], 28, 28, 1)

The final preprocessing step for the input data is to convert our data type to float32 and normalize our data values to the range [0, 1]

In [None]:
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('Original training data shape:', X_train.shape)
print('New training data shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

## Preprocessing of labels data

Before proceeding with preprocessing let's us check once again attributes of our labels

In [None]:
#chek the shape, size and dtype of labels in dataset
print(f'The shape of training set is: {y_train.shape}')
print(f'The shape of test set is: {y_test.shape}\n')

print(f'The size of training set is: {y_train.size/1024} kB')
print(f'The size of test set is: {y_test.size/1024} kB\n')


print(f'The training set is: {y_train.ndim} - dimensional array')
print(f'The test set is: {y_test.ndim} - dimensional array\n')

print(f'The training set type is: {y_train.dtype}')
print(f'The test set type is: {y_test.dtype}\n')

First ten labels of our data is:

In [None]:
#labels in the training dataset using pandas data series
print(y_trains.head(5))

#or numpy arrays
print(y_train[:100])


The y_train and y_test data are not split into 10 distinct class labels, but rather are represented as a single array with the class values. Thus, we need to preprocess class labels by converting the 1 - dimensional numpy array to 10 - dimensional array.

The labels are encoded as [1-hot]( https://en.wikipedia.org/wiki/One-hot) encoding (label of 3 with 10 digits becomes `0001000000`, where the first index corresponds to digit `0` and the last one corresponds to digit `9`.

![](https://www.cntk.ai/jup/cntk103a_onehot.png)

In [None]:
Y_train = to_categorical(y_train, 10)
Y_test = to_categorical(y_test, 10)

Check the dimensionality of train and test labels after one hot encoding

In [None]:
print(Y_train.shape)
print(Y_test.shape)

## Convolutional network for image classification
Convolutional networks for classification are constructed from a sequence of convolutional layers (for image processing) and fully connected (Dense) layers (for readout). In this exercise, you will construct a small convolutional network for classification of MNIST digits. We will use the Keras library to create neural networks and to train these neural networks to classify images. These models will all be of the Sequential type, meaning that the outputs of one layer are provided as inputs only to the next layer.

Add a Conv2D layer to construct the input layer of the network. Use a kernel size of 3 by 3. You can use the img_rows and img_cols objects available in your workspace to define the input_shape of this layer.
Add a Flatten layer to translate between the image processing and classification part of your network.
Add a Dense layer to classify the 3 different categories of clothing in the dataset.

<img src="https://cdn-images-1.medium.com/max/1600/1*uAeANQIOQPqWZnnuH-VEyw.jpeg" width='480'>

In [None]:
# Initialize the model object
model = Sequential()

# Add a convolutional layer
model.add(Conv2D(3, kernel_size=3, activation='relu',input_shape=(28,28,1)))
#model.add(Dense(10, activation='relu',input_shape=(784,)))

# Add pooling layer
model.add(MaxPool2D(2))

# Add a dropout layer
model.add(Dropout(0.25))

# Add a convolutional layer
model.add(Conv2D(2, kernel_size=5, activation='relu'))

# Flatten the output of the convolutional layer
model.add(Flatten())
          
# Add an output dense layer 
model.add(Dense(128, activation='relu'))
          
# Add an output layer for the 10 categories
model.add(Dense(10, activation='softmax'))

## Compile a neural network
Once you have constructed a model in Keras, the model needs to be compiled before you can fit it to data. This means that you need to specify the optimizer that will be used to fit the model and the loss function that will be used in optimization. Optionally, you can also specify a list of metrics that the model will keep track of. For example, if you want to know the classification accuracy, you will provide the list ['accuracy'] to the metrics keyword argument.

In [None]:
# Compile the model
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])

## Inspect model

In [None]:
model.output_shape

Model summary representation

In [None]:
model.summary()

Model layers are listed in the attribute layers

In [None]:
model.layers

Extracting first convolutional layer in the list

In [None]:
C1=model.layers[0]
print(C1)

Extracting weights from the first layer

In [None]:
w1=C1.get_weights()
print(len(w1))
print(w1)

In [None]:
k1=w1[0]
print(k1)

Shape of kernels

In [None]:
k1.shape

The first two dimensions denote the size of kernel or filter. The third dimension determines the number of channels. In case of grayscale images the size is 1. Working with color images would produce the size of 3. The last dimension in kernel is number of filters.

In [None]:
kernel1=k1[:,:,0,0]
print(kernel1)

In [None]:
kernel1=k1[:,:,0,0]
plt.imshow(kernel1, cmap="gray")
#plt.axis('off')
plt.show()

In [None]:
kernel2=k1[:,:,0,1]
plt.imshow(kernel1, cmap="gray")
#plt.axis('off')
plt.show()

In [None]:
import cv2
sampleNo = np.random.randint(0,59999)
print(f'Sample number is {sampleNo}\n')


sampleImg=X_train[0]
filteredImage = cv2.filter2D(sampleImg,-1,kernel1)
#resize

fig,ax=plt.subplots(nrows=1,ncols=3)
ax[0].imshow(sampleImg, cmap="gray") 
ax[1].imshow(kernel1, cmap="gray")
ax[2].imshow(filteredImage, cmap="gray")

plt.show()

Model configuration

In [None]:
model.get_config()

List all weight tensors in the model

In [None]:
weg=model.get_weights()

In [None]:
print(type(weg))

In [None]:
print(model.get_weights())

## Fitting a neural network model to MNIST data
Transform the data into the network's expected input and then fit the model on training data and training labels. Model fitting requires a training data set, together with the training labels to the network.

#Fit the model on a training set
model.fit(train_data, train_labels, 
          validation_split=0.2, 
          epochs=3, batch_size=10)

In [None]:
# Fit the model
training=model.fit(x_train, Y_train, validation_split=0.2, epochs=8)

To evaluate the model, we use a separate test data-set. As in the train data, the images in the test data also need to be reshaped before they can be provided to the fully-connected network because the network expects one column per pixel in the input.

In [None]:
# Evaluate the model
score=model.evaluate(x_test, Y_test)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

## Evaluating a CNN with test data
To evaluate a trained neural network, you should provide a separate testing data set of labeled images. The model you fit in the previous exercise is available in your workspace.

Evaluate the data on a separate test set: test_data and test_labels.
Use the same batch size that was used for fitting (10 images per batch).

In [None]:
# Evaluate the model on separate test data
model.evaluate(x_test, Y_test, batch_size=10)

## Plot the learning curves
During learning, the model will store the loss function evaluated in each epoch. Looking at the learning curves can tell us quite a bit about the learning process. In this part we will plot the learning and validation loss curves for a model that you will train.

In [None]:
#import matplotlib.pyplot as plt

# Extract the history from the training object
history = training.history

# Plot the training loss 
plt.plot(history['loss'])
plt.ylabel('Loss')
plt.xlabel('Epochs')

# Show the figure
plt.show()

In [None]:
#import matplotlib.pyplot as plt

# Extract the history from the training object
history = training.history

# Plot the training loss 
plt.ylabel('val_loss')
plt.xlabel('Epochs')
# Plot the validation loss
plt.plot(history['val_loss'])

# Show the figure
plt.show()

In [None]:
plt.plot(training.history['acc'])
plt.plot(training.history['val_acc'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

In [None]:
plt.plot(training.history['loss'])
plt.plot(training.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

In [None]:
prediction = model.predict_classes(x_test, verbose=0)
submission = pd.DataFrame({"ImageId": list(range(1,len(prediction)+1)),
                         "Label": prediction})
submission.to_csv("submission.csv", index=False, header=True)

In [None]:
# Plot a random image from the training set 
# np.random.randint returns random integers between 0 and 59999.
sampleNo = np.random.randint(0,10000)
print(f'Sample number is {sampleNo}\n')
sampleImg=X_test[sampleNo]
print(f'Image number {sampleNo} in MNIST dataset has the following {sampleImg.shape}\n')
#print(sampleImg,'\n')
print(f'Shape of the image is {sampleImg.shape} and image is {type(sampleImg)} array\n')
plt.imshow(sampleImg, cmap="gray")
plt.show()

prediction = model.predict_classes(x_test[[sampleNo],:,:,:], verbose=0)
print(prediction)

# Exercise

### Add strides to a convolutional network
The size of the strides of the convolution kernel determines whether the kernel will skip over some of the pixels as it slides along the image. This affects the size of the output because when strides are larger than one, the kernel will be centered on only some of the pixels. 

###  Calculate the size of convolutional layer output
Zero padding and strides affect the size of the output of a convolution. What is the size of the output for an input of size 256 by 256, with 2 kernels of size 3 by 3, padding of 1 and strides of 2?

###  How many parameters in a CNN?
We need to know how many parameters a CNN has, so we can adjust the model architecture, to reduce this number or shift parameters from one part of the network to another. How many parameters would a network have if its inputs are images with 28-by-28 pixels, there is one convolutional layer with 10 units kernels of 3-by-3 pixels, using zero padding (input has the same size as the output), and one densely connected layer with 10 units?

### How many parameters in a deep CNN?
In this exercise, you will use Keras to calculate the total number of parameters along with the number of parameters in each layer of the network.
We have already provided code that builds a deep CNN. 