<a href="https://colab.research.google.com/github/Arthur-Barreto/Machine-Vision/blob/main/VisComp_Class_02.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Class 2: Neural Networks

## Preliminaries

Run the cell below to download the class pack.

In [None]:
import gdown

gdown.download(id='1HyzfGz13lLENHkZER2sbi4mqcRN2XHuM')

!unzip -o '02.zip'
!rm '02.zip'

Run the cell below to import the class modules.

If you get import warnings, try using **Ctrl+m .** (notice there is a dot there) to restart the kernel.

In [None]:
import numpy as np

from sdx import *
from tensorflow import keras

## Loading the MNIST dataset

The [MNIST dataset](http://yann.lecun.com/exdb/mnist/) is a famous database of handwritten digits.

In [None]:
(train_images, train_labels), (test_images, test_labels) = keras.datasets.mnist.load_data()

The variable `train_images` is an array of 60000 images that should be used as training data. Below, we see the tenth image.

In [None]:
cv_imshow(train_images[9])

The variable `train_labels` is an array of 60000 integers, which are the respective labels of these images. These integers were obtained manually, so they are reliable to use as groundtruth. Below, we see the label of the tenth image.

In [None]:
train_labels[9]

The variable `test_images` is an array of 10000 images that should be used as test data.

In [None]:
cv_imshow(test_images[-1])

Finally, the variable `test_labels` is an array of 10000 integers, which are the respective labels.

In [None]:
test_labels[-1]

To show more than one image at once, we can call the `cv_gridshow` function, passing the array and the parameters of a slice. Below, we show the 25 images that correspond to the `train_images[10:35]` slice.

In [None]:
cv_gridshow(train_images, start=10, stop=35)

It is possible to show the labels alongside the images.

In [None]:
cv_gridshow(train_images, start=10, stop=35, labels=train_labels)

The default value for `start` is `0` and the default value for `stop` is `9`. This is important because the processing happens in Google servers but the rendering still happens in your machine. If you accidentally tried to show 60000 images at once, your browser would probably die horribly.

In [None]:
cv_gridshow(train_images)

## Building neural networks

Let's build a first neural network. More accurately, let's *demonstrate the foundations of building a neural network.* The actual model we will build is so trivial that we can't really call it an actual neural network, since it will have no hidden layers.

Firstly, let's reshape the training images and testing images. The dataset is available as 2D arrays, but the model expects 1D arrays, so we must concatenate the image rows to turn the 28x28 arrays into 1x784 arrays.

In [None]:
train_images = train_images.reshape(60000, 784)
test_images = test_images.reshape(10000, 784)

Then, let's build the *input layer,* calling the `keras.Input` constructor. It obviously needs to know the size of the inputs.

In [None]:
inputs = keras.Input(shape=(784,))

Finally, let's build the *output layer,* which will always be a *dense layer* in our networks. We do this by calling the `keras.layers.Dense` constructor. Since we are building a *categorical* classifier, it needs to know the number of categories, which is the number of possible digits.

In [None]:
layer = keras.layers.Dense(10)

This constructor actually returns a function. In order to have the actual layer, we need to call this function.

In [None]:
outputs = layer(inputs)

We now have what we need to compile the model. We won't delve into the details today, but the parameters below are necessary to ensure that an adequate model for our problem is compiled.

In [None]:
def compile_model(inputs, outputs):
    model = keras.Model(inputs, outputs)
    model.compile(
        loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
        metrics=[keras.metrics.SparseCategoricalAccuracy()],
    )
    return model

In [None]:
model = compile_model(inputs, outputs)

To confirm that the model is structured as expected, we can call the `model.summary` method.

In [None]:
model.summary()

Now we are ready to train our model, calling `model.fit`...

In [None]:
model.fit(train_images, train_labels);

...and test it, calling `model.evaluate`.

In [None]:
model.evaluate(test_images, test_labels);

A more detailed evaluation of the model can be seen via the *confusion matrix.*

In [None]:
plot_confusion(model, test_images, test_labels)

**STOP HERE! WAIT FOR THE TEACHER BEFORE PROCEEDING!**

## Building a single-layer neural network

In [None]:
layer = keras.layers.Dense(397)
outputs = layer(inputs)

layer = keras.layers.Dense(10)
outputs = layer(outputs)

model = compile_model(inputs, outputs)

model.summary()

In [None]:
model.fit(train_images, train_labels)

model.evaluate(test_images, test_labels)

plot_confusion(model, test_images, test_labels)

## Extra

Using relu as activation
-> First Layer

In [None]:
layer = keras.layers.Dense(397, activation = 'relu')
outputs = layer(inputs)

layer = keras.layers.Dense(10)
outputs = layer(outputs)

model = compile_model(inputs, outputs)

model.summary()

In [None]:
model.fit(train_images, train_labels)

model.evaluate(test_images, test_labels)

plot_confusion(model, test_images, test_labels)

USing 'relu' on the last layer

In [None]:
layer = keras.layers.Dense(397)
outputs = layer(inputs)

layer = keras.layers.Dense(10, activation = 'relu')
outputs = layer(outputs)

model = compile_model(inputs, outputs)

model.summary()

In [None]:
model.fit(train_images, train_labels)

model.evaluate(test_images, test_labels)

plot_confusion(model, test_images, test_labels)

## Conclusion

The better model use 'relu' as actiavation function, however, when we used on the last layer, we got less than 10 % accuracy.

You can click on the toc.png tab to the left to browse by section.