# Intro to Neural Networks

In this notebook we will use the same dataset and pre-processing as the previous notebook, but build a simple Neural Network and go into more detail about how it works and how we might change the structure.

But first, we'll reload the dataset and import required libraries.

In [None]:
import tensorflow as tf
from tensorflow.keras.datasets import cifar10

# Standard imports
import numpy as np
import pandas as pd

%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt

(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

# Reshape the labels.
train_labels = train_labels[:,0]
test_labels = test_labels[:,0]

# And scale.
train_images = train_images / 255.0
test_images = test_images / 255.0

# Index to name mapping.
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']


## Build and train a simple neural network

Start building and training the network, then as it is going we'll discuss what is actually going on.

In [None]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Flatten, Dense

model = Sequential([
    Flatten(input_shape=(32, 32, 3)),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

In [None]:
optimizer = tf.keras.optimizers.Adam()
loss = tf.keras.losses.SparseCategoricalCrossentropy()
accuracy = tf.keras.metrics.SparseCategoricalAccuracy()

model.compile(optimizer=optimizer,
              loss=loss,
              metrics=[accuracy])

In [None]:
model.fit(train_images, train_labels, epochs=10)

In [None]:
test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)

## What is going on?

A typical structure is an input layer, a number of hidden layers, and an output layer.

    model = Sequential([
        Flatten(input_shape=(32, 32, 3)),
        Dense(128, activation='relu'),
        Dense(10, activation='softmax')
    ])

So what is each layer doing?

#### Flatten

The input images are `32 x 32 x 3`, our simple network is expecting a one dimensional array of inputs, this layer simply flattens the input to a 1D array.

#### Dense Layer 1

The first dense layer comprised 128 neurons, with each neuron connected with a weighted link to each of the inputs. The output of each neuron is calculated by summing the weights of the inputs, then applying the ReLU activation function to the sum.

#### Dense Layer 2

This is the final layer! As you'll remember there are 10 categories that we are trying to classify, and this layer has a neuron to represent each one. Ideally when an image of a truck is fed into the network, the "truck" neuron will output a very high value while the others will be very low. The softmax activation normalises the outputs to total one and give a relative confidence of each category.


In [None]:
model.summary()

## Assess the performance

In [None]:
print('Test accuracy:', test_acc)

Pretty poor performance...but given it only took a couple of minutes to build it is quite impressive.

As mentioned, the final layer of the network has 10 neurons, with a softmax activation. What this means is that it will provide 10 outputs, each representing a measure of how confident the network is that an image belongs to a particular category. (The softmax makes these outputs add to 1 to show relative confidence between the outputs.)

Lets step through some predictions to understand this in more detail.

In [None]:
pred = model.predict(np.array([test_images[0]]))
pred

As you can see, there are 10 floating point numbers giving a relative confidence of each of the 10 categories.

### Exercises

Calculate the sum of the prediction array.

Which index has the highest value? And what category does that correspond to?

**Optional:** Try replacing the "softmax" activation with a "relu" one, what happens to the outputs? Is that what you expected? Do they still give category predictions?

### Exploring the output

Now we'll look at the performance on a group of images. Red bars indicate an incorrect prediction, blue bars represent the correct answer.

In [None]:
###
#
# Don't spend too much time understanding this - it is used to make pretty plots.
#
###

def plot_image_predictions(img, predictions, true_label, class_names):
  plt.figure(figsize=(6, 3))
  plt.subplot(1, 2, 1)
  plot_image(img, predictions, true_label, class_names)
  plt.subplot(1, 2, 2)
  plot_value_array(predictions, true_label, class_names)
  plt.show()

def plot_image(img, predictions, true_label, class_names):
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])

  plt.imshow(img, cmap=plt.cm.binary)

  predicted_label = np.argmax(predictions)

  if predicted_label == true_label:
    color = 'blue'
  else:
    color = 'red'

  plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
                                100*np.max(predictions),
                                class_names[true_label]),
                                color=color)

def plot_value_array(predictions, true_label, class_names):
  plt.grid(False)
  plt.xticks(range(10))
  plt.yticks([])
  thisplot = plt.bar(range(10), predictions, color="#777777")
  plt.ylim([0, 1])
  predicted_label = np.argmax(predictions)

  thisplot[predicted_label].set_color('red')
  thisplot[true_label].set_color('blue')

In [None]:
def predict_and_plot(index):
    pred = model.predict(np.array([test_images[index]]))
    plot_image_predictions(test_images[index], pred[0], test_labels[index], class_names)

In [None]:
for i in range(0,5):
    predict_and_plot(i)

## Try adding an extra dense layer

As a sample exercies, lets add some more layers to the neural network from the previous notebook and see how it preforms.

Other improvements could be adjusting the learning rate, activation functions, or more advanced features like dropout, image augmentation, convolutional layers.

In [None]:
# Add an extra layer somewhere here:
deep_model = Sequential([
    Flatten(input_shape=(32, 32, 3)),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

In [None]:
# Use the same optimizer/loss/metrics
deep_model.compile(optimizer=optimizer,
              loss=loss,
              metrics=[accuracy])

In [None]:
deep_model.fit(train_images, train_labels, epochs=10)

In [None]:
test_loss, test_acc = deep_model.evaluate(test_images,  test_labels, verbose=2)

print('Test accuracy:', test_acc)

### Exercise

How many parameters require training in the new model? (try plotting the model summary...)

Other questions:
* What was the impact on training time?
* How much did accuracy improve?