## ML Week 5 - Artificial Neural Networks
---
[Top](#ML-Week-5---Artificial-Neural-Networks) | [Previous](#ML-Week-5---Artificial-Neural-Networks) | [Next](#Part-0:-Imports-and-Setup) | [Bottom](#Cheers)

Today's lesson will be on Artificial Neural Networks (ANNs). I'll be trying to cover an extremely broad and complicated topic with the following parts:

* [Part 0: Imports and Setup](#Part-0:-Imports-and-Setup)
* [Part 1: Biological Intuition](#Part-1:-Biological-Intuition-(Optional))
* [Part 2: Neural Network Basics](#Part-2:-Neural-Network-Basics)
* [Part 3: Perceptrons](#Part-3:-Perceptrons)
* [Part 4: From Logistic Regression to ANNs](#Part-4:-From-Logistic-Regression-to-ANNs)
* [Part 5: Deeper Networks](#Part-5:-Deeper-Networks)
* [Part 6: Intro to Computer Vision](#Part-6:-Intro-to-Computer-Vision)
* [Part 7: Bonus Content](#Part-7:-Bonus-Content)

## Part 0: Imports and Setup

---
[Top](#ML-Week-5---Artificial-Neural-Networks) | [Previous](#ML-Week-5---Artificial-Neural-Networks) | [Next](#Part-1:-Biological-Intuition-(Optional)) | [Bottom](#Cheers)

Before we can jump into learning there's quite a few things to import and some code to setup visualising our data. Here's a helpful table on these libraries in case you want to dig into them a little bit more.

| Library | Description |
|- | - |
| [matplotlib](https://matplotlib.org/) | Plotting Python functions and code |
| [seaborn](https://seaborn.pydata.org/) | Statistical visualisations in Python |
| [pandas](https://pandas.pydata.org/) | Python library for working with tabular data |
| [numpy](https://www.numpy.org/) | Mathemtical manipulations in Python |
| [warnings](https://docs.python.org/3/library/warnings.html) | Regulates warnings printed in Python |
| [Datetime](https://docs.python.org/3/library/datetime.html) | Working with date objects in Python |
| [sklearn](https://scikit-learn.org/) | Python library for prepping, building and assessing many machine learning models |
| [keras](https://keras.io/) | Interface for building neural networks in Python |

### Imports and Configs

In [None]:
# Plotting
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
import seaborn as sns

# Tabular data
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings('ignore')
pd.options.display.float_format = '{:,.2f}'.format
pd.set_option('display.max_rows', 100)
pd.set_option('display.max_columns', 200)

# Dates
from datetime import datetime

# Machine learning
from sklearn.datasets import make_classification, make_moons, make_circles
from sklearn.metrics import confusion_matrix, classification_report, mean_squared_error, mean_absolute_error, r2_score
from sklearn.linear_model import LogisticRegression
from sklearn.utils import shuffle
from sklearn.preprocessing import StandardScaler, LabelEncoder, OneHotEncoder, MinMaxScaler
from sklearn.model_selection import train_test_split, cross_val_score, StratifiedKFold, KFold

# Neural networks
!pip install keras 
from keras.models import Sequential
from keras.layers import Dense, Dropout, BatchNormalization, Activation
from keras.optimizers import Adam
from keras.callbacks import EarlyStopping
from keras.utils.np_utils import to_categorical
import keras.backend as K
from keras.wrappers.scikit_learn import KerasClassifier

### Utility Functions

This cell will house a bunch of utility functions used throughout the notebook. I wouldn't waste too much time trying to understand what's going on here as most of it is just for visualisation or creating quick datasets. A lot of these functions and code come from [this blogpost](https://towardsdatascience.com/applied-deep-learning-part-1-artificial-neural-networks-d7834f67a4f6), which actually does a really good job explaining the basics of neural networks.

We'll be providing a little bit more of a high-level explanation, and move to a more fun example later.

In [None]:
def plot_decision_boundary(func, X, y, figsize=(9, 6)):
    """
    Plots decision boundaries for a classifier
    
    :param func: <function> Prediction function
    :param X: <2d-array like> input data
    :param y: <1d-array like> target data
    :param figsize: <tuple> size of figure
    """
    amin, bmin = X.min(axis=0) - 0.1
    amax, bmax = X.max(axis=0) + 0.1
    hticks = np.linspace(amin, amax, 101)
    vticks = np.linspace(bmin, bmax, 101)
    
    aa, bb = np.meshgrid(hticks, vticks)
    ab = np.c_[aa.ravel(), bb.ravel()]
    c = func(ab)
    cc = c.reshape(aa.shape)

    cm = plt.cm.RdBu
    cm_bright = ListedColormap(['#cc2529', '#396ab1'])
    
    fig, ax = plt.subplots(figsize=figsize)
    contour = plt.contourf(aa, bb, cc, cmap=cm, alpha=0.8)
    
    ax_c = fig.colorbar(contour)
    ax_c.set_label("$P(y = 1)$")
    ax_c.set_ticks([0, 0.25, 0.5, 0.75, 1])
    
    plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cm_bright)
    plt.xlim(amin, amax)
    plt.ylim(bmin, bmax)
    
    
def plot_data(X, y, figsize=None):
    """
    Plots 2-dimensional data
    
    :param X: <2d-array like> input data
    :param y: <1d-array like> target data
    :param figsize: <tuple> size of figure
    """
    if not figsize:
        figsize = (8, 6)
    plt.figure(figsize=figsize)
    plt.plot(X[y==0, 0], X[y==0, 1], 'or', alpha=0.5, label=0)
    plt.plot(X[y==1, 0], X[y==1, 1], 'ob', alpha=0.5, label=1)
    plt.xlim((min(X[:, 0])-0.1, max(X[:, 0])+0.1))
    plt.ylim((min(X[:, 1])-0.1, max(X[:, 1])+0.1))
    plt.legend()

    
def plot_loss_accuracy(history):
    """
    Plot training history (loss and accuracy)
    
    :param history: <object>, model fit output with history and loss
    """
    historydf = pd.DataFrame(history.history, index=history.epoch)
    plt.figure(figsize=(8, 6))
    historydf.plot(ylim=(0, max(1, historydf.values.max())))
    loss = history.history['loss'][-1]
    acc = history.history['acc'][-1]
    plt.title('Loss: %.3f, Accuracy: %.3f' % (loss, acc))


#Nice function to show a confusion matrix
def plot_confusion_matrix(model, X, y):
    """
    Plot confusion matrix
    
    :param model: <ML model>, model fit
    :param X: <2d-array like> input data
    :param y: <1d-array like> target data
    """
    y_pred = model.predict_classes(X, verbose=0)
    plt.figure(figsize=(8, 6))
    sns.heatmap(pd.DataFrame(confusion_matrix(y, y_pred)), annot=True, fmt='d', cmap='YlGnBu', alpha=0.8, vmin=0)


# Simple sinewave generator function
def make_sine_wave():
    """
    Makes a basic sine wave
    """
    c = 3
    num = 2400
    step = num/(c*4)
    np.random.seed(0)
    x0 = np.linspace(-c*np.pi, c*np.pi, num)
    x1 = np.sin(x0)
    noise = np.random.normal(0, 0.1, num) + 0.1
    noise = np.sign(x1) * np.abs(noise)
    x1  = x1 + noise
    x0 = x0 + (np.asarray(range(num)) / step) * 0.3
    X = np.column_stack((x0, x1))
    y = np.asarray([int((i/step)%2 > 1) for i in range(len(x0))])
    return X, y

## Part 1: Biological Intuition (Optional)

---
[Top](#ML-Week-5---Artificial-Neural-Networks) | [Previous](#Part-0:-Imports-and-Setup) | [Next](#Part-2:-Neural-Network-Basics) | [Bottom](#Cheers)

When first learning about neural networks it is common to start with some sort of biological intuition. I personally don't find it useful since it's far more complex than an ANN but none the less here is a picture showing a biological neural synapse. 

<img src="img/bnn.png" width="500">

Verterbrates' brains consist of neurons and those neurons communicate via exchanging electrical signals. The signals flow through axons that connect to other cells. This process is incredibly complex and not well understood but it is the basis of how organic creatures learn. If you want to learn more about it a good start is the [wikipedia page on synapses](https://en.wikipedia.org/wiki/Synapse) or this [khan academy video](https://www.youtube.com/watch?v=Tbq-KZaXiL4).


## Part 2: Neural Network Basics ##

---
[Top](#ML-Week-5---Artificial-Neural-Networks) | [Previous](#Part-1:-Biological-Intuition-(Optional)) | [Next](#Part-3:-Perceptrons) | [Bottom](#Cheers)

### What is a neural network? ###

A neural network is just a supervised learning architecture that allows us to approximate any mathematical function. It consists of building blocks we call neurons or perceptrons. These building blocks interconnect to each other and form distinct layers. A shallow neural network consists of only 3 basic layers:

* Input layer
* Hidden layer
* Output layer

Where the input layer takes whatever data we want to process, then passes it to the hidden layer which does the computation and then forwards it to the output layer that produces some sort of prediction. A deep neural network merely has more hidden layers and we will discuss the advantages of that layer.

### What does a neural network look like? ###

A neural network looks like a graph, the graph consists of three types of nodes in layers which are all interconnected like so: 

<img src="img/ann_tom.png" width="661">


### Neural Network Glossary ##

Before diving any deeper into neural networks let's first get a basic understanding of some important terminology. Even if you understand nothing about neural networks by the end of today, if you at least start to grasp the terminology it will help in further study!
            
<div class="row">
<h3 align="left">Inputs</h3>
<img align="left" src="img/glossary/inputs.png" >
<p style="text-align:left;">
This is the data we feed into the net. It could be images, audio recordings or stock prices. Before we feed it into the network we may have to preprocess it into a form that is more readable by the network.
</p>
</div>

<div class="row">
<h3 align="left">Outputs</h3>
<img align="left" src="img/glossary/outputs.png" >
<p style="text-align:left;">
This is the data we get back from the net. It will be formulated from all the output neurons. An example might be a classification of an image being either a dog or a cat.
</p>
</div>

<div class="row">
<h3 align="left">Neurons/Perceptrons</h3>
<img align="left" src="img/glossary/neuron.png" >
<p>
The building block of a neural network. Take a series of inputs and produces an output.
</p>
</div>

<div class="row">
<h3 align="left">Activation Function</h3>
<img align="left" src="img/glossary/activationfunction.png" >
<p>
All neuron outputs are fed through an activation function. This function helps to introduce non-linearity into the network and can also help keep output values manageable. 
</p>
</div>

<div class="row">
<h3 align="left">Weight Space</h3>
<img align="left" src="img/glossary/weightspace.png" >
<p>
Every connection between neurons has a weight. Weights, biases and the activation function define a neuron output. The goal of our network is to create an optimal weightspace at every layer that effectively learns to solve our input problem. </p>
</div>

<div class="row">
<h3 align="left">Forward Pass</h3>
<img align="left" src="img/glossary/forwardpass.png" >
<p>
A forward pass is when we send an input signal all the way to the output. Each perceptron is fires and passes its output to the next layer. We use a forward pass to produce our final outputs.
</p>
</div>

<div class="row">
<h3 align="left">Error</h3>
<img align="left" src="img/glossary/error.png" >
<p>
We use the error between the predicted outputs and the known outputs to evalute our model. As we train the network we aim to minimise this error
 </p>
</div>

<div class="row">
<h3 align="left">Backpropagation</h3>
<img align="left" src="img/glossary/backpropagation.png" >
<p>
During the training process we employ the backpropagation algorithm. This algorithm moves from the output layer to the input layer and updates all the neuron parameters. As time goes on the network loss will decrease. This is quite a complicated mathematical concept and requires calculus. More informational on the mathematical aspect called gradient descent can be found <a href="https://en.wikipedia.org/wiki/Gradient_descent">on wikipedia</a> .

</p>
</div>

<div class="row">
<h3 align="left">Hyperparamaters</h3>
<img align="left" src="img/glossary/hypers.png" >
<p>
Hyperparamaters are configuration numbers that manage how the network is built. Examples include number of layers, number of neurons, how many iterations of training we do and what activation functions we use.</p>
</div>


## Part 3: Perceptrons

---
[Top](#ML-Week-5---Artificial-Neural-Networks) | [Previous](#Part-2:-Neural-Network-Basics) | [Next](#Part-4:-From-Logistic-Regression-to-ANNs) | [Bottom](#Cheers)

Perceptrons are the basic building block of our neural networks. Their purpose is simple, transform an input into an output.

<img src="img/perceptron.png">

The process can be summed up in english as  

1. multiplying all inputs by the weight matrix and summing them together
2. add 1 times some bias weight so we can shift the function
3. feed this result through the activation function
4. produce the output

and for those of us who are more maths and visually inclined as

<img src="img/matrix.png">

This process is actually the same as logistic regression, with multiple features and training example. We did logistic regression a couple of weeks ago.

## Part 4: From Logistic Regression to ANNs

---
[Top](#ML-Week-5---Artificial-Neural-Networks) | [Previous](#Part-3:-Perceptrons) | [Next](#Part-5:-Deeper-Networks) | [Bottom](#Cheers)

Last week you learned about how you can use logistic regression to solve linearly separable binary classification problems. Below we have a recap example showing how effective this is. We first make a scatter plot of two classes of points and then observe the class discriminator. Let's pretend we have data from two species of baby snakes that we have caught to research. Each snake has two distinguishing features...

1. The length of the snake (in cm)
2. The number of stripes on the snake's body

The goal of the classifier would be to predict the species of snake based upon the length and the number of stripes.

In [None]:
# Make the data
X, y = make_classification(n_samples=1000, n_features=2, n_redundant=0, 
                           n_informative=2, random_state=7, n_clusters_per_class=1)

# Modify the data
X=(X+2)*4 + 4

plot_data(X, y)
plt.title("Caught Baby Snakes")
plt.ylabel("Length (cm)")
plt.xlabel("Number of Stripes")
plt.gca().legend(('Tiger Snake','Eastern Brown Snake'))

lr = LogisticRegression()
lr.fit(X, y)
print('LR coefficients:', lr.coef_)
print('LR intercept:', lr.intercept_)

plot_data(X, y)

limits = np.array([-2, 40])
boundary = -(lr.coef_[0][0] * limits + lr.intercept_[0]) / lr.coef_[0][1]
plt.title("Caught Baby Snakes")
plt.ylabel("Length (cm)")
plt.xlabel("Number of Stripes")
plt.gca().legend(('Tiger Snake','Eastern Brown Snake'))
plt.plot(limits, boundary, "g-", linewidth=2)

#### Mini Exercise

This data is called **linearly separable**. Why do think it's called that?

#### Mini Exercise (Beastmode)

Tweak the above code to produce a dataset which isnt linearly separable and observe the results.

### Neural Network Replacement

We can solve the same problem that logistic regression solves using a neural network. We use Keras to construct a network with a single dense layer with one neuron in it. The neuron can process an array of data size 2.

In [None]:
# Construct a sequential network
model = Sequential()

# Add a dense layer
model.add(Dense(units=1, input_shape=(2,), activation='sigmoid'))

# Compile the network and define our optimisation algorithm, loss function and metrics
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model for 50 iterations
history = model.fit(x=X, y=y, verbose=0, epochs=50)

# Visualise the training
plot_loss_accuracy(history)

# Visualise predidcted boundary
plot_decision_boundary(lambda x: model.predict(x), X, y)
plt.title("Caught Baby Snakes")
plt.ylabel("Length (cm)")
plt.xlabel("Number of Stripes")
print()

Remember our old steps in fitting a model? Something that looked like...


```python
classifier = LogisticRegression()
classifier.fit(X, y)
```


This code is doing the same thing using keras, which is a different library than sklearn. The above code has been replaced by...


```python
# Add a dense layer
model.add(Dense(units=1, input_shape=(2,), activation='sigmoid'))

# Compile the network and define our optimisation algorithm, loss function and metrics
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model for 50 iterations
history = model.fit(x=X, y=y, verbose=0, epochs=50)
```

It's a little bit more complicated, but the syntax from keras will actually let us make a more complicated model. More on this later...

### Ponder this

1. Why are we able to learn this classifier so quickly?

2. Why are we able to learn at all given we only have 1 neuron?

3. What happens to the decision boundary if we run the same code multiple times?

4. What happens when we vary the number of epochs? Do higher epochs matter and can we get away with a lot less?

Try to experiment with some of the other hyperparameters and read up about the layers [on the keras website](https://keras.io/layers/core/)

### Slightly More Complex Problems

As we saw the simple neural network is fantastic at solving the linearly separable classification problem but how does it fair on a more complex mathematical description?

In [None]:
X, y = make_circles(n_samples=1000, noise=0.05, factor=0.3, random_state=0)
plot_data(X, y)

Now let's try to train a model using a single layer.

In [None]:
# Same model as before
model = Sequential()
model.add(Dense(1, input_shape=(2,), activation='sigmoid'))

model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])

# Train model for 100 epochs
history = model.fit(X, y, verbose=0, epochs=100)
plot_loss_accuracy(history)

plot_decision_boundary(lambda x: model.predict(x), X, y)


As we can see it is completely useless. The network still tries to find a linear boundary. Clearly our network is not complex enough to learn how to solve this problem!

## Part 5: Deeper Networks

---
[Top](#ML-Week-5---Artificial-Neural-Networks) | [Previous](#Part-4:-From-Logistic-Regression-to-ANNs) | [Next](#Part-6:-Intro-to-Computer-Vision) | [Bottom](#Cheers)

A **deep** network adds **hidden layers** to the model. The deeper the network, the _more layers_. This branch of machine learning is often referred to as "Deep Learning". As we saw before our shallow networks weren't able to solve problems like separating points which aren't linearly separable. 

### Exercise 

Try and solve the previous circle problem by adding another layer (or layers) to the network. Then try changing some of the hyper paramaters.



In [None]:
# Same model as before
model = Sequential()

# ADD MORE LAYERS HERE. Here's a clue:
# model.add(Dense(10, input_shape=(2,), activation='sigmoid'))

# Output layer
model.add(Dense(1, input_shape=(2,), activation='sigmoid'))

model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])

# Train model for 100 epochs
history = model.fit(X, y, verbose=0, epochs=100)
plot_loss_accuracy(history)

plot_decision_boundary(lambda x: model.predict(x), X, y)

So we've solved one non linear problem but it still seems relatively trivial.

### Optional: More complex models (sine curve)

Lets construct an even more complex problem. A [sine curve](https://en.wikipedia.org/wiki/Sine_wave) shows a periodic oscillation, for example, the oscillating voltage along a single phase AC power line. Can we use a neural network to classify different sections of a sine curve? The red class denotes the power up part of the cycle and the blue class represents the power down part with respect to time.

In [None]:
# Create sine wave
X, y = make_sine_wave()

plot_data(X, y, figsize=(10, 8))
plt.xlabel("Time in milliseconds")
plt.ylabel("Voltage (V)")
print("")

Let's now create a model to recognise these oscillation bends.

In [None]:
# Construct standard sequential network
model = Sequential()
model.add(Dense(64, input_shape=(2,), activation='tanh'))
model.add(Dense(64, activation='tanh'))
model.add(Dense(64, activation='tanh'))
model.add(Dense(1, activation='sigmoid'))

# Compile model
model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])

# Fit
history = model.fit(X, y, verbose=0, epochs=50)

# Plot decision boundary
plot_decision_boundary(lambda x: model.predict(x), X, y, figsize=(12, 9))

# Plot accuracy
plot_loss_accuracy(history)

# Plot metrics and confusion matrix
y_pred = model.predict_classes(X, verbose=0)
print(classification_report(y, y_pred))
plot_confusion_matrix(model, X, y)

### Recap and extra resources

So far we've used nets to very accurately solve some mathematical problems. You may be wondering about some of the intricacies of how the network works.

Some main things to look into in your own time are 

 * [Why are activations useful, what is relu, and why relu is it so powerful?](https://towardsdatascience.com/activation-functions-and-its-types-which-is-better-a9a5310cc8f)
 * [Why do we use the adam optimiser?](https://towardsdatascience.com/adam-latest-trends-in-deep-learning-optimization-6be9a291375c), if you're feeling really smart you might try the [paper](https://arxiv.org/pdf/1412.6980.pdf).

## Part 6: Intro to Computer Vision

---
[Top](#ML-Week-5---Artificial-Neural-Networks) | [Previous](#Part-5:-Deeper-Networks) | [Next](#Part-7:-Bonus-Content) | [Bottom](#Cheers)

One of the strongest applications of deep learning is in computer vision. To illustrate one usecase we'll the employ the [MNIST fashion dataset](https://github.com/zalandoresearch/fashion-mnist) to try and classify what kind an item of clothing is when fed into the network.

In [None]:
from __future__ import absolute_import, division, print_function

# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt

print(tf.__version__)

#Load the prebaked fashion mnist dataset
fashion_mnist = keras.datasets.fashion_mnist

(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

Lets have a little investigation into the prepackaged data. A useful numpy matrix variable is "shape", let's use this to tell us how much data we are working with.

In [None]:
print("The shape of train image dataset is {}".format(train_images.shape))
print("There are {} labels".format(len(train_labels)))
print("The shape of test image dataset is {}".format(test_images.shape))

### Inspecting the data

In [None]:
plt.figure()
plt.imshow(train_images[0])
plt.colorbar()
plt.grid(False)
plt.show()

We'll need to do some pre-processing on our data to make it easier to use wtihin our neural network. The following will scale the data from 0-255 to 0-1 so its easier for the network to process

In [None]:
train_images = train_images / 255.0
test_images = test_images / 255.0
plt.figure()
plt.imshow(train_images[0])
plt.colorbar()
plt.grid(False)
plt.show()

Let's visualise some of the data to see what we're working with. As you can see we have black and white images with a resolution of 28 by 28. There are 10 different classes within this network, specifying the type of clothing within the picture.

In [None]:
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[train_labels[i]])
plt.show()

Similar to before, we make a simple neural network and use the adam optimiser to train for 5 epochs to classify the type of clothing based upon the image.

In [None]:
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation=tf.nn.relu),
    keras.layers.Dense(10, activation=tf.nn.softmax)
])

#Compile model
model.compile(optimizer='adam', 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
#Train the model for 5 epochs
model.fit(train_images, train_labels, epochs=5)

Let's see how well our model did on the test dataset.

In [None]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)

We can visualise the output qualitatively with some class using the following functions.

In [None]:
def plot_image(i, predictions_array, true_label, img):
    """
    Plot an image
    
    :param i: <int>, the index of the image
    :param predictions_array: list<int>, predictions for the images
    :param true_label: list<int> the actual image labels
    :param img: <np.array>, the actual image
    """
    predictions_array, true_label, img = predictions_array[i], true_label[i], img[i]
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])

    plt.imshow(img, cmap=plt.cm.binary)

    predicted_label = np.argmax(predictions_array)
    if predicted_label == true_label:
        color = 'blue'
    else:
        color = 'red'

    plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
                                100*np.max(predictions_array),
                                class_names[true_label]),
                                color=color)

def plot_value_array(i, predictions_array, true_label):
    """
    Plot a bar chart of likely classes.
    
    :param i: <int>, the index of the image
    :param predictions_array: list<int>, predictions for the images
    :param true_label: list<int> the actual image labels
    """
    predictions_array, true_label = predictions_array[i], true_label[i]
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])
    thisplot = plt.bar(range(10), predictions_array, color="#777777")
    plt.ylim([0, 1]) 
    predicted_label = np.argmax(predictions_array)

    thisplot[predicted_label].set_color('red')
    thisplot[true_label].set_color('blue')

Now let's try to predict an image's clothing type from the test dataset, and visualise the output.

In [None]:
# Grab an image from the test dataset
test_image_num = 1

img = test_images[test_image_num]
print(img.shape)
# Add the image to a batch where it's the only member.
img = (np.expand_dims(img,0))

# Make predictions on all images
predictions = model.predict(test_images)

plt.figure(figsize=(6,3))
plot_image(test_image_num, predictions, test_labels, test_images)
plt.show()

print(img.shape)
predictions_single = model.predict(img)

# Print the raw prediction matrix
print(predictions_single)

# Display all the class predictions
plot_value_array(0, predictions_single, test_labels)
_ = plt.xticks(range(10), class_names, rotation=45)

## Part 7: Bonus Content

---
[Top](#ML-Week-5---Artificial-Neural-Networks) | [Previous](#Part-6:-Intro-to-Computer-Vision) | [Next](#Cheers) | [Bottom](#Cheers)

Here's some more about computer vision processes.

<img src="img/vision.png" width="661">


## Cheers

---
[Top](#ML-Week-5---Artificial-Neural-Networks) | [Previous](#Part-7:-Bonus-Content) | [Next](#Cheers) | [Bottom](#Cheers)

Any questions hit me up my AI relevant experience is particularly in:
* Deep neural networks and their varieties
* Computer vision, classical and deep
* Robotics and self driving vehicles

**Please complete the week 5 survey here**: [bit.ly/ml_week_5_survey](https://docs.google.com/forms/d/e/1FAIpQLSeW4SELSovBB0iIo0gi42pFQEDSJxZB46gJN4NaSCNuVVeJLw/viewform?usp=sf_link)