# CS-344: Homework 04 - Classification

by Joseph Jinn

### For kicks and giggles:

<img style="float:center; transform: rotate(0deg); margin: 0 10px 10px 0" src="ml_meme.png" />

# Part 1: Deep Neural Networks - Bust or Breakthrough?

I speculate that deep neural networks will last at least within my limited lifespan.  Google, who we all know and love, use deep neural networks in standard features of their search engine, such as Google Images.  If giant mega tech corporations are applying machine learning in their commercial products, I doubt it is going away anytime soon.  Insofar as it remains profitable.</p>

There is a steady stream of on-going research involving machine learning.  As one example, my summer 2019 research with Professor VanderLinden.  As another example, AlphaStar, the Starcraft 2 AI that is capable of defeating professional players.  Perusing the deepmind.com blog I saw articles describing research on breast cancer screening via machine learning conducted by Google DeepMind Health.  Apparently, Google acquired DeepMind a while back, which I was not aware of.  Another post described how machine learning was applied to Google’s wind farms to predict optimal hourly delivery commitments to a power grid a full day in advance.  So on and so forth.</p>

With continuing improvements in technology, we will have access to more powerful CPU’s, GPU’s, more RAM/VRAM, etc.  This will only increase the viability of machine learning as an established field with theoretical and practical applications.  It will be more computationally feasible to train on models containing very numerous (deep) layers and a large number of nodes.  At Calvin, we have the “Borg” super-computer with 4 Nvidia Titans.  Tensorflow and other machine learning API are highly parallelizable and scalable.  Keras, a more user-friendly API capable of running on top of Tensorflow allows for fast prototyping so even the average layman could become involved with a simple tutorial or two.</p>

I believe that machine learning is still in its adolescent phase and has yet to mature.  Anyone with a system with decent specifications could install the necessary software and train a machine learning model to predict something.  This isn’t some prohibitively inaccessible field where you need millions of dollars and a Ph.D. from UC-Berkeley in order to get started.  You don’t necessarily need a degree in Statistics to understand the results either.  Google’s Machine Learning Crash Course will do just fine.  In conclusion, deep neural networks should be a breakthrough that will last throughout the 21st century, if not beyond.</p>



# Part 2: Back-Propagation Cycle - Hand Calculation

### Note:  The images are cut off at the top and bottom.  I have the .png's for these in the Homework04 directory of my repository.  I will also probably have turned in a hard-copy to you before the due date. =)

### Page 1:

<img style="float:center; transform: rotate(90deg); margin: 0 10px 10px 0" src="hw04-part2-page1.jpg" />

### Page 2:

<img style="float:center; transform: rotate(90deg); margin: 0 10px 10px 0" src="hw04-part2-page2.jpg" />

### Page 3:

<img style="float:center; transform: rotate(90deg); margin: 0 10px 10px 0" src="hw04-part2-page3.jpg" />

# Part 3: Keras Fashion MNIST Dataset - Keras-based Convolutional Neural Network

## Information concerning the Fashion MNIST Dataset:
    
Similar to the MNIST digit dataset, the Fashion MNIST dataset includes:

60,000 training examples<br>
10,000 testing examples<br>
10 classes<br>
28×28 grayscale/single channel images<br>
The ten fashion class labels include:<br>

T-shirt/top<br>
Trouser/pants<br>
Pullover shirt<br>
Dress<br>
Coat<br>
Sandal<br>
Shirt<br>
Sneaker<br>
Bag<br>
Ankle boot<br>

In [1]:
"""
Course: CS 344 - Artificial Intelligence
Instructor: Professor VanderLinden
Name: Joseph Jinn
Date: 4-1-19

Homework 4 - Classification
Keras Fashion MNIST Dataset - Keras-based Convolutional Neural Network

Notes:

I took material from two tutorials and sort of meshed them together.

I don't pretend to understand everything that's happening, but it seems to be a working CNN.

############################################

Resources Used:

URL: https://www.tensorflow.org/tutorials/keras/basic_classification
(Keras Tensorflow tutorial)

URL: https://developers.google.com/machine-learning/practica/image-classification/
(Google Crash Course)

URL: https://www.markdownguide.org/basic-syntax/
(Markdown syntax for Juypter Notebook)

URL: https://www.pyimagesearch.com/2019/02/11/fashion-mnist-with-keras-and-deep-learning/
(Keras Fashion MNIST Tutorial)

URL: https://www.pyimagesearch.com/2018/12/31/keras-conv2d-and-convolutional-layers/
(Keras Convolutional Layers)

############################################

Assignment Instructions:

Build a Keras-based ConvNet for Keras’s Fashion MNIST dataset (fashion_mnist). Experiment with different network
architectures, submit your most performant network, and report the results.

"""
############################################################################################

from __future__ import absolute_import, division, print_function

# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras

# import the necessary packages
from keras.models import Sequential
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dropout
from keras.layers.core import Dense
from keras import backend as K
from keras.optimizers import SGD
from keras.utils import np_utils

import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report

# Initialize the number of epochs to train for, base learning rate, and batch size.
NUM_EPOCHS = 25
LEARNING_RATE = 1e-2
BATCH_SIZE = 32

print(tf.__version__)

############################################################################################
"""
Label	Class
0	T-shirt/top
1	Trouser
2	Pullover
3	Dress
4	Coat
5	Sandal
6	Shirt
7	Sneaker
8	Bag
9	Ankle boot
"""
# Load the dataset.
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

# Column headers.
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

print("Training data shape:")
print(train_images.shape)
print("Training targets shape:")
print(train_labels.shape)
print("Testing data shape:")
print(test_images.shape)
print("Testing targets shape:")
print(test_labels.shape)
print()

# Display 1st image in training set.
print("Training images[0]:")
plt.figure()
plt.imshow(train_images[0])
plt.colorbar()
plt.grid(False)
plt.show()

############################################################################################
"""
Pre-process dataset.
"""

# Scale data to the range of [0, 1].
train_images = train_images.astype("float32") / 255.0
test_images = test_images.astype("float32") / 255.0

# Display first 25 images in training set.
plt.figure(figsize=(10, 10))
for i in range(25):
    plt.subplot(5, 5, i + 1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[train_labels[i]])
plt.show()

# If we are using "channels first" ordering, then reshape the design matrix such that the matrix is:
# num_samples x depth x rows x columns
if K.image_data_format() == "channels_first":
    train_images_reshaped = train_images.reshape((train_images.shape[0], 1, 28, 28))
    test_images_reshaped = test_images.reshape((test_images.shape[0], 1, 28, 28))

# Otherwise, we are using "channels last" ordering, so the design matrix shape should be:
# num_samples x rows x columns x depth
else:
    train_images_reshaped = train_images.reshape((train_images.shape[0], 28, 28, 1))
    test_images_reshaped = test_images.reshape((test_images.shape[0], 28, 28, 1))

print("Training data shape after reshape:")
print(train_images_reshaped.shape)
print("Testing data shape after reshape:")
print(test_images_reshaped.shape)
print()

# One-hot encode the training and testing labels.
train_labels_one_hot = np_utils.to_categorical(train_labels, 10)
test_labels_one_hot = np_utils.to_categorical(test_labels, 10)

print("Training targets shape after one-hot encoding:")
print(train_labels_one_hot.shape)
print("Testing targets shape after one-hot encoding:")
print(test_labels_one_hot.shape)
print()

############################################################################################
"""
tf.keras.layers.Flatten - Flattens the input. Does not affect the batch size. (2-d to 1-d array)

tf.keras.layers.Dense - densely-connected, or fully-connected, neural layers.

10-node softmax layer—this returns an array of 10 probability scores that sum to 1.

Pooling layers help to progressively reduce the spatial dimensions of the input volume.

Batch normalization seeks to normalize the activations of a given input volume before passing it into the next layer. 
It has been shown to be effective at reducing the number of epochs required to train a CNN at the expense of an 
increase in per-epoch time.

Dropout is a form of regularization that aims to prevent overfitting. 
Random connections are dropped to ensure that no single node in the network is responsible for activating 
when presented with a given pattern.

What follows is a fully-connected layer and softmax classifier (Lines 49-57). 
The softmax classifier is used to obtain output classification probabilities.
"""


def build(width, height, depth, classes):
    """
    Function builds the machine learning. my_model.
    
    :param width:  width of the image file in pixels.
    :param height:  height of the image file in pixels.
    :param depth:  the channels - r,g,b,a.
    :param classes: all the possible labels for each image.
    :return: the constructed machine learning my_model.
    """

    # Initialize the my_model along with the input shape to be "channels last" and the channels dimension itself.
    my_model = Sequential()
    input_shape = (height, width, depth)
    channel_dimensions = -1

    # If we are using "channels first", update the input shape and channels dimension.
    if K.image_data_format() == "channels_first":
        input_shape = (depth, height, width)
        channel_dimensions = 1

    # First CONV => RELU => CONV => RELU => POOL layer set.
    my_model.add(Conv2D(32, (3, 3), padding="same",
                        input_shape=input_shape))
    my_model.add(Activation("relu"))
    my_model.add(BatchNormalization(axis=channel_dimensions))
    my_model.add(Conv2D(32, (3, 3), padding="same"))
    my_model.add(Activation("relu"))
    my_model.add(BatchNormalization(axis=channel_dimensions))
    my_model.add(MaxPooling2D(pool_size=(2, 2)))
    my_model.add(Dropout(0.25))

    # Second CONV => RELU => CONV => RELU => POOL layer set.
    my_model.add(Conv2D(64, (3, 3), padding="same"))
    my_model.add(Activation("relu"))
    my_model.add(BatchNormalization(axis=channel_dimensions))
    my_model.add(Conv2D(64, (3, 3), padding="same"))
    my_model.add(Activation("relu"))
    my_model.add(BatchNormalization(axis=channel_dimensions))
    my_model.add(MaxPooling2D(pool_size=(2, 2)))
    my_model.add(Dropout(0.25))

    # First (and only) set of FC => RELU layers
    my_model.add(Flatten())
    my_model.add(Dense(512))
    my_model.add(Activation("relu"))
    my_model.add(BatchNormalization())
    my_model.add(Dropout(0.5))

    # Softmax classifier.
    my_model.add(Dense(classes))
    my_model.add(Activation("softmax"))

    # Return the constructed network architecture.
    return my_model


"""
Loss function —This measures how accurate the model is during training. 
We want to minimize this function to "steer" the model in the right direction.

Optimizer —This is how the model is updated based on the data it sees and its loss function.

Metrics —Used to monitor the training and testing steps. The following example uses accuracy, 
the fraction of the images that are correctly classified
"""

# Initialize the optimizer and the model.
opt = SGD(lr=LEARNING_RATE, momentum=0.9, decay=LEARNING_RATE / NUM_EPOCHS)
model = build(width=28, height=28, depth=1, classes=10)
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])

"""
Feed the training data to the model—in this example, the train_images and train_labels arrays.

The model learns to associate images and labels.

We ask the model to make predictions about a test set—in this example, the test_images array. 
We verify that the predictions match the labels from the test_labels array.
"""

# Train the model.
train_model = model.fit(train_images_reshaped, train_labels_one_hot,
                        validation_data=(test_images_reshaped, test_labels_one_hot),
                        batch_size=BATCH_SIZE, epochs=NUM_EPOCHS)

############################################################################################
"""
Evaluate accuracy.
"""

test_loss, test_acc = model.evaluate(test_images_reshaped, test_labels_one_hot)

print()
print('Test dataset accuracy:', test_acc)
print('Test dataset loss:', test_loss)
print()

############################################################################################
"""
Make predictions.
"""
predictions = model.predict(test_images_reshaped)


############################################################################################

def plot_image(i, predictions_array, true_label, img):
    """
    Function outputs a graph of the image.

    :param i: counter variable.
    :param predictions_array: array of all predictions made.
    :param true_label: actual identity of apparel in image.
    :param img: fashion apparel image.
    :return: nothing.
    """

    predictions_array, true_label, img = predictions_array[i], true_label[i], img[i]
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])

    plt.imshow(img, cmap=plt.cm.binary)

    predicted_label = np.argmax(predictions_array)
    if predicted_label == true_label:
        color = 'blue'
    else:
        color = 'red'

    plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
                                         100 * np.max(predictions_array),
                                         class_names[true_label]), color=color)


############################################################################################

def plot_value_array(i, predictions_array, true_label):
    """
    Function outputs the prediction array with set of confidence values associated with each class.

    :param i: counter variable.
    :param predictions_array: array of all predictions made.
    :param true_label: actualy identity of apparel in image.
    :return: nothing.
    """
    predictions_array, true_label = predictions_array[i], true_label[i]
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])

    this_plot = plt.bar(range(10), predictions_array, color="#777777")
    plt.ylim([0, 1])
    predicted_label = np.argmax(predictions_array)

    this_plot[predicted_label].set_color('red')
    this_plot[true_label].set_color('blue')


############################################################################################

def visualize_training_results(predictions_array):
    """
    Function visualizes the results of training the model as a summary of statistics and a plot
    of training loss and accuracy on the dataset.

    :return: Nothing.
    """
    # Show a nicely formatted classification report.
    print("[INFO] evaluating network...")
    print(classification_report(test_labels_one_hot.argmax(axis=1), predictions_array.argmax(axis=1),
                                target_names=class_names))

    # Plot the training loss and accuracy (for each epoch).
    plt.style.use("ggplot")
    plt.figure()
    plt.plot(np.arange(0, NUM_EPOCHS), train_model.history["loss"], label="train_loss")
    plt.plot(np.arange(0, NUM_EPOCHS), train_model.history["val_loss"], label="val_loss")
    plt.plot(np.arange(0, NUM_EPOCHS), train_model.history["acc"], label="train_acc")
    plt.plot(np.arange(0, NUM_EPOCHS), train_model.history["val_acc"], label="val_acc")
    plt.title("Training Loss and Accuracy on Dataset")
    plt.xlabel("Epoch #")
    plt.ylabel("Loss/Accuracy")
    plt.legend(loc="lower left")
    plt.savefig("training_loss_accuracy_plot.png")

    # Plot the first X test images, their predicted label, and the true label.
    # Color correct predictions in blue, incorrect predictions in red.
    num_rows = 10
    num_cols = 5
    num_images = num_rows * num_cols
    plt.figure(figsize=(2 * 2 * num_cols, 2 * num_rows))
    for i in range(num_images):
        plt.subplot(num_rows, 2 * num_cols, 2 * i + 1)
        plot_image(i, predictions, test_labels, test_images)
        plt.subplot(num_rows, 2 * num_cols, 2 * i + 2)
        plot_value_array(i, predictions, test_labels)
    plt.show()


############################################################################################

def single_image_prediction_results():
    """
    Predict probabilities for a single image.

    :return: Nothing.
    """

    import random
    random_image = random.randint(1, 10000)

    # Prediction sample.
    print()
    print("Prediction for the randomly chosen image:")
    print(predictions[random_image])
    print()

    # Get the highest confidence value from the prediction array.
    print("Highest class confidence value for the image:")
    print(np.argmax(predictions[random_image]))

    # Confirm against associated test label.
    print("Associated test label value for the image:")
    print(test_labels[random_image])
    print()

    # Grab an image from the test dataset.
    image = test_images_reshaped[random_image]
    print("Randomly chosen image's shape: ")
    print(image.shape)
    print()
    # Add the image to a batch where it's the only member.
    image = (np.expand_dims(image, 0))
    print("Randomly chosen image's shape in batch it was added to: ")
    print(image.shape)
    print()
    # Predict the images in the batch.
    predictions_single = model.predict(image)
    print("Randomly chosen image's batch prediction results: ")
    print(predictions_single)
    print()

    # Visualize the results of the batch image predictions.
    plot_value_array(0, predictions_single, test_labels)
    _ = plt.xticks(range(10), class_names, rotation=45)
    plt.show()

    # Get the max confidence value result signifying which class it belongs to in the labels.
    print("Highest class confidence value for the image in the batch:")
    print(np.argmax(predictions_single[0]))

    # Confirm against associated test label.
    print("Associated test label value for the image in the batch:")
    print(test_labels[random_image])
    print()


############################################################################################
"""
Main function.  Execute the program.
"""
# Debug variable.
debug = 0

if __name__ == '__main__':
    print()

    if debug:
        # Visualize 0th image, predictions, and prediction array.
        i = 0
        plt.figure(figsize=(6, 3))
        plt.subplot(1, 2, 1)
        plot_image(i, predictions, test_labels, test_images)
        plt.subplot(1, 2, 2)
        plot_value_array(i, predictions, test_labels)
        plt.show()

        # Visualize the 12th image, predictions, and prediction array.
        i = 12
        plt.figure(figsize=(6, 3))
        plt.subplot(1, 2, 1)
        plot_image(i, predictions, test_labels, test_images)
        plt.subplot(1, 2, 2)
        plot_value_array(i, predictions, test_labels)
        plt.show()

    ##############################################

    # Predict for a single image.
    single_image_prediction_results()

    ##############################################

    # Visualize the training and prediction results.
    visualize_training_results(predictions)

############################################################################################


Using TensorFlow backend.


1.13.1
Training data shape:
(60000, 28, 28)
Training targets shape:
(60000,)
Testing data shape:
(10000, 28, 28)
Testing targets shape:
(10000,)

Training images[0]:


<Figure size 640x480 with 2 Axes>

<Figure size 1000x1000 with 25 Axes>

Training data shape after reshape:
(60000, 28, 28, 1)
Testing data shape after reshape:
(10000, 28, 28, 1)

Training targets shape after one-hot encoding:
(60000, 10)
Testing targets shape after one-hot encoding:
(10000, 10)

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
Instructions for updating:
Use tf.cast instead.
Train on 60000 samples, validate on 10000 samples
Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25

Test dataset accuracy: 0.8822
Test dataset loss: 0.327498046875



Prediction for the randomly chosen image:
[5.88418730e-03 2.46823241e-04 5.17997444e-02 8.74542654e-01
 4.56983149e-02 1.45562335e

<Figure size 640x480 with 1 Axes>

Highest class confidence value for the image in the batch:
3
Associated test label value for the image in the batch:
2

[INFO] evaluating network...
              precision    recall  f1-score   support

 T-shirt/top       0.86      0.81      0.84      1000
     Trouser       0.99      0.97      0.98      1000
    Pullover       0.79      0.79      0.79      1000
       Dress       0.89      0.89      0.89      1000
        Coat       0.78      0.82      0.80      1000
      Sandal       0.98      0.95      0.96      1000
       Shirt       0.69      0.70      0.69      1000
     Sneaker       0.93      0.96      0.94      1000
         Bag       0.97      0.97      0.97      1000
  Ankle boot       0.96      0.95      0.96      1000

   micro avg       0.88      0.88      0.88     10000
   macro avg       0.88      0.88      0.88     10000
weighted avg       0.88      0.88      0.88     10000



<Figure size 640x480 with 1 Axes>

<Figure size 2000x2000 with 100 Axes>

#### Note 1:  The above results are from fast training with Conv2D() layers disabled.

#### Note 2: Leaving the cell below as "code comments" as I haven't figured out how to easily format it to look exactly as it comes out in the output console in Pycharm using Markdown.  The results below are from training with 2 Conv2D() layers enabled directly from within PyCharm in the "homework4_part3.py" file.



In [2]:
# Results from overnight training:

NUM_EPOCHS = 25
LEARNING_RATE = 1e-2
BATCH_SIZE = 32

Test dataset accuracy: 0.9336
Test dataset loss: 0.1849521788418293



Prediction for the randomly chosen image:
[3.9291081e-10 3.1192984e-10 1.3035477e-10 5.3126011e-12 4.1941251e-12
 9.9999607e-01 3.2498469e-11 2.7543354e-07 1.3105337e-09 3.7321797e-06]

Highest class confidence value for the image:
5
Associated test label value for the image:
5

Randomly chosen image's shape: 
(28, 28, 1)

Randomly chosen image's shape in batch it was added to: 
(1, 28, 28, 1)

Randomly chosen image's batch prediction results: 
[[3.9291231e-10 3.1193104e-10 1.3035527e-10 5.3126219e-12 4.1941494e-12
  9.9999607e-01 3.2498594e-11 2.7543430e-07 1.3105362e-09 3.7321938e-06]]

Highest class confidence value for the image in the batch:
5
Associated test label value for the image in the batch:
5

[INFO] evaluating network...
              precision    recall  f1-score   support

 T-shirt/top       0.87      0.90      0.89      1000
     Trouser       0.99      0.99      0.99      1000
    Pullover       0.89      0.91      0.90      1000
       Dress       0.93      0.94      0.93      1000
        Coat       0.89      0.91      0.90      1000
      Sandal       0.99      0.99      0.99      1000
       Shirt       0.83      0.75      0.79      1000
     Sneaker       0.97      0.98      0.98      1000
         Bag       0.99      0.99      0.99      1000
  Ankle boot       0.98      0.97      0.97      1000

   micro avg       0.93      0.93      0.93     10000
   macro avg       0.93      0.93      0.93     10000
weighted avg       0.93      0.93      0.93     10000


Process finished with exit code 0


IndentationError: unindent does not match any outer indentation level (<tokenize>, line 39)

I let the results in the cell above train overnight.  I hit "run" and then went to sleep and woke up in the morning with these results.  My older laptop with Nvidia Geforce 780M in SLI only has Nvidia CUDA Compute version 3.0, which does not satisfy the requirements to use GPU's with Tensorflow.  Tensorflow currently requires Nvidia CUDA Compute version 3.5 or above.  So, it takes a while using just the 6-core i7.  I would need to do this in my newer laptop with a Nvidia Geforce 1050 Ti if I want to use GPU support for Tensorflow and Keras.

The fast training results with Conv2D() layers disabled only took a few minutes versus the few hours with both Conv2D() layers enabled.  Without the CNN layers, I managed a 88% accuracy with my current hyper parameters.  With the CNN layers, I managed a 93% accuracy with my current hyper parameters.  So, it seems the CNN does better with the Fashion MNIST then a non-CNN.

Below, are the visualized results of training the model.  They have titles in each cell to indicate what they are.

Fashion MNIST Sample Training Image:


<img style="float:center; transform: rotate(0deg); margin: 0 10px 10px 0" src="fashion_mnist_training_image[0].png" />

Fashion MNIST Image Preprocessing:

<img style="float:center; transform: rotate(0deg); margin: 0 10px 10px 0" src="fashion_mnist_training_images_preprocess.png" />

Image pre-processing grayscales the original image.

Fashion MNIST Image Prediction Array (probabilities for each class for a single example)

<img style="float:center; transform: rotate(0deg); margin: 0 10px 10px 0" src="fashion_mnist_image_prediction_array.png" />

### Fashion MNIST Loss and Accuracy Plot

<img style="float:center; transform: rotate(0deg); margin: 0 10px 10px 0" src="fashion_mnist_image_loss_accuracy.png" />

Based on the above plot of accuracy and loss, it seems validation loss began to exceed training loss around Epoch 12 or 13.  Training accuracy seemed to fit very tightly to validation accuracy after the 5th Epoch.  After the 12 or 13th Epoch, training accuracy exceeded validation accuracy by some minute margin.

### Fashion MNIST Image Prediction Results (array of examples)

Note: Blue = correct prediction, Red = incorrect prediction<br>

<img style="float:center; transform: rotate(0deg); margin: 0 10px 10px 0" src="fashion_mnist_test_images_prediction_results.png" />

<img style="float:center; transform: rotate(0deg); margin: 0 10px 10px 0" src="fashion_mnist_test_images_prediction_results2.png" />

<img style="float:center; transform: rotate(0deg); margin: 0 10px 10px 0" src="fashion_mnist_test_images_prediction_results3.png" />

A sequence of plots of an array of examples and their prediction results.  First a 5x5, then a 10x10, then a 20x20.  I think it stopped outputting to SciView in PyCharm once I tried to go even higher.  Otherwise, the 20x20 contains everything in the 10x10 and the 10x10 contains everything in the 5x5, and then some (upper left quadrant for the smaller set in the larger set).