###General Instructions
In this assignment, you will need to complete the code samples where indicated to accomplish the given objectives. **Be sure to run all cells** and export this notebook as an HTML with results included.  Upload the exported HTML file to Canvas by the assignment deadline.

####Assignment
Unlike previous exercises, you will not be provided any sample code from which to work.  You will be given some very high-level instructions and are expected to figure out a solution from there.

The MNIST-Fashion dataset is a collection of 60,000 training and 10,000 test images identifying fashion products in 10 categories: 

| Label        | Description          |
| ------------- |:-------------:|
| 0	| T-shirt/top| 
| 1	| Trouser| 
| 2| 	Pullover| 
| 3	| Dress| 
| 4	| Coat| 
| 5	| Sandal| 
| 6	| Shirt| 
| 7	| Sneaker| 
| 8 | 	Bag| 
| 9	| Ankle boot| 

Build a CNN consisting of the following layers to classify images to a particular category:

1. 2D convolutional layer leveraging 64 nodes and using a 3x3 filter
2. MaxPooling layer with a 2x2 kernel
3. Flattening layer 
4. Dense layer consisting of 96 nodes
5. Dense output layer with 10 nodes

Use standard activation functions with each convolutional and dense layer. Be sure to apply a 25% dropout after the pooling and first dense layer. 

Retrieve your data using the [built-in fashion_mnist dataset](https://keras.io/datasets/#fashion-mnist-database-of-fashion-articles) found within Keras.  Be sure to apply appropriate data transformations. 

Adjust the number of epochs to achieve a > 90% accuracy. Try both the Adadelta and the Stochastic Gradient Descent optimization algorithms (with a learning rate of 0.01 as your starting point) with a loss function of categorical_crossentropy when you compile. Stop whenever gives you the required result.

Do not use Horovod. Be sure to run the last cell of this assignment so that predictions are presented along side sample images.

HINT: Review the first lab on CNNs for the patterns to follow. You will want to go with a lower learning rate than used in that lab or your model can get stuck.  Ideally, it should take you between 10 and 20 epochs to reach the required accuracy. With the Stochastic Gradent Descent optimizer, you might try bumping up your momentum to 0.9 to speed up its learning.

In [4]:
import tensorflow as tf
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import Adadelta

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

In [5]:
# retrieve dataset and apply transformations
from keras.datasets import fashion_mnist

(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()

In [6]:
print('The X_train dataset is organized as a {0} with a shape of {1}'.format(type(X_train), X_train.shape))
print('The X_test dataset is organized as a {0} with a shape of {1}'.format(type(X_test), X_test.shape))
print('The y_train dataset is organized as a {0} with a shape of {1}'.format(type(y_train), y_train.shape))
print('The y_test dataset is organized as a {0} with a shape of {1}'.format(type(y_test), y_test.shape))

In [7]:
X_train.shape[0]

In [8]:
# get shape of original images
img_rows, img_cols = X_train[0].shape[:2]

# restructure arrays for (rows, columns, channels)
X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)
X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)

# record new input shape for layers
input_shape = (img_rows, img_cols, 1)

# normalize the data
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

X_train /= 255
X_test /= 255

# turn y into categorical 
y_train_cat = keras.utils.to_categorical(y_train, 10)
y_test_cat = keras.utils.to_categorical(y_test, 10)

In [9]:
# construct & compile CNN
model = Sequential()

# Convolution Layer
model.add(
  Conv2D(32, 
         kernel_size=(3, 3),
         activation='relu',
         input_shape=input_shape)
         ) 


# Pooling with stride (2, 2)
model.add(
  MaxPooling2D(
    pool_size=(2, 2)
    )
  )
# 25% dropout
model.add(Dropout(0.25))

# flatten out result
model.add(Flatten())

model.add(Dense(96, activation='relu'))
model.add(Dropout(0.5))

# Apply Softmax for the classes 0 through 9 (10 classes total)
model.add(Dense(10, activation='softmax'))

# Loss function (crossentropy) and Optimizer (Adadelta)
model.compile(
  loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
  metrics=['accuracy']
  )


In [10]:
# train the CNN
model.fit(
      X_train,
      y_train_cat,
      batch_size=128,
      epochs=15,
      verbose=1,
      validation_data=(X_test, y_test_cat)
      )



In [11]:
# score the CNN
score = model.evaluate(X_test, y_test_cat, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

In [12]:
# visualize results
items = {
  0:'T-Shirt/Top',
  1:'Trouser',
  2:'Pullover', 
  3:'Dress',
  4:'Coat',
  5:'Sandal', 
  6:'Shirt',
  7:'Sneaker', 
  8:'Bag',
  9:'Ankle Boot'  
  }

plt.figure(figsize=(25,5))

for i, y in enumerate(y_test[0:5]):
  
  yhat = model.predict(X_test[i].reshape(1,28,28,1))
  
  s = plt.subplot(1,5,i+1)
  s.set_title('item = {0} (prob = {1:.2%})'.format( items[np.where(yhat == np.amax(yhat))[1][0]], np.amax(yhat)))
  plt.imshow(X_test[i][:,:,0], cmap='gray') 
  
display()