### Instructions
 * The research question is can we use algorithms and compute to identify clothing items? Specifically, can we determine which algorithm and compute methodology provides us the most efficient approach for classifying simple fashion images?
 * Using the base samples available from Zalando Research:
  * https://github.com/zalandoresearch/fashion-mnist
  * Review the data -- clean as appropriate
  * Provide an initial data analysis
 * Implement at least two approaches for classifying the digits -- examples below:
  * Naïve bayes
  * Neural Networks
  * Keras
  * Azure ML
  * IBM DSX
  * Boosted trees
  * Linear classification
  * Your choice

* Answer the following questions:
  * What is the accuracy of each method?
  * What are the trade-offs of each approach?
  * What is the compute performance of each approach

# TensorFlow Keras Model

## Importing Libraries

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import random

# -----------------------------------
import tensorflow as tf
import keras
from tensorflow import keras
import keras_tuner as kt

In [None]:
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

In [None]:
# !pip install -q -U keras-tuner

### Reading in the data, Splitting data into Training/Testing datasets, and reshapping the images

In [None]:
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

In [None]:
train_images  = train_images / 255.0
test_images = test_images / 255.0

### Printing some examples of the clothing data

In [None]:
plt.figure(figsize=(10,10))
for i in range(16):
    plt.subplot(4,4,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.title("Image label is: {}".format(train_labels[i]))
plt.show()

In [None]:
x_train = train_images.reshape(-1,28,28,1)
x_test = test_images.reshape(-1,28,28,1)

### Hyperparameter Tuning and Building Keras Model

For hypertuning the Keras model I am using Keras Tuner which is a TensorFlow library that helps you pick the optimal set of hyperparameters. Additional information and documentation over the Keras Tuner library [here](https://www.tensorflow.org/tutorials/keras/keras_tuner).

In [None]:
## Setting the parameters for the model to
def build_model(hp):
    model = keras.Sequential([

    # First conv_block
    keras.layers.Conv2D(
        filters = hp.Choice('conv_1_filter', values=[16, 32, 64, 128]),
        kernel_size=hp.Choice('conv_1_kernel', values = [3,4]),
        activation='relu',
        input_shape=(28,28,1)),
    keras.layers.MaxPooling2D((2,2)),

    # Second conv_block
    keras.layers.Conv2D(
        filters = hp.Choice('conv_2_filter', values=[16, 32, 64, 128]),
        kernel_size=hp.Choice('conv_2_kernel', values = [3,4]),
        activation='relu'),
    keras.layers.MaxPooling2D((2,2)),

    # --------------------------------
    keras.layers.Flatten(),
    keras.layers.Dense(units = hp.Choice('units', values=[16, 32, 64, 128, 256]),
                       activation='relu'),
    keras.layers.Dropout(hp.Float('dropout', 0, 0.5, step=0.1, default=0.5)),

    # --------------------------------
    keras.layers.Dense(10)
    ])

    model.compile(optimizer=keras.optimizers.Adam(hp.Choice('learning_rate',
                                                            values=[1e-1, 1e-2, 1e-3, 1e-4])),
              loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
    return model

### Using the [Hyperband](https://www.tensorflow.org/tutorials/keras/keras_tuner#:~:text=The%20Hyperband%20tuning%20algorithm%20uses%20adaptive%20resource%20allocation%20and%20early%2Dstopping%20to%20quickly%20converge%20on%20a%20high%2Dperforming%20model.) tuning algorithm to find the optimal parameters.

In [None]:
tuner = kt.Hyperband(build_model,
                     objective="val_accuracy",
                     max_epochs=5,
                     factor=3,
                     hyperband_iterations=3)

In [None]:
##The combination of parameters that the Hyperband tuning algorithm will run through
tuner.search_space_summary()

[EarlyStopping](https://keras.io/api/callbacks/early_stopping/) in Keras allows us to stop training the model early if the model stops improving.

In [None]:
early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)

The following code runs through all possible model combinations as specified in the build_model function above.

**Note:** Keras Tuner took over 2 hours for the following code block to run, but I also left out the "early_stop" arguement, which may have allowed it to end sooner.

In [None]:
tuner.search(x_train,train_labels, epochs=3, validation_split=0.2, callbacks=[early_stop])

### Optimal Hyperparameters generated from the Keras Search

In [None]:
best_hps = best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]
print(f"""conv_1_filter is {best_hps.get('conv_1_filter')}""")
print(f"""conv_1_kernel is {best_hps.get('conv_1_kernel')}""")
print(f"""conv_2_filter is {best_hps.get('conv_2_filter')}""")
print(f"""conv_2_kernel is {best_hps.get('conv_2_kernel')}""")
print("-------------------------------------------------")
print(f"""units is {best_hps.get('units')}""")
print(f"""learning_rate is {best_hps.get('learning_rate')}""")
print(f"""dropout is {best_hps.get('dropout')}""")
print("-------------------------------------------------")
print(f"""The hyperparameter search is complete. The optimal number of units in the first densely-connected layer
is {best_hps.get('units')} and the optimal learning rate for the optimizer is {best_hps.get('learning_rate')}.""")

## Training Keras Model

In [None]:
model = tuner.hypermodel.build(best_hps)
model.summary()

In [None]:
##Now that we have the optimal hyperparamters for the model and data, we can use these to train and build the model.
model = tuner.hypermodel.build(best_hps)
history = model.fit(x_train, train_labels, epochs=50, validation_split=0.2)

val_acc_per_epoch = history.history['val_accuracy']
best_epoch = val_acc_per_epoch.index(max(val_acc_per_epoch)) + 1
print('Best epoch: %d' % (best_epoch,))

In [None]:
##Running the model with the optimal hyperparameters
hypermodel = tuner.hypermodel.build(best_hps)

history = hypermodel.fit(x_train, train_labels,
                         epochs=best_epoch,
                         validation_split=0.2,
                         callbacks=[early_stop])

### Model Structure and Summary

In [None]:
hypermodel.summary()

In [None]:
##Plot of the layers from the optimal hypermodel
keras.utils.plot_model(hypermodel, show_shapes=True)

## Keras Model Evaluation and Performance

In [None]:
pred = hypermodel.predict(x_test)

print("Prediction is -> {}".format(pred[12]))
print("Actual value is -> {}".format(test_labels[12]))
print("The highest value for label is {}".format(np.argmax(pred[12])))

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

The Validation/Test data does not seem to have as good of accuracy as the Training dataset had. However, for the the sake of time I reduced the number of Epochs for the Test dataset from 50 to 10, which might have reduced the chances for the model the improve further.

In [None]:
import matplotlib.pyplot as plt

# Plotting training and validation accuracy values
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

The Keras Model with Hyperparameter Tuning did produce fairly good results, with an Accuracy on the test images of 90.24%. However, using the Keras Search function to find the optimal parameters did take over 2 hours to find, which seems to be pretty computationally heavy for the accuracy received.

In [None]:
eval_result = hypermodel.evaluate(x_test, test_labels)
print("test loss:", f"{eval_result[0]:.2%}")
print("test accuracy:", f"{eval_result[1]:.2%}")

# Autoencoder Model

## Importing Libraries

In [None]:

import numpy as np
import pandas as pd
import os, time, re
import pickle, gzip
import seaborn as sns
color = sns.color_palette()
import lightgbm as lgb

%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt

import tensorflow as tf
from tensorflow.keras import layers, losses
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.models import Model\

from sklearn import preprocessing as pp
from sklearn.model_selection import train_test_split, StratifiedKFold
from sklearn.metrics import log_loss, precision_recall_curve, average_precision_score, roc_curve, auc, roc_auc_score, accuracy_score, precision_score, recall_score

import keras
from keras import backend as K, regularizers
from keras.models import Sequential, Model
from keras.layers import Activation, Dense, Dropout, BatchNormalization, Input, Lambda
from keras.layers import
from keras.losses import mse, binary_crossentropy

## Building Autoencoder Model

### Reading in the data, Splitting data into Training/Testing datasets, and reshapping the images

In [None]:
train = pd.read_csv('fashion-mnist_train.csv')
test = pd.read_csv('fashion-mnist_test.csv')

In [None]:
(x_train, _), (x_test, _) = fashion_mnist.load_data()

In [None]:
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

x_train = x_train[..., tf.newaxis]
x_test = x_test[..., tf.newaxis]

print(x_train.shape)

### Adding in random noise to the image data

In [None]:
noise_factor = 0.2
x_train_noisy = x_train + noise_factor * tf.random.normal(shape=x_train.shape)
x_test_noisy = x_test + noise_factor * tf.random.normal(shape=x_test.shape)

x_train_noisy = tf.clip_by_value(x_train_noisy, clip_value_min=0., clip_value_max=1.)
x_test_noisy = tf.clip_by_value(x_test_noisy, clip_value_min=0., clip_value_max=1.)

### Printing some examples of the clothing data with the noise added

In [None]:
n = 10
plt.figure(figsize=(20, 2))
for i in range(n):
    ax = plt.subplot(1, n, i + 1)
    plt.title("original + noise")
    plt.imshow(tf.squeeze(x_test_noisy[i]))
    plt.gray()
plt.show()

Creating a class called "Denoise" that will take the images that have the random noise added, and try to predict what type of clothing that noise-added image is.

In [None]:
class Denoise(Model):
  def __init__(self):
    super(Denoise, self).__init__()
    self.encoder = tf.keras.Sequential([
      layers.Input(shape=(28, 28, 1)),
      layers.Conv2D(16, (3, 3), activation='relu', padding='same', strides=2),
      layers.Conv2D(8, (3, 3), activation='relu', padding='same', strides=2)])

    self.decoder = tf.keras.Sequential([
      layers.Conv2DTranspose(8, kernel_size=3, strides=2, activation='relu', padding='same'),
      layers.Conv2DTranspose(16, kernel_size=3, strides=2, activation='relu', padding='same'),
      layers.Conv2D(1, kernel_size=(3, 3), activation='sigmoid', padding='same')])

  def call(self, x):
    encoded = self.encoder(x)
    decoded = self.decoder(encoded)
    return decoded

autoencoder = Denoise()

In [None]:
autoencoder.compile(optimizer='adam', loss=losses.MeanSquaredError())

In [None]:
# early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)

## Training the Autoencoder Model

In [None]:
history = autoencoder.fit(x_train_noisy, x_train,
                epochs=10,
                shuffle=True,
                verbose=1,
                validation_data=(x_test_noisy, x_test))
score = autoencoder.evaluate(x_test_noisy, x_test, verbose=0)

### Model Structure and Summary

In [None]:
autoencoder.encoder.summary()

In [None]:
autoencoder.decoder.summary()

In [None]:
encoded_imgs = autoencoder.encoder(x_test_noisy).numpy()
decoded_imgs = autoencoder.decoder(encoded_imgs).numpy()

### Plotting the noise-added images with the reconstructed images
The reconstructed images are generated images from the random noise-added images.

In [None]:
n = 10
plt.figure(figsize=(20, 4))
for i in range(n):

    # display original + noise
    ax = plt.subplot(2, n, i + 1)
    plt.title("original + noise")
    plt.imshow(tf.squeeze(x_test_noisy[i]))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)

    # display reconstruction
    bx = plt.subplot(2, n, i + n + 1)
    plt.title("reconstructed")
    plt.imshow(tf.squeeze(decoded_imgs[i]))
    plt.gray()
    bx.get_xaxis().set_visible(False)
    bx.get_yaxis().set_visible(False)
plt.show()

## Keras Model Evaluation and Performance

In [None]:
# Predict probabilities for each class
predicted_probs = autoencoder.predict(x_test)

# Get the class indices with the highest probabilities
predicted_classes = np.argmax(predicted_probs, axis=1)

# Get the indices of the true class labels for validation data
y_true = np.argmax(x_test, axis=1)

# Find indices of correct and incorrect predictions
correct = np.nonzero(predicted_classes == y_true)[0]
incorrect = np.nonzero(predicted_classes != y_true)[0]

In [None]:
# Calculate accuracy
accuracy = len(correct) / (len(correct) + len(incorrect))
print("Accuracy: {:.2%}".format(accuracy))

# Print loss
loss = autoencoder.evaluate(x_test_noisy, x_test)
print("Loss:", loss)

# Create a loss graph
plt.plot(history.history['loss'], label='train_loss')
plt.plot(history.history['val_loss'], label='val_loss')
plt.legend()
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and Validation Loss')
plt.show()

In [None]:
print('Test loss:', score)
# print('Test accuracy:', max(history.history["val_loss"]))

The Autoencoder Test/Validation model performs just slightly worse than the training model, with the test loss of 0.665% and training loss of 0.660%.

Training the Autoencoder Model was substantially faster than the time it took to train the Keras Model. However, the Keras Model did have a much higher accuracy, which I would take over the Autoencoder model.

### Sources:
* [TensorFlow Website](https://www.tensorflow.org/tutorials/generative/autoencoder)
* [Kaggle Repo](https://www.kaggle.com/code/aksahaha/autoencoders)