# Introduction

Author: Joseph Muhle
Date:   10/7/2020

This notebook explores the implementation of 2 neural network models. The first model is trained off the clasic MNIST dataset and is aimed at identifying hand written digits. The second model was trained on the fashion MNIST dataset and is aimed at identifying different articles of clothing. 

Running the script should automatically load the data (and do slight formatting), train it, and output the results of each model. 



# **Classic MNIST NN**

In [None]:
# loading the requred packages
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from keras.models import Sequential
from keras.layers import Dense # for fully connected layers


In [None]:
#load in training data
df_train = pd.read_csv("../input/mnist-in-csv/mnist_train.csv") # read csv and convert to dataframe
exp = df_train.loc[: ,"label"] # copy labels from data
data = df_train.loc[:, df_train.columns != "label"] # seperate labels from data
dummies = pd.get_dummies(exp) # changes label represention (ex. 2 as 0010000000)

#load in testing data
df_test = pd.read_csv("../input/mnist-in-csv/mnist_test.csv") # read csv and convert to dataframe
exp_test = df_test.loc[: ,"label"] # copy labels from data
data_test = df_test.loc[:, df_test.columns != "label"] # seperate labels from data
dummies_test = pd.get_dummies(exp_test) # changes label represention (ex. 2 as 0010000000)

The following code block builds the model and specifies how many layers there are, the number of nerons per layer, and their activation function. 

In [None]:
#create model
model = Sequential()
model.add(Dense(10, activation = "relu", input_dim = 784)) # first layer
model.add(Dense(9, activation = "relu")) #second layer
model.add(Dense(8, activation = "relu")) #third layer
model.add(Dense(10, activation = "softmax")) #output layer (origionally used sigmoid, but that caused errors in later epochs)
model.compile(
    optimizer = "adam",
    loss = "categorical_crossentropy", 
    metrics = ["accuracy"]
) # compile model

The following code block will train the model. Changes to the number of epochs can be done by changing n in "epochs = n" on the second line of code to be whatever number of epochs is desired.

In [None]:
# train the model
model.fit(data, dummies, 
          epochs = 40, 
          verbose = 0 
         ) 

The following code block will evaluate how well the model performs on both the training data and the testing data.

In [None]:
#checking model performance
scores = model.evaluate(data, dummies)
print("Training Accuracy: %.2f%%\n" % (scores[1]*100))

scores = model.evaluate(data_test, dummies_test)
print("Testing Accuracy: %.2f%%\n" % (scores[1]*100))

# **Fashion MNIST NN**

The code for the model in this section closely mimics the code of the previous model.

In [None]:
#load in training data
df_train = pd.read_csv("../input/fashionmnist/fashion-mnist_train.csv") # read csv and convert to dataframe
exp = df_train.loc[: ,"label"] # copy labels from data
data = df_train.loc[:, df_train.columns != "label"] # seperate labels from data
dummies = pd.get_dummies(exp) # changes label represention (ex. 2 as 0010000000)

#load in testing data
df_test = pd.read_csv("../input/fashionmnist/fashion-mnist_test.csv") # read csv and convert to dataframe
exp_test = df_test.loc[: ,"label"] # copy labels from data
data_test = df_test.loc[:, df_test.columns != "label"] # seperate labels from data
dummies_test = pd.get_dummies(exp_test) # changes label represention (ex. 2 as 0010000000)

In [None]:
#create model
fashion_model = Sequential()
fashion_model.add(Dense(10, activation = "relu", input_dim = 784)) # first layer
fashion_model.add(Dense(9, activation = "relu")) #second layer
fashion_model.add(Dense(8, activation = "relu")) #third layer
fashion_model.add(Dense(10, activation = "softmax")) #output layer (origionally used sigmoid, but that caused errors in later epochs)
fashion_model.compile(
    optimizer = "adam",
    loss = "categorical_crossentropy", 
    metrics = ["accuracy"]
) # compile model



In [None]:
# training the model
fashion_model.fit(data, dummies, 
          epochs = 60,
          verbose = 0
         ) 

In [None]:
#checking model performance
scores = fashion_model.evaluate(data, dummies)
print("Training Accuracy: %.2f%%\n" % (scores[1]*100))

scores = fashion_model.evaluate(data_test, dummies_test)
print("Testing Accuracy: %.2f%%\n" % (scores[1]*100))

# Analysis of Results

Pandas, a data analysis python library, was used to load and format the data. Keras, a python API useful for deep learning, was used to create, train, and evaluate the model. Origionally, this project started with tenserflow, but on the recomendation of a friend who works with lots of neural networks, recommended the use of keras and pandas.

The biggest difference observed between the two models are how quickly they learn. The MNIST model performs much better than the fashion MNIST model with the same number of epochs. Over seceral different models, the MNIST model seems to have about a 80-90% accuracy with 40 epochs, whereas the fashion MNIST model has about a 40-60% accuracy with 40 epochs. I believe this is due in part to the fashion MNIST database having harder images to classify than the MNIST database (where some images can be identified based off 1 pixel in the image). 

The decision for each model to have 3 hidden layers with 10, 9, and 8 neurons respectively was arbitrary. Origionally they both had 7, 6, and 5 neurons, but they didn't perform nearly as well as desired. 