**Fashion MNIST - A Multilayer Perceptron**

In this practical exercise, we'll make a simple neural network that gets ~90% accuracy on the Fashion MNIST dataset (a ten class, 28x28 image classification problem).

For details about the dataset, check the following link:
**Resources:**
[Fashion MNIST](https://github.com/zalandoresearch/fashion-mnist)


The library we are going to use is Keras to implement the MLP. Check the following link for documentation:
[Keras](https://keras.io)

### Importing The Packages

First up: importing modules. This model just feeds forwards, so we can use a `Sequential` class. As for the layers themselves, we're only using `Dense` and `Activation`. Nothing fancy.

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.utils import to_categorical
import tensorflow.keras as tf

import pandas as pd
import numpy as np

### Data Loading and Prepartion

Next some constants. `INPUT_SHAPE` is 784 (28 x 28 - flattened form of the image), and `NUM_CATEGORIES` is 10. All fairly self explanatory.  At the bottom, we use `pd.read_csv` to pull in our data, and we grab the `values` property, which is a numpy array version of the `DataFrame` we just read in.


#### Note: Since the files of the data are large, you need to download and unzip it locally from: https://www.kaggle.com/datasets/zalando-research/fashionmnist/data

In [None]:
#  DEFINE CONSTANTS

INPUT_SHAPE = 784
NUM_CATEGORIES = 10

LABEL_DICT = {
 0: "T-shirt/top",
 1: "Trouser",
 2: "Pullover",
 3: "Dress",
 4: "Coat",
 5: "Sandal",
 6: "Shirt",
 7: "Sneaker",
 8: "Bag",
 9: "Ankle boot"
}

# LOAD THE RAW DATA
train_raw = pd.read_csv('./data/mnist_fashion/fashion-mnist_train.csv').values
test_raw = pd.read_csv('./data/mnist_fashion/fashion-mnist_test.csv').values

Next, we split the import data into training and testing data (as well as X and Y). Any "x" variable is an input, while "y" is the expected output. We set train and test x to everything but the first column of data in our input data (hence the slice), and use Keras' `to_categorical` to one-hot encode the output label to a vector of length `NUM_CATEGORIES` (10). We then normalize the X data. We change the range from 0 - 255 to 0 - 1 by dividing by 255

In [None]:
# split into X and Y, after one-hot encoding
train_x, train_y = (train_raw[:,1:], to_categorical(train_raw[:,0], num_classes = NUM_CATEGORIES))
test_x, test_y = (test_raw[:,1:], to_categorical(test_raw[:,0], num_classes = NUM_CATEGORIES))

# normalize the x data
train_x = train_x / 255
test_x = test_x / 255

### Model Creation

#### Creating the model

Now for the fun part - defining our model! In this case it's a simple four layer network - an input shape of `INPUT_SHAPE` (784), three 512 neuron layers, and an output layer with `NUM_CATEGORIES` neurons (10). We use categorical crossentroy as our loss, as we've got a multi-class classification problem. For an activation function, we use ReLU all the way, except for the output layer, which uses softmax.

In [None]:
# BUILD THE MODEL
model = Sequential()

# Add a Dense layer to the model with 512 neurons
# Add a relu activation

# <write your code here>
# <write your code here>

# Repeat the same process two more times: 
# Dense of 512 followed by relu activation + 
# Dense of 512 followed by relu activation +
# Dense of 512 followed by relu activation

# <write your code here>
# ...

model.add(Dense(NUM_CATEGORIES))
#In the last layer, we add NUM_CATEGORIE as the output size. We still need to add a softmax activation!
# <write your code here>

#### Compiling the model - categorical crossentropy is for multiple choice classification

In [None]:
# Compile the model. Use rmsprop as optimizer, categorical_crossentropy as loss, and accuracy as metrics
# You can check the Keras documentation for samples

# <write your code here>

#### Training the model

Finally, the training. We tell it to use our `train_x` and `train_y` as our training data, `test_x` and `test_y` to validate, use 32 samples per training pass, and run through the whole dataset 8 times.

In [None]:
# train the model!
model.fit(train_x,
          train_y,
          epochs = 8,
          batch_size = 32,
          validation_data = (test_x, test_y))

In [None]:
# how'd the model do?
# Perform validation on the model for the train and test data

# <write your code here>
# <write your code here>

Nice! The first parameter is loss, while the second parameter is accuracy. almost 90%.