<p align="center"><img width="50%" src="https://aimodelsharecontent.s3.amazonaws.com/aimodshare_banner.jpg" /></p>


---
**Source:**<br>
*Dataset adapted from: Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. Han Xiao, Kashif Rasul, Roland Vollgraf. arXiv:1708.07747*


# Fashion - MNIST Image Classification Competition Model Submission Guide

---

Let's share our models to a centralized leaderboard, so that we can collaborate and learn from the model experimentation process...

**Instructions:**
1.   Get data in and set up X_train / X_test / y_train
2.   Preprocess data / Write and Save Preprocessor function
3. Fit model on preprocessed data and save preprocessor function and model 
4. Generate predictions from X_test data and submit model to competition
5. Repeat submission process to improve place on leaderboard



## 1. Get data in and set up X_train, X_test, y_train objects

In [None]:
#install aimodelshare library
! pip install aimodelshare --upgrade

In [2]:
# Get competition data
from aimodelshare import download_data
download_data('public.ecr.aws/y2e2a1d6/fashion_mnist_competition_data-repository:latest') 


Data downloaded successfully.


##2.   Preprocess data / Write and Save Preprocessor function


In [None]:
# Here is a pre-designed preprocessor, but you could also build your own to prepare the data differently

def preprocessor(image_filepath, shape=(28, 28)):
        """
        This function reads in images, resizes them to a fixed shape and
        min/max transforms them before converting feature values to float32 numeric values
        required by onnx files.
        
        params:
            image_filepath
                full filepath of a particular image
                      
        returns:
            X
                numpy array of preprocessed image data
        """
           
        import cv2
        import numpy as np

        "Resize a color image and min/max transform the image"
        img = cv2.imread(image_filepath) # Read in image from filepath.
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # cv2 reads in images in order of blue green and red, we reverse the order for ML.
        img = cv2.resize(img, shape) # Change height and width of image.
        img = img / 255.0 # Min-max transform.

        # Resize all the images...
        X = np.array(img)
        X = np.expand_dims(X, axis=0) # Expand dims to add "1" to object shape [1, h, w, channels] for keras model.
        X = np.array(X, dtype=np.float32) # Final shape for onnx runtime.
        return X

In [None]:
#  Create training data objects

# Preprocess X_train image data to generate predictions from models 
import numpy as np

file_names = [('fashion_mnist_competition_data/training_data/train_' + str(i) + '.jpeg') for i in range(60000)]
preprocessed_image_data = [preprocessor(x) for x in file_names]

#Create single X_test array from preprocessed images
X_train = np.vstack(preprocessed_image_data) 

# Load y_train labels 
import pickle
with open("fashion_mnist_competition_data/y_train_labels.pkl", "rb") as fp:  
    y_train_labels = pickle.load(fp)

# One-hot encode y_train labels (y_train.columns used to generate prediction labels below)
import pandas as pd
y_train = pd.get_dummies(y_train_labels)

In [None]:
# Create test data objects

# Preprocess X_test image data to generate predictions from models 
import numpy as np

file_names = [('fashion_mnist_competition_data/test_data/test_' + str(i) + '.jpeg') for i in range(10000)]
preprocessed_image_data = [preprocessor(x) for x in file_names]

#Create single X_test array from preprocessed images
X_test = np.vstack(preprocessed_image_data) 

In [None]:
# Check shape 

print(X_train.shape)
print(X_test.shape)
print(y_train.shape)

(60000, 28, 28, 3)
(10000, 28, 28, 3)
(60000, 10)


##3. Fit model on preprocessed data and save preprocessor function and model 


In [None]:
# Let's build a convnet model...
import tensorflow as tf 
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras import regularizers

with tf.device('/device:GPU:0'): # "/GPU:0": Short-hand notation for the first GPU of your machine that is visible to TensorFlow.
                                 
        cnn1 = Sequential()
        cnn1.add(Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', input_shape=[28, 28, 3]))
        cnn1.add(Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', kernel_regularizer=regularizers.l2(l=0.01)))
        cnn1.add(MaxPooling2D(pool_size=2)) 
        cnn1.add(Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', kernel_regularizer=regularizers.l2(l=0.01)))
        cnn1.add(Conv2D(filters=16, kernel_size=3, padding='same', activation='relu'))
        cnn1.add(MaxPooling2D(pool_size=2))
        cnn1.add(Dropout(0.3))
        cnn1.add(Flatten())
        cnn1.add(Dense(50, activation='relu'))
        cnn1.add(Dropout(0.2))
        cnn1.add(Dense(10, activation='softmax'))

        # Compile model...
        cnn1.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

        # Fitting the NN to the Training set...
        hist = cnn1.fit(X_train, y_train, epochs=1, verbose=1, validation_split=0.2)



#### Save preprocessor function to local "preprocessor.zip" file

In [None]:
import aimodelshare as ai
ai.export_preprocessor(preprocessor,"") 

Your preprocessor is now saved to 'preprocessor.zip'


#### Save model to local ".onnx" file

In [None]:
# Save keras model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(cnn1, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

## 4. Generate predictions from X_test data and submit model to competition


In [None]:
#Set credentials using modelshare.org username/password

from aimodelshare.aws import set_credentials
    
apiurl="https://v0c5jgfkx6.execute-api.us-east-1.amazonaws.com/prod/m" #This is the unique rest api that powers this Fashion-MNIST Classification Model Playground

set_credentials(apiurl=apiurl)

AI Modelshare Username:··········
AI Modelshare Password:··········
AI Model Share login credentials set successfully.


In [None]:
#Instantiate Competition

mycompetition= ai.Competition(apiurl)

In [None]:
#Submit Model 1: 

#-- Generate predicted y values (Model 1)
#Note: Keras predict returns the predicted column index location for classification models
prediction_column_index=cnn1.predict(X_test).argmax(axis=1)

# extract correct prediction labels 
prediction_labels = [y_train.columns[i] for i in prediction_column_index]

# Submit Model 1 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "model.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 3

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:1034


In [None]:
# Get leaderboard to explore current best model architectures

# Get raw data in pandas data frame
data = mycompetition.get_leaderboard()

# Stylize leaderboard data
mycompetition.stylize_leaderboard(data)

Unnamed: 0,accuracy,f1_score,precision,recall,ml_framework,transfer_learning,deep_learning,model_type,depth,num_params,dropout_layers,maxpooling2d_layers,flatten_layers,dense_layers,relu_act,softmax_act,loss,optimizer,model_config,memory_size,username,version
0,85.94%,85.84%,85.91%,85.94%,keras,False,True,Sequential,11,47168,2.0,2.0,1,2,5,1.0,str,RMSprop,"{'name': 'sequential', 'layers...",2691496,gstreett,3
2,84.49%,84.42%,85.64%,84.49%,keras,False,True,Sequential,3,302474,,,1,2,1,,CategoricalCrossentropy,Adam,"{'name': 'sequential', 'layers...",166064,AIModelShare,1
3,82.18%,82.13%,84.44%,82.18%,keras,False,True,Sequential,4,506186,,,1,3,2,1.0,function,RMSprop,"{'name': 'sequential_9', 'laye...",180376,AIModelShare,2


## 5. Repeat submission process to improve place on leaderboard


In [None]:
# Train and submit model 2 using same preprocessor (note that you could save a new preprocessor, but we will use the same one for this example).

with tf.device('/device:GPU:0'): # "/GPU:0": Short-hand notation for the first GPU of your machine that is visible to TensorFlow.
      densemodel = Sequential()
      densemodel.add(Dense(128,  input_shape=(28, 28, 3), activation='relu'))
      densemodel.add(Dense(64, activation='relu'))
      densemodel.add(Dense(64, activation='relu'))
      densemodel.add(Flatten())
      densemodel.add(Dense(10, activation='softmax')) 
                                                  
      # Compile model
      densemodel.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

      # Fitting the NN to the Training set
      densemodel.fit(X_train, y_train, 
                    epochs = 1, verbose=1, validation_split=.2)



In [None]:
# Save keras model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(densemodel, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model2.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [None]:
#Submit Model 2: 

#-- Generate predicted y values (Model 2)
prediction_column_index=densemodel.predict(X_test).argmax(axis=1)

# extract correct prediction labels 
prediction_labels = [y_train.columns[i] for i in prediction_column_index]

# Submit Model 2 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "model2.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 4

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:1034


In [None]:
# Compare two or more models 
data=mycompetition.compare_models([3,4], verbose=1)
mycompetition.stylize_compare(data)

Unnamed: 0,Model_3_Layer,Model_3_Shape,Model_3_Params,Model_4_Layer,Model_4_Shape,Model_4_Params
0,Conv2D,"[None, 28, 28, 16]",448,Dense,"[None, 28, 28, 128]",512.0
1,Conv2D,"[None, 28, 28, 16]",2320,Dense,"[None, 28, 28, 64]",8256.0
2,MaxPooling2D,"[None, 14, 14, 16]",0,Dense,"[None, 28, 28, 64]",4160.0
3,Conv2D,"[None, 14, 14, 16]",2320,Flatten,"[None, 50176]",0.0
4,Conv2D,"[None, 14, 14, 16]",2320,Dense,"[None, 10]",501770.0
5,MaxPooling2D,"[None, 7, 7, 16]",0,,,
6,Dropout,"[None, 7, 7, 16]",0,,,
7,Flatten,"[None, 784]",0,,,
8,Dense,"[None, 50]",39250,,,
9,Dropout,"[None, 50]",0,,,


## Optional: Tune model within range of hyperparameters with Keras Tuner

*Simple example shown below. Consult [documentation](https://keras.io/guides/keras_tuner/getting_started/) to see full functionality.*

In [None]:
! pip install keras_tuner

In [None]:
#Separate validation data 
from sklearn.model_selection import train_test_split
x_train_split, x_val, y_train_split, y_val = train_test_split(
     X_train, y_train, test_size=0.2, random_state=42)

In [None]:
import keras_tuner as kt
import tensorflow as tf 
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras import regularizers

#Define model structure & parameter search space with function
def build_model(hp):
    with tf.device('/device:GPU:0'): # "/GPU:0": Short-hand notation for the first GPU of your machine that is visible to TensorFlow.
                                 
        model = Sequential()
        model.add(Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', input_shape=[28, 28, 3]))
        model.add(Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', kernel_regularizer=regularizers.l2(l=0.01)))
        model.add(MaxPooling2D(pool_size=2)) 
        model.add(Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', kernel_regularizer=regularizers.l2(l=0.01)))
        model.add(Conv2D(filters=hp.Int("filters", min_value=8, max_value=32, step=8), #range 8-32 inclusive, minimum step between tested values is 8
                         kernel_size=3, padding='same', activation='relu'))
        model.add(MaxPooling2D(pool_size=2))
        model.add(Dropout(0.3))
        model.add(Flatten())
        model.add(Dense(50, activation='relu'))
        model.add(Dropout(0.2))
        model.add(Dense(10, activation='softmax'))

        # Compile model...
        model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

        return model 

#initialize the tuner (which will search through parameters)
tuner = kt.RandomSearch(
    hypermodel=build_model, 
    objective="val_accuracy", # objective to optimize
    max_trials=2, #max number of trials to run during search
    executions_per_trial=1, #higher number reduces variance of results; guages model performance more accurately 
    overwrite=True,
    directory="tuning_model",
    project_name="tuning_units",
)

tuner.search(x_train_split, y_train_split, epochs=1, validation_data=(x_val, y_val))


Trial 2 Complete [00h 02m 23s]
val_accuracy: 0.8656666874885559

Best val_accuracy So Far: 0.8656666874885559
Total elapsed time: 00h 04m 46s


In [None]:
# Build model with best hyperparameters

# Get the top 2 hyperparameters.
best_hps = tuner.get_best_hyperparameters(5)
# Build the model with the best hp.
tuned_model = build_model(best_hps[0])
# Fit with the entire dataset.
tuned_model.fit(x=X_train, y=y_train, epochs=1)




<keras.callbacks.History at 0x7fde6b751d50>

In [None]:
# Save keras model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(tuned_model, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("tuned_model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [None]:
#Submit Model 3: 

#-- Generate predicted y values (Model 3)
prediction_column_index=tuned_model.predict(X_test).argmax(axis=1)

# extract correct prediction labels 
prediction_labels = [y_train.columns[i] for i in prediction_column_index]

# Submit Model 3 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "tuned_model.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 5

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:1034


In [None]:
# Get leaderboard

data = mycompetition.get_leaderboard()
mycompetition.stylize_leaderboard(data)

Unnamed: 0,accuracy,f1_score,precision,recall,ml_framework,transfer_learning,deep_learning,model_type,depth,num_params,dropout_layers,maxpooling2d_layers,flatten_layers,dense_layers,relu_act,softmax_act,loss,optimizer,model_config,memory_size,username,version
0,86.43%,86.32%,86.49%,86.43%,keras,False,True,Sequential,11,47168,2.0,2.0,1,2,5,1.0,str,RMSprop,"{'name': 'sequential_1', 'laye...",2363616,gstreett,5
1,85.94%,85.84%,85.91%,85.94%,keras,False,True,Sequential,11,47168,2.0,2.0,1,2,5,1.0,str,RMSprop,"{'name': 'sequential', 'layers...",2691496,gstreett,3
2,85.03%,84.83%,85.05%,85.03%,keras,False,True,Sequential,5,514698,,,1,4,3,1.0,str,RMSprop,"{'name': 'sequential_2', 'laye...",2044352,gstreett,4
4,84.49%,84.42%,85.64%,84.49%,keras,False,True,Sequential,3,302474,,,1,2,1,,CategoricalCrossentropy,Adam,"{'name': 'sequential', 'layers...",166064,AIModelShare,1
5,82.18%,82.13%,84.44%,82.18%,keras,False,True,Sequential,4,506186,,,1,3,2,1.0,function,RMSprop,"{'name': 'sequential_9', 'laye...",180376,AIModelShare,2


In [None]:
# Compare two or more models 
data=mycompetition.compare_models([4, 5], verbose=1)
mycompetition.stylize_compare(data)

Unnamed: 0,Model_4_Layer,Model_4_Shape,Model_4_Params,Model_5_Layer,Model_5_Shape,Model_5_Params
0,Dense,"[None, 28, 28, 128]",512.0,Conv2D,"[None, 28, 28, 16]",448
1,Dense,"[None, 28, 28, 64]",8256.0,Conv2D,"[None, 28, 28, 16]",2320
2,Dense,"[None, 28, 28, 64]",4160.0,MaxPooling2D,"[None, 14, 14, 16]",0
3,Flatten,"[None, 50176]",0.0,Conv2D,"[None, 14, 14, 16]",2320
4,Dense,"[None, 10]",501770.0,Conv2D,"[None, 14, 14, 16]",2320
5,,,,MaxPooling2D,"[None, 7, 7, 16]",0
6,,,,Dropout,"[None, 7, 7, 16]",0
7,,,,Flatten,"[None, 784]",0
8,,,,Dense,"[None, 50]",39250
9,,,,Dropout,"[None, 50]",0
