<p align="center"><img width="50%" src="https://aimodelsharecontent.s3.amazonaws.com/aimodshare_banner.jpg" /></p>


# Image Classification Competition Model Submission Guide

---

Let's share our models to a centralized leaderboard, so that we can collaborate and learn from the model experimentation process...

**Instructions:**
1.   Get data in and set up X_train / X_test / y_train
2.   Preprocess data / Write and Save Preprocessor function
3. Fit model on preprocessed data and save preprocessor function and model 
4. Generate predictions from X_test data and submit model to competition
5. Repeat submission process to improve place on leaderboard



## 1. Get data in and set up X_train, X_test, y_train objects

In [1]:
#install aimodelshare library
! pip install aimodelshare --upgrade

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting aimodelshare
  Downloading aimodelshare-0.0.159-py3-none-any.whl (952 kB)
[K     |████████████████████████████████| 952 kB 4.3 MB/s 
[?25hCollecting PyJWT>=2.4.0
  Downloading PyJWT-2.6.0-py3-none-any.whl (20 kB)
Collecting keras2onnx>=1.7.0
  Downloading keras2onnx-1.7.0-py3-none-any.whl (96 kB)
[K     |████████████████████████████████| 96 kB 5.2 MB/s 
[?25hCollecting wget==3.2
  Downloading wget-3.2.zip (10 kB)
Collecting onnxmltools>=1.6.1
  Downloading onnxmltools-1.11.1-py3-none-any.whl (308 kB)
[K     |████████████████████████████████| 308 kB 55.0 MB/s 
Collecting shortuuid>=1.0.8
  Downloading shortuuid-1.0.11-py3-none-any.whl (10 kB)
Collecting boto3==1.21.20
  Downloading boto3-1.21.20-py3-none-any.whl (132 kB)
[K     |████████████████████████████████| 132 kB 61.7 MB/s 
Collecting scikit-learn==0.24.2
  Downloading scikit_learn-0.24.2-cp38-cp38-manylinux2010_x86_

In [2]:
# Get competition data
from aimodelshare import download_data
download_data('public.ecr.aws/y2e2a1d6/neuralnet_competition_data-repository:latest') 


Data downloaded successfully.


In [3]:
# Create training data objects 

# Extract images
!unzip "neuralnet_competition_data/X_train.zip"
!unzip "neuralnet_competition_data/X_test.zip" 

# Create ordered list of filepaths 
train_filepaths = [('/content/train_shuffle/' + str(i) + '.jpg') for i in range(0, 6472)]
test_filepaths = [('/content/test_shuffle/' + str(i) + '.jpg') for i in range(0, 9127)]

# Read in y_train data 
import pandas as pd 
y_train = pd.read_csv("neuralnet_competition_data/y_train.csv")

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  inflating: test_shuffle/8701.jpg   
  inflating: test_shuffle/1057.jpg   
  inflating: test_shuffle/6738.jpg   
  inflating: test_shuffle/7426.jpg   
  inflating: test_shuffle/8715.jpg   
  inflating: test_shuffle/3640.jpg   
  inflating: test_shuffle/3898.jpg   
  inflating: test_shuffle/5231.jpg   
  inflating: test_shuffle/2238.jpg   
  inflating: test_shuffle/4891.jpg   
  inflating: test_shuffle/5557.jpg   
  inflating: test_shuffle/4649.jpg   
  inflating: test_shuffle/3126.jpg   
  inflating: test_shuffle/7340.jpg   
  inflating: test_shuffle/8073.jpg   
  inflating: test_shuffle/1731.jpg   
  inflating: test_shuffle/2210.jpg   
  inflating: test_shuffle/4661.jpg   
  inflating: test_shuffle/7368.jpg   
  inflating: test_shuffle/6076.jpg   
  inflating: test_shuffle/1719.jpg   
  inflating: test_shuffle/6710.jpg   
  inflating: test_shuffle/3668.jpg   
  inflating: test_shuffle/4107.jpg   
  inflating: test_shuff

##2.   Preprocess data / Write and Save Preprocessor function


In [7]:
# Here is a pre-designed preprocessor, but you could also build your own to prepare the data differently
import albumentations as A
def preprocessor(image_filepath, shape=(10, 10)):
        """
        This function preprocesses reads in images, resizes them to a fixed shape and
        min/max transforms them before converting feature values to float32 numeric values
        required by onnx files.
        
        params:
            image_filepath
                full filepath of a particular image
                      
        returns:
            X
                numpy array of preprocessed image data
                  
        """
           
        import cv2
        import numpy as np

        "Resize a color image and min/max transform the image"
        img = cv2.imread(image_filepath) # Read in image from filepath.
        img = cv2.resize(img, (64, 64), interpolation = cv2.INTER_CUBIC)
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # cv2 reads in images in order of blue green and red, we reverse the order for ML.
        #img = cv2.resize(img, shape) # Change height and width of image.
        
        transform_data = A.Compose([A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), max_pixel_value=255.0, p=1)])
        img = transform_data(image = img)['image']  #custom transform
        #img = img / 255.0 # Min-max transform.


        # Resize all the images...
        X = np.array(img)
        X = np.expand_dims(X, axis=0) # Expand dims to add "1" to object shape [1, h, w, channels].
        X = np.array(X, dtype=np.float32) # Final shape for onnx runtime.

        # transpose image to pytorch format
        X = np.transpose(X, (0, 3, 1, 2))

        return X

In [8]:
# Use preprocessor to create X_train object 

# Import image, load to array of shape height, width, channels, then min/max transform...
# Read in all images from filenames...
import numpy as np 
import os 

preprocessed_image_data = [preprocessor(x) for x in train_filepaths]

# Object needs to be an array rather than a list for Keras. (vstack converts above list to array object.)
X_train = np.vstack(preprocessed_image_data)
# Assigning to X_train to highlight that this represents feature input data for our model.

In [9]:
# Preprocess X_test image data to generate predictions from models 
import numpy as np

preprocessed_image_data = [preprocessor(x) for x in test_filepaths]

#Create single X_test array from preprocessed images
X_test = np.vstack(preprocessed_image_data) 

##3. Fit model on preprocessed data and save preprocessor function and model 


In [11]:
#Let us build a baseline pytorch model and load the trained weights
import torch
import torch.nn as nn
import torchvision
from torchsummary import summary

# Load the EfficientNetB3 model trained on ImageNet
backbone_model = torchvision.models.efficientnet_b3(weights='IMAGENET1K_V1')
backbone_model.classifier = nn.Identity()

# input = torch.rand((1, 3, 64, 64))
# output = backbone_model(input)
# print(output.shape)

class BaselineModel(nn.Module):
    def __init__(self, backbone) -> None:
        super().__init__()
        self.backbone = backbone
        # 3 super classes and 89 + 1 (novel) sub classes
        self.superclass = nn.Linear(in_features = 1536, out_features = 3)
        #self.subclass = nn.Linear(in_features = 1536, out_features = 90)
    
    def forward(self, x):
        out = self.backbone(x)
        super_class_out = self.superclass(out)
        #sub_class_out = self.subclass(out)
        return super_class_out

# Create the baseline EfficientNetB6 model
baseline_model = BaselineModel(backbone_model)

# input = torch.rand((1, 3, 64, 64))
# output = backbone_model(input)
# print(output.shape)
# print(baseline_model)

baseline_model.load_state_dict(torch.load('./Bestmodel_SuperClass.pth', map_location=torch.device('cpu'))['model_state_dict'], strict = True)

<All keys matched successfully>

#### Save preprocessor function to local "preprocessor.zip" file

In [12]:
import aimodelshare as ai
ai.export_preprocessor(preprocessor,"") 

Your preprocessor is now saved to 'preprocessor.zip'


#### Save model to local ".onnx" file

In [13]:
# Save pytorch model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx
example_input = torch.randn(1, 3, 64, 64, requires_grad=True)

onnx_model = model_to_onnx(baseline_model, framework='pytorch',
                           model_input=example_input,
                          transfer_learning=True,
                          deep_learning=True)

with open("baseline_model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

## 4. Generate predictions from X_test data and submit model to competition


In [14]:
#Set credentials using modelshare.org username/password

from aimodelshare.aws import set_credentials

# Note -- This is the unique rest api that powers this specific image classification Model Plaground
# ... Update the apiurl if submitting to a new competition

apiurl="https://8vhobca1n7.execute-api.us-east-1.amazonaws.com/prod/m"
set_credentials(apiurl=apiurl)

AI Modelshare Username:··········
AI Modelshare Password:··········
AI Model Share login credentials set successfully.


In [15]:
#Instantiate Competition

mycompetition= ai.Competition(apiurl)

In [16]:
#Submit Model 1: 

#-- Generate predicted y values (Model 1)
#Note: Keras predict returns the predicted column index location for classification models
# prediction_column_index=cnn1.predict(X_test).argmax(axis=1)
import torch
import numpy as np

X_test_tensor = torch.from_numpy(X_test)
#print(X_test_tensor.shape)

prediction_column_index = np.zeros((X_test_tensor.shape[0]))

for idx in range(0, len(X_test_tensor), 64):
    last_idx = min(idx + 64, len(X_test_tensor))
    if idx % (1280) == 0:
        print(idx)
    
    with torch.no_grad():
        prediction_column_index[idx:last_idx] = (torch.argmax(baseline_model(X_test_tensor[idx:last_idx]), dim = 1)).detach().cpu().numpy()

prediction_column_index = prediction_column_index.astype(np.int32)


# extract correct prediction labels 
prediction_labels = [y_train.columns[i] for i in prediction_column_index]

# print(len(prediction_labels))
# print(prediction_labels[:10])

0
1280
2560
3840
5120
6400
7680
8960


In [17]:
# Submit Model 1 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "baseline_model.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): None
Provide any useful notes about your model (optional): None

Your model has been submitted as model version 99

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:2651


In [18]:
# Get leaderboard to explore current best model architectures

# Get raw data in pandas data frame
data = mycompetition.get_leaderboard()

# Stylize leaderboard data
mycompetition.stylize_leaderboard(data)

Unnamed: 0,accuracy,f1_score,precision,recall,ml_framework,transfer_learning,deep_learning,model_type,depth,num_params,dense_layers,identity_layers,adaptiveavgpool2d_layers,batchnorm2d_layers,dropout_layers,conv2d_layers,flatten_layers,maxpooling2d_layers,relu_act,softmax_act,silu_act,sigmoid_act,loss,optimizer,memory_size,username,version
0,73.77%,73.42%,72.92%,75.65%,unknown,,,unknown,,,0.0,,,,,,,,,,,,,,,cp3016,24
1,73.42%,73.08%,73.39%,76.98%,unknown,,,unknown,,,0.0,,,,,,,,,,,,,,,cp3016,41
2,72.02%,71.58%,72.15%,75.58%,unknown,,,unknown,,,0.0,,,,,,,,,,,,,,,cp3016,81
3,69.57%,69.21%,70.80%,73.78%,pytorch,True,True,BaselineModel(),237.0,10700843.0,1.0,1.0,27.0,78.0,,130.0,,,,,78.0,26.0,,,1944992.0,ug2146,99
4,69.22%,69.19%,70.56%,73.39%,unknown,,,unknown,,,0.0,,,,,,,,,,,,,,,cp3016,56
5,68.16%,67.74%,68.44%,71.15%,unknown,,,unknown,,,0.0,,,,,,,,,,,,,,,vs2778,38
6,68.27%,68.27%,67.65%,71.05%,unknown,,,unknown,,,0.0,,,,,,,,,,,,,,,ar4180,93
7,68.27%,68.27%,67.65%,71.05%,unknown,,,unknown,,,0.0,,,,,,,,,,,,,,,ar4180,47
8,68.14%,67.74%,67.69%,70.80%,unknown,,,unknown,,,0.0,,,,,,,,,,,,,,,cp3016,83
9,67.75%,67.44%,67.73%,70.62%,unknown,,,unknown,,,0.0,,,,,,,,,,,,,,,sp4013,90


## 5. Repeat submission process to improve place on leaderboard


In [None]:
# Train and submit model 2 using same preprocessor (note that you could save a new preprocessor, but we will use the same one for this example).

with tf.device('/device:GPU:0'): # "/GPU:0": Short-hand notation for the first GPU of your machine that is visible to TensorFlow.
      cnn2 = Sequential()
      cnn2.add(Dense(64,  input_shape=(10, 10, 3), activation='relu'))
      cnn2.add(Dense(64, activation='relu'))
      cnn2.add(Dense(64, activation='relu'))
      cnn2.add(Flatten())
      cnn2.add(Dense(3, activation='softmax')) 
                                                  
      # Compile model
      cnn2.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

      # Fitting the NN to the Training set
      cnn2.fit(X_train, y_train, 
                    epochs = 1, verbose=1, validation_split=.2)



In [None]:
# Save keras model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(cnn2, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model2.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [None]:
#Submit Model 2: 

#-- Generate predicted y values (Model 2)
prediction_column_index=cnn2.predict(X_test).argmax(axis=1)

# extract correct prediction labels 
prediction_labels = [y_train.columns[i] for i in prediction_column_index]

# Submit Model 2 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "model2.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 2

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:2651


In [None]:
# Compare two or more models
data=mycompetition.compare_models([1,2], verbose=1)
mycompetition.stylize_compare(data)

Unnamed: 0,Model_1_Layer,Model_1_Shape,Model_1_Params,Model_2_Layer,Model_2_Shape,Model_2_Params
0,Conv2D,"[None, 10, 10, 16]",448,Dense,"[None, 10, 10, 64]",256.0
1,Conv2D,"[None, 10, 10, 16]",2320,Dense,"[None, 10, 10, 64]",4160.0
2,MaxPooling2D,"[None, 5, 5, 16]",0,Dense,"[None, 10, 10, 64]",4160.0
3,Conv2D,"[None, 5, 5, 16]",2320,Flatten,"[None, 6400]",0.0
4,Conv2D,"[None, 5, 5, 16]",2320,Dense,"[None, 3]",19203.0
5,MaxPooling2D,"[None, 2, 2, 16]",0,,,
6,Dropout,"[None, 2, 2, 16]",0,,,
7,Flatten,"[None, 64]",0,,,
8,Dense,"[None, 50]",3250,,,
9,Dropout,"[None, 50]",0,,,


## Optional: Tune model within range of hyperparameters with Keras Tuner

*Simple example shown below. Consult [documentation](https://keras.io/guides/keras_tuner/getting_started/) to see full functionality.*

In [None]:
! pip install keras_tuner

In [None]:
#Separate validation data 
from sklearn.model_selection import train_test_split
x_train_split, x_val, y_train_split, y_val = train_test_split(
     X_train, y_train, test_size=0.2, random_state=42)

In [None]:
import keras_tuner as kt
import tensorflow as tf 
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras import regularizers

#Define model structure & parameter search space with function
def build_model(hp):
    with tf.device('/device:GPU:0'): # "/GPU:0": Short-hand notation for the first GPU of your machine that is visible to TensorFlow.
                                 
        model = Sequential()
        model.add(Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', input_shape=[10, 10, 3]))
        model.add(Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', kernel_regularizer=regularizers.l2(l=0.01)))
        model.add(MaxPooling2D(pool_size=2)) 
        model.add(Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', kernel_regularizer=regularizers.l2(l=0.01)))
        model.add(Conv2D(filters=hp.Int("filters", min_value=8, max_value=32, step=8), #range 8-32 inclusive, minimum step between tested values is 8
                         kernel_size=3, padding='same', activation='relu'))
        model.add(MaxPooling2D(pool_size=2))
        model.add(Dropout(0.3))
        model.add(Flatten())
        model.add(Dense(50, activation='relu'))
        model.add(Dropout(0.2))
        model.add(Dense(3, activation='softmax'))

        # Compile model...
        model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

        return model 

#initialize the tuner (which will search through parameters)
tuner = kt.RandomSearch(
    hypermodel=build_model, 
    objective="val_accuracy", # objective to optimize
    max_trials=2, #max number of trials to run during search
    executions_per_trial=1, #higher number reduces variance of results; guages model performance more accurately 
    overwrite=True,
    directory="tuning_model",
    project_name="tuning_units",
)

tuner.search(x_train_split, y_train_split, epochs=1, validation_data=(x_val, y_val))


Trial 2 Complete [00h 00m 04s]
val_accuracy: 0.6046332120895386

Best val_accuracy So Far: 0.6046332120895386
Total elapsed time: 00h 00m 11s


In [None]:
# Build model with best hyperparameters

# Get the top 2 hyperparameters.
best_hps = tuner.get_best_hyperparameters(5)
# Build the model with the best hp.
tuned_model = build_model(best_hps[0])
# Fit with the entire dataset.
tuned_model.fit(x=X_train, y=y_train, epochs=1)




<keras.callbacks.History at 0x7f0838ddda50>

In [None]:
# Save keras model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(tuned_model, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("tuned_model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [None]:
#Submit Model 3: 

#-- Generate predicted y values (Model 3)
prediction_column_index=tuned_model.predict(X_test).argmax(axis=1)

# extract correct prediction labels 
prediction_labels = [y_train.columns[i] for i in prediction_column_index]

# Submit Model 3 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "tuned_model.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 3

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:2651


In [None]:
# Get leaderboard

data = mycompetition.get_leaderboard()
mycompetition.stylize_leaderboard(data)

Unnamed: 0,accuracy,f1_score,precision,recall,ml_framework,transfer_learning,deep_learning,model_type,depth,num_params,dense_layers,dropout_layers,conv2d_layers,flatten_layers,maxpooling2d_layers,relu_act,softmax_act,loss,optimizer,memory_size,username,version
0,53.86%,54.04%,55.52%,59.17%,keras,False,True,Sequential,11,10811,2,2.0,4.0,1,2.0,5,1,str,RMSprop,44936,COMS_NNDL,1
1,51.16%,50.74%,52.75%,54.38%,keras,False,True,Sequential,5,27779,4,,,1,,3,1,str,RMSprop,112168,COMS_NNDL,2
2,45.25%,44.57%,56.87%,53.68%,keras,False,True,Sequential,11,13571,2,2.0,4.0,1,2.0,5,1,str,RMSprop,55976,COMS_NNDL,3


In [None]:
# Compare two or more models 
data=mycompetition.compare_models([1, 2, 3], verbose=1)
mycompetition.stylize_compare(data)

Unnamed: 0,Model_1_Layer,Model_1_Shape,Model_1_Params,Model_2_Layer,Model_2_Shape,Model_2_Params,Model_3_Layer,Model_3_Shape,Model_3_Params
0,Conv2D,"[None, 10, 10, 16]",448,Dense,"[None, 10, 10, 64]",256.0,Conv2D,"[None, 10, 10, 16]",448
1,Conv2D,"[None, 10, 10, 16]",2320,Dense,"[None, 10, 10, 64]",4160.0,Conv2D,"[None, 10, 10, 16]",2320
2,MaxPooling2D,"[None, 5, 5, 16]",0,Dense,"[None, 10, 10, 64]",4160.0,MaxPooling2D,"[None, 5, 5, 16]",0
3,Conv2D,"[None, 5, 5, 16]",2320,Flatten,"[None, 6400]",0.0,Conv2D,"[None, 5, 5, 16]",2320
4,Conv2D,"[None, 5, 5, 16]",2320,Dense,"[None, 3]",19203.0,Conv2D,"[None, 5, 5, 24]",3480
5,MaxPooling2D,"[None, 2, 2, 16]",0,,,,MaxPooling2D,"[None, 2, 2, 24]",0
6,Dropout,"[None, 2, 2, 16]",0,,,,Dropout,"[None, 2, 2, 24]",0
7,Flatten,"[None, 64]",0,,,,Flatten,"[None, 96]",0
8,Dense,"[None, 50]",3250,,,,Dense,"[None, 50]",4850
9,Dropout,"[None, 50]",0,,,,Dropout,"[None, 50]",0
