<p align="center"><img width="50%" src="https://aimodelsharecontent.s3.amazonaws.com/aimodshare_banner.jpg" /></p>


# Image Classification Competition Model Submission Guide

---

Let's share our models to a centralized leaderboard, so that we can collaborate and learn from the model experimentation process...

**Instructions:**
1.   Get data in and set up X_train / X_test / y_train
2.   Preprocess data / Write and Save Preprocessor function
3. Fit model on preprocessed data and save preprocessor function and model 
4. Generate predictions from X_test data and submit model to competition
5. Repeat submission process to improve place on leaderboard



## 1. Get data in and set up X_train, X_test, y_train objects

In [None]:
#install aimodelshare library
! pip install aimodelshare --upgrade

In [3]:
# Get competition data
from aimodelshare import download_data
download_data('public.ecr.aws/y2e2a1d6/neuralnet_subclasses_competition_data-repository:latest') 


Data downloaded successfully.


In [None]:
# Create training data objects 

# Extract images
!unzip "neuralnet_subclasses_competition_data/X_train.zip"
!unzip "neuralnet_subclasses_competition_data/X_test.zip" 

# Create ordered list of filepaths 
train_filepaths = [('/content/train_shuffle/' + str(i) + '.jpg') for i in range(0, 6472)]
test_filepaths = [('/content/test_shuffle/' + str(i) + '.jpg') for i in range(0, 9127)]

# Read in y_train data 
import pandas as pd 
y_train = pd.read_csv("neuralnet_subclasses_competition_data/y_train.csv")

##2.   Preprocess data / Write and Save Preprocessor function


In [32]:
# Here is a pre-designed preprocessor, but you could also build your own to prepare the data differently
import albumentations as A
def preprocessor(image_filepath, shape=(10, 10)):
        """
        This function preprocesses reads in images, resizes them to a fixed shape and
        min/max transforms them before converting feature values to float32 numeric values
        required by onnx files.
        
        params:
            image_filepath
                full filepath of a particular image
                      
        returns:
            X
                numpy array of preprocessed image data
                  
        """
           
        import cv2
        import numpy as np

        "Resize a color image and min/max transform the image"
        img = cv2.imread(image_filepath) # Read in image from filepath.
        img = cv2.resize(img, (64, 64), interpolation = cv2.INTER_CUBIC)
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # cv2 reads in images in order of blue green and red, we reverse the order for ML.
        #img = cv2.resize(img, shape) # Change height and width of image.
        
        transform_data = A.Compose([A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), max_pixel_value=255.0, p=1)])
        img = transform_data(image = img)['image']  #custom transform
        #img = img / 255.0 # Min-max transform.


        # Resize all the images...
        X = np.array(img)
        X = np.expand_dims(X, axis=0) # Expand dims to add "1" to object shape [1, h, w, channels].
        X = np.array(X, dtype=np.float32) # Final shape for onnx runtime.

        # transpose image to pytorch format
        X = np.transpose(X, (0, 3, 1, 2))
        
        return X

In [33]:
# Use preprocessor to create X_train object 

# Import image, load to array of shape height, width, channels, then min/max transform...
# Read in all images from filenames...
import numpy as np 
import os 

preprocessed_image_data = [preprocessor(x) for x in train_filepaths]

# Object needs to be an array rather than a list for Keras. (vstack converts above list to array object.)
X_train = np.vstack(preprocessed_image_data)
# Assigning to X_train to highlight that this represents feature input data for our model.

In [34]:
# Preprocess X_test image data to generate predictions from models 
import numpy as np

preprocessed_image_data = [preprocessor(x) for x in test_filepaths]

#Create single X_test array from preprocessed images
X_test = np.vstack(preprocessed_image_data)
#Print the shape for verification 
print(X_test.shape)

(9127, 3, 64, 64)


##3. Fit model on preprocessed data and save preprocessor function and model 


In [35]:
#Let us build a baseline pytorch model and load the trained weights
import torch
import torch.nn as nn
import torchvision
from torchsummary import summary

# Load the EfficientNetB3 model trained on ImageNet
backbone_model = torchvision.models.efficientnet_b3(weights='IMAGENET1K_V1')
backbone_model.classifier = nn.Identity()

# input = torch.rand((1, 3, 64, 64))
# output = backbone_model(input)
# print(output.shape)

class BaselineModel(nn.Module):
    def __init__(self, backbone) -> None:
        super().__init__()
        self.backbone = backbone
        # 3 super classes and 89 + 1 (novel) sub classes
        # self.superclass = nn.Linear(in_features = 1536, out_features = 3)
        self.subclass = nn.Linear(in_features = 1536, out_features = 90)
    
    def forward(self, x):
        out = self.backbone(x)
        #super_class_out = self.superclass(out)
        sub_class_out = self.subclass(out)
        return sub_class_out

# Create the baseline EfficientNetB6 model
baseline_model = BaselineModel(backbone_model)

# input = torch.rand((1, 3, 64, 64))
# output = backbone_model(input)
# print(output.shape)
# print(baseline_model)

baseline_model.load_state_dict(torch.load('./Bestmodel_SubClass.pth', map_location=torch.device('cpu'))['model_state_dict'], strict = True)

<All keys matched successfully>

#### Save preprocessor function to local "preprocessor.zip" file

In [36]:
import aimodelshare as ai
ai.export_preprocessor(preprocessor,"") 

Your preprocessor is now saved to 'preprocessor.zip'


#### Save model to local ".onnx" file

In [51]:
# Save pytorch model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx
example_input = torch.randn(1, 3, 64, 64, requires_grad=True)

onnx_model = model_to_onnx(baseline_model, framework='pytorch',
                           model_input=example_input,
                          transfer_learning=True,
                          deep_learning=True)

with open("baseline_model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

## 4. Generate predictions from X_test data and submit model to competition


In [29]:
#Set credentials using modelshare.org username/password

from aimodelshare.aws import set_credentials

# Note -- This is the unique rest api that powers this specific image classification Model Plaground
# ... Update the apiurl if submitting to a new competition

apiurl= "https://arj1w1ffm6.execute-api.us-east-1.amazonaws.com/prod/m"
set_credentials(apiurl=apiurl)

AI Modelshare Username:··········
AI Modelshare Password:··········
AI Model Share login credentials set successfully.


In [38]:
#Instantiate Competition

mycompetition= ai.Competition(apiurl)

In [50]:
#Submit Model 1: 

#-- Generate predicted y values (Model 1)
#Note: Keras predict returns the predicted column index location for classification models
# prediction_column_index=cnn1.predict(X_test).argmax(axis=1)
import torch
import numpy as np

X_test_tensor = torch.from_numpy(X_test)
#print(X_test_tensor.shape)

prediction_column_index = np.zeros((X_test_tensor.shape[0]))

for idx in range(0, len(X_test_tensor), 64):
    last_idx = min(idx + 64, len(X_test_tensor))
    if idx % (1280) == 0:
        print(idx)
    
    with torch.no_grad():
        prediction_column_index[idx:last_idx] = (torch.argmax(baseline_model(X_test_tensor[idx:last_idx]), dim = 1)).detach().cpu().numpy()

prediction_column_index = prediction_column_index.astype(np.int32)


# extract correct prediction labels 
# prediction_labels = [y_train.columns[i] for i in prediction_column_index]
prediction_labels = []
for i in prediction_column_index:
    if i != 89:
        prediction_labels.append(y_train.columns[i])  
    else:
        prediction_labels.append('novel')

# print(len(prediction_labels))
# print(prediction_labels[:10])

0
1280
2560
3840
5120
6400
7680
8960
9127
['common newt, Triturus vulgaris', 'pelican', 'redshank, Tringa totanus', 'silky terrier, Sydney silky', 'American coot, marsh hen, mud hen, water hen, Fulica americana', 'goldfinch, Carduelis carduelis', 'ostrich, Struthio camelus', 'common iguana, iguana, Iguana iguana', 'common iguana, iguana, Iguana iguana', 'Boston bull, Boston terrier']
(9127,)


In [52]:
# Submit Model 1 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "baseline_model.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): deeplearning, subclass
Provide any useful notes about your model (optional): None

Your model has been submitted as model version 76

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:2653


In [53]:
# Get leaderboard to explore current best model architectures

# Get raw data in pandas data frame
data = mycompetition.get_leaderboard()

# Stylize leaderboard data
mycompetition.stylize_leaderboard(data)

Unnamed: 0,accuracy,f1_score,precision,recall,ml_framework,transfer_learning,deep_learning,model_type,depth,num_params,dropout_layers,dense_layers,flatten_layers,maxpooling2d_layers,conv2d_layers,adaptiveavgpool2d_layers,identity_layers,batchnorm2d_layers,silu_act,relu_act,sigmoid_act,softmax_act,loss,optimizer,memory_size,username,version
0,100.00%,100.00%,100.00%,100.00%,unknown,,,unknown,,,,0.0,,,,,,,,,,,,,,ttwzc,10
1,26.07%,3.60%,4.36%,3.92%,unknown,,,unknown,,,,0.0,,,,,,,,,,,,,,gabeguo,34
2,43.40%,3.09%,5.56%,2.84%,unknown,,,unknown,,,,0.0,,,,,,,,,,,,,,ar4180,64
3,2.10%,3.36%,4.06%,4.13%,unknown,,,unknown,,,,0.0,,,,,,,,,,,,,,vs2778,27
4,2.10%,3.36%,4.06%,4.13%,unknown,,,unknown,,,,0.0,,,,,,,,,,,,,,vs2778,28
5,2.10%,3.37%,3.52%,4.16%,unknown,,,unknown,,,,0.0,,,,,,,,,,,,,,vs2778,22
6,23.18%,3.04%,4.05%,2.96%,unknown,,,unknown,,,,0.0,,,,,,,,,,,,,,vs2778,42
7,2.04%,3.28%,3.61%,4.00%,unknown,,,unknown,,,,0.0,,,,,,,,,,,,,,vs2778,20
8,47.68%,2.73%,4.29%,2.21%,unknown,,,unknown,,,,0.0,,,,,,,,,,,,,,gabeguo,37
9,2.08%,3.21%,2.91%,4.10%,keras,,True,Sequential,5.0,578265.0,,4.0,1.0,,,,,,,3.0,,1.0,str,RMSprop,2314112.0,COMS_NNDL,2


## 5. Repeat submission process to improve place on leaderboard


In [None]:
# Train and submit model 2 using same preprocessor (note that you could save a new preprocessor, but we will use the same one for this example).

with tf.device('/device:GPU:0'): # "/GPU:0": Short-hand notation for the first GPU of your machine that is visible to TensorFlow.
      cnn2 = Sequential()
      cnn2.add(Dense(64,  input_shape=(10, 10, 3), activation='relu'))
      cnn2.add(Dense(64, activation='relu'))
      cnn2.add(Dense(64, activation='relu'))
      cnn2.add(Flatten())
      cnn2.add(Dense(89, activation='softmax')) 
                                                  
      # Compile model
      cnn2.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

      # Fitting the NN to the Training set
      cnn2.fit(X_train, y_train, 
                    epochs = 50, verbose=1, validation_split=.2)

In [None]:
# Save keras model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(cnn2, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model2.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [None]:
#Submit Model 2: 

#-- Generate predicted y values (Model 2)
prediction_column_index=cnn2.predict(X_test).argmax(axis=1)

# extract correct prediction labels 
prediction_labels = [y_train.columns[i] for i in prediction_column_index]

# Submit Model 2 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "model2.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 2

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:2653


In [None]:
# Compare two or more models
data=mycompetition.compare_models([1, 2], verbose=1)
mycompetition.stylize_compare(data)

Unnamed: 0,Model_1_Layer,Model_1_Shape,Model_1_Params,Model_2_Layer,Model_2_Shape,Model_2_Params
0,Conv2D,"[None, 10, 10, 16]",448,Dense,"[None, 10, 10, 64]",256.0
1,Conv2D,"[None, 10, 10, 16]",2320,Dense,"[None, 10, 10, 64]",4160.0
2,MaxPooling2D,"[None, 5, 5, 16]",0,Dense,"[None, 10, 10, 64]",4160.0
3,Conv2D,"[None, 5, 5, 16]",2320,Flatten,"[None, 6400]",0.0
4,Conv2D,"[None, 5, 5, 16]",2320,Dense,"[None, 89]",569689.0
5,MaxPooling2D,"[None, 2, 2, 16]",0,,,
6,Dropout,"[None, 2, 2, 16]",0,,,
7,Flatten,"[None, 64]",0,,,
8,Dense,"[None, 50]",3250,,,
9,Dropout,"[None, 50]",0,,,
