<p align="center"><img width="50%" src="https://aimodelsharecontent.s3.amazonaws.com/aimodshare_banner.jpg" /></p>


---
**Source:**<br>
*Dataset adapted from: Dataset adapted from: Soomro, K., Zamir, A. R., & Sha, M. (2012). UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild. Center for Research in Computer Vision, University of Central Florida. https://arxiv.org/pdf/1212.0402v1.pdf*


# Sports Clips Video Classification Competition Model Submission Guide

---

Let's share our models to a centralized leaderboard, so that we can collaborate and learn from the model experimentation process...

**Instructions:**
1.   Get data in and set up X_train / X_test / y_train
2.   Preprocess data / Write and Save Preprocessor function
3. Fit model on preprocessed data and save preprocessor function and model 
4. Generate predictions from X_test data and submit model to competition
5. Repeat submission process to improve place on leaderboard



## 1. Get data in and set up X_train, X_test, y_train objects

In [None]:
#install aimodelshare library
! pip install aimodelshare --upgrade

In [2]:
# Get competition data
from aimodelshare import download_data
download_data('public.ecr.aws/y2e2a1d6/sports_clips_competition_data-repository:latest') 


Data downloaded successfully.


##2.   Preprocess data / Write and Save Preprocessor function


In [3]:
def preprocessor(video, num_frames=60, gap_frames=3, **kwargs):

    """
      This function preprocesses the data to extract frames out of each video and resize
      them to a fixed size of (128x128) pixels. Moreover, these images are flattened out to
      act as features for each time step.
      
      params:
          video_list
              location of video files to be processed

          num_frames
              the number of frames to be extracted from each video. If there aren't
              sufficient frames, then the list is padded with zeros
              
          gap_frames:
              the number of frames after which we extract the next frame. If =1,
              contiguous frames are extracted
              
      returns:
          X
              list of transformed features corresponding to data passed as input
      
    """

    import cv2
    import numpy as np

    vidcap = cv2.VideoCapture(video)
    
    frames = []

    success, frame = vidcap.read()
    idx = 0

    while success:
        # for each frame, do if we satisfy the gap_frames parameter
        if idx % gap_frames == 0:
            # convert to RGB (default cv2 is BGR)
            # this is important because the vgg model is trained on RGB images
            frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
            frame = cv2.resize(frame, (128, 128))

            # extract features for the resized image
            frame = frame / 255.0 # Min-max transform.

            # flatten the features and append to a list of features for this video
            frame = frame.reshape(-1)

            frames.append(frame)

            if len(frames) >= num_frames:
                break

        idx += 1
        success, frame = vidcap.read()

    # if number of timesteps or frames < num_frames, pad
    while len(frames) < num_frames:
        frames.append(np.zeros(*frames[-1].shape))

    X = np.array(frames)
    X = np.expand_dims(X, axis=0)
    X = np.array(X, dtype=np.float32)

    return X

In [4]:
## Prepare data: 
# Unzip video clips
import zipfile
with zipfile.ZipFile('sports_clips_competition_data/clips_test.zip', 'r') as zip_ref:
    zip_ref.extractall('X_test_clips')

with zipfile.ZipFile('sports_clips_competition_data/clips_train.zip', 'r') as zip_ref:
    zip_ref.extractall('X_train_clips')

# Preprocess clips
import numpy as np
file_names_test = [('X_test_clips/clips_test/' + str(i) + '.avi') for i in range(1, 46)]
preprocessed_test_data = [preprocessor(x) for x in file_names_test]

file_names_train = [('X_train_clips/clips_train/' + str(i) + '.avi') for i in range(1, 106)]
preprocessed_train_data = [preprocessor(x) for x in file_names_train]

#Create arrays from preprocessed videos
X_test = np.vstack(preprocessed_test_data) 
X_train = np.vstack(preprocessed_train_data) 

In [5]:
# read in y_train data object
import pandas as pd

y_train = pd.read_csv("sports_clips_competition_data/y_train.csv")

In [6]:
# Check shape 

print(X_train.shape)
print(X_test.shape)
print(y_train.shape)

(105, 60, 49152)
(45, 60, 49152)
(105, 3)


##3. Fit model on preprocessed data and save preprocessor function and model 


In [7]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout
from keras.layers import LSTM

# defining the model
hidden_size = 50
dense_out_1 = 20
dense_out_2 = 3

video_1 = Sequential()
video_1.add(LSTM(hidden_size))
video_1.add(Dense(dense_out_1, activation="relu"))
video_1.add(Dense(dense_out_2, activation="softmax"))

video_1.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

with tf.device('/device:GPU:0'): #"/GPU:0": Short-hand notation for the first GPU of your machine that is visible to TensorFlow.

  history = video_1.fit(
      X_train,
      y_train,
      epochs=1,
      batch_size=1,
      validation_split=0.2,
      verbose=2)

84/84 - 129s - loss: 1.0299 - accuracy: 0.5476 - val_loss: 0.8983 - val_accuracy: 0.5714 - 129s/epoch - 2s/step


#### Save preprocessor function to local "preprocessor.zip" file

In [8]:
import aimodelshare as ai
ai.export_preprocessor(preprocessor,"") 

Your preprocessor is now saved to 'preprocessor.zip'


#### Save model to local ".onnx" file

In [9]:
# Save keras model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(video_1, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

## 4. Generate predictions from X_test data and submit model to competition


In [10]:
#Set credentials using modelshare.org username/password

from aimodelshare.aws import set_credentials
    
apiurl="https://bd5gfx9wj3.execute-api.us-east-1.amazonaws.com/prod/m" #This is the unique rest api that powers this Sports Clips Classification Model Playground

set_credentials(apiurl=apiurl)

AI Modelshare Username:··········
AI Modelshare Password:··········
AI Model Share login credentials set successfully.


In [11]:
#Instantiate Competition

mycompetition= ai.Competition(apiurl)

In [12]:
#Submit Model 1: 

#-- Generate predicted y values (Model 1)
#Note: Keras predict returns the predicted column index location for classification models
prediction_column_index=video_1.predict(X_test).argmax(axis=1)

# extract correct prediction labels 
prediction_labels = [y_train.columns[i] for i in prediction_column_index]

# Submit Model 1 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "model.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 3

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:1658


In [13]:
# Get leaderboard to explore current best model architectures

# Get raw data in pandas data frame
data = mycompetition.get_leaderboard()

# Stylize leaderboard data
mycompetition.stylize_leaderboard(data)

Unnamed: 0,accuracy,f1_score,precision,recall,ml_framework,transfer_learning,deep_learning,model_type,depth,num_params,lstm_layers,dense_layers,softmax_act,tanh_act,relu_act,loss,optimizer,memory_size,username,version
0,69.57%,55.46%,52.94%,61.90%,keras,False,True,Sequential,3,983398,2,1,1,2,,function,Adam,2218656,AIModelShare,2
1,60.87%,60.93%,62.63%,59.68%,keras,False,True,Sequential,3,9841683,1,2,1,1,1.0,str,Adam,5202352,AIModelShare,3
2,52.17%,48.16%,45.10%,61.90%,keras,False,True,Sequential,3,9841683,1,2,1,1,1.0,function,Adam,219360,AIModelShare,1


## 5. Repeat submission process to improve place on leaderboard


In [14]:
# Train and submit model 2 using same preprocessor (note that you could save a new preprocessor, but we will use the same one for this example).

# defining the model
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout
from keras.layers import LSTM

hidden_size = 5
dense_out = 3

video_2 = Sequential()
video_2.add(LSTM(hidden_size, return_sequences=True, dropout=.2, input_shape=(60, 49152)))
video_2.add(LSTM(hidden_size, dropout=.2))
video_2.add(Dense(dense_out, activation="softmax"))

video_2.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

with tf.device('/device:GPU:0'):

  history = video_2.fit(
      X_train,
      y_train,
      epochs=1,
      batch_size=1,
      validation_split=0.2,
      verbose=2
  )

84/84 - 21s - loss: 1.0551 - accuracy: 0.3929 - val_loss: 1.0618 - val_accuracy: 0.5238 - 21s/epoch - 247ms/step


In [15]:
# Save keras model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(video_2, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model2.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [16]:
#Submit Model 2: 

#-- Generate predicted y values (Model 2)
prediction_column_index=video_2.predict(X_test).argmax(axis=1)

# extract correct prediction labels 
prediction_labels = [y_train.columns[i] for i in prediction_column_index]

# Submit Model 2 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "model2.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 4

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:1658


In [17]:
# Compare two or more models 
data=mycompetition.compare_models([3,4], verbose=1)
mycompetition.stylize_compare(data)

Unnamed: 0,Model_3_Layer,Model_3_Shape,Model_3_Params,Model_4_Layer,Model_4_Shape,Model_4_Params
0,LSTM,"[None, 50]",9840600,LSTM,"[None, 60, 5]",983160
1,Dense,"[None, 20]",1020,LSTM,"[None, 5]",220
2,Dense,"[None, 3]",63,Dense,"[None, 3]",18


## Optional: Tune model within range of hyperparameters with Keras Tuner

*Simple example shown below. Consult [documentation](https://keras.io/guides/keras_tuner/getting_started/) to see full functionality.*

In [None]:
! pip install keras_tuner

In [19]:
#Separate validation data 
from sklearn.model_selection import train_test_split
x_train_split, x_val, y_train_split, y_val = train_test_split(
     X_train, y_train, test_size=0.2, random_state=42)

In [20]:
import keras_tuner as kt
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout
from keras.layers import LSTM

#Define model structure & parameter search space with function
def build_model(hp):
    model = Sequential()
    model.add(LSTM(5, return_sequences=True, dropout=.2, input_shape=(60, 49152)))
    model.add(LSTM(units=hp.Int("units", min_value=5, max_value=32, step=3), #range 5-32 inclusive, minimum step between tested values is 3
                     return_sequences=True, dropout=0.2))
    model.add(LSTM(3, dropout=.2))
    model.add(Dense(3, activation="softmax"))

    model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
    return model 

#initialize the tuner (which will search through parameters)
tuner = kt.RandomSearch(
    hypermodel=build_model, 
    objective="val_accuracy", # objective to optimize
    max_trials=2, #max number of trials to run during search
    executions_per_trial=1, #higher number reduces variance of results; guages model performance more accurately 
    overwrite=True,
    directory="tuning_model",
    project_name="tuning_units",
)

tuner.search(x_train_split, y_train_split, epochs=1, validation_data=(x_val, y_val))

Trial 2 Complete [00h 00m 17s]
val_accuracy: 0.1428571492433548

Best val_accuracy So Far: 0.4761904776096344
Total elapsed time: 00h 00m 52s


In [21]:
# Build model with best hyperparameters

# Get the top 2 hyperparameters.
best_hps = tuner.get_best_hyperparameters(5)
# Build the model with the best hp.
tuned_model = build_model(best_hps[0])
# Fit with the entire dataset.
tuned_model.fit(x=X_train, y=y_train, epochs=1)




<keras.callbacks.History at 0x7f66ba32dfd0>

In [22]:
# Save keras model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(tuned_model, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("tuned_model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [23]:
#Submit Model 3: 

#-- Generate predicted y values (Model 3)
prediction_column_index=tuned_model.predict(X_test).argmax(axis=1)

# extract correct prediction labels 
prediction_labels = [y_train.columns[i] for i in prediction_column_index]

# Submit Model 3 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "tuned_model.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 5

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:1658


In [24]:
# Get leaderboard

data = mycompetition.get_leaderboard()
mycompetition.stylize_leaderboard(data)

Unnamed: 0,accuracy,f1_score,precision,recall,ml_framework,transfer_learning,deep_learning,model_type,depth,num_params,lstm_layers,dense_layers,softmax_act,tanh_act,relu_act,loss,optimizer,memory_size,username,version
0,60.87%,60.93%,62.63%,59.68%,keras,False,True,Sequential,3,9841683,1,2,1,1,1.0,str,Adam,5202352,AIModelShare,3
1,69.57%,55.46%,52.94%,61.90%,keras,False,True,Sequential,3,983398,2,1,1,2,,function,Adam,2218656,AIModelShare,2
2,52.17%,48.16%,45.10%,61.90%,keras,False,True,Sequential,3,9841683,1,2,1,1,1.0,function,Adam,219360,AIModelShare,1
3,47.83%,44.44%,44.44%,57.14%,keras,False,True,Sequential,3,983398,2,1,1,2,,str,Adam,11272336,AIModelShare,4
4,39.13%,39.32%,54.76%,43.97%,keras,False,True,Sequential,4,985540,3,1,1,3,,str,Adam,16002608,AIModelShare,5


In [25]:
# Compare two or more models 
data=mycompetition.compare_models([3, 4, 5], verbose=1)
mycompetition.stylize_compare(data)

Unnamed: 0,Model_3_Layer,Model_3_Shape,Model_3_Params,Model_4_Layer,Model_4_Shape,Model_4_Params,Model_5_Layer,Model_5_Shape,Model_5_Params
0,LSTM,"[None, 50]",9840600.0,LSTM,"[None, 60, 5]",983160.0,LSTM,"[None, 60, 5]",983160
1,Dense,"[None, 20]",1020.0,LSTM,"[None, 5]",220.0,LSTM,"[None, 60, 20]",2080
2,Dense,"[None, 3]",63.0,Dense,"[None, 3]",18.0,LSTM,"[None, 3]",288
3,,,,,,,Dense,"[None, 3]",12
