<p align="center"><img width="50%" src="https://aimodelsharecontent.s3.amazonaws.com/aimodshare_banner.jpg" /></p>


---




<p align="center"><h1 align="center">Quick Start: IMDB Review Text Classification Tutorial</h1> 

##### <p align="center">*Dataset Adapted From: Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. (2011). Learning Word Vectors for Sentiment Analysis. The 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011).* 
---

<h3 align="center">(Deploy model to an AI Model Share Model Playground REST API<br> and Web Dashboard in five easy steps...)</h3></p>
<p align="center"><img width="100%" src="https://aimodelsharecontent.s3.amazonaws.com/aimstutorialsteps.gif" /></p>


---



## **Credential Configuration**

In order to deploy an AI Model Share Model Playground, you will need a credentials text file. 

Generating your credentials file requires two sets of information: 
1. Your AI Model Share username and password (create them [HERE](https://www.modelshare.org/login)). 
2. Your AWS (Amazon Web Services) access keys (follow the tutorial [HERE](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html)). 

You only need to generate your credentials file once. After running the configure function below, save the outputted file for all your future Model Playground deployments and competition submissions. 

*Note: Handle your credentials file with the same level of security you handle your passwords. Do not share your file with anyone, send via email, or upload to Github.*


In [None]:
#install aimodelshare library
! pip install aimodelshare --upgrade

In [None]:
# Generate credentials file
import aimodelshare as ai 
from aimodelshare.aws import configure_credentials 

configure_credentials()

## **Set up Environment**

Use your credentials file to set your credentials for all aimodelshare functions. 

In [2]:
# Set credentials 
import aimodelshare as ai
from aimodelshare.aws import set_credentials

set_credentials(credential_file="credentials.txt", type="deploy_model")

AI Model Share login credentials set successfully.
AWS credentials set successfully.


In [3]:
# Get materials for tutorial: Determine if a movie review is positive or negative

import aimodelshare as ai
X_train, X_test, y_train_labels, y_test_labels, example_data, model_1, model_2 = ai.import_quickstart_data("imdb")


Data downloaded successfully.

Preparing downloaded files for use...

Success! Your Quick Start materials have been downloaded. 
You are now ready to run the tutorial.


## **(1) Preprocessor Function & Setup**

### **Write a Preprocessor Function**


> ###   Preprocessor functions are used to preprocess data into the precise data your model requires to generate predictions.  

*  *Preprocessor functions should always be named "preprocessor".*
*  *You can use any Python library in a preprocessor function, but all libraries should be imported inside your preprocessor function.*  

In [4]:
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
import numpy as np

# Build vocabulary from training text data
tokenizer = Tokenizer(num_words=5000)
tokenizer.fit_on_texts(X_train)

# preprocessor tokenizes words and makes sure all documents have the same length
def preprocessor(data, maxlen=100, max_words=5000):

    sequences = tokenizer.texts_to_sequences(data)

    word_index = tokenizer.word_index
    X = pad_sequences(sequences, maxlen=maxlen)

    return X

print(preprocessor(X_train).shape)
print(preprocessor(X_test).shape)

(37500, 100)
(12500, 100)


In [5]:
# Create one hot encoded data from list of y_train category labels
#...to allow modeltoapi() to extract correct labels for predictions in your deployed API
import pandas as pd 

y_train = pd.get_dummies(y_train_labels)
y_test = pd.get_dummies(y_test_labels)

#ensure column names are correct in one hot encoded target for correct label extraction
list(y_train.columns)

['negative', 'positive']

In [6]:
# Check shape of data preproprecessed using your new preprocessor() function
print(preprocessor(X_train, maxlen=100, max_words=5000).shape)
print(preprocessor(X_test, maxlen=100, max_words=5000).shape)

(37500, 100)
(12500, 100)


## **(2) Build Model Using your Preferred ML Library**

### **We already loaded a Keras Sequential Model in the quickstart function above**

In [7]:
# Here is a pre-trained model, but you could train your own model after preprocessing data with your preprocessor function.

model_1.summary()

Model: "sequential_13"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_13 (Embedding)    (None, 100, 16)           80000     
                                                                 
 flatten_13 (Flatten)        (None, 1600)              0         
                                                                 
 dense_13 (Dense)            (None, 2)                 3202      
                                                                 
Total params: 83,202
Trainable params: 83,202
Non-trainable params: 0
_________________________________________________________________


## **(3) Save Preprocessor**
### Saves preprocessor function to "preprocessor.zip" file

In [8]:
import aimodelshare as ai
ai.export_preprocessor(preprocessor,"") # Second argument is the directory you want to use to save your function

Your preprocessor is now saved to 'preprocessor.zip'


In [9]:
#  Now let's import and test the preprocessor function to see if it is working...
prep = ai.import_preprocessor("preprocessor.zip")
prep(example_data).shape

(5, 100)

## **(4) Save keras model to ONNX File Format**


In [10]:
# Save keras model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(model_1, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

## **(5) Create your Model Playground**

In [11]:
#Set up arguments for Model Playground deployment
import pandas as pd 

model_filepath="model.onnx"
preprocessor_filepath="preprocessor.zip"

In [12]:
from aimodelshare import ModelPlayground

#Instantiate ModelPlayground() Class

myplayground=ModelPlayground(model_type="text", classification=True, private=False)

# Create Model Playground (generates live rest api and web-app for your model/preprocessor)

myplayground.deploy(model_filepath, preprocessor_filepath, y_train, example_data) 

We need some information about your model before we can build your REST API and interactive Model Playground.
   
Model Name (for AI Model Share Website):IMDB Movie Review Classifier
Model Description (Explain what your model does and 
 why end-users would find your model useful):IMDb, also knows as the Internet Movie Database, is an online database of movies, TV shows, celebrities, and awards. Regstered users can write reviews and rate content that they've seen. Use this dataset to classify 50,000 'highly polarized' movie reviews as positive or negative.
Model Key Words (Search categories that describe your model, separated with commas):IMDB, movies, text, binary classification
   
Creating your prediction API. (This process may take several minutes.)


Success! Your Model Playground was created in 79 seconds. 
 Playground Url: "https://2hewh9dkmh.execute-api.us-east-1.amazonaws.com/prod/m"

You can now use your Model Playground.

Follow this link to explore your Model Playground's fu

## **Use your new Model Playground!**

Follow the link in the output above to:
- Generate predictions with your interactive web dashboard
- Access example code in Python, R, and Curl

Or, follow the rest of the tutorial to create a competition for your Model Playground and: 
- Access verified model performance metrics 
- Upload multiple models to a leaderboard 
- Easily compare model performance & structure 

## **Part 2: Create a Competition**

-------

After deploying your Model Playground, you can now create a competition. 

Creating a competition allows you to:
1. Verify the model performance metrics on aimodelshare.org
2. Submit models to a leaderboard
3. Grant access to other users to submit models to the leaderboard
4. Easily compare model performance and structure 

In [None]:
# Create list of authorized participants for competition
# Note that participants should use the same email address when creating modelshare.org account

emaillist=["emailaddress1@email.com", "emailaddress2@email.com", "emailaddress3@email.com"]

In [13]:
# Create Competition
# Note -- Make competition public (allow any AI Model Share user to submit models) 
# .... by excluding the email_list argument and including the 'public=True' argument 

myplayground.create_competition(data_directory='imdb_competition_data', 
                                y_test = y_test_labels, 
                                email_list=emaillist)
                               # public=True)


--INPUT COMPETITION DETAILS--

Enter competition name:IMDB Movie Review Classification Competition
Enter competition description:Read movie reviews form IMDB and determine if they are positive or negative.

--INPUT DATA DETAILS--

Note: (optional) Save an optional LICENSE.txt file in your competition data directory to make users aware of any restrictions on data sharing/usage.

Enter data description (i.e.- filenames denoting training and test data, file types, and any subfolders where files are stored):Data competition folder contains labeled X_test, X_train, and y_train_labels files
Enter optional data license descriptive name (e.g.- 'MIT, Apache 2.0, CC0, Other, etc.'):
Uploading your data. Please wait for a confirmation message.

 Success! Model competition created. 

You may now update your prediction API runtime model and verify evaluation metrics with the update_runtime_model() function.

To upload new models and/or preprocessors to this API, team members should use 
the follow

In [14]:
#Instantiate Competition
#--Note: If you start a new session, the first argument should be the Model Playground url in quotes. 
#--e.g.- mycompetition= ai.Competition("https://2121212.execute-api.us-east-1.amazonaws.com/prod/m)
#See Model Playground "Compete" tab for example model submission code.

mycompetition= ai.Competition(myplayground.playground_url)

In [None]:
# Add, remove, or completely update authorized participants for competition later
emaillist=["emailaddress4@email.com"]

mycompetition.update_access_list(email_list=emaillist,update_type="Add")

Submit Models

In [15]:
#Authorized users can submit new models after setting credentials using modelshare.org username/password
from aimodelshare.aws import set_credentials

apiurl=myplayground.playground_url # example url from deployed playground: apiurl= "https://123456.execute-api.us-east-1.amazonaws.com/prod/m
set_credentials(apiurl=apiurl)


AI Modelshare Username:··········
AI Modelshare Password:··········
AI Model Share login credentials set successfully.


In [16]:
#Submit Model 1: 

#-- Generate predicted values for Model 1
prediction_column_index = model_1.predict(preprocessor(X_test)).argmax(axis=1)
prediction_labels = [y_test.columns[i] for i in prediction_column_index]

# Submit Model 1 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "model.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 1

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:1319


### **Model 2:**

In [17]:
# Train and submit model 2 using same preprocessor (note that you could save a new preprocessor, but we will use the same one for this example).
model_2.summary()

Model: "sequential_14"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_14 (Embedding)    (None, 100, 16)           80000     
                                                                 
 lstm_6 (LSTM)               (None, 32)                6272      
                                                                 
 flatten_14 (Flatten)        (None, 32)                0         
                                                                 
 dense_14 (Dense)            (None, 2)                 66        
                                                                 
Total params: 86,338
Trainable params: 86,338
Non-trainable params: 0
_________________________________________________________________


In [18]:
# Save Model 2 to .onnx file
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(model_2, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

# Save model to local .onnx file
with open("model_2.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString()) 

In [19]:
# Submit Model 2

#-- Generate predicted y values (Model 2)
prediction_column_index = model_2.predict(preprocessor(X_test)).argmax(axis=1)
prediction_labels = [y_test.columns[i] for i in prediction_column_index]

# Submit Model 2 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "model_2.onnx",
                                 prediction_submission=prediction_labels,
                                 preprocessor_filepath="preprocessor.zip")

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 2

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:1319


Get Leaderboard

In [20]:
data = mycompetition.get_leaderboard()
mycompetition.stylize_leaderboard(data)

Unnamed: 0,accuracy,f1_score,precision,recall,ml_framework,transfer_learning,deep_learning,model_type,depth,num_params,dense_layers,embedding_layers,flatten_layers,lstm_layers,softmax_act,tanh_act,loss,optimizer,model_config,memory_size,username,version
0,84.72%,84.69%,85.02%,84.72%,keras,False,True,Sequential,3,83202,1,1,1,,1,,function,RMSprop,"{'name': 'sequential_13', 'lay...",146208,AIModelShare,1
1,84.33%,84.30%,84.62%,84.33%,keras,False,True,Sequential,4,86338,1,1,1,1.0,1,1.0,function,RMSprop,"{'name': 'sequential_14', 'lay...",1207288,AIModelShare,2


Compare Models

In [21]:
# Compare two or more models 
data=mycompetition.compare_models([1,2], verbose=1)
mycompetition.stylize_compare(data)

Unnamed: 0,Model_1_Layer,Model_1_Shape,Model_1_Params,Model_2_Layer,Model_2_Shape,Model_2_Params
0,Embedding,"[None, 100, 16]",80000.0,Embedding,"[None, 100, 16]",80000
1,Flatten,"[None, 1600]",0.0,LSTM,"[None, 32]",6272
2,Dense,"[None, 2]",3202.0,Flatten,"[None, 32]",0
3,,,,Dense,"[None, 2]",66


#### Check structure of y test data 
(This helps users understand how to submit predicted values to leaderboard)

In [22]:
mycompetition.inspect_y_test()

{'class_balance': {'negative': 6250, 'positive': 6250},
 'class_labels': ['positive', 'negative'],
 'label_dtypes': {"<class 'str'>": 12500},
 'y_length': 12500,
 'ytest_example': ['negative', 'negative', 'positive', 'positive', 'negative']}

## **Part 3: Maintaining your Model Playground**

-------

Update Runtime model

*Use this function to 1) update the prediction API behind your Model Playground with a new model, chosen from the leaderboard and 2) verify the model performance metrics in your Model Playground*

In [23]:
myplayground.update_runtime_model(model_version=1)

Runtime model & preprocessor for api: https://2hewh9dkmh.execute-api.us-east-1.amazonaws.com/prod/m updated to model version 1.

Model metrics are now updated and verified for this model playground.


Delete Deployment 

*Use this function to delete the entire Model Playground, including the REST API, web dashboard, competition, and all submitted models*

In [None]:
myplayground.delete_deployment()

Running this function will permanently delete all resources tied to this deployment, 
 including the eval lambda and all models submitted to the model competition.

To confirm, type 'permanently delete':permanently delete


'Deployment deleted successfully.'