<p align="center"><img width="50%" src="https://aimodelsharecontent.s3.amazonaws.com/aimodshare_banner.jpg" /></p>


---




<p align="center"><h1 align="center">Quick Start: Clickbait Detection Text Classification Tutorial</h1> 

##### <p align="center">*Dataset Adapted From: Abhijnan Chakraborty, Bhargavi Paranjape, Sourya Kakarla, and Niloy Ganguly. "Stop Clickbait: Detecting and Preventing Clickbaits in Online News Media”. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Francisco, US, August 2016.* 
---

<h3 align="center">(Deploy model to an AI Model Share Model Playground REST API<br> and Web Dashboard in five easy steps...)</h3></p>
<p align="center"><img width="100%" src="https://aimodelsharecontent.s3.amazonaws.com/aimstutorialsteps.gif" /></p>


---



## **Credential Configuration**

In order to deploy an AI Model Share Model Playground, you will need a credentials text file. 

Generating your credentials file requires two sets of information: 
1. Your AI Model Share username and password (create them [HERE](https://www.modelshare.org/login)). 
2. Your AWS (Amazon Web Services) access keys (follow the tutorial [HERE](https://aimodelshare.readthedocs.io/en/latest/create_credentials.html)). 

You only need to generate your credentials file once. After running the configure function below, save the outputted file for all your future Model Playground deployments and competition submissions. 

*Note: Handle your credentials file with the same level of security you handle your passwords. Do not share your file with anyone, send via email, or upload to Github.*


In [None]:
#install aimodelshare library
! pip install aimodelshare --upgrade


In [None]:
# Generate credentials file
import aimodelshare as ai 
from aimodelshare.aws import configure_credentials 

configure_credentials()

## **Set up Environment**

Use your credentials file to set your credentials for all aimodelshare functions. 

In [2]:
# Set credentials 
import aimodelshare as ai
from aimodelshare.aws import set_credentials

set_credentials(credential_file="credentials.txt", type="deploy_model")

AI Model Share login credentials set successfully.
AWS credentials set successfully.


In [4]:
# Get materials for tutorial: Predicting whether or not a headline is clickbait

import aimodelshare as ai
X_train, X_test, y_train_labels, y_test_labels, example_data, lstm_model, lstm_model2 = ai.import_quickstart_data("clickbait")


Data downloaded successfully.

Preparing downloaded files for use...

Success! Your Quick Start materials have been downloaded. 
You are now ready to run the tutorial.


## **(1) Preprocessor Function & Setup**

### **Write a Preprocessor Function**


> ###   Preprocessor functions are used to preprocess data into the precise data your model requires to generate predictions.  

*  *Preprocessor functions should always be named "preprocessor".*
*  *You can use any Python library in a preprocessor function, but all libraries should be imported inside your preprocessor function.*  
*  *For tabular prediction models users should minimally include function inputs for an unpreprocessed pandas dataframe.*  
*  *Any categorical features should be preprocessed to one hot encoded numeric values.* 


In [5]:
# This preprocessor function makes use of the tf.keras tokenizer

from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
import numpy as np

# Build vocabulary from training text data
tokenizer = Tokenizer(num_words=10000)
tokenizer.fit_on_texts(X_train)

# preprocessor tokenizes words and makes sure all documents have the same length
def preprocessor(data, maxlen=40, max_words=10000):

    sequences = tokenizer.texts_to_sequences(data)

    word_index = tokenizer.word_index
    X = pad_sequences(sequences, maxlen=maxlen)

    return X

In [6]:
# One-hot encode your Y data
import pandas as pd 

y_train = pd.get_dummies(y_train_labels)
y_test = pd.get_dummies(y_test_labels)

In [7]:
# Check shape of data preproprecessed using your new preprocessor() function
print(preprocessor(X_train, maxlen=40, max_words=10000).shape)
print(preprocessor(X_test, maxlen=40, max_words=10000).shape)

(24979, 40)
(6245, 40)


## **(2) Build Model Using your Preferred ML Library**

### **We already loaded a Keras Sequential Model with LSTM layers in the quickstart function above**

In [8]:
# Here is a pre-trained LSTM model, but you could train your own model after preprocessing data with your preprocessor function.

lstm_model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_2 (Embedding)      (None, 40, 16)            160000    
_________________________________________________________________
lstm_4 (LSTM)                (None, 40, 32)            6272      
_________________________________________________________________
lstm_5 (LSTM)                (None, 40, 32)            8320      
_________________________________________________________________
flatten_2 (Flatten)          (None, 1280)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 2562      
Total params: 177,154
Trainable params: 177,154
Non-trainable params: 0
_________________________________________________________________


## **(3) Save Preprocessor**
### Saves preprocessor function to "preprocessor.zip" file

In [9]:
import aimodelshare as ai
ai.export_preprocessor(preprocessor,"") # Second argument is the directory you want to use to save your function

Your preprocessor is now saved to 'preprocessor.zip'


In [10]:
#  Now let's import and test the preprocessor function to see if it is working...

import aimodelshare as ai
prep=ai.import_preprocessor("preprocessor.zip")

prep(X_test, maxlen=40, max_words=10000).shape

(6245, 40)

## **(4) Save keras model to Onnx File Format**


In [11]:
# Save keras model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(lstm_model, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

## **(5) Create your Model Playground and Deploy REST API/ Live Web-Application**

In [12]:
#Set up arguments for Model Playground deployment
import pandas as pd 

model_filepath="model.onnx"
preprocessor_filepath="preprocessor.zip"

In [13]:
from aimodelshare import ModelPlayground

#Instantiate ModelPlayground() Class

myplayground=ModelPlayground(model_type="text", classification=True, private=False)

# Create Model Playground (generates live rest api and web-app for your model/preprocessor)

myplayground.deploy(model_filepath, preprocessor_filepath, y_train, example_data) 

We need some information about your model before we can build your REST API and interactive Model Playground.
   
Model Name (for AI Model Share Website):Clickbait Text Classifier
Model Description (Explain what your model does and 
 why end-users would find your model useful):Takes text input to predict if text is clickbait or not.
Model Key Words (Search categories that describe your model, separated with commas):clickbait, text, classification
   
Creating your prediction API. (This process may take several minutes.)


Success! Your Model Playground was created in 72 seconds. 
 Playground Url: "https://1pj0k29ak0.execute-api.us-east-1.amazonaws.com/prod/m"

You can now use your Model Playground.

Follow this link to explore your Model Playground's functionality
You can make predictions with the Dashboard and access example code from the Programmatic tab.
https://www.modelshare.org/detail/model:759


## **Use your new Model Playground!**

Follow the link in the output above to:
- Generate predictions with your interactive web dashboard
- Access example code in Python, R, and Curl

Or, follow the rest of the tutorial to create a competition for your Model Playground and: 
- Access verified model performance metrics 
- Upload multiple models to a leaderboard 
- Easily compare model performance & structure 

## **Part 2: Create a Competition**

-------

After deploying your Model Playground, you can now create a competition. 

Creating a competition allows you to:
1. Verify the model performance metrics on aimodelshare.org
2. Submit models to a leaderboard
3. Grant access to other users to submit models to the leaderboard
4. Easily compare model performance and structure 

In [14]:
# Create list of authorized participants for competition
# Note that participants should use the same email address when creating modelshare.org account

emaillist=["emailaddress1@email.com", "emailaddress2@email.com", "emailaddress3@email.com"]

In [None]:
# Create Competition
myplayground.create_competition(data_directory='clickbait_competition_data', 
                                y_test = y_test_labels, 
                                email_list=emaillist)

In [16]:
#Instantiate Competition
#--Note: If you start a new session, the first argument should be the Model Playground url in quotes. 
#--e.g.- mycompetition= ai.Competition("https://2121212.execute-api.us-east-1.amazonaws.com/prod/m)
#See Model Playground "Compete" tab for example model submission code.

mycompetition= ai.Competition(myplayground.playground_url)

In [None]:
# Add, remove, or completely update authorized participants for competition later
emaillist=["emailaddress4@email.com"]

mycompetition.update_access_list(email_list=emaillist,update_type="Add")

Submit Models

In [18]:
#Authorized users can submit new models after setting credentials using modelshare.org username/password

apiurl=myplayground.playground_url # example url from deployed playground: apiurl= "https://123456.execute-api.us-east-1.amazonaws.com/prod/m

from aimodelshare.aws import set_credentials
set_credentials(apiurl=apiurl)


AI Modelshare Username:··········
AI Modelshare Password:··········
AI Model Share login credentials set successfully.


In [None]:
#Submit Model 1: 

#-- Generate predicted values for Model 1
prediction_column_index = lstm_model.predict(preprocessor(X_test)).argmax(axis=1)
prediction_labels = [y_test.columns[i] for i in prediction_column_index]

# Submit Model 1 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "model.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

### **Model 2: Bidirectional LSTM**

In [20]:
# Here is a pre-trained LSTM model, but you could train your own model after preprocessing data with your preprocessor function.

lstm_model2.summary()

Model: "sequential_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_11 (Embedding)     (None, 40, 16)            160000    
_________________________________________________________________
lstm_28 (LSTM)               (None, 40, 32)            6272      
_________________________________________________________________
lstm_29 (LSTM)               (None, 40, 32)            8320      
_________________________________________________________________
bidirectional_6 (Bidirection (None, 64)                16640     
_________________________________________________________________
dense_11 (Dense)             (None, 2)                 130       
Total params: 191,362
Trainable params: 191,362
Non-trainable params: 0
_________________________________________________________________


In [21]:
# Save Model 2 to .onnx file (This model has abt. 200k parameters, so it takes abt. a minute to save)

onnx_model = model_to_onnx(lstm_model2, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

# Save model to local .onnx file
with open("model_2.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString()) 

In [22]:
# Submit Model 2

#-- Generate predicted y values (Model 2)
prediction_column_index = lstm_model2.predict(preprocessor(X_test)).argmax(axis=1)
prediction_labels = [y_test.columns[i] for i in prediction_column_index]

# Submit Model 2 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "model_2.onnx",
                                 prediction_submission=prediction_labels,
                                 preprocessor_filepath="preprocessor.zip")

Insert search tags to help users find your model (optional): lstm, bidirectional lstm, embedding, clickbait
Provide any useful notes about your model (optional): lstm, bidirectional lstm, embedding

Your model has been submitted as model version 2

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:759


Get Leaderboard

In [23]:
data = mycompetition.get_leaderboard()
mycompetition.stylize_leaderboard(data)

Unnamed: 0,accuracy,f1_score,precision,recall,ml_framework,transfer_learning,deep_learning,model_type,depth,num_params,bidirectional_layers,dense_layers,embedding_layers,flatten_layers,lstm_layers,softmax_act,tanh_act,loss,optimizer,model_config,memory_size,username,version
0,78.57%,78.57%,78.66%,78.66%,keras,False,True,Sequential,5,177154,,1,1,1.0,2,1,2,function,RMSprop,"{'name': 'sequential_2', 'laye...",2751400,mikedparrott,1
1,78.46%,78.38%,79.41%,78.74%,keras,False,True,Sequential,5,191362,1.0,1,1,,2,1,2,function,RMSprop,"{'name': 'sequential_11', 'lay...",3353848,mikedparrott,2


Compare Models

In [24]:
# Compare two or more models (Experimental, Git-like Diffs for Model Architectures)
data=mycompetition.compare_models([1,2],verbose=1)
mycompetition.stylize_compare(data)

Unnamed: 0,Model_1_Layer,Model_1_Shape,Model_1_Params,Model_2_Layer,Model_2_Shape,Model_2_Params
0,Embedding,"[None, 40, 16]",160000,Embedding,"[None, 40, 16]",160000
1,LSTM,"[None, 40, 32]",6272,LSTM,"[None, 40, 32]",6272
2,LSTM,"[None, 40, 32]",8320,LSTM,"[None, 40, 32]",8320
3,Flatten,"[None, 1280]",0,Bidirectional,"[None, 64]",16640
4,Dense,"[None, 2]",2562,Dense,"[None, 2]",130


#### Check structure of y test data 
(This helps users understand how to submit predicted values to leaderboard)

In [25]:
mycompetition.inspect_y_test()

{'class_balance': {'clickbait': 3022, 'not clickbait': 3223},
 'class_labels': ['not clickbait', 'clickbait'],
 'label_dtypes': {"<class 'str'>": 6245},
 'y_length': 6245,
 'ytest_example': ['not clickbait',
  'not clickbait',
  'not clickbait',
  'not clickbait',
  'clickbait']}

## **Part 3: Maintaining your Model Playground**

-------

Update Runtime model

*Use this function to 1) update the prediction API behind your Model Playground with a new model, chosen from the leaderboard and 2) verify the model performance metrics in your Model Playground*

In [26]:
myplayground.update_runtime_model(model_version=2)

Runtime model & preprocessor for api: https://1pj0k29ak0.execute-api.us-east-1.amazonaws.com/prod/m updated to model version 2.

Model metrics are now updated and verified for this model playground.


Delete Deployment 

*Use this function to delete the entire Model Playground, including the REST API, web dashboard, competition, and all submitted models*

In [None]:
myplayground.delete_deployment()

Running this function will permanently delete all resources tied to this deployment, 
 including the eval lambda and all models submitted to the model competition.

To confirm, type 'permanently delete':permanently delete


'API deleted successfully.'