<p align="center"><img width="50%" src="https://aimodelsharecontent.s3.amazonaws.com/aimodshare_banner.jpg" /></p>


---




<p align="center"><h1 align="center">Quick Start: Titanic Tabular Classification Tutorial</h1> 

---

<h3 align="center">(Deploy model to an AI Model Share Model Playground REST API<br> and Web Dashboard in five easy steps...)</h3></p>
<p align="center"><img width="100%" src="https://aimodelsharecontent.s3.amazonaws.com/aimstutorialsteps.gif" /></p>


---



## **Credential Configuration**

In order to deploy an AI Model Share Model Playground, you will need a credentials text file. 

Generating your credentials file requires two sets of information: 
1. Your AI Model Share username and password (create them [HERE](https://www.modelshare.org/login)). 
2. Your AWS (Amazon Web Services) access keys (follow the tutorial [HERE](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html)). 

You only need to generate your credentials file once. After running the configure function below, save the outputted file for all your future Model Playground deployments and competition submissions. 

*Note: Handle your credentials file with the same level of security you handle your passwords. Do not share your file with anyone, send via email, or upload to Github.*


In [None]:
#install aimodelshare library
#! pip install aimodelshare
! pip install --extra-index-url https://test.pypi.org/simple/ --upgrade aimodelsharedev==656.618.838



In [None]:
# Generate credentials file
import aimodelshare as ai 
from aimodelshare.aws import configure_credentials 

configure_credentials()

## **Set up Environment**

Use your credentials file to set your credentials for all aimodelshare functions. 

In [3]:
# Set credentials 
import aimodelshare as ai
from aimodelshare.aws import set_credentials

set_credentials(credential_file="credentials.txt", type="deploy_model")

AI Model Share login credentials set successfully.
AWS credentials set successfully.


In [4]:
# Get materials for tutorial
import aimodelshare as ai
X_train, X_test, y_train, y_test, example_data, y_test_labels = ai.import_quickstart_data("titanic")

Downloading [====>                                            ]

Data downloaded successfully.

Preparing downloaded files for use...

Success! Your Quick Start materials have been downloaded. 
You are now ready to run the tutorial.


## **(1) Preprocessor Function & Setup**

### **Write a Preprocessor Function**


> ###   Preprocessor functions are used to preprocess data into the precise data your model requires to generate predictions.  

*  *Preprocessor functions should always be named "preprocessor".*
*  *You can use any Python library in a preprocessor function, but all libraries should be imported inside your preprocessor function.*  
*  *For tabular prediction models users should minimally include function inputs for an unpreprocessed pandas dataframe.*  
*  *Any categorical features should be preprocessed to one hot encoded numeric values.* 


In [5]:
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, OneHotEncoder

#Preprocess data using sklearn's Column Transformer approach

# We create the preprocessing pipelines for both numeric and categorical data.
numeric_features = ['age', 'fare']
numeric_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='median')), #'imputer' names the step
    ('scaler', StandardScaler())])

categorical_features = ['embarked', 'sex', 'pclass']

# Replacing missing values with Modal value and then one-hot encoding.
categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='most_frequent')),
    ('onehot', OneHotEncoder(handle_unknown='ignore'))])

# Final preprocessor object set up with ColumnTransformer...

preprocess = ColumnTransformer(
    transformers=[
        ('num', numeric_transformer, numeric_features),
        ('cat', categorical_transformer, categorical_features)])

# fit preprocessor to your data
preprocess = preprocess.fit(X_train)

In [6]:
# Write function to transform data with preprocessor 
# In this case we use sklearn's Column transformer in our preprocessor function

def preprocessor(data):
    preprocessed_data=preprocess.transform(data)
    return preprocessed_data

In [7]:
# check shape of X data 
preprocessor(X_train).shape

(1047, 10)

## **(2) Build Model Using sklearn (or Your Preferred ML Library)**

### **Penalized Logit**

In [8]:
from sklearn.linear_model import LogisticRegression

model = LogisticRegression(C=10, penalty='l2')
model.fit(preprocessor(X_train), y_train) # Fitting to the training set.
model.score(preprocessor(X_train), y_train) # Fit score, 0-1 scale.

0.7793696275071633

## **(3) Save Preprocessor**
### Saves preprocessor function to "preprocessor.zip" file

In [9]:
import aimodelshare as ai
ai.export_preprocessor(preprocessor,"") 

Your preprocessor is now saved to 'preprocessor.zip'


In [10]:
#  Now let's import and test the preprocessor function to see if it is working...

import aimodelshare as ai
prep=ai.import_preprocessor("preprocessor.zip")

prep(X_test).shape

(262, 10)

## **(4) Save sklearn model to Onnx File Format**


In [11]:
# Save sklearn model to local ONNX file
from aimodelshare.aimsonnx import model_to_onnx

# Check how many preprocessed input features are there?
from skl2onnx.common.data_types import FloatTensorType
initial_type = [('float_input', FloatTensorType([None, 10]))]  # You need to insert correct number of features in preprocesed data

onnx_model = model_to_onnx(model, framework='sklearn',
                          initial_types=initial_type,
                          transfer_learning=False,
                          deep_learning=False)

with open("model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

## **(5) Create your Model Playground**

In [12]:
#Set up arguments for Model Playground deployment
import pandas as pd 

model_filepath="model.onnx"
preprocessor_filepath="preprocessor.zip"
exampledata = example_data

In [13]:
from aimodelshare import ModelPlayground

#Instantiate ModelPlayground() Class

myplayground=ModelPlayground(model_type="tabular", classification="TRUE", private="FALSE")

# Create Model Playground (generates live rest api and web-app for your model/preprocessor)

myplayground.deploy(model_filepath, preprocessor_filepath, y_train, exampledata) 

We need some information about your model before we can build your REST API and interactive Model Playground.
   
Model Name (for AI Model Share Website):Titanic survival classifier
Model Description (Explain what your model does and 
 why end-users would find your model useful):This model takes passenger attributes such as gender, ticket class, and age to predict titanic survival.
Model Key Words (Search categories that describe your model, separated with commas):tabular, classification, titanic
   
Creating your prediction API. (This process may take several minutes.)


Success! Your Model Playground was created in 65 seconds. 
 Playground Url: "https://evxjcyzox2.execute-api.us-east-1.amazonaws.com/prod/m"

You can now use your Model Playground.

Follow this link to explore your Model Playground's functionality
You can make predictions with the Dashboard and access example code from the Programmatic tab.
https://www.modelshare.org/detail/model:494


## **Use your new Model Playground!**

Follow the link in the output above to:
- Generate predictions with your interactive web dashboard
- Access example code in Python, R, and Curl

Or, follow the rest of the tutorial to create a competition for your Model Playground and: 
- Access verified model performance metrics 
- Upload multiple models to a leaderboard 
- Easily compare model performance & structure 

## **Part 2: Create a Competition**

-------

After deploying your Model Playground, you can now create a competition. 

Creating a competition allows you to:
1. Verify the model performance metrics on aimodelshare.org
2. Submit models to a leaderboard
3. Grant access to other users to submit models to the leaderboard
4. Easily compare model performance and structure 

In [14]:
# Create Competition
myplayground.create_competition(data_directory='titanic_competition_data', 
                      e          y_test = y_test_labels, 
                                generate_credentials_file = True)

Enter competition name:Titanic Survival Competition
Enter competition description:Submit models to predict titanic survival.
Enter data description (i.e.- filenames denoting training and test data, file types, and any subfolders where files are stored):training data includes target variable.  Use test data to generate predictions of labels "survived" or "died" to make submissions to this competition. 
Uploading your data. Please wait for a confirmation message.

 Success! Model competition created. 

Your team members can now make use of the following functions: 
submit_model() to submit new models to the competition leaderboard. 
download_data('public.ecr.aws/y2e2a1d6/titanic_competition_data-repository:latest') to download your competition data.  

You may update your prediction API runtime model with the update_runtime_model() function.

To upload new models and/or preprocessors to this API, team members should use 
the following credentials:

#Credentials for Competition: evxjcyzox

In [15]:
#Instantiate Competition
#--Note: If you start a new session, the first argument should be the Model Playground url in quotes. 
#--e.g.- mycompetition= ai.Competition("https://2121212.execute-api.us-east-1.amazonaws.com/prod/m)
#See Model Playground "Compete" tab for example model submission code.

mycompetition= ai.Competition(myplayground.playground_url)

Submit Models

In [16]:
#Submit Model 1: 

#-- Generate predicted values (a list of predicted labels "survived" or "died") (Model 1)
prediction_labels = model.predict(preprocessor(X_test))

# Submit Model 1 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "model.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): penalized logistic regression
Provide any useful notes about your model (optional): did not tune C

Your model has been submitted as model version 1

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:494


In [17]:
# Create model 2 (Gradient Boosting Classifier)
from sklearn.ensemble import GradientBoostingClassifier

model_2 = GradientBoostingClassifier()
model_2.fit(preprocessor(X_train), y_train)
model_2.score(preprocessor(X_train), y_train)

0.8701050620821394

In [18]:
# Save Model 2 to .onnx file

# How many preprocessed input features are there?
from skl2onnx.common.data_types import FloatTensorType
initial_type = [('float_input', FloatTensorType([None, 10]))]  # need number of features in preprocesed data

onnx_model = model_to_onnx(model_2, framework='sklearn',
                          initial_types=initial_type,
                          transfer_learning=False,
                          deep_learning=False)

# Save model to local .onnx file
with open("model_2.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString()) 

In [19]:
# Submit Model 2

#-- Generate predicted y values (Model 2)
prediction_labels = model_2.predict(preprocessor(X_test))

# Submit Model 2 to Competition Leaderboard
mycompetition.submit_model(model_filepath = "model_2.onnx",
                                 prediction_submission=prediction_labels,
                                 preprocessor_filepath="preprocessor.zip")

Insert search tags to help users find your model (optional): gb classifier, untuned
Provide any useful notes about your model (optional): gb classifier set to default arguments

Your model has been submitted as model version 2

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:494


Get Leaderboard

In [20]:
data = mycompetition.get_leaderboard()
mycompetition.stylize_leaderboard(data)

Unnamed: 0,accuracy,f1_score,precision,recall,ml_framework,transfer_learning,deep_learning,model_type,num_params,optimizer,model_config,username,version
0,81.68%,80.09%,81.15%,79.44%,sklearn,False,False,LogisticRegression,10.0,lbfgs,"{'C': 10, 'class_weight': None...",mikedparrott,1
1,82.06%,80.06%,82.48%,78.99%,sklearn,False,False,GradientBoostingClassifier,,,"{'ccp_alpha': 0.0, 'criterion'...",mikedparrott,2


Compare Models

In [None]:
# Compare two or more models (Experimental, Git-like Diffs for Model Architectures)
mycompetition.compare_models([1,2])

Import a model from the leaderboard

In [24]:
mymodel = mycompetition.instantiate_model(2, trained=True)

#Generate new predictions to test model
mymodel.predict(preprocessor(X_test))

array(['died', 'survived', 'died', 'died', 'died', 'died', 'survived',
       'died', 'died', 'died', 'died', 'survived', 'died', 'survived',
       'died', 'died', 'died', 'died', 'died', 'died', 'died', 'died',
       'died', 'died', 'died', 'died', 'survived', 'died', 'survived',
       'survived', 'died', 'died', 'died', 'died', 'died', 'survived',
       'survived', 'died', 'died', 'survived', 'died', 'died', 'died',
       'died', 'died', 'survived', 'died', 'died', 'survived', 'survived',
       'died', 'died', 'survived', 'survived', 'survived', 'died', 'died',
       'survived', 'survived', 'died', 'died', 'died', 'died', 'died',
       'survived', 'died', 'died', 'died', 'died', 'died', 'died', 'died',
       'survived', 'survived', 'survived', 'survived', 'survived',
       'survived', 'died', 'died', 'died', 'died', 'survived', 'survived',
       'died', 'died', 'survived', 'survived', 'survived', 'survived',
       'died', 'died', 'survived', 'died', 'survived', 'died', 'd

#### Check structure of y test data 
(This helps users understand how to submit predicted values to leaderboard)

In [25]:
mycompetition.inspect_y_test()

{'class_balance': {'died': 162, 'survived': 100},
 'class_labels': ['died', 'survived'],
 'label_dtypes': {"<class 'str'>": 262},
 'y_length': 262,
 'ytest_example': ['survived', 'survived', 'survived', 'died', 'died']}

## **Part 3: Maintaining your Model Playground**

-------

Update Runtime model

*Use this function to 1) update the prediction API behind your Model Playground with a new model, chosen from the leaderboard and 2) verify the modelperformance metrics in your Model Playground*

In [26]:
myplayground.update_runtime_model(model_version=2)

Runtime model & preprocessor for api: https://evxjcyzox2.execute-api.us-east-1.amazonaws.com/prod/m updated to model version 2.

Model metrics are now updated and verified for this model playground.


Delete Deployment 

*Use this function to delete the entire Model Playground, including the REST API, web dashboard, competition, and all submitted models*

In [None]:
myplayground.delete_deployment()

Running this function will permanently delete all resources tied to this deployment, 
 including the eval lambda and all models submitted to the model competition.

To confirm, type 'permanently delete':permanently delete


'API deleted successfully.'