## Chassis Example Notebooks
Welcome to the examples section for [Chassis](https://chassis.ml), which contains notebooks that auto-containerize models built using the most common machine learning (ML) frameworks. 

#### What is Chassis?
Chassis allows you to automatically create a Docker container from your model code and push that container image to a Docker registry. All you need is your model loaded into memory and a few lines of Chassis code! Our example bank is here to provide reference examples for many common ML frameworks.  

Can't find the framework you are looking for or need help? Fork this repository and open a PR, or list the desired framework in a new issue. We're always interested in growing this example bank! 

The primary maintainers of Chassis also actively monitor our [Discord Server](https://discord.gg/cHpzY9yCcM), so feel free to join and ask any questions you might have. We'll be there to respond and help out promptly.

In [2]:
import time
import os
import chassisml
import numpy as np
import getpass
import json
import pandas as pd
from io import StringIO
from fastai.tabular.all import TabularDataLoaders, RandomSplitter, TabularPandas, tabular_learner, Categorify, FillMissing, Normalize, range_of, accuracy

## Enter credentials
Dockerhub creds and Modzy API Key

In [3]:
dockerhub_user = getpass.getpass('docker hub username')
dockerhub_pass = getpass.getpass('docker hub password')

docker hub username········
docker hub password········


## Download and process data

In [4]:
df = pd.read_csv("./data/adult_sample/adult.csv")
df.head()

Unnamed: 0,age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country,salary
0,49,Private,101320,Assoc-acdm,12.0,Married-civ-spouse,,Wife,White,Female,0,1902,40,United-States,>=50k
1,44,Private,236746,Masters,14.0,Divorced,Exec-managerial,Not-in-family,White,Male,10520,0,45,United-States,>=50k
2,38,Private,96185,HS-grad,,Divorced,,Unmarried,Black,Female,0,0,32,United-States,<50k
3,38,Self-emp-inc,112847,Prof-school,15.0,Married-civ-spouse,Prof-specialty,Husband,Asian-Pac-Islander,Male,0,0,40,United-States,>=50k
4,42,Self-emp-not-inc,82297,7th-8th,,Married-civ-spouse,Other-service,Wife,Black,Female,0,0,50,United-States,<50k


In [5]:
dls = TabularDataLoaders.from_csv("./data/adult_sample/adult.csv", path=os.path.join(os.getcwd(), "data/adult_sample"), y_names="salary",
    cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race'],
    cont_names = ['age', 'fnlwgt', 'education-num'],
    procs = [Categorify, FillMissing, Normalize])

In [6]:
splits = RandomSplitter(valid_pct=0.2)(range_of(df))
to = TabularPandas(df, procs=[Categorify, FillMissing,Normalize],
                   cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race'],
                   cont_names = ['age', 'fnlwgt', 'education-num'],
                   y_names='salary',
                   splits=splits)

In [7]:
# save test subset
test_df = df.copy()
test_df.drop(['salary'], axis=1, inplace=True)
test_df[:20].to_csv("./data/sample_adult_data.csv", index=False)

In [8]:
# print out sample of fully preprocess data
to.xs.iloc[:2]

Unnamed: 0,workclass,education,marital-status,occupation,relationship,race,education-num_na,age,fnlwgt,education-num
31480,5,9,3,4,1,5,1,-1.070514,0.578132,0.362212
7605,5,6,7,12,2,5,1,2.741258,0.181321,-2.378319


In [9]:
# rebuild dataloaders with preprocessed data
dls = to.dataloaders(bs=64)

## Train model

In [10]:
# build and train model
learn = tabular_learner(dls, metrics=accuracy)
learn.fit_one_cycle(1)

epoch,train_loss,valid_loss,accuracy,time
0,0.382037,0.358385,0.842752,00:08


In [11]:
# view results
learn.show_results()

Unnamed: 0,workclass,education,marital-status,occupation,relationship,race,education-num_na,age,fnlwgt,education-num,salary,salary_pred
0,5.0,12.0,3.0,5.0,1.0,5.0,1.0,0.102339,0.478246,-0.420797,1.0,1.0
1,5.0,1.0,5.0,7.0,4.0,5.0,1.0,-1.510334,-0.834797,-1.59531,0.0,0.0
2,5.0,12.0,1.0,9.0,2.0,5.0,1.0,-0.850604,-0.274844,-0.420797,0.0,0.0
3,5.0,13.0,5.0,11.0,4.0,5.0,1.0,-0.997211,0.11723,1.536725,0.0,0.0
4,7.0,12.0,3.0,4.0,1.0,5.0,1.0,1.201888,1.233909,-0.420797,0.0,0.0
5,5.0,10.0,7.0,2.0,2.0,5.0,1.0,1.788315,0.253177,1.145221,0.0,0.0
6,7.0,16.0,7.0,13.0,2.0,5.0,1.0,2.741258,0.343837,-0.029293,0.0,0.0
7,8.0,16.0,5.0,2.0,4.0,5.0,1.0,-1.290424,-0.048057,-0.029293,0.0,0.0
8,5.0,10.0,5.0,13.0,2.0,3.0,1.0,-0.557391,0.39704,1.145221,0.0,0.0


In [16]:
labels = ['<50k', '>50k']

## Write process function

* Must take bytes as input
* Preprocess bytes, run inference, postprocess model output, return results

In [17]:
def process(input_bytes):
    inputs = pd.read_csv(StringIO(str(input_bytes, "utf-8")))
    dl = learn.dls.test_dl(inputs)
    preds = learn.get_preds(dl=dl)[0].numpy()
    
    inference_result = {
        "classPredictions": [
            {
                "row": i+1,
                "predictions": [
                    {"class": labels[j], "score": round(pred[j], 4)} for j in range(2)
                ]
            } for i, pred in enumerate(preds)
        ]
    }
    
    structured_output = {
        "data": {
            "result": inference_result,
            "explanation": None,
            "drift": None,
        }
    }
    
    return structured_output

## Initialize Chassis Client
We'll use this to interact with the Chassis service

In [18]:
chassis_client = chassisml.ChassisClient("https://chassis.app.modzy.com")

## Create and test Chassis model
* Requires `process_fn` defined above

In [19]:
# create Chassis model
chassis_model = chassis_client.create_model(process_fn=process)

# test Chassis model locally (can pass filepath, bufferedreader, bytes, or text here):
sample_filepath = './data/sample_adult_data.csv'
results = chassis_model.test(sample_filepath)
print(results)

b'{"data":{"result":{"classPredictions":[{"row":1,"predictions":[{"class":"<50k","score":0.4047999978065491},{"class":">50k","score":0.5952000021934509}]},{"row":2,"predictions":[{"class":"<50k","score":0.4226999878883362},{"class":">50k","score":0.5773000121116638}]},{"row":3,"predictions":[{"class":"<50k","score":0.9347000122070312},{"class":">50k","score":0.06530000269412994}]},{"row":4,"predictions":[{"class":"<50k","score":0.07410000264644623},{"class":">50k","score":0.9258999824523926}]},{"row":5,"predictions":[{"class":"<50k","score":0.8180999755859375},{"class":">50k","score":0.1818999946117401}]},{"row":6,"predictions":[{"class":"<50k","score":0.9790999889373779},{"class":">50k","score":0.020899999886751175}]},{"row":7,"predictions":[{"class":"<50k","score":0.8970000147819519},{"class":">50k","score":0.10300000011920929}]},{"row":8,"predictions":[{"class":"<50k","score":0.7594000101089478},{"class":">50k","score":0.24060000479221344}]},{"row":9,"predictions":[{"class":"<50k","

In [21]:
# manually construct conda environment to pass to Chassis job

# NOTE: if you define your environment manually like this, the "chassisml" package must be included within the pip depedencies
env = {
    "name": "sklearn-chassis",
    "channels": ['conda-forge'],
    "dependencies": [
        "python=3.8.5",
        {
            "pip": [
                "fastai",
                "numpy",
                "chassisml",
                "pandas"
            ] 
        }
    ]
}

## Publish model to Docker
Need to provide model name, model version, and Dockerhub credentials

In [22]:
MODEL_NAME = "Fast AI Salary Prediction"
start_time = time.time()
response = chassis_model.publish(
    model_name=MODEL_NAME,
    model_version="0.0.1",
    registry_user=dockerhub_user,
    registry_pass=dockerhub_pass,
    conda_env=env
)

Starting build job... Ok!


In [23]:
job_id = response.get('job_id')
final_status = chassis_client.block_until_complete(job_id)
end_time = time.time()
if final_status['status']['succeeded'] == 1:
    print("Job Completed in {} minutes.\n\nView your new container image here: https://hub.docker.com/repository/docker/{}/{}".format(round((end_time-start_time)/60, 5), dockerhub_user, "-".join(MODEL_NAME.lower().split(" "))))
else:
    print("Job Failed. See logs below:\n\n{}".format(final_status['logs']))

Job Completed in 11.83846 minutes.

View your new container image here: https://hub.docker.com/repository/docker/bmunday131/fast-ai-salary-prediction
