## Chassis Example Notebooks
Welcome to the examples section for [Chassis](https://chassis.ml), which contains notebooks that auto-containerize models built using the most common machine learning (ML) frameworks. 

#### What is Chassis?
Chassis allows you to automatically create a Docker container from your model code and push that container image to a Docker registry. All you need is your model loaded into memory and a few lines of Chassis code! Our example bank is here to provide reference examples for many common ML frameworks.  

Can't find the framework you are looking for or need help? Fork this repository and open a PR, or list the desired framework in a new issue. We're always interested in growing this example bank! 

The primary maintainers of Chassis also actively monitor our [Discord Server](https://discord.gg/cHpzY9yCcM), so feel free to join and ask any questions you might have. We'll be there to respond and help out promptly.

In [1]:
import os
import cv2
import pickle
import tempfile
import chassisml
import numpy as np
import getpass
import torch
from torch.nn import functional as F
from shutil import rmtree
import json
import onnx
from onnx import backend
from onnx import numpy_helper
import onnxruntime as ort
from transformers import GPT2Model, GPT2LMHeadModel, GPT2Tokenizer

## Enter credentials
Dockerhub creds and Modzy API Key

In [None]:
dockerhub_user = getpass.getpass('docker hub username')
dockerhub_pass = getpass.getpass('docker hub password')

## Load ONNX Model and Test Locally
This model was downloaded from the [ONNX Model Zoo](https://github.com/onnx/models/tree/master/text/machine_comprehension/gpt-2), which contains several pre-trained models saved in the ONNX open standard format.

In [3]:
# check gpt-2 and gpt-2 head models are valid onnx models files
head_model = onnx.load("models/head_model.onnx")

# check onnx file is valid model
onnx.checker.check_model(head_model)

# load input tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

In [4]:
# WORKING FUNTIONS NEEDED
def flatten(inputs):
    return [[flatten(i) for i in inputs] if isinstance(inputs, (list, tuple)) else inputs]
def to_numpy(x):
    if type(x) is not np.ndarray:
        x = x.detach().cpu().numpy().astype(np.int64) if x.requires_grad else x.cpu().numpy().astype(np.int64)
    return x

#### Local Test

In [12]:
batch_size = 1
length = 10

text = "Today is a great day to learn about transformers. And Chassisml makes it really easy to package models"
tokens = np.array(tokenizer.encode(text, add_special_tokens=True))
tensors = torch.tensor([[tokens]])
prev = tensors
output = tensors.squeeze(0)

for i in range(length):
    if len(prev.shape) == 2:
        prev = prev.unsqueeze(0)     
    session = ort.InferenceSession("models/head_model.onnx")
    ort_inputs = dict((session.get_inputs()[i].name, to_numpy(input)) for i, input in enumerate(flatten(prev)))
    outputs = session.run(None, ort_inputs)
    logits = torch.tensor(outputs[0]).squeeze(0)
    logits = logits[:, -1, :]
    log_probs = F.softmax(logits, dim=-1)
    _, prev = torch.topk(log_probs, k=1, dim=-1)
    output = torch.cat((output, prev), dim=1)

output = output[:, len(tokens):].tolist()
generated = 0
output_text = []
for i in range(batch_size):
    generated += 1
    text = tokenizer.decode(output[i])
    output_text.append(text)
    print(text)

 in the the the the the the the the the


## Write process function

* Must take bytes as input
* Preprocess bytes, run inference, postprocess model output, return results

In [14]:
def process(input_bytes):
    # initialize fixed variables for post processing
    batch_size = 1
    length = 10
    
    # save model to filepath for inference
    tmp_dir = tempfile.mkdtemp()
    import onnx
    onnx.save(head_model, "{}/head_model.onnx".format(tmp_dir))
    
    # preprocess text
    text_input = input_bytes.decode()
    tokens = np.array(tokenizer.encode(text_input, add_special_tokens=True))
    tensors = torch.tensor([[tokens]])

    # run inference 
    prev = tensors
    output = tensors.squeeze(0)
    for i in range(length):
        if len(prev.shape) == 2:
            prev = prev.unsqueeze(0)     
        session = ort.InferenceSession("{}/head_model.onnx".format(tmp_dir))
        ort_inputs = dict((session.get_inputs()[i].name, to_numpy(input)) for i, input in enumerate(flatten(prev)))
        outputs = session.run(None, ort_inputs)
        logits = torch.tensor(outputs[0]).squeeze(0)
        logits = logits[:, -1, :]
        log_probs = F.softmax(logits, dim=-1)
        _, prev = torch.topk(log_probs, k=1, dim=-1)
        output = torch.cat((output, prev), dim=1)

    # process output
    output = output[:, len(tokens):].tolist()
    generated = 0
    for i in range(batch_size):
        generated += 1
        text_full = tokenizer.decode(output[i])
        
    # format text
    output_text = text_full.split(" ")[1:]
    
    # format results
    structured_result = {
        "data": {
            "result": {"nextWordPredictions": [{"word_{}".format(i): text_pred} for i, text_pred in enumerate(output_text)]},
            "combined": text_input + text_full
        }
    }
    
    # remove temp directory
    rmtree(tmp_dir)
    return structured_result

## Initialize Chassis Client
We'll use this to interact with the Chassis service

In [15]:
chassis_client = chassisml.ChassisClient("http://localhost:5000")

## Create and test Chassis model
* Requires `process_fn` defined above

In [16]:
# create Chassis model
chassis_model = chassis_client.create_model(process_fn=process)

# test Chassis model locally (can pass filepath, bufferedreader, bytes, or text here):
sample_filepath = 'data/sample_text.txt'
results = chassis_model.test(sample_filepath)
print(results)

b'{"data":{"result":{"nextWordPredictions":[]},"combined":"Today is a great day to learn about transformers. And Chassisml makes it really easy to package models.\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n"}}'


In [None]:
# test environment and model within Chassis service, must pass filepath here:

# dry run before build
test_env_result = chassis_model.test_env(sample_filepath)
print(test_env_result)

## Publish model to Docker
Need to provide model name, model version, and Dockerhub credentials

In [17]:
response = chassis_model.publish(
    model_name="ONNX GPT2 Word Prediction",
    model_version="0.0.1",
    registry_user=dockerhub_user,
    registry_pass=dockerhub_pass
)

job_id = response.get('job_id')
final_status = chassis_client.block_until_complete(job_id)

Starting build job... Ok!


In [18]:
if chassis_client.get_job_status(job_id)["result"] is not None:
    print("New model URL: {}".format(chassis_client.get_job_status(job_id)["result"]["container_url"]))
else:
    print("Chassis job failed \n\n {}".format(chassis_client.get_job_status(job_id)))

New model URL: https://integration.modzy.engineering/models/txje9pp5co/0.0.1
