# CentML Yoxall Demo

In this demo we will use a <b>RESNET-50</b> model in ONNX format and optimize it with CentML using the CentML APIs. Once the ONNX model is optimized, we will compare the performance of the optimized model with the original format

## Export PyTorch Model to ONNX

We will use the open source Pytorch <b>RESNET-50</b> ONNX model for this demo
We have also created a `param.json` which contains the following data about the input shape:
```
[
    {
        "input_shape":"1,3,224,224",
        "dtype":"float16"
    }
]
```

In [None]:
import torch
import onnx
from torchvision.models import resnet50, ResNet50_Weights

model = resnet50(weights=ResNet50_Weights.IMAGENET1K_V2).eval().half().cuda()
dummy_input = torch.randn(1, 3, 224, 224).to(torch.float16).cuda()

input_names = [ "actual_input" ]
output_names = [ "output" ]
torch.onnx.export(model,
                 dummy_input,
                 "./model.onnx",
                 verbose=False,
                 input_names=input_names,
                 output_names=output_names,
                 export_params=True,
                 )

## Optimize model

In [None]:
from erin_server.optimize import optimize

optimize(onnx_path="./model.onnx",
         params_file="./params.json",
         outdir="./output")

## Wait for optimize to finish.

<b>An optimization task can take upto several hours.</b>
We can check the status of a optimization job with the status API using the optimization task id from above.

## Load optimized model and run CentML benchmark

In [None]:
%%time

import erin
import torch
import time
import numpy as np
from torchvision import transforms

# Set file paths
params = "./params.json"
onnx_path = "./model.onnx"

# Set the model in Hidet/Erin
model = erin.create_model(onnx_path, params, './model')

np_payload = np.random.rand(1,3,224, 224).astype("float16")
hidet_tensor = erin.from_numpy(np_payload).cuda()

# Configure number of iterations to run here
NUM_ITERATIONS = 100

hidet_time_durations = []
for i in range(0,NUM_ITERATIONS):
    # Start time
    start_time = time.time()
    
    # Prediction tensor
    output = model.predict(hidet_tensor)
    
    #End time
    end_time = time.time()
    
    duration = end_time - start_time
    hidet_time_durations.append(duration)

print("Average time: {:0.4f}s".format(sum(hidet_time_durations)/len(hidet_time_durations)))

## Run PyTorch benchmark

In [None]:
%%time

import torch
import onnx
from torchvision.models import resnet50, ResNet50_Weights

model = resnet50(weights=ResNet50_Weights.IMAGENET1K_V2).eval().half().cuda()
pytorch_tensor = torch.from_numpy(np_payload).cuda()

pytorch_time_durations = []

for i in range(0, NUM_ITERATIONS):
    # Start time
    start_time = time.time()
    
    # Prediction tensor
    output = model(pytorch_tensor)
    
    #End time
    end_time = time.time()
    
    duration = end_time - start_time
    pytorch_time_durations.append(duration)

print("Average time: {:0.4f}s".format(sum(pytorch_time_durations)/len(pytorch_time_durations)))