# Verify the Correctness of Exported Model and Compare the Performance

We choose PyTorch to export the ONNX model, and use Caffe2 and Tensorflow as backend.
After that, the outputs and performance of the three models are compared.

The ONNX Tutorial "Verify the Correctness of Exported Model and Compare the Performance" uses only Caffe2 as backend. But it fails when running the Caffe2 Model. In this notebook, we used a workaround to correct this issue. There are also some drawbacks of 

Reference : https://github.com/onnx/tutorials/blob/master/tutorials/CorrectnessVerificationAndPerformanceComparison.ipynb

In [3]:
import warnings
warnings.filterwarnings('ignore')

import tensorflow.python.util.deprecation as deprecation
deprecation._PRINT_DEPRECATION_WARNINGS = False

import numpy as np
import torch
import onnx
from onnx_tf.backend import prepare
import torch.nn as nn
import torch.nn.functional as F
import time

from caffe2.proto import caffe2_pb2
from caffe2.python import core
from torch.autograd import Variable
from caffe2.python.onnx.backend import Caffe2Backend
from caffe2.python.onnx.helper import c2_native_run_net, save_caffe2_net, load_caffe2_net

## Build MNIST Model

In [4]:
class MNIST(nn.Module):

    def __init__(self):
        super(MNIST, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x)

## Generate ONNX model from PyTorch

In [5]:
# Create a pytorch model.
pytorch_model = MNIST()
pytorch_model.train(False)

# Generate dummy inputs.

inputs = (Variable(torch.randn(3, 1, 28, 28), requires_grad=True), )

# Run the PyTorch exporter to generate an ONNX model.
torch.onnx.export(pytorch_model, inputs, "mnist_pytorch.onnx")

## Load ONNX Model

In [6]:
# Load the onnx model
onnx_model = onnx.load('mnist_pytorch.onnx')

# Check whether the onnx_model is valid or not.
print("Check the ONNX model.")
onnx.checker.check_model(onnx_model)

Check the ONNX model.


## Build Caffe2 Model from ONNX

Now, we have an ONNX model, let's turn it into a Caffe2 one.

In [7]:
# Convert the ONNX model to a Caffe2 model.
print("Convert the model to a Caffe2 model.")
init_net, predict_net = Caffe2Backend.onnx_graph_to_caffe2_net(onnx_model, device="CPU")

Convert the model to a Caffe2 model.


Caffe2 takes a list of numpy array as inputs. So we need to change the format.

In [8]:
# Prepare the inputs for Caffe2.
caffe2_inputs = [var.data.numpy() for var in inputs]

## Build TensorFlow Model from ONNX

In [9]:
# Prepare inputs for TensorFlow
tensorflow_inputs = [var.detach().numpy() for var in inputs]

In [10]:
# Convert the ONNX model to a TensorFlow model.
tf_rep = prepare(onnx_model)

## Running different models

Run PyTorch, Caffe2 and Tensorflow models separately, and get the results.

In [11]:
# Compute the results using the PyTorch model.
pytorch_results = pytorch_model(*inputs)

# Compute the results using the Caffe2 model.
# the function c2_native_run_net return 2 objects, the workspace and the results
# ws is an object representing a Caffe2 workspace an instance of the class Workspace from onnx_caffe2.workspace,
#This class makes it possible to work with workspaces more locally, and without forgetting 
#to deallocate everything in the end.
ws, caffe2_results = c2_native_run_net(init_net, predict_net, caffe2_inputs)

# Compute the results using the Tensorflow model.
tensorflow_results = tf_rep.run(tensorflow_inputs)

## Correctness Check of different models

Now we have the results, let's check the correctness of the exported model.
If no assertion fails, our model has achieved expected precision.

In [12]:
# Check the decimal precision of the exported Caffe2 and Tensorflow.
expected_decimal = 5
for p, c, t in zip([pytorch_results], caffe2_results, tensorflow_results):
    np.testing.assert_almost_equal(p.data.cpu().numpy(), c, decimal=expected_decimal)
    np.testing.assert_almost_equal(p.data.cpu().numpy(), t, decimal=expected_decimal)
    np.testing.assert_almost_equal(c, t, decimal=expected_decimal)
print("The exported model achieves {}-decimal precision.".format(expected_decimal))

The exported model achieves 5-decimal precision.


## Performance Check of different models

The following code measures the performance of PyTorch, Caffe2 and Tensorflow models.
We report:
- Execution time per iteration
- Iterations per second

### Definition of a function that measures PyTorch model performance (Baseline)

In [13]:
def performance_pytorch_model(model, inputs, warmup_iters=3, main_iters=10):
    '''
     Run the model several times, and measure the execution time.
     Print the execution time per iteration (millisecond) and the number of iterations per second.
    '''
    for _i in range(warmup_iters):
        model(*inputs)
        
    total_time = 0.0
    
    for _i in range(main_iters):
        ts = time.time()
        model(*inputs)
        te = time.time()
        total_time += te - ts
    
    print("The PyTorch model execution time per iter is {} milliseconds, "
          "{} iters per second.".format(total_time / main_iters * 1000,
                                        main_iters / total_time))

### Definition of a function that measures Caffe2 model performance

We decided to use this method instead of the builtin function benchmark_caffe2_model under caffe2.python.onnx.helper so that the same way of benchmarking would be apllied on all the frameworks.

In [14]:
 def performance_caffe2_model(init_net, predict_net,inputs, warmup_iters=3, main_iters=10):
    '''
     Run the model several times, and measure the execution time.
     Print the execution time per iteration (millisecond) and the number of iterations per second.
    '''
    for _i in range(warmup_iters):
        ws, caffe2_results = c2_native_run_net(init_net, predict_net, inputs)    
    
    total_time = 0.0
    for _i in range(main_iters):
        ts = time.time()
        ws, caffe2_results = c2_native_run_net(init_net, predict_net, inputs)
        te = time.time()
        total_time += te - ts
    print("The Caffe2 model execution time per iter is {} milliseconds, "
          "{} iters per second.".format(total_time / main_iters * 1000,
                                        main_iters / total_time))

### Definition of a function that measures Tensorflow model performance

In [15]:
 def performance_tensorflow_model(model, inputs, warmup_iters=3, main_iters=10):
    '''
     Run the model several times, and measure the execution time.
     Print the execution time per iteration (millisecond) and the number of iterations per second.
    '''

    for _i in range(warmup_iters):
        output = tf_rep.run(inputs)
    
    total_time = 0.0
    for _i in range(main_iters):
        ts = time.time()
        output = tf_rep.run(inputs)
        te = time.time()
        total_time += te - ts
    print("The Tensorflow model execution time per iter is {} milliseconds, "
          "{} iters per second.".format(total_time / main_iters * 1000,
                                        main_iters / total_time))

## Evaluation of results

In [16]:
performance_pytorch_model(pytorch_model, inputs,warmup_iters=3,main_iters=100)

The PyTorch model execution time per iter is 0.6457304954528809 milliseconds, 1548.633690125868 iters per second.


In [17]:
performance_caffe2_model(init_net, predict_net,caffe2_inputs, warmup_iters=3, main_iters=100)

The Caffe2 model execution time per iter is 2.544734477996826 milliseconds, 392.96830716389076 iters per second.


In [18]:
performance_tensorflow_model(onnx_model, tensorflow_inputs, warmup_iters=3,main_iters=100)

The Tensorflow model execution time per iter is 18.066928386688232 milliseconds, 55.3497516897672 iters per second.
