<a href="https://colab.research.google.com/github/nanamiwang/criu/blob/master/PyTorch_supercombo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install onnx2pytorch onnx onnxruntime==1.4.0 onnxruntime-gpu onnx2keras 

Collecting onnx2pytorch
  Downloading onnx2pytorch-0.4.1-py3-none-any.whl (44 kB)
[K     |████████████████████████████████| 44 kB 1.6 MB/s 
[?25hCollecting onnx
  Downloading onnx-1.10.2-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (12.7 MB)
[K     |████████████████████████████████| 12.7 MB 9.1 MB/s 
[?25hCollecting onnxruntime==1.4.0
  Downloading onnxruntime-1.4.0-cp37-cp37m-manylinux2010_x86_64.whl (4.4 MB)
[K     |████████████████████████████████| 4.4 MB 33.8 MB/s 
[?25hCollecting onnxruntime-gpu
  Downloading onnxruntime_gpu-1.10.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (104.8 MB)
[K     |████████████████████████████████| 104.8 MB 94 kB/s 
[?25hCollecting onnx2keras
  Downloading onnx2keras-0.0.24.tar.gz (20 kB)
Building wheels for collected packages: onnx2keras
  Building wheel for onnx2keras (setup.py) ... [?25l[?25hdone
  Created wheel for onnx2keras: filename=onnx2keras-0.0.24-py3-none-any.whl size=24593 sha256=7a754a4ca9ce8a11a986ea96

In [None]:
!pip install gdown



In [None]:
# This is a file I had downloaded earlier from https://github.com/commaai/openpilot/blob/72a736f90e57a7d5845891ea34b17360b6f684d0/models/supercombo.onnx
# I couldn't download it using git today — Comma moved the models to LFS and when I try 
# to download the files using LFS, GitHub says that the repository ran out of its quota.
!gdown https://drive.google.com/uc?id=14RmJCLQq8IdjC5KiNH0XL5tZXfYEsIzP

Downloading...
From: https://drive.google.com/uc?id=14RmJCLQq8IdjC5KiNH0XL5tZXfYEsIzP
To: /content/supercombo.onnx
100% 56.7M/56.7M [00:00<00:00, 122MB/s] 


In [None]:
import os, sys
import numpy as np
import torch
import matplotlib.pyplot as plt
from matplotlib.cm import get_cmap

import onnxruntime as rt
import onnx
import onnx2pytorch
import onnx2keras

## Convert ONNX model

In [None]:
path_to_onnx_model = 'supercombo.onnx'

model = onnx.load(path_to_onnx_model)

input_names = [node.name for node in model.graph.input]
output_names = [node.name for node in model.graph.output]

# onnxruntime
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
onnxruntime_model = rt.InferenceSession(path_to_onnx_model, providers=providers)

# pytorch
device = torch.device('cuda')
pytorch_model = onnx2pytorch.ConvertModel(model).to(device)
pytorch_model.requires_grad_(False)
pytorch_model.eval()

# keras
keras_model = onnx2keras.onnx_to_keras(model, input_names, verbose=False)

  layer.weight.data = torch.from_numpy(numpy_helper.to_array(weight))
Trying to convert multi-output node
Trying to convert multi-output node


## Run models

In [None]:
torch_inputs = {
    'input_imgs': torch.ones((1, 12, 128, 256), dtype=torch.float32).to(device),
    'desire': torch.zeros((1, 8), dtype=torch.float32).to(device),
    'traffic_convention': torch.tensor([0, 1], dtype=torch.float32).reshape(1, 2).to(device),
    'initial_state': torch.zeros((1, 512), dtype=torch.float32).to(device),
}

onnx_inputs = {
    'input_imgs': np.ones((1, 12, 128, 256), dtype=np.float32),
    'desire': np.zeros((1, 8), dtype=np.float32),
    'traffic_convention': np.array([0, 1], dtype=np.float32).reshape(1, 2),
    'initial_state': np.zeros((1, 512), dtype=np.float32),
}

keras_inputs = {
    'input_imgs': np.ones((1, 12, 128, 256), dtype=np.float32),
    'desire': np.zeros((1, 8), dtype=np.float32),
    'traffic_convention': np.array([0, 1], dtype=np.float32).reshape(1, 2),
    'initial_state': np.zeros((1, 512), dtype=np.float32),
}

# verify inputs are identical
for key in torch_inputs.keys():

  torch_val = torch_inputs[key].detach().cpu().numpy()
  onnx_val = onnx_inputs[key]
  keras_val = keras_inputs[key]

  np.testing.assert_equal(torch_val, onnx_val)
  np.testing.assert_equal(torch_val, keras_val)


# run inference
keras_outs = keras_model(keras_inputs)
torch_outs = pytorch_model(**torch_inputs)
onnxruntime_outs = onnxruntime_model.run(output_names, onnx_inputs)[0]

keras_outs = keras_outs.numpy()
torch_outs = torch_outs.detach().cpu().numpy()

print('Torch outs:', torch_outs.shape)
print('Keras outs:', keras_outs.shape)
print('onnxruntime outs:', onnxruntime_outs.shape)

Torch outs: (1, 6472)
Keras outs: (1, 6472)
onnxruntime outs: (1, 6472)


### 🛑 The issue:

In [None]:
torch_keras_diff = np.sum(np.abs(torch_outs - keras_outs))
torch_onnx_diff = np.sum(np.abs(torch_outs - onnxruntime_outs))
onnx_keras_diff = np.sum(np.abs(onnxruntime_outs - keras_outs))

# print diffs
print(f'Torch vs Keras: {torch_keras_diff:.3e}')
print(f'Torch vs ONNX: {torch_onnx_diff:.3e}')
print(f'ONNX vs Keras: {onnx_keras_diff:.3e}')

Torch vs Keras: 3.441e+01
Torch vs ONNX: 3.439e+01
ONNX vs Keras: 1.411e-02


## Fix found! 🎉

In [None]:
# the only change
pytorch_model.Elu_907.inplace = False

In [None]:
# run inference
keras_outs = keras_model(keras_inputs)
torch_outs = pytorch_model(**torch_inputs)
onnxruntime_outs = onnxruntime_model.run(output_names, onnx_inputs)[0]

keras_outs = keras_outs.numpy()
torch_outs = torch_outs.detach().cpu().numpy()

print('Torch outs:', torch_outs.shape)
print('Keras outs:', keras_outs.shape)
print('onnxruntime outs:', onnxruntime_outs.shape)

Torch outs: (1, 6472)
Keras outs: (1, 6472)
onnxruntime outs: (1, 6472)


In [None]:
torch_keras_diff = np.sum(np.abs(torch_outs - keras_outs))
torch_onnx_diff = np.sum(np.abs(torch_outs - onnxruntime_outs))
onnx_keras_diff = np.sum(np.abs(onnxruntime_outs - keras_outs))

# print diffs
print(f'Torch vs Keras: {torch_keras_diff:.3e}')
print(f'Torch vs ONNX: {torch_onnx_diff:.3e}')
print(f'ONNX vs Keras: {onnx_keras_diff:.3e}')

Torch vs Keras: 1.408e-02
Torch vs ONNX: 8.621e-03
ONNX vs Keras: 1.411e-02


### Fix explanation

It will be clear if you visualize `supercombo.onnx` using https://netron.app/

1. `onnx2pytorch` runs the model sequentially, in the order that the layers are defined in the onnx graph, reflected in the onnx layer names (e.g. `ELU_203` is computed right after `Flatten_202`)
2. The conv head of the network at the layer `Flatten_202` splits into two branches, one starting with `ELU_203` and another with `Concat_230`
3. `ELU_203` mutates the value of `Flatten_202` (due to onnx2pytorch setting `inplace=True` for ELU by default)
4. `Concat_230` is then computed with mutated `Flatten_202`

**Solution** - ensure operations that immediately follow a split in the network (any multi-output layer) are **NOT in-place**. 