[web] ~100 seconds to load model/InferenceSession #11217

josephrocca · 2022-04-14T20:13:49Z

Describe the bug
I'm using this ONNX file for this browser-based SwinIR super-resolution demo, and it works great. The only problem is that it takes about 100 seconds to load the ONNX model. I think this is because the model file has a very large number of nodes (about 28k according to Netron).

I converted this .pth model using the following code (as documented here):

torch.onnx.export(model, img_lq, "003_realSR_BSRGAN_DFO_s64w8_SwinIR-M_x4_GAN.onnx", export_params=True, opset_version=12, do_constant_folding=True, verbose=True, input_names = ['input'], output_names = ['output'], dynamic_axes={'input' : {2 : 'h', 3 : 'w'}, 'output' : {2 : 'h', 3 : 'w'}})

Urgency
None

System information

Browser: Tested in Firefox and Chrome
ONNX Runtime version: https://cdn.jsdelivr.net/npm/onnxruntime-web@1.11.0/dist/ort.js

To Reproduce
Run the following code:

<script src="https://cdn.jsdelivr.net/npm/onnxruntime-web@1.11.0/dist/ort.js"></script>
<script type=module>
  ort.env.wasm.proxy = true;
  let onnxSession = await ort.InferenceSession.create('https://huggingface.co/rocca/swin-ir-onnx/resolve/main/003_realSR_BSRGAN_DFO_s64w8_SwinIR-M_x4_GAN.onnx', { executionProviders: ["wasm"] });
</script>

Or you can just open this page, and the model will start loading (see console): https://josephrocca.github.io/super-resolution-js/

Expected behavior
I'm not sure what to expect here, because it may be that a model with so many nodes just inherently takes this long to load, but 100s does seem excessive given how fast the .pth loads in python (a few seconds at the most). I guess the root of the issue could also be to do with the behavior of torch.onnx.export.

The text was updated successfully, but these errors were encountered:

jdluzen · 2022-05-08T20:42:08Z

When I try to load this model in with the C# wrapper on Win10 x64, I get this:
Microsoft.ML.OnnxRuntime.OnnxRuntimeException: '[ErrorCode:RuntimeException] Exception during initialization: D:\a\_work\1\s\onnxruntime\core\graph\function.cc:462 onnxruntime::FunctionImpl::FunctionImpl status.IsOK() was false. This is an invalid model. Error: two nodes with same node name (Unsqueeze_1122).'
Maybe that is related to the root cause.

josephrocca · 2022-05-09T16:53:34Z

@jdluzen Hmm that's weird - from what I can see it doesn't appear to be an invalid file. It loads correctly and works fine with the web runtime and on https://netron.app

Searching for Unsqueeze_1122 on Netron only gives one search result:

Note that it also takes a very long time to render in Netron due to the large node count. Netron says that there are about 28k nodes (I'll update the first post to mention this).

PINTO0309 · 2022-07-06T15:08:21Z

There are 41,085 OPs.

$ ssc4onnx --input_onnx_file_path 003_realSR_BSRGAN_DFO_s64w8_SwinIR-M_x4_GAN.onnx
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ OP Type                ┃ OPs        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Add                    │ 1452       │
│ Cast                   │ 1982       │
│ Concat                 │ 1277       │
│ Constant               │ 13284      │
│ ConstantOfShape        │ 1657       │
│ Conv                   │ 13         │
│ Div                    │ 403        │
│ Equal                  │ 1368       │
│ Erf                    │ 36         │
│ Expand                 │ 1620       │
│ Gather                 │ 1922       │
│ LeakyRelu              │ 4          │
│ MatMul                 │ 216        │
│ Mul                    │ 1521       │
│ Not                    │ 36         │
│ Pad                    │ 1          │
│ Pow                    │ 74         │
│ Range                  │ 1296       │
│ ReduceMean             │ 148        │
│ Reshape                │ 1852       │
│ Resize                 │ 2          │
│ ScatterND              │ 324        │
│ Shape                  │ 4413       │
│ Slice                  │ 1558       │
│ Softmax                │ 36         │
│ Sqrt                   │ 74         │
│ Sub                    │ 118        │
│ Transpose              │ 231        │
│ Unsqueeze              │ 2799       │
│ Where                  │ 1368       │
│ ---------------------- │ ---------- │
│ Total number of OPs    │ 41085      │
│ ====================== │ ========== │
│ Model Size             │ 58.5MiB    │
└────────────────────────┴────────────┘
INFO: file: 003_realSR_BSRGAN_DFO_s64w8_SwinIR-M_x4_GAN.onnx
INFO: producer: pytorch 1.10
INFO: opset: 12
INFO: input_name.1: input shape: [1, 3, 'h', 'w'] dtype: float32
INFO: output_name.1: output shape: ['Sliceoutput_dim_0', 'Sliceoutput_dim_1', 'h', 'w'] dtype: float32
INFO: Finish!

josephrocca · 2022-07-06T16:26:30Z

Thanks @PINTO0309! I've just done a few tests of load/init times for different runtimes:

Original Pytorch model loaded with Pytorch: ~1 second
ONNX file loaded with ORT Pytorch: ~40 seconds
ONNX file loaded with ORT Web: ~60 seconds (excluding model download time)

I wonder if this is a useful test case for finding bottlenecks in the init? Given that Pytorch handles the init much quicker, and given that SwinIR is a fairly popular model, it seems like it's worth looking into? Edit: But I guess it could also be a "problem" with the conversion - ops getting inefficiently converted into many ops, or something?

Not sure who to ping here, or if it's appropriate to ping, so apologies if not! @snnn @edgchen1

PINTO0309 · 2022-07-06T16:36:51Z

The input size is not fixed and therefore not fully optimized.

It is also true that even with optimization, loading protocolbuffers into onnxruntime is several times slower than in Pytorch.

If it were me, I would use Torchscript.

wangyems added the component:ort-web label Apr 14, 2022

edgchen1 added the type:performance label Jul 6, 2022

sophies927 added platform:web issues related to ONNX Runtime web; typically submitted using template core runtime issues related to core runtime and removed component:ort-web labels Aug 12, 2022

josephrocca mentioned this issue Jun 6, 2023

[Feature request] Image-to-image (super-resolution) xenova/transformers.js#138

Closed

josephrocca mentioned this issue Nov 8, 2023

Add image-to-image task w/ Swin2SR (for super-resolution) xenova/transformers.js#381

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[web] ~100 seconds to load model/InferenceSession #11217

[web] ~100 seconds to load model/InferenceSession #11217

josephrocca commented Apr 14, 2022 •

edited

jdluzen commented May 8, 2022

josephrocca commented May 9, 2022

PINTO0309 commented Jul 6, 2022 •

edited

josephrocca commented Jul 6, 2022 •

edited

PINTO0309 commented Jul 6, 2022

[web] ~100 seconds to load model/InferenceSession #11217

[web] ~100 seconds to load model/InferenceSession #11217

Comments

josephrocca commented Apr 14, 2022 • edited

jdluzen commented May 8, 2022

josephrocca commented May 9, 2022

PINTO0309 commented Jul 6, 2022 • edited

josephrocca commented Jul 6, 2022 • edited

PINTO0309 commented Jul 6, 2022

josephrocca commented Apr 14, 2022 •

edited

PINTO0309 commented Jul 6, 2022 •

edited

josephrocca commented Jul 6, 2022 •

edited