Dynamic ONNX Importer #6351

mbrookhart · 2020-08-27T21:36:46Z

Hello Friends,

Over the last couple of months, @electriclilies and I have been working to add more dynamic support to relay ops, to separate the dynamic implementations into a dyn namespace, and to provide a pass for converting ops back to static forms when possible.

The culmination of that work is this PR, which refactors the ONNX importer to directly create dynamic relay graphs instead of using infer_value to make them static in the importer. Longer term, this will allow us to import dynamic models that we can't currently use.

We don't want to cause regressions for anyone, so this PR enables the dynamic_to_static pass by default in the graph runtime, we tested the PR against the ONNX model zoo https://github.com/onnx/models and fixed a number of issues in ops that apparently hadn't been tested with dynamic shapes to date.

An added benefit of this PR is that it removes a severe bottleneck in the infer_value calls. Models with lots of dynamic ops will import and compile much faster than before, Bert Squad from the ONNX model zoo imports and compiles in ~170s on master vs ~15s with this change.

This PR is not yet complete, we're working on adding and strided slice (#6316) to remove the last infer value calls.

Since we don't want to introduce regressions for anyone, I'd appreciate it if you could test any models you are currently running against this branch and let us know if you run into issues.

Thanks!

cc @masahi @jwfromm @soiferj @siju-samuel Please tag anyone else you think might be interested

mbrookhart · 2020-08-27T21:59:18Z

cc @zhiics @icemelon9

tests/python/relay/test_op_level10.py

mbrookhart · 2020-09-03T17:54:12Z

Thanks to @tmoreau89 for testing some custom models against this branch and finding a regression. Any one else using ONNX, I'd really appreciate it if you could do the same.

include/tvm/relay/transform.h

python/tvm/relay/frontend/onnx.py

tests/python/frontend/onnx/test_forward.py

tqchen · 2020-09-04T15:29:24Z

cc @zhiics @yzhliu

tmoreau89 · 2020-09-04T17:27:42Z

Thanks to @tmoreau89 for testing some custom models against this branch and finding a regression. Any one else using ONNX, I'd really appreciate it if you could do the same.

Happy to! On the custom model I tested, compilation time went from 95.8s down to 1.2s. Nice work!

zhiics · 2020-09-04T17:28:21Z

@mbrookhart ping me please when it is ready for review

fix pylint

Make OneHot dynamic Support BatchMatMul with dynamically shaped inputs fix dynamic broadcast Add null checks to broadcast_to rel functions fail more isolated broadcast_to test use StructuralEqual instead of pointer comparisions in dynamic_to_static pass add an optional weight freeze argument to onnx importer convert onnx resize to dynamic op add dynamic expand to onnx importer add a shape_func for power fix BERTSquad, lint handle onnx graph initializer parameters more intelligently

fix lint fix Call reference fix a type issue with expand fix a bad test refactor respond to review comments, fix batch matmul tests

…c_onnx

zhiics

Thanks for the great effort. Only left some minor comments.

python/tvm/relay/frontend/onnx.py

zhiics · 2020-09-14T16:10:10Z

python/tvm/relay/op/strategy/x86.py

-        plevel=10,
-    )
+    if is_dynamic(out_type):
+        strategy.add_implementation(


shouldn't this one be in generic.py?

Hmm, something very similar to this is already in generic.py, what I'm trying to do here is short-circuit the schedule if we have dynamic shapes. The x86 schedule, as written, assumes static shapes and breaks during schedule construction if I give it a dynamic input. Is there a cleaner way to do that short-circuit in generic.py?

I am not quite sure what is a better to do this though. @icemelon9 thoughts?

What would the behavior be if instead we only had if not is_dynamic(out_type) to register the x86 schedule? I would think that the generic strategy would be used even if we dont readd it here.

I'll give that a try! I'll report back shortly

Unforunately, it seems like the compile engine can't find any schedules if I do this:

E File "/home/mbrookhart/repos/mbrookhart_tvm/python/tvm/relay/backend/compile_engine.py", line 289, in lower_call E op, call.attrs, inputs, ret_type, target, use_autotvm=False E File "/home/mbrookhart/repos/mbrookhart_tvm/python/tvm/relay/backend/compile_engine.py", line 188, in select_implementation E best_plevel_impl = max(all_impls, key=lambda x: x.plevel) E ValueError: max() arg is an empty sequence

src/relay/transforms/dynamic_to_static.cc

tests/python/relay/dyn/test_dynamic_op_level10.py

zhiics · 2020-09-15T15:24:06Z

@jwfromm @electriclilies please take another look

masahi · 2020-09-15T23:21:58Z

@mbrookhart Does this PR enable compiling one model and running it with input data of different shapes?

mbrookhart · 2020-09-22T22:40:55Z

@jwfromm @electriclilies @zhiics @csullivan Could you take another look?

python/tvm/autotvm/record.py

python/tvm/relay/frontend/onnx.py

jwfromm · 2020-09-24T15:22:51Z

python/tvm/relay/op/strategy/x86.py

-        plevel=10,
-    )
+    if is_dynamic(out_type):
+        strategy.add_implementation(


What would the behavior be if instead we only had if not is_dynamic(out_type) to register the x86 schedule? I would think that the generic strategy would be used even if we dont readd it here.

jwfromm · 2020-09-24T15:25:09Z

tests/python/frontend/onnx/test_forward.py

+    tvm.testing.assert_allclose(out_np, tvm_out, rtol=1e-5, atol=1e-5)
+
+
+# TODO(mbrookhart): enable cuda once VM supports heterogenous execution


PR #6337 is now merged, should we enable GPU here or are there still issues?

I'm hitting issues on dynamic strided slice and topk, I was going to wait until I had that fixed to enabled them in the onnx frontend.

…c_onnx

mbrookhart · 2020-10-01T15:16:34Z

Ping?

…c_onnx

zhiics · 2020-10-01T17:50:55Z

@masahi @jwfromm @electriclilies please take another look per https://tvm.apache.org/docs/contribute/code_review.html#approve-and-request-changes-explicitly

electriclilies

Overall, this looks good to me!

One thing I did notice is that in the onnx importer itself, you put warnings in the comments telling people that they will need to run the dynamic_to_static pass because some operators do not support dynamic shapes yet.

We should probably add note / warning to the importer documentation and tutorials -- I'm not sure if that should be a part of this PR or separate, though.

zhiics · 2020-10-02T04:04:37Z

@mbrookhart please see Lily's last comment.

mbrookhart · 2020-10-02T18:29:18Z

@zhiics @electriclilies Added some doc strings

electriclilies · 2020-10-02T18:35:29Z

@mbrookhart Thanks! LGTM

zhiics · 2020-10-02T18:54:44Z

@mbrookhart cool. I will merge once CI passes

zhiics · 2020-10-03T02:33:09Z

Thanks @mbrookhart @jwfromm @electriclilies @masahi

* Change onnx importer to use dynamic upsampling3d (neo-ai#3) fix pylint * Refactor ONNX frontend to be dynamic Make OneHot dynamic Support BatchMatMul with dynamically shaped inputs fix dynamic broadcast Add null checks to broadcast_to rel functions fail more isolated broadcast_to test use StructuralEqual instead of pointer comparisions in dynamic_to_static pass add an optional weight freeze argument to onnx importer convert onnx resize to dynamic op add dynamic expand to onnx importer add a shape_func for power fix BERTSquad, lint handle onnx graph initializer parameters more intelligently * Dynamic ONNX importer: Upsampling and Pad (neo-ai#2) fix lint fix Call reference fix a type issue with expand fix a bad test refactor respond to review comments, fix batch matmul tests * black format * fix batch matmul test * add dynamic strided slice to the onnx importer * fix clip importer * fix qnn tutorial * fix bad merge, respond to review comments * add a simple dynamic model test * Add dynamic-shaped autopadding to convolution and pooling ops * fix dynamic issues in a few ops * fix pylint * disable tests onnxrt doesn't support * fix pytorch test * respond to review comments * add documentation about partially supporting dynamic shapes Co-authored-by: Lily Orth-Smith <lorthsmith@octoml.ai>

* Change onnx importer to use dynamic upsampling3d (#3) fix pylint * Refactor ONNX frontend to be dynamic Make OneHot dynamic Support BatchMatMul with dynamically shaped inputs fix dynamic broadcast Add null checks to broadcast_to rel functions fail more isolated broadcast_to test use StructuralEqual instead of pointer comparisions in dynamic_to_static pass add an optional weight freeze argument to onnx importer convert onnx resize to dynamic op add dynamic expand to onnx importer add a shape_func for power fix BERTSquad, lint handle onnx graph initializer parameters more intelligently * Dynamic ONNX importer: Upsampling and Pad (#2) fix lint fix Call reference fix a type issue with expand fix a bad test refactor respond to review comments, fix batch matmul tests * black format * fix batch matmul test * add dynamic strided slice to the onnx importer * fix clip importer * fix qnn tutorial * fix bad merge, respond to review comments * add a simple dynamic model test * Add dynamic-shaped autopadding to convolution and pooling ops * fix dynamic issues in a few ops * fix pylint * disable tests onnxrt doesn't support * fix pytorch test * respond to review comments * add documentation about partially supporting dynamic shapes Co-authored-by: Lily Orth-Smith <lorthsmith@octoml.ai>

electriclilies reviewed Aug 28, 2020

View reviewed changes

tests/python/relay/test_op_level10.py Show resolved Hide resolved

masahi mentioned this pull request Aug 31, 2020

Add TVM backend microsoft/hummingbird#232

Closed

mbrookhart force-pushed the mbrookhart/dynamic_onnx branch 2 times, most recently from 4e3bb37 to 8b5899b Compare September 3, 2020 17:45

jwfromm reviewed Sep 3, 2020

View reviewed changes

include/tvm/relay/transform.h Outdated Show resolved Hide resolved

python/tvm/relay/frontend/onnx.py Show resolved Hide resolved

python/tvm/relay/frontend/onnx.py Show resolved Hide resolved

tests/python/frontend/onnx/test_forward.py Outdated Show resolved Hide resolved

tqchen assigned zhiics Sep 4, 2020

mbrookhart force-pushed the mbrookhart/dynamic_onnx branch from 8b5899b to 8d37b3a Compare September 4, 2020 19:39

Lily Orth-Smith and others added 6 commits September 11, 2020 10:52

Change onnx importer to use dynamic upsampling3d (#3)

80962b7

fix pylint

Dynamic ONNX importer: Upsampling and Pad (#2)

c1e993c

fix lint fix Call reference fix a type issue with expand fix a bad test refactor respond to review comments, fix batch matmul tests

black format

077f74c

Merge remote-tracking branch 'upstream/master' into mbrookhart/dynami…

6ae7c02

…c_onnx

fix batch matmul test

2892e6a

mbrookhart force-pushed the mbrookhart/dynamic_onnx branch from 26c2571 to 2892e6a Compare September 11, 2020 19:36

mbrookhart marked this pull request as ready for review September 11, 2020 20:19

add dynamic strided slice to the onnx importer

1fc3721

mbrookhart force-pushed the mbrookhart/dynamic_onnx branch from f198c5f to 1fc3721 Compare September 11, 2020 20:22

Matthew Brookhart added 3 commits September 12, 2020 20:28

fix clip importer

55b4012

fix qnn tutorial

8864abc

Merge branch 'master' into mbrookhart/dynamic_onnx

7e8efae

zhiics reviewed Sep 14, 2020

View reviewed changes

fix bad merge, respond to review comments

53f7a7b

masahi mentioned this pull request Sep 15, 2020

[Frontend][Pytorch] Improve Pytorch frontend for object detection models #6449

Merged

fix pytorch test

78d3ff5

tqchen added the status: need review label Sep 23, 2020

jwfromm reviewed Sep 24, 2020

View reviewed changes

Matthew Brookhart added 2 commits September 25, 2020 15:38

Merge remote-tracking branch 'upstream/master' into mbrookhart/dynami…

664a055

…c_onnx

respond to review comments

54dd8d5

mbrookhart force-pushed the mbrookhart/dynamic_onnx branch from 795422f to 54dd8d5 Compare September 25, 2020 22:20

Merge remote-tracking branch 'upstream/master' into mbrookhart/dynami…

f0aa08d

…c_onnx

electriclilies approved these changes Oct 1, 2020

View reviewed changes

masahi approved these changes Oct 1, 2020

View reviewed changes

zhiics approved these changes Oct 2, 2020

View reviewed changes

Matthew Brookhart added 2 commits October 2, 2020 10:04

Merge branch 'master' into mbrookhart/dynamic_onnx

461d6ee

add documentation about partially supporting dynamic shapes

24a9e22

zhiics merged commit 2658ebe into apache:master Oct 3, 2020

junrushao mentioned this pull request Nov 1, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic ONNX Importer #6351

Dynamic ONNX Importer #6351

mbrookhart commented Aug 27, 2020 •

edited

mbrookhart commented Aug 27, 2020

mbrookhart commented Sep 3, 2020

tqchen commented Sep 4, 2020

tmoreau89 commented Sep 4, 2020

zhiics commented Sep 4, 2020

zhiics left a comment

zhiics Sep 14, 2020

mbrookhart Sep 14, 2020

zhiics Sep 14, 2020

jwfromm Sep 24, 2020

mbrookhart Sep 25, 2020

mbrookhart Sep 25, 2020

zhiics commented Sep 15, 2020

masahi commented Sep 15, 2020 •

edited

mbrookhart commented Sep 22, 2020

jwfromm Sep 24, 2020

jwfromm Sep 24, 2020

mbrookhart Sep 25, 2020

mbrookhart commented Oct 1, 2020

zhiics commented Oct 1, 2020

electriclilies left a comment

zhiics commented Oct 2, 2020

mbrookhart commented Oct 2, 2020

electriclilies commented Oct 2, 2020

zhiics commented Oct 2, 2020

zhiics commented Oct 3, 2020

		tvm.testing.assert_allclose(out_np, tvm_out, rtol=1e-5, atol=1e-5)


		# TODO(mbrookhart): enable cuda once VM supports heterogenous execution

Dynamic ONNX Importer #6351

Dynamic ONNX Importer #6351

Conversation

mbrookhart commented Aug 27, 2020 • edited

mbrookhart commented Aug 27, 2020

mbrookhart commented Sep 3, 2020

tqchen commented Sep 4, 2020

tmoreau89 commented Sep 4, 2020

zhiics commented Sep 4, 2020

zhiics left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiics commented Sep 15, 2020

masahi commented Sep 15, 2020 • edited

mbrookhart commented Sep 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mbrookhart commented Oct 1, 2020

zhiics commented Oct 1, 2020

electriclilies left a comment

Choose a reason for hiding this comment

zhiics commented Oct 2, 2020

mbrookhart commented Oct 2, 2020

electriclilies commented Oct 2, 2020

zhiics commented Oct 2, 2020

zhiics commented Oct 3, 2020

mbrookhart commented Aug 27, 2020 •

edited

masahi commented Sep 15, 2020 •

edited