Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ONNX] Remove Aten ops from ONNX export #37239

Closed
wants to merge 51 commits into from

Conversation

@neginraoof
Copy link
Collaborator

neginraoof commented Apr 24, 2020

This PR adds a new operator export type to exporter: ONNX_FALLTHROUGH
This new type allows ops that are not supported to pass through.
This PR also removes all aten ops in ONNX operator export type mode.

@dr-ci
Copy link

dr-ci bot commented Apr 24, 2020

💊 CI failures summary and remediations

As of commit 04c1cb8 (more details on the Dr. CI page):


  • 2/2 failures possibly* introduced in this PR
    • 1/2 non-CircleCI failure(s)

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_bionic_py3_8_gcc9_test (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) <confirmed not flaky by 5 failures>

May 29 18:48:33 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function ‘int main()’:\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:22: error: expected ‘;’ before ‘}’ token\n 2 | int main() { return 0 }\n | ^~\n | ;\n" }
May 29 18:48:32     raise RuntimeError(message) 
May 29 18:48:32 RuntimeError: test_cuda failed! 
May 29 18:48:33  
May 29 18:48:33 real	5m13.298s 
May 29 18:48:33 user	5m19.383s 
May 29 18:48:33 sys	0m17.898s 
May 29 18:48:33 + cleanup 
May 29 18:48:33 + retcode=1 
May 29 18:48:33 + set +x 
May 29 18:48:33 =================== sccache compilation log =================== 
May 29 18:48:33 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function ‘int main()’:\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:22: error: expected ‘;’ before ‘}’ token\n    2 | int main() { return 0 }\n      |                      ^~\n      |                      ;\n" } 
May 29 18:48:33  
May 29 18:48:33 =========== If your build fails, please take a look at the log above for possible reasons =========== 
May 29 18:48:33 Compile requests                 58 
May 29 18:48:33 Compile requests executed        33 
May 29 18:48:33 Cache hits                       25 
May 29 18:48:33 Cache misses                      7 
May 29 18:48:33 Cache timeouts                    0 
May 29 18:48:33 Cache read errors                 0 
May 29 18:48:33 Forced recaches                   0 
May 29 18:48:33 Cache write errors                0 

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 130 times.

@vadimkantorov
Copy link
Contributor

vadimkantorov commented Apr 24, 2020

There are a lot of almost duplicate error messages 'Unsupported: ONNX export of cumsum in ' 'opset 9. Please try higher opset versions.'. Should a function / format string be put in a variable? (e.g. error_opset9_fmt = 'Unsupported: ONNX export of {} in opset 9. Please try higher opset versions.'

@neginraoof
Copy link
Collaborator Author

neginraoof commented Apr 24, 2020

@vadimkantorov Yeah makes sense. This is mainly WIP now to check CI failures due to removal of ATen ops in ONNX. I'll refactor and clean up export and symbolic conversions a bit.

@neginraoof neginraoof changed the title [ONNX] Remove Aten ops from ONNX models [WIP][ONNX] Remove Aten ops from ONNX models Apr 24, 2020
neginraoof added 3 commits Apr 24, 2020
@neginraoof neginraoof changed the title [WIP][ONNX] Remove Aten ops from ONNX models [WIP][ONNX] Remove Aten ops from ONNX export Apr 28, 2020
@neginraoof neginraoof requested a review from apaszke as a code owner Apr 28, 2020
neginraoof added 7 commits Apr 28, 2020
…ch into neraoof/onnxFallthrough
@neginraoof
Copy link
Collaborator Author

neginraoof commented Apr 29, 2020

@houseroad I'm seeing ONNX checker failures. Would it be possible to update ONNX submodule?

neginraoof added 4 commits Apr 30, 2020
…oof/onnxFallthrough
…oof/onnxFallthrough
@neginraoof neginraoof changed the title [WIP][ONNX] Remove Aten ops from ONNX export [ONNX] Remove Aten ops from ONNX export May 1, 2020
neginraoof added 2 commits May 12, 2020
torch/onnx/__init__.py Show resolved Hide resolved
torch/onnx/__init__.py Outdated Show resolved Hide resolved
torch/onnx/symbolic_opset9.py Outdated Show resolved Hide resolved
torch/onnx/symbolic_opset9.py Outdated Show resolved Hide resolved
torch/onnx/utils.py Show resolved Hide resolved
neginraoof added 4 commits May 15, 2020
…oof/onnxFallthrough
…oof/onnxFallthrough

# Conflicts:
#	test/onnx/test_utility_funs.py
torch/onnx/__init__.py Outdated Show resolved Hide resolved
torch/onnx/__init__.py Outdated Show resolved Hide resolved
neginraoof added 5 commits May 19, 2020
…oof/onnxFallthrough
@neginraoof
Copy link
Collaborator Author

neginraoof commented May 19, 2020

cc @houseroad
Please review this PR. Thanks!

Copy link
Contributor

facebook-github-bot left a comment

@houseroad has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@neginraoof
Copy link
Collaborator Author

neginraoof commented May 27, 2020

@houseroad Thanks for looking into this. Could you please give me more details about the internal failures?

Copy link
Member

houseroad left a comment

Could you rebase the PR to resolve the conflict?

is exported as:
graph(%x.1 : Long(1:1)):
%1 : Tensor = onnx::ReduceSum[keepdims=0](%x.1)
%y.1 : Long() = prim::ListConstruct(%1)

This comment has been minimized.

Copy link
@houseroad

houseroad May 27, 2020

Member

After reading the comments, I am still not clear how ONNX_FALLTHROUGH works, could you also provide an ONNX graph as well?

This comment has been minimized.

Copy link
@neginraoof

neginraoof May 27, 2020

Author Collaborator

The example graph is at line 103:

Example graph:
                graph(%x.1 : Long(1:1)):
                  %1 : None = prim::Constant()
                  %2 : Tensor = aten::sum(%x.1, %1)
                  %y.1 : Tensor[] = prim::ListConstruct(%2)
                  return (%y.1)
is exported as:
                graph(%x.1 : Long(1:1)):
                  %1 : Tensor = onnx::ReduceSum[keepdims=0](%x.1)
                  %y.1 : Long() = prim::ListConstruct(%1)
                return (%y.1)

This mode is converting the registered onnx ops into ONNX domain, and fallthrough the ops not registered.
In this case, prim:ListConstruct is not supported, and so it's exported as is.
With ONNX_ATEN_FALLBACK mode, only unsupported aten ops are exported. The new ONNX_FALLTHROUGH mode exports any unsupported op.

This comment has been minimized.

Copy link
@neginraoof

neginraoof May 28, 2020

Author Collaborator

This comment has been minimized.

Copy link
@houseroad

houseroad May 29, 2020

Member

Yeah, I think you should add the above explanation to the comments as well, such as in the example, which op is not supported by onnx, and how it falls through, and the major difference between ATEN_FALLBACK and ONNX_FALLTHROUGH

This comment has been minimized.

Copy link
@neginraoof

neginraoof May 29, 2020

Author Collaborator

Sure, I'll update it.

neginraoof added 2 commits May 27, 2020
Copy link
Contributor

facebook-github-bot left a comment

@houseroad has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

facebook-github-bot left a comment

@houseroad has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 30, 2020

@houseroad merged this pull request in b7b99ab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

7 participants
You can’t perform that action at this time.