Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add amd migraphx execution provider to onnx runtime #2929

Merged
merged 27 commits into from
May 26, 2020

Conversation

pfultz2
Copy link
Contributor

@pfultz2 pfultz2 commented Jan 28, 2020

Description:

This adds an initial execution provider using MIGraphX which supports the AMD ROCm stack.

As a first step, we have hardcoded the operators supported. In the future, we plan to update this to use an API from MIGraphX directly.

This also uses the C/C++ API from MIGraphX, as this will be more stable. As this hasn't been released as part of the ROCm distribution yet we build this version in the dockerfile. In the future we can switch to just pulling down the debian package from ROCm.

Motivation and Context

To support AMD GPUs in onnx runtime and take advantage of MIGraphX's graph optimizations.

@pfultz2 pfultz2 requested a review from a team as a code owner January 28, 2020 23:12
@msftclas
Copy link

msftclas commented Jan 28, 2020

CLA assistant check
All CLA requirements met.

@jywu-msft
Copy link
Member

Thanks for the contribution!
We will take a look.

@jywu-msft
Copy link
Member

Would you be able to test against the models contained in the zip file?
https://github.com/microsoft/onnxruntime/blob/master/tools/ci_build/github/azure-pipelines/templates/set-test-data-variables-step.yml#L4

Those are models we test against in our CI pipelines using onnx_test_runner.

@jywu-msft
Copy link
Member

why are updates showing up in the tvm and onnx-tensorrt modules?

@scxiao
Copy link
Contributor

scxiao commented Feb 4, 2020

why are updates showing up in the tvm and onnx-tensorrt modules?

This a mismatch of the commit used by these submodules between our branch and the master branch. Fixed that.

@@ -743,6 +744,14 @@ if (onnxruntime_USE_TENSORRT)
endif()
endif()

if (onnxruntime_USE_MIGRAPHX)
if (WIN32)
message(FATAL_ERROR "MIGraphX does not support build in Windows!")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there plans to eventually support in Windows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently dont have any plans.

@scxiao
Copy link
Contributor

scxiao commented Mar 3, 2020

Would you be able to test against the models contained in the zip file?
https://github.com/microsoft/onnxruntime/blob/master/tools/ci_build/github/azure-pipelines/templates/set-test-data-variables-step.yml#L4

Those are models we test against in our CI pipelines using onnx_test_runner.

After trying all models, MIGraphX EP can run all cases by either falling back to CPU or run correctly on MIGraphX.

@jywu-msft
Copy link
Member

/azp run Linux CPU CI Pipeline, Linux CPU x64 NoContribops CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, MacOS CI Pipeline, MacOS NoContribops CI Pipeline, Win CPU x64 NoContribops CI Pipeline, Win CPU x86 CI Pipeline, Windows CPU CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 9 pipeline(s).

@jywu-msft
Copy link
Member

/azp run Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@jywu-msft
Copy link
Member

/azp run MacOS CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@jywu-msft
Copy link
Member

windows build is failing because of failing python pep8 check (which is new)
see https://github.com/microsoft/onnxruntime/blob/master/docs/Coding_Conventions_and_Standards.md#python-code-style
can you ensure any modified python files are following pep8 conventions? thanks!

@scxiao
Copy link
Contributor

scxiao commented May 22, 2020

windows build is failing because of failing python pep8 check (which is new)
see https://github.com/microsoft/onnxruntime/blob/master/docs/Coding_Conventions_and_Standards.md#python-code-style
can you ensure any modified python files are following pep8 conventions? thanks!

Thanks. Just fixed it, hope it can pass this time.

@jywu-msft
Copy link
Member

/azp run Linux CPU CI Pipeline, Linux CPU x64 NoContribops CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, MacOS CI Pipeline, MacOS NoContribops CI Pipeline, Win CPU x64 NoContribops CI Pipeline, Win CPU x86 CI Pipeline, Windows CPU CI Pipeline

@jywu-msft
Copy link
Member

/azp run Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-mac-ci-pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 7 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@jywu-msft
Copy link
Member

/azp run Windows CPU CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jywu-msft
Copy link
Member

/azp run Linux CPU CI Pipeline, Linux CPU x64 NoContribops CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, MacOS CI Pipeline, MacOS NoContribops CI Pipeline, Win CPU x64 NoContribops CI Pipeline, Win CPU x86 CI Pipeline

@jywu-msft
Copy link
Member

/azp run Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-mac-ci-pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@jywu-msft
Copy link
Member

@snnn is the static analysis failure a known issue? i don't think it's due to changes in this PR

@snnn
Copy link
Member

snnn commented May 26, 2020

Yes. Please update the branch with the latest master

@jywu-msft
Copy link
Member

@scxiao please merge latest master to fix the windows cpu CI build failure (due to static analysis failure)
I will kick off tests again. everything should pass afterwards. thanks!

@scxiao
Copy link
Contributor

scxiao commented May 26, 2020

@scxiao please merge latest master to fix the windows cpu CI build failure (due to static analysis failure)
I will kick off tests again. everything should pass afterwards. thanks!

Thanks for the info. I just merged that. Please try again.

@jywu-msft
Copy link
Member

/azp run Linux CPU CI Pipeline, Linux CPU x64 NoContribops CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, MacOS CI Pipeline, MacOS NoContribops CI Pipeline, Win CPU x64 NoContribops CI Pipeline, Win CPU x86 CI Pipeline

@jywu-msft
Copy link
Member

/azp run Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-mac-ci-pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@jywu-msft
Copy link
Member

/azp run Windows CPU CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jywu-msft jywu-msft merged commit 7759136 into microsoft:master May 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants