Add a PyTorch to Relay parser #63

alexwong · 2019-11-19T22:24:19Z

Thanks for contributing to TVM! Please refer to guideline https://docs.tvm.ai/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.

Support PyTorch natively in TVM by providing a relay parser. Adapted from #23 and currently only supports traced models (no control flow) and probably only image classification models (provided test uses torchvision impl to test against).

Like other frontends, grab the relay module and paramaters to build via:
mod, params = relay.frontend.from_pytorch(trace, input_shapes)

Tested against torchvision models in the included test_forward.py. Will write a discussion post in discuss.tvm.ai to see if we want this upstream.

alexwong · 2019-11-20T01:04:45Z

Having issues installing torch and torchvision in CI/CD so tests can be run. Will continue looking at it.

alexwong · 2019-11-27T17:25:55Z

WIP for now, looks like there may be some bugs.

alexwong · 2019-12-11T19:32:00Z

@kevinthesun @zhiics @yongwww Could you guys take a look at this sometime?

*Also please ignore the changes to the CI scripts. I had some issues trying to get the CI to pass but it may have been unrelated (PT is installed here: https://github.com/neo-ai/tvm/blob/dev/docker/install/ubuntu_install_onnx.sh)

tests/python/frontend/pytorch/test_forward.py

python/tvm/relay/frontend/pytorch.py

tests/python/frontend/pytorch_neo/test_forward.py

kevinthesun

LGTM

zxy844288792

LGTM

tests/scripts/task_python_vta.sh

tests/scripts/task_python_unittest.sh

tests/scripts/task_python_integration.sh

tests/python/frontend/pytorch_neo/test_forward.py

python/tvm/relay/frontend/pytorch_neo.py

yongwww · 2020-01-09T22:48:21Z

@zhiics pls help take a look

tests/python/frontend/pytorch_neo/test_forward.py

zhiics

Thanks for the effort. I have a general question, which is about the handling of dynamic models. I think the current structure or handling of ops would have the same/similar problem to the TF frontend parser when handling dynamic models -- we need to infer value/shape everywhere. I agree we don't see this in most image classification model, but we probably need to be aware of it when extending to models involving dynamic shape.

@yongwww @kevinthesun how do you guys think?

python/tvm/relay/frontend/pytorch_neo.py

alexwong · 2020-01-15T21:14:20Z

Any more feedback? I responded to everything except the question regarding dynamic shape which I'm not sure about. Anywhere we need shape, we tend to use the infer_shape pass. The only thing I'm slightly concerned about is that VGG and AlexNet are kind of flaky in terms of accuracy.

zhiics · 2020-01-16T18:37:40Z

@alexwong Dynamic shape would be the a separate issue that we will work on incrementally. I brought it up just because I wanted to make sure that design would not change much when we handle dynamic information in the future.

Could you please elaborate a little more about the accuracy problem? What is flaky? And can you please dig a bit more into the details because I think flaky accuracy on real model might cause problems to the service in the end.

alexwong · 2020-01-16T19:06:52Z

There's been some issues with vgg and alexnet failing due to bad accuracy. I can take another look but it's kind of hard to debug at the moment as it doesn't happen when running locally on CPU and doesn't happen consistently when running on the CI.

zhiics · 2020-01-16T19:08:50Z

How frequently it happens in the CI? We can checkout the docker image to reproduce if it never happens on the local machine. Lets discuss offline.

yongwww · 2020-01-16T19:33:37Z

@alexwong it's possible to checkout the ci container to reproduce flaky on your own ec2 instance. The flaky issue should be fixed if it occurs frequently.

I will take a look at this pr again later today.

alexwong · 2020-01-16T20:43:11Z

Okay, trying to reproduce on an ec2 instance with the docker image. I would say it happens less than half the time (rough guess). It is odd that this only recently happened and I don't think there's been too many functional changes in the code. Part of the reason could be from pulling in the upstream merge here.

alexwong · 2020-01-18T01:16:42Z

I'm able to reproduce the failures with the ci-gpu container. Am having trouble diagnosing the issue though. Comparing the initial PT graph and produced relay module for passes and failures, there is no difference so my assumption would be that the problem is most likely on the operator implementation level. Will compare resulting output after each operator/layer and compare between PyTorch and TVM to figure out where it's going wrong. Will require some rework on the PT side to return the output at each layer in the forward function.

alexwong · 2020-01-22T19:07:11Z

I'm still looking into the VGG issue but @zxy844288792 talked to Vin and he said we can merge this without the fix and have that as a separate issue as it only occurs when using the settings in ci-gpu and doesn't occur in our service. It may be good to temporarily comment out VGG in the test so it doesn't cause other PRs to fail CI occasionally though. Can we finalize the review and merge this? @yongwww

python/tvm/relay/frontend/pytorch_neo.py

yongwww · 2020-01-23T23:23:52Z

@alexwong I am okay for commenting out vgg temporarily, filed an issue #80 to track it.

I agree with @zhiics about the infer_shape for dynamic inputs. Considering we don't have dynamic requirements/customer needs/tests for pt model currently, it is okay for me to skip this part.

Overall, lgtm, @alexwong pls fix all comments.

alexwong changed the title ~~Initial commit of PyTorch to Relay parser~~ Add a PyTorch to Relay parser Nov 19, 2019

alexwong changed the title ~~Add a PyTorch to Relay parser~~ [WIP] Add a PyTorch to Relay parser Nov 27, 2019

alexwong force-pushed the pytorch-relay branch 2 times, most recently from d093bf8 to 14cf351 Compare December 3, 2019 21:05

alexwong changed the title ~~[WIP] Add a PyTorch to Relay parser~~ Add a PyTorch to Relay parser Dec 11, 2019

alexwong force-pushed the pytorch-relay branch from 083c631 to b8f8e1a Compare December 11, 2019 19:29

alexwong commented Dec 12, 2019

View reviewed changes

tests/python/frontend/pytorch/test_forward.py Outdated Show resolved Hide resolved

zxy844288792 reviewed Dec 17, 2019

View reviewed changes

python/tvm/relay/frontend/pytorch.py Outdated Show resolved Hide resolved

python/tvm/relay/frontend/pytorch.py Outdated Show resolved Hide resolved

zxy844288792 reviewed Dec 17, 2019

View reviewed changes

tests/python/frontend/pytorch_neo/test_forward.py Outdated Show resolved Hide resolved

tests/python/frontend/pytorch_neo/test_forward.py Outdated Show resolved Hide resolved

alexwong force-pushed the pytorch-relay branch 2 times, most recently from c620491 to be96d1f Compare December 20, 2019 21:15

zxy844288792 suggested changes Dec 20, 2019

View reviewed changes

tests/python/frontend/pytorch_neo/test_forward.py Outdated Show resolved Hide resolved

alexwong force-pushed the pytorch-relay branch from 6d58323 to a4d3f56 Compare December 20, 2019 23:28

kevinthesun approved these changes Jan 8, 2020

View reviewed changes

zxy844288792 approved these changes Jan 9, 2020

View reviewed changes

yongwww suggested changes Jan 9, 2020

View reviewed changes

tests/python/frontend/pytorch_neo/test_forward.py Show resolved Hide resolved

tests/python/frontend/pytorch_neo/test_forward.py Show resolved Hide resolved

alexwong force-pushed the pytorch-relay branch 3 times, most recently from 6522294 to c8d690e Compare January 14, 2020 00:16

zhiics reviewed Jan 14, 2020

View reviewed changes

python/tvm/relay/frontend/pytorch_neo.py Show resolved Hide resolved

python/tvm/relay/frontend/pytorch_neo.py Outdated Show resolved Hide resolved

python/tvm/relay/frontend/pytorch_neo.py Outdated Show resolved Hide resolved

python/tvm/relay/frontend/pytorch_neo.py Show resolved Hide resolved

alexwong force-pushed the pytorch-relay branch 2 times, most recently from 9066acf to d266abf Compare January 15, 2020 19:42

alexwong force-pushed the pytorch-relay branch 3 times, most recently from 3d366cd to b83978c Compare January 23, 2020 04:02

alexwong requested a review from yongwww January 23, 2020 20:40

yongwww suggested changes Jan 23, 2020

View reviewed changes

yongwww mentioned this pull request Jan 23, 2020

pt vgg flaky issue on ci gpu instance #80

Open

yongwww approved these changes Jan 24, 2020

View reviewed changes

Add a PyTorch to Relay parser

0979094

alexwong force-pushed the pytorch-relay branch from 8498f7f to 0979094 Compare January 24, 2020 18:51

trevor-m merged commit 1ce36ec into neo-ai:dev Jan 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a PyTorch to Relay parser #63

Add a PyTorch to Relay parser #63

alexwong commented Nov 19, 2019 •

edited

Loading

alexwong commented Nov 20, 2019

alexwong commented Nov 27, 2019

alexwong commented Dec 11, 2019 •

edited

Loading

kevinthesun left a comment

zxy844288792 left a comment

yongwww commented Jan 9, 2020

zhiics left a comment

alexwong commented Jan 15, 2020

zhiics commented Jan 16, 2020

alexwong commented Jan 16, 2020

zhiics commented Jan 16, 2020

yongwww commented Jan 16, 2020

alexwong commented Jan 16, 2020 •

edited

Loading

alexwong commented Jan 18, 2020

alexwong commented Jan 22, 2020

yongwww commented Jan 23, 2020

Add a PyTorch to Relay parser #63

Add a PyTorch to Relay parser #63

Conversation

alexwong commented Nov 19, 2019 • edited Loading

alexwong commented Nov 20, 2019

alexwong commented Nov 27, 2019

alexwong commented Dec 11, 2019 • edited Loading

kevinthesun left a comment

Choose a reason for hiding this comment

zxy844288792 left a comment

Choose a reason for hiding this comment

yongwww commented Jan 9, 2020

zhiics left a comment

Choose a reason for hiding this comment

alexwong commented Jan 15, 2020

zhiics commented Jan 16, 2020

alexwong commented Jan 16, 2020

zhiics commented Jan 16, 2020

yongwww commented Jan 16, 2020

alexwong commented Jan 16, 2020 • edited Loading

alexwong commented Jan 18, 2020

alexwong commented Jan 22, 2020

yongwww commented Jan 23, 2020

alexwong commented Nov 19, 2019 •

edited

Loading

alexwong commented Dec 11, 2019 •

edited

Loading

alexwong commented Jan 16, 2020 •

edited

Loading