Skip to content

Add OpInfo based meta tensor tests [RELAND] #77008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

ezyang
Copy link
Contributor

@ezyang ezyang commented May 6, 2022

Stack from ghstack:

PR #75994 was taking too long to ship so I extracted out the CrossRef gadget and
had it run on a simple OpInfo invocation only.

Signed-off-by: Edward Z. Yang ezyang@fb.com

PR #75994 was taking too long to ship so I extracted out the CrossRef gadget and
had it run on a simple OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 6, 2022

🔗 Helpful links

❌ 3 New Failures, 1 Base Failures

As of commit ce24296 (more details on the Dr. CI page):

Expand to see more
  • 3/4 failures introduced in this PR
  • 1/4 broken upstream at merge base c174dbe on May 06 from 1:12pm to 9:08pm

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build trunk / linux-bionic-cuda10.2-py3.9-gcc7 / test (slow, 1, 1, linux.4xlarge.nvidia.gpu) (1/3)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-07T05:11:18.2062573Z FAIL [0.248s]: tes...nary_cuda_uint8 (__main__.TestCudaFuserOpInfoCUDA)
2022-05-07T05:11:18.2060413Z     self.assertExportImportModule(m, inputs)
2022-05-07T05:11:18.2060798Z   File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_jit.py", line 154, in assertExportImportModule
2022-05-07T05:11:18.2060974Z     self.assertEqual(a, b, "Results of original model and "
2022-05-07T05:11:18.2061374Z   File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py", line 2207, in assertEqual
2022-05-07T05:11:18.2061491Z     assert_equal(
2022-05-07T05:11:18.2061817Z   File "/opt/conda/lib/python3.9/site-packages/torch/testing/_comparison.py", line 1074, in assert_equal
2022-05-07T05:11:18.2061945Z     raise error_metas[0].to_error()
2022-05-07T05:11:18.2062174Z AssertionError: Results of original model and exported/imported version of model differed
2022-05-07T05:11:18.2062193Z 
2022-05-07T05:11:18.2062330Z ======================================================================
2022-05-07T05:11:18.2062573Z FAIL [0.248s]: test_nvfuser_correctness_jiterator_unary_cuda_uint8 (__main__.TestCudaFuserOpInfoCUDA)
2022-05-07T05:11:18.2062844Z ----------------------------------------------------------------------
2022-05-07T05:11:18.2062978Z Traceback (most recent call last):
2022-05-07T05:11:18.2063327Z   File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py", line 1800, in wrapper
2022-05-07T05:11:18.2063432Z     method(*args, **kwargs)
2022-05-07T05:11:18.2063781Z   File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py", line 1800, in wrapper
2022-05-07T05:11:18.2063954Z     method(*args, **kwargs)
2022-05-07T05:11:18.2064338Z   File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2022-05-07T05:11:18.2064477Z     result = test(self, **param_kwargs)
2022-05-07T05:11:18.2064829Z   File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_device_type.py", line 808, in dep_fn
2022-05-07T05:11:18.2064955Z     return fn(slf, *args, **kwargs)

See GitHub Actions build periodic / linux-bionic-cuda11.6-py3.7-gcc7 / test (default, 2, 2, linux.4xlarge.nvidia.gpu) (2/3)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-07T04:16:08.7815153Z RuntimeError: test_meta failed!
2022-05-07T04:16:07.5095892Z 
2022-05-07T04:16:07.5096066Z FAILED (errors=4, skipped=198, expected failures=23)
2022-05-07T04:16:07.5096281Z 
2022-05-07T04:16:07.5096414Z Generating XML reports...
2022-05-07T04:16:08.0998110Z Generated XML report: test-reports/python-unittest/test_meta/TEST-TestMetaCUDA-20220507041042.xml
2022-05-07T04:16:08.7809594Z Traceback (most recent call last):
2022-05-07T04:16:08.7810161Z   File "test/run_test.py", line 1072, in <module>
2022-05-07T04:16:08.7811895Z     main()
2022-05-07T04:16:08.7812192Z   File "test/run_test.py", line 1050, in main
2022-05-07T04:16:08.7814822Z     raise RuntimeError(err_message)
2022-05-07T04:16:08.7815153Z RuntimeError: test_meta failed!
2022-05-07T04:16:09.2921054Z 
2022-05-07T04:16:09.2921544Z real	5m34.440s
2022-05-07T04:16:09.2921872Z user	5m30.417s
2022-05-07T04:16:09.2922125Z sys	0m6.878s
2022-05-07T04:16:09.2922705Z + cleanup
2022-05-07T04:16:09.2922945Z + retcode=1
2022-05-07T04:16:09.2923185Z + set +x
2022-05-07T04:16:09.2976598Z ##[error]Process completed with exit code 1.
2022-05-07T04:16:09.3025389Z ##[group]Run pytorch/pytorch/.github/actions/get-workflow-job-id@master
2022-05-07T04:16:09.3025765Z with:

See GitHub Actions build periodic / linux-bionic-cuda11.6-py3.7-gcc7 / test (default, 1, 2, linux.4xlarge.nvidia.gpu) (3/3)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-07T04:37:56.4515232Z RuntimeError: test_ops failed!
2022-05-07T04:37:54.5614154Z 
2022-05-07T04:37:54.5614278Z Generating XML reports...
2022-05-07T04:37:55.3692172Z Generated XML report: test-reports/python-unittest/test_ops/TEST-TestCommonCUDA-20220507041255.xml
2022-05-07T04:37:55.5342661Z Generated XML report: test-reports/python-unittest/test_ops/TEST-TestCompositeComplianceCUDA-20220507041255.xml
2022-05-07T04:37:55.6556333Z Generated XML report: test-reports/python-unittest/test_ops/TEST-TestMathBitsCUDA-20220507041255.xml
2022-05-07T04:37:56.4505950Z Traceback (most recent call last):
2022-05-07T04:37:56.4506634Z   File "test/run_test.py", line 1072, in <module>
2022-05-07T04:37:56.4510551Z     main()
2022-05-07T04:37:56.4511088Z   File "test/run_test.py", line 1050, in main
2022-05-07T04:37:56.4514658Z     raise RuntimeError(err_message)
2022-05-07T04:37:56.4515232Z RuntimeError: test_ops failed!
2022-05-07T04:37:56.9470571Z 
2022-05-07T04:37:56.9470887Z real	27m51.924s
2022-05-07T04:37:56.9471404Z user	27m45.353s
2022-05-07T04:37:56.9471895Z sys	0m24.759s
2022-05-07T04:37:56.9472358Z + cleanup
2022-05-07T04:37:56.9472722Z + retcode=1
2022-05-07T04:37:56.9472954Z + set +x
2022-05-07T04:37:56.9517501Z ##[error]Process completed with exit code 1.
2022-05-07T04:37:56.9561461Z ##[group]Run pytorch/pytorch/.github/actions/get-workflow-job-id@master
2022-05-07T04:37:56.9561828Z with:

🚧 1 fixed upstream failure:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@ezyang ezyang requested review from albanD and anjali411 May 7, 2022 01:04
PR #75994 was taking too long to ship so I extracted out the CrossRef gadget and
had it run on a simple OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
PR #75994 was taking too long to ship so I extracted out the CrossRef gadget and
had it run on a simple OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 7, 2022
PR #75994 was taking too long to ship so I extracted out the CrossRef gadget and
had it run on a simple OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: 5488fed
Pull Request resolved: #77008
@ezyang
Copy link
Contributor Author

ezyang commented May 7, 2022

@pytorchbot merge this

@github-actions
Copy link
Contributor

github-actions bot commented May 7, 2022

Hey @ezyang.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

@ezyang ezyang added release notes: composability release notes category topic: not user facing topic category labels May 7, 2022
@facebook-github-bot facebook-github-bot deleted the gh/ezyang/1146/head branch May 10, 2022 14:16
facebook-github-bot pushed a commit that referenced this pull request May 13, 2022
Summary:
PR #75994 was taking too long to ship so I extracted out the CrossRef gadget and
had it run on a simple OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: #77008

Approved by: https://github.com/ngimel

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/60f131fb6c2e3f4a23e64096a3e718a1e669215b

Reviewed By: malfet

Differential Revision: D36250515

fbshipit-source-id: 93cdc3cb9bf4c3375bd679aea8d5f59a09f65585
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants