Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port min kernel to structured kernels. #61450

Closed
wants to merge 15 commits into from

Conversation

ysiraichi
Copy link
Collaborator

@ysiraichi ysiraichi commented Jul 9, 2021

Tracking issue: #55070

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jul 9, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 91507a1 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See GitHub Actions build linux-xenial-cuda11.3-py3.6-gcc7 / test (default, 2, 2, linux.8xlarge.nvidia.gpu) (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2021-09-21T17:08:02.1986802Z RuntimeError: CUDA error: device-side assert triggered
2021-09-21T17:07:59.0109996Z   File "/opt/conda/lib/python3.6/site-packages/torch/cuda/__init__.py", line 493, in synchronize
2021-09-21T17:07:59.0110835Z     return torch._C._cuda_synchronize()
2021-09-21T17:07:59.0111704Z RuntimeError: CUDA error: device-side assert triggered
2021-09-21T17:07:59.0112763Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
2021-09-21T17:07:59.0113775Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
2021-09-21T17:08:02.1977308Z /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/TensorCompare.cu:161: _assert_async_cuda_kernel: block: [0,0,0], thread: [0,0,0] Assertion `input[0] != c10::complex<float>(0, 0)` failed.
2021-09-21T17:08:02.1982070Z Traceback (most recent call last):
2021-09-21T17:08:02.1982666Z   File "<string>", line 4, in <module>
2021-09-21T17:08:02.1984296Z   File "/opt/conda/lib/python3.6/site-packages/torch/cuda/__init__.py", line 493, in synchronize
2021-09-21T17:08:02.1985877Z     return torch._C._cuda_synchronize()
2021-09-21T17:08:02.1986802Z RuntimeError: CUDA error: device-side assert triggered
2021-09-21T17:08:02.1987851Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
2021-09-21T17:08:02.1988845Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
2021-09-21T17:08:02.4127859Z ok (12.692s)
2021-09-21T17:08:02.4199281Z   test_gather_bool (__main__.TestCuda) ... ok (0.007s)
2021-09-21T17:08:02.4243492Z   test_get_device_index (__main__.TestCuda) ... ok (0.004s)
2021-09-21T17:08:02.4304689Z   test_get_set_rng_state_all (__main__.TestCuda) ... ok (0.006s)
2021-09-21T17:08:02.4544704Z   test_grad_scaling_accumulation (__main__.TestCuda) ... ok (0.024s)
2021-09-21T17:08:02.5017497Z   test_grad_scaling_autocast (__main__.TestCuda) ... ok (0.047s)
2021-09-21T17:08:02.5291978Z   test_grad_scaling_clipping (__main__.TestCuda) ... ok (0.027s)
2021-09-21T17:08:02.5560886Z   test_grad_scaling_clipping_separate_unscale (__main__.TestCuda) ... ok (0.027s)

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

ysiraichi added a commit that referenced this pull request Jul 9, 2021
Tracking issue: #55070

ghstack-source-id: 4b22dd5db24567ec1e8faccdb63ba17acd881158
Pull Request resolved: #61450
ysiraichi added a commit that referenced this pull request Jul 9, 2021
Tracking issue: #55070

ghstack-source-id: aa16ba15f10264fba0b9d3f6d5d857a61d821745
Pull Request resolved: #61450
@ysiraichi ysiraichi marked this pull request as ready for review July 10, 2021 00:35
@ysiraichi ysiraichi requested a review from ezyang as a code owner July 10, 2021 00:35
@ysiraichi ysiraichi requested a review from bdhirsh July 10, 2021 00:35
ysiraichi added a commit that referenced this pull request Jul 10, 2021
Tracking issue: #55070

ghstack-source-id: 175efea674ecc7cd429b8d1414972fdd97013dfd
Pull Request resolved: #61450
ysiraichi added a commit that referenced this pull request Jul 14, 2021
Tracking issue: #55070

ghstack-source-id: 02fbbbc3a3d025f0320b91955f614449513e0aa2
Pull Request resolved: #61450
ysiraichi added a commit that referenced this pull request Jul 14, 2021
Tracking issue: #55070

ghstack-source-id: 3334511e5378826cebaed6bffdd00bd0e71d5058
Pull Request resolved: #61450
@ysiraichi
Copy link
Collaborator Author

The CI failure is not related to this PR.

@bdhirsh
Copy link
Contributor

bdhirsh commented Jul 16, 2021

@bdhirsh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ysiraichi added a commit that referenced this pull request Jul 20, 2021
Tracking issue: #55070

ghstack-source-id: 1e4365a409b3e4bdc29a1fcac75e882c28065a8f
Pull Request resolved: #61450
ysiraichi added a commit that referenced this pull request Jul 21, 2021
Tracking issue: #55070

ghstack-source-id: 1e4365a409b3e4bdc29a1fcac75e882c28065a8f
Pull Request resolved: #61450
bdhirsh added a commit that referenced this pull request Aug 30, 2021
Tracking issue: #55070

ghstack-source-id: b0c0e5e9a1276e36b7a804a576aea7d8afd564c4
Pull Request resolved: #61450
@bdhirsh
Copy link
Contributor

bdhirsh commented Aug 31, 2021

@bdhirsh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@bdhirsh
Copy link
Contributor

bdhirsh commented Aug 31, 2021

Actually, is that test failure unrelated? It's a test suite for meta (TestSparseMETA), looks meta tensor related. The failing check lives here:

storage_initialized(),
- meta tensors have storage (so has_storage() returns true), but it's nullptr so storage_initialized() returns false).

This sounds like a meta coverage thing where enabling the meta API for min may have gotten some tests further along, so you might need to mark a test in test_sparse.py with skip_meta?

@ezyang
Copy link
Contributor

ezyang commented Sep 1, 2021

Hum, we probably shouldn't be running any of the sparse tests under meta tensors. Oops!

@ysiraichi
Copy link
Collaborator Author

@bdhirsh Yeah. That does seem to be the case. I thought I had @skipMeta all of them, though.
@ezyang Does adding "meta" as the except_for parameter when instantiating tests for both TestSparse and TestSparseUnaryUfuncs sounds reasonable?

@ezyang
Copy link
Contributor

ezyang commented Sep 2, 2021

Does adding "meta" as the except_for parameter when instantiating tests for both TestSparse and TestSparseUnaryUfuncs sounds reasonable?

sgtm

ysiraichi added a commit that referenced this pull request Sep 5, 2021
Tracking issue: #55070

ghstack-source-id: 767fa8595647fea4b459638b6126075418b71f4f
Pull Request resolved: #61450
@ezyang
Copy link
Contributor

ezyang commented Sep 7, 2021

@bdhirsh is on vacation this week; if you want this imported faster than that give a holler

@ysiraichi
Copy link
Collaborator Author

Don't worry. I still want to do some refactoring in this PR.

ysiraichi added a commit that referenced this pull request Sep 7, 2021
Tracking issue: #55070

ghstack-source-id: 6112d1ad175ff744f97680ed11e60d9afb5f95b3
Pull Request resolved: #61450
@codecov
Copy link

codecov bot commented Sep 7, 2021

Codecov Report

Merging #61450 (91507a1) into gh/ysiraichi/16/base (1b02641) will increase coverage by 6.29%.
The diff coverage is n/a.

@@                   Coverage Diff                    @@
##           gh/ysiraichi/16/base   #61450      +/-   ##
========================================================
+ Coverage                 60.08%   66.38%   +6.29%     
========================================================
  Files                       657      738      +81     
  Lines                     85345    94175    +8830     
========================================================
+ Hits                      51277    62514   +11237     
+ Misses                    34068    31661    -2407     

bdhirsh added a commit that referenced this pull request Sep 13, 2021
Tracking issue: #55070

ghstack-source-id: aea71dffa3888c8131c9881c32295af32f2fa2ae
Pull Request resolved: #61450
@bdhirsh
Copy link
Contributor

bdhirsh commented Sep 13, 2021

@bdhirsh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@bdhirsh
Copy link
Contributor

bdhirsh commented Sep 15, 2021

@ysiraichi let me know if you're still doing any refactoring. Otherwise I'll go ahead and try to land!

@ysiraichi
Copy link
Collaborator Author

@bdhirsh Sorry for the delay. I'm done refactoring this PR.

bdhirsh added a commit that referenced this pull request Sep 21, 2021
Tracking issue: #55070

ghstack-source-id: 548bb9586ed260e1019fa8a5214210918819d959
Pull Request resolved: #61450
@bdhirsh
Copy link
Contributor

bdhirsh commented Sep 21, 2021

@bdhirsh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot facebook-github-bot deleted the gh/ysiraichi/16/head branch October 2, 2021 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants