Fix DLPack CUDA stream convention #67618

emcastillo · 2021-11-01T08:46:32Z

Apparently for the array API, cuda default stream and per thread stream should be 1 and 2 instead of 0 and 1:

https://data-apis.org/array-api/latest/API_specification/array_object.html?dlpack-self-stream-none#dlpack-self-stream-none.

This caused a problem in the interop with CuPy cupy/cupy#5970 (comment).

cc @rgommers @leofang @mruberry

pytorch-probot · 2021-11-01T08:46:36Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/emcastillo/pytorch/blob/c4494d0d2d2c180b047f461f5ec6c833b2db8220/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-bionic-py3.6-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/xla`	✅ triggered
linux-vulkan-bionic-py3.6-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3-clang5-mobile-build	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`	✅ triggered
linux-xenial-py3.6-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`	✅ triggered
linux-xenial-py3.6-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`	✅ triggered
linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3.6-gcc7	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3.6-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/win`	✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`	🚫 skipped
docker-builds	`ciflow/all`	🚫 skipped
ios-12-5-1-arm64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-arm64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-arm64-custom-ops	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-arm64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-arm64-metal	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-x86-64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-x86-64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-x86-64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`	🚫 skipped
macos-10-15-py3-arm64	`ciflow/all`, `ciflow/macos`	🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64	`ciflow/all`, `ciflow/macos`	🚫 skipped
macos-10-15-py3-x86-64	`ciflow/all`, `ciflow/macos`	🚫 skipped
parallelnative-linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`	🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.1-py3.6-gcc7-debug	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.1-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:

# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

facebook-github-bot · 2021-11-01T08:46:37Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/67618
📄 Preview docs built from this PR
📄 Preview C++ docs built from this PR
🔧 Opt-in to CIFlow to control what jobs run on your PRs

💊 CI failures summary and remediations

As of commit c4494d0 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

leofang

Thanks, Emilio! Look like we missed stream_ptr here?

torch/utils/dlpack.py

leofang · 2021-11-01T14:54:59Z

Question for @rgommers @mruberry: Would there be PyTorch 1.10.1? If so we should backport this patch.

rgommers

This LGTM, thanks @emcastillo!

mruberry · 2021-11-10T11:47:02Z

Hey @emcastillo!

Thanks for this fix and your patience (I was moving from the West Coast to the East Coast)!

It looks like the ROCm failure is real, however:

08:18:53 ======================================================================
08:18:53 FAIL [0.075s]: test_dlpack_default_stream_cuda (__main__.TestTorchDeviceTypeCUDA)
08:18:53 ----------------------------------------------------------------------
08:18:53 Traceback (most recent call last):
08:18:53   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 1422, in wrapper
08:18:53     method(*args, **kwargs)
08:18:53   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 1422, in wrapper
08:18:53     method(*args, **kwargs)
08:18:53   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 371, in instantiated_test
08:18:53     result = test(self, **param_kwargs)
08:18:53   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 772, in dep_fn
08:18:53     return fn(slf, *args, **kwargs)
08:18:53   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 891, in only_fn
08:18:53     return fn(slf, *args, **kwargs)
08:18:53   File "test_torch.py", line 7336, in test_dlpack_default_stream
08:18:53     from_dlpack(x)
08:18:53   File "/opt/conda/lib/python3.6/site-packages/torch/utils/dlpack.py", line 68, in from_dlpack
08:18:53     dlpack = ext_tensor.__dlpack__(stream=stream_ptr)
08:18:53   File "test_torch.py", line 7328, in __dlpack__
08:18:53     assert stream == 1
08:18:53 AssertionError

Any idea what's going on there? One option would be to file a follow-up issue and skip the test on ROCm for now (with a link to the issue).

rgommers · 2021-11-10T14:48:57Z

It looks like the ROCm failure is real,

Stream numbering for CUDA and ROCm is different, it's 1 for CUDA default stream and 0 for ROCm: see https://data-apis.org/array-api/latest/API_specification/array_object.html#dlpack-self-stream-none.

I have a vague memory about us talking about this before, and there being an issue with detecting whether we're running on ROCm, but I can't find it back.

emcastillo · 2021-11-15T07:53:35Z

Thanks @mruberry, ler me take a closer look and fix it!

emcastillo · 2021-11-17T02:17:27Z

@mruberry all tests passed :)

facebook-github-bot · 2021-11-18T04:15:29Z

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-11-18T16:37:42Z

@mruberry merged this pull request in 533e72e.

pytorch-probot bot added the ciflow/default label Nov 1, 2021

facebook-github-bot added the cla signed label Nov 1, 2021

emcastillo mentioned this pull request Nov 1, 2021

Fix __dlpack__ protocol cupy/cupy#5970

Merged

pytorchbot added the open source label Nov 1, 2021

emcastillo force-pushed the fix-dlpack branch from f1eabae to f42295b Compare November 1, 2021 10:35

leofang suggested changes Nov 1, 2021

View reviewed changes

torch/utils/dlpack.py Outdated Show resolved Hide resolved

emcastillo force-pushed the fix-dlpack branch 2 times, most recently from 06a8985 to 8114b16 Compare November 2, 2021 04:57

mruberry requested review from rgommers and mruberry November 2, 2021 11:49

mruberry added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Nov 2, 2021

rgommers approved these changes Nov 2, 2021

View reviewed changes

Emilio Castillo added 3 commits November 16, 2021 08:09

Fix DLPack CUDA stream convention

b03d5a7

flake8 fix

cbd421a

fix hip test

c4494d0

emcastillo force-pushed the fix-dlpack branch from b72a634 to c4494d0 Compare November 16, 2021 08:09

mruberry approved these changes Nov 18, 2021

View reviewed changes

facebook-github-bot closed this in 533e72e Nov 18, 2021

facebook-github-bot added the Merged label Nov 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix DLPack CUDA stream convention #67618

Fix DLPack CUDA stream convention #67618

emcastillo commented Nov 1, 2021 •

edited

Loading

pytorch-probot bot commented Nov 1, 2021 •

edited

Loading

⚛️ CI Flow

facebook-github-bot commented Nov 1, 2021 •

edited

Loading

leofang left a comment

leofang commented Nov 1, 2021

rgommers left a comment

mruberry commented Nov 10, 2021

rgommers commented Nov 10, 2021

emcastillo commented Nov 15, 2021

emcastillo commented Nov 17, 2021

facebook-github-bot commented Nov 18, 2021

facebook-github-bot commented Nov 18, 2021

Fix DLPack CUDA stream convention #67618

Fix DLPack CUDA stream convention #67618

Conversation

emcastillo commented Nov 1, 2021 • edited Loading

pytorch-probot bot commented Nov 1, 2021 • edited Loading

⚛️ CI Flow

facebook-github-bot commented Nov 1, 2021 • edited Loading

🔗 Helpful links

💊 CI failures summary and remediations

leofang left a comment

Choose a reason for hiding this comment

leofang commented Nov 1, 2021

rgommers left a comment

Choose a reason for hiding this comment

mruberry commented Nov 10, 2021

rgommers commented Nov 10, 2021

emcastillo commented Nov 15, 2021

emcastillo commented Nov 17, 2021

facebook-github-bot commented Nov 18, 2021

facebook-github-bot commented Nov 18, 2021

emcastillo commented Nov 1, 2021 •

edited

Loading

pytorch-probot bot commented Nov 1, 2021 •

edited

Loading

facebook-github-bot commented Nov 1, 2021 •

edited

Loading