Skip to content

Conversation

@ezyang
Copy link
Contributor

@ezyang ezyang commented Feb 11, 2019

Previously, we used the templated class directly to provide
implementations. However, there is a subtle difference
between this, and CUDAStreamGuard: CUDAStreamGuard has refined types
for the Streams it returns. This lead to a compilation failure
of HIPified ddp.cpp. This commit lines them up more closely,
at the cost of copy-paste.

A possible alternate strategy would have been to extend the
InlineDeviceGuard templates to optionally accept refinements
for Stream. I leave this for future work.

Signed-off-by: Edward Z. Yang ezyang@fb.com

…sCUDA

Previously, we used the templated class directly to provide
implementations.  However, there is a subtle difference
between this, and CUDAStream: CUDAStream has refined types
for the Streams it returns.  This lead to a compilation failure
of HIPified ddp.cpp.  This commit lines them up more closely,
at the cost of copy-paste.

A possible alternate strategy would have been to extend the
InlineDeviceGuard templates to optionally accept refinements
for Stream.  I leave this for future work.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
@ezyang
Copy link
Contributor Author

ezyang commented Feb 11, 2019

cc @jithunnair-amd

@jithunnair-amd
Copy link
Collaborator

@ezyang Looks good, or do we need to rerun CI?

@jithunnair-amd
Copy link
Collaborator

@pytorchbot retest this please

@ezyang
Copy link
Contributor Author

ezyang commented Feb 12, 2019

CI coverage looks sufficient

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

zdevito pushed a commit to zdevito/ATen that referenced this pull request Feb 12, 2019
…sCUDA (#16978)

Summary:
Previously, we used the templated class directly to provide
implementations.  However, there is a subtle difference
between this, and CUDAStreamGuard: CUDAStreamGuard has refined types
for the Streams it returns.  This lead to a compilation failure
of HIPified ddp.cpp.  This commit lines them up more closely,
at the cost of copy-paste.

A possible alternate strategy would have been to extend the
InlineDeviceGuard templates to optionally accept refinements
for Stream.  I leave this for future work.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: pytorch/pytorch#16978

Differential Revision: D14045346

Pulled By: ezyang

fbshipit-source-id: 2b101606e62e4db588027c57902ea739a2119410
@bddppq bddppq added the module: rocm AMD GPU support for Pytorch label Feb 13, 2019
@ezyang ezyang added the merged label Jun 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: rocm AMD GPU support for Pytorch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants