-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Caffe2] Enabling AMD GPU Backend for Caffe2 #7566
Conversation
…e2_core_hip * 'caffe2_core_hip' of github.com:petrex/pytorch: caffe2 PB update for AMD/ROCM HIP device
@@ -59,6 +59,7 @@ cmake_dependent_option( | |||
USE_GLOO "Use Gloo" ON | |||
"BUILD_CAFFE2" OFF) | |||
option(USE_GLOO_IBVERBS "Use Gloo IB verbs for distributed support" OFF) # New option | |||
option(USE_HIP "Use HIP" ON) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
FIND_PACKAGE(HIP 1.0 REQUIRED) | ||
FIND_PACKAGE(HIP 1.0) | ||
|
||
IF(HIP_FOUND) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
@@ -1,42 +1,86 @@ | |||
set(PYTORCH_FOUND_HIP FALSE) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
set(Caffe2_HIP_INCLUDES | ||
${hip_INCLUDE_DIRS} ${rocrand_INCLUDE_DIRS} ${hiprand_INCLUDE_DIRS} ${rocblas_INCLUDE_DIRS} ${miopen_INCLUDE_DIRS} ${Caffe2_HIP_INCLUDES} ${thrust_INCLUDE_DIRS}) | ||
set(Caffe2_HIP_DEPENDENCY_LIBS | ||
${rocrand_LIBRARIES} ${hiprand_LIBRARIES} ${PYTORCH_HIP_HCC_LIBRARIES} ${PYTORCH_MIOPEN_LIBRARIES}) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
@@ -121,6 +121,10 @@ ENDIF() | |||
# Find the HIP package, set the HIP paths, load the HIP CMake. | |||
IF(WITH_ROCM) | |||
include(LoadHIP) | |||
if (NOT PYTORCH_FOUND_HIP) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
…e2_core_hip * 'caffe2_core_hip' of github.com:petrex/pytorch: (40 commits) [auto] Update onnx to 52f7528 - add more shape inference tests (onnx/onnx#971) onnx/onnx@52f7528 JIT cleanup (pytorch#7631) fix to build sleef when using cmake 3.11.1 (pytorch#7679) Fix typo in document (pytorch#7725) [auto] Update onnx to 6f4b1b1 - Tests for Gemm operator (onnx/onnx#885) onnx/onnx@6f4b1b1 [auto] Update onnx to c6c6aad - Enhance the 1-element broadcast case (onnx/onnx#902) onnx/onnx@c6c6aad serialization for torch.device (pytorch#7713) Fix compile flags for MSVC (pytorch#7703) Fix exporting Sum to onnx (pytorch#7685) Renanme ZFNet to ZFNet512 (pytorch#7723) Implement __reduce__ for torch.dtype (pytorch#7699) Remove unnecessary include in vec256_float.h (pytorch#7711) Update from facebook (pytorch#7696) fix for cuda 9.2 builds (pytorch#7709) make BatchSampler subclass of Sampler, and expose (pytorch#7707) Dont emit warning for ABI incompatibility when PyTorch was built from source (pytorch#7681) remove index from python bindings (fixes: pytorch#7639) (pytorch#7690) Update _torch_docs.py (pytorch#7700) Fix the wrong usage of environment variables detection in cmake Changes from D7881937 and D7963936 plus an edit (pytorch#7605) ...
If we're working around a bug in the upstream HIP files, we should say so in the code that is implementing the workaround, so that when HIP fixes their cmake we know what to eliminate. |
@ezyang @Jorghi12 Ok let me explain here again, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stamping approval
@petrex Let's first get this initial version in so we can parallel the work of polishing the core and adding hip ops |
…e2_core_hip * 'caffe2_core_hip' of github.com:petrex/pytorch: (24 commits) Allow empty storage for the 'Edge' class. (pytorch#7595) Process group base class and Gloo implementation (pytorch#7628) _LRSchedulers getstate include optimizer info (pytorch#7757) [PyTorch] [gradcheck] change backward() to grad() (pytorch#7710) Update test_nn.py (pytorch#7787) Define general default scheduler for TBB and fix ppc64le bug (pytorch#7761) Add support for accepting Tensor as input in clip_grad_* functions. (pytorch#7769) [Easy] Remove unused code (pytorch#7782) Update tbb (pytorch#7734) Add @generated annotation (pytorch#7780) fix legacy comment after variable tensor merge (pytorch#7771) Revert pytorch#7750 and pytorch#7762 to fix Windows CI on master (pytorch#7772) Temporarily disable build env check (pytorch#7768) Add missing brace (pytorch#7762) [C++ API] Add backward() to Tensor and Variable (pytorch#7750) [auto] Update onnx to d43b550 - Fix .gitignore and add missing files (onnx/onnx#1005) onnx/onnx@d43b550 [auto] Update onnx to ea1aa13 - add tests for reduce ops (onnx/onnx#675) onnx/onnx@ea1aa13 include cudnn_h (pytorch#7749) [C++ API] Using new registration mechanism (pytorch#7663) [auto] Update onnx to 5dd68e6 - Add a util function: polish_model (onnx/onnx#1000) onnx/onnx@5dd68e6 ...
This reverts commit 6e89ad4.
@bddppq Just reverted change for the operators. Let's keep this PR for Caffe2 core and CI only. |
* Revert "[auto] Update onnx to 4898c9e - Added TensorDenotation and metadata_props for images (onnx/onnx#879) onnx/onnx@4898c9e" This reverts commit 9c679da. * Revert "Add BiasCHW fallback for GPU (#7738)" This reverts commit 14ad2e7. * Revert "[Caffe2] Enabling AMD GPU Backend for Caffe2 (#7566)" This reverts commit 2ebcf4b.
* origin: [Caffe2] Enabling AMD GPU Backend for Caffe2 (pytorch#7566) Call grad_mode.py context managers as decorators (pytorch#7737) catch CPU tensors in checkSameGPU (fixes pytorch#7689) (pytorch#7767) Mark stack as non-executable in NNPACK (pytorch#7752) small fixes in fusion_compiler (pytorch#7776) Run clang-format on c10d (pytorch#7791)
* Add hip support for caffe2 core * Add MIOPEN header/wrapper to caffe2 core * Add HIP device into caffe2 PB * top level makefile change for rocm/hip * makefile scaffolding for AMD/RocM/HIP * Makefile scafodding for AMD/RocM/HIP; add makefile/utility for HIP files * caffe2 PB update for AMD/ROCM HIP device * Add AMD/RocM/Thrust dependency * HIP threadpool update * Fix makefile macro * makefile fix: duplicate test/binary name * makefile clean-up * makefile clean-up * add HIP operator registry * add utilities for hip device * Add USE_HIP to config summary * makefile fix for BUILD_TEST * merge latest * Fix indentation * code clean-up * Guard builds without HIP and use the same cmake script as PyTorch to find HIP * Setup rocm environment variables in build.sh (ideally should be done in the docker images) * setup locale * set HIP_PLATFORM * Revert "set HIP_PLATFORM" This reverts commit 8ec58db. * continue the build script environment variables mess * HCC_AMDGPU_TARGET * Cleanup the mess, has been fixed in the lastest docker images * Assign protobuf field hip_gpu_id a new field number for backward compatibility * change name to avoid conflict * Fix duplicated thread pool flag * Refactor cmake files to not add hip includes and libs globally * Fix the wrong usage of environment variables detection in cmake * Add MIOPEN CNN operators * Revert "Add MIOPEN CNN operators" This reverts commit 6e89ad4.
* Revert "[auto] Update onnx to 4898c9e - Added TensorDenotation and metadata_props for images (onnx/onnx#879) onnx/onnx@4898c9e" This reverts commit 9c679da. * Revert "Add BiasCHW fallback for GPU (pytorch#7738)" This reverts commit 14ad2e7. * Revert "[Caffe2] Enabling AMD GPU Backend for Caffe2 (pytorch#7566)" This reverts commit 2ebcf4b.
The goal of this PR is to enable AMD GPU backend for Caffe2.
Major changes include :