Compatibility patch for CUDA Toolkit 11.0 and PyTorch 1.8 #41
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The changes in CMakeLists.txt is compatibility patch for CUDA Toolkit 11.0. Since 11.0,
cub
is already included in CUDA Toolkit, using customcub
will trigger an error off. This may fix #34.For usage of
${CUDA_INCLUDE_DIRS}
,${CUDA_VERSION_MAJOR}
, andVERSION_LESS
refer to https://cmake.org/cmake/help/latest/module/FindCUDA.htmlThe changes in dreamplace/ops is compatibility patch for PyTorch 1.8.
For changes in torch.h the definition of function
rfft()
andirfft()
, PyTorch 1.7 and 1.8 implements anat::fft
namespace instead ofat
to be consistent with the style ofNumPy
referring to https://github.com/pytorch/pytorch/wiki/The-torch.fft-module-in-PyTorch-1.7, which recommends usingat::view_as_complex
, though the documentation of previous versions and PyTorch 1.7/1.8 does not clearly give the differences between the two versions of ffts and iffts. Note that for PyTorch 1.7 the previous ffts and iffts can work but are deprecated, and they are completely abandoned in PyTorch 1.8. New version of PyTorch also reminds thatAT_ERROR
is deprecated, soTORCH_CHECK_VALUE
is used here instead. Due to the limit ofstd::value_or()
(orc10::value_or()
), whensignal_ndim==1
,if else
block is needed.However, in dct2_fft2.cpp and dct2_fft2_cuda.cpp,
at::irfft
receives the last argument as a<brace-enclosed initializer list>
. This will be fine for previous versions because the argument is declared to beIntArrayRef signal_sizes
(see torch/include/ATen/Functions.h), but newer functions, it is declared to bec10::optional<c10::IntArrayRef>
(IntArrayRef
is justArrayRef<int64_t>
orArrayRef<long int>
), whose constructors would fail to match the<brace-enclosed initializer list>
(see pytorch/pytorch#43545).<brace-enclosed initializer list>
has no type, so will not make sense for deduction (see https://en.cppreference.com/w/cpp/language/list_initialization). Using double braces will resolve that (see http://gavinchou.github.io/experience/summary/syntax/initializer_list, https://stackoverflow.com/questions/49261221/call-constructor-of-class-with-brace-enclosed-initilizer-list, and Scott Meyers' "Effective Modern C++" Item 7).In torch.h, global_swap_cuda.cpp, and k_reorder_cuda.cpp, the changes of
DREAMPLACE_DISPATCH_FLOATING_TYPES(TYPE, NAME, ...)
andDISPATCH_CUSTOM_TYPES(TYPE, NAME, ...)
is introduced since PyTorch 1.8.0, which can be checked here: pytorch/pytorch@v1.7.1...v1.8.0.