Compatibility patch for CUDA Toolkit 11.0 and PyTorch 1.8 #41

yihuajack · 2021-06-12T18:12:43Z

The changes in CMakeLists.txt is compatibility patch for CUDA Toolkit 11.0. Since 11.0, cub is already included in CUDA Toolkit, using custom cub will trigger an error off. This may fix #34.
For usage of ${CUDA_INCLUDE_DIRS}, ${CUDA_VERSION_MAJOR}, and VERSION_LESS refer to https://cmake.org/cmake/help/latest/module/FindCUDA.html

The changes in dreamplace/ops is compatibility patch for PyTorch 1.8.
For changes in torch.h the definition of function rfft() and irfft(), PyTorch 1.7 and 1.8 implements an at::fft namespace instead of at to be consistent with the style of NumPy referring to https://github.com/pytorch/pytorch/wiki/The-torch.fft-module-in-PyTorch-1.7, which recommends using at::view_as_complex, though the documentation of previous versions and PyTorch 1.7/1.8 does not clearly give the differences between the two versions of ffts and iffts. Note that for PyTorch 1.7 the previous ffts and iffts can work but are deprecated, and they are completely abandoned in PyTorch 1.8. New version of PyTorch also reminds that AT_ERROR is deprecated, so TORCH_CHECK_VALUE is used here instead. Due to the limit of std::value_or() (or c10::value_or()), when signal_ndim==1, if else block is needed.
However, in dct2_fft2.cpp and dct2_fft2_cuda.cpp, at::irfft receives the last argument as a <brace-enclosed initializer list>. This will be fine for previous versions because the argument is declared to be IntArrayRef signal_sizes (see torch/include/ATen/Functions.h), but newer functions, it is declared to be c10::optional<c10::IntArrayRef> (IntArrayRef is just ArrayRef<int64_t> or ArrayRef<long int>), whose constructors would fail to match the <brace-enclosed initializer list> (see pytorch/pytorch#43545). <brace-enclosed initializer list> has no type, so will not make sense for deduction (see https://en.cppreference.com/w/cpp/language/list_initialization). Using double braces will resolve that (see http://gavinchou.github.io/experience/summary/syntax/initializer_list, https://stackoverflow.com/questions/49261221/call-constructor-of-class-with-brace-enclosed-initilizer-list, and Scott Meyers' "Effective Modern C++" Item 7).

In torch.h, global_swap_cuda.cpp, and k_reorder_cuda.cpp, the changes of DREAMPLACE_DISPATCH_FLOATING_TYPES(TYPE, NAME, ...) and DISPATCH_CUSTOM_TYPES(TYPE, NAME, ...) is introduced since PyTorch 1.8.0, which can be checked here: pytorch/pytorch@v1.7.1...v1.8.0.

The changes in CMakeLists.txt is compatibility patch for CUDA Toolkit 11.0. The changes in dreamplace/ops is compatibility patch for PyTorch 1.8.

limbo018 · 2021-06-14T15:40:08Z

Hi Yihua, appreciate your efforts. Could you make sure the patch passes unittest/ops/dct_unittest.py? I tried and it failed to pass. It is required to pass this unit test before committing the patch.

yihuajack · 2021-06-15T08:14:52Z

I'll work on this

limbo018 · 2021-06-15T09:09:09Z

I'll work on this

Hi Yihua, I spent some time to fix the problems in your patch and merged it into the develop branch. It can pass the unit test and global placement now, while I still got problem in the detailed placement (runtime failure with CUDA 11, works fine with older CUDA versions).

You can check my fix in the develop branch. If no other issues are reported, I will merge into the master branch this week.

yihuajack · 2021-06-21T03:07:11Z

Appreciate your effort! But on the upstream develop branch I got a weird problem that when generating dct.stamp it prompted an error of ModuleNotFoundError: no module named torch. This never happened before. I checked that CMake does use /usr/bin/python and I can import torch by /usr/bin/python in terminal. I can successfully manually run python ${SETUP_PY} build under dct. What's the possible reason for this? It's really troublesome...

limbo018 · 2021-06-22T08:47:49Z

Hi, I have replaced all setup.py.in with a pure CMake system in the latest commits (521844) to the develop branch (This has been done internally for a while, but not updated to the public version.). It no longer relies on the stamping scheme. You can have a try and see whether you still encounter the problem.

yihuajack · 2021-06-25T19:00:37Z

Glad to find that this version can be successfully built

Compatibility patch

c43958f

The changes in CMakeLists.txt is compatibility patch for CUDA Toolkit 11.0. The changes in dreamplace/ops is compatibility patch for PyTorch 1.8.

limbo018 merged commit c43958f into limbo018:master Jul 2, 2021

yihuajack deleted the yihuajack/patch branch July 9, 2021 06:34

wangzhen0518 mentioned this pull request Apr 1, 2023

Error of [dispatch_histogram.cuh] when making #121

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compatibility patch for CUDA Toolkit 11.0 and PyTorch 1.8 #41

Compatibility patch for CUDA Toolkit 11.0 and PyTorch 1.8 #41

yihuajack commented Jun 12, 2021 •

edited

Loading

limbo018 commented Jun 14, 2021

yihuajack commented Jun 15, 2021

limbo018 commented Jun 15, 2021

yihuajack commented Jun 21, 2021

limbo018 commented Jun 22, 2021

yihuajack commented Jun 25, 2021

Compatibility patch for CUDA Toolkit 11.0 and PyTorch 1.8 #41

Compatibility patch for CUDA Toolkit 11.0 and PyTorch 1.8 #41

Conversation

yihuajack commented Jun 12, 2021 • edited Loading

limbo018 commented Jun 14, 2021

yihuajack commented Jun 15, 2021

limbo018 commented Jun 15, 2021

yihuajack commented Jun 21, 2021

limbo018 commented Jun 22, 2021

yihuajack commented Jun 25, 2021

yihuajack commented Jun 12, 2021 •

edited

Loading