Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate kernels with Tensor? to C10 full dispatcher #54263

Closed
wants to merge 7 commits into from

Conversation

wenleix
Copy link
Contributor

@wenleix wenleix commented Mar 18, 2021

Stack from ghstack:

Codemod commands generated by #54223

Signatures of the following 8 methods in LegacyTHFunctionsCUDA.h are
manually changed.

_thnn_multi_margin_loss_forward
_thnn_multi_margin_loss_backward
_thnn_nll_loss_forward
_thnn_nll_loss_backward
_thnn_nll_loss2d_forward
_thnn_nll_loss2d_backward
_thnn_conv2d_forward
_thnn_conv_depthwise2d_forward

Differential Revision: D27164092

Codemod commands generated by #54223

Differential Revision: [D27164092](https://our.internmc.facebook.com/intern/diff/D27164092/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 18, 2021

💊 CI failures summary and remediations

As of commit d2bb75b (more details on the Dr. CI page):


  • 5/5 failures possibly* introduced in this PR
    • 1/5 non-scanned failure(s)

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_windows_vs2019_py36_cuda10.1_build (1/2)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Error generating file

C:/Users/circleci/project/aten/src\ATen/NumericUtils.h(80): error: calling a __host__ function("__ceilf") from a __device__ function(" const") is not allowed

C:/Users/circleci/project/aten/src\ATen/NumericUtils.h(80): error: identifier "__ceilf" is undefined in device code

16 errors detected in the compilation of "C:/Users/circleci/project/build/win_tmp/bin/.tmp8CHCjN/tmpxft_00002474_00000000-7_DistributionGeometricKernel.cpp1.ii".
DistributionGeometricKernel.cu
-- Removing C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_DistributionGeometricKernel.cu.obj
C:/Jenkins/Miniconda3/Library/bin/cmake.exe -E remove C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_DistributionGeometricKernel.cu.obj
CMake Error at torch_cuda_generated_DistributionGeometricKernel.cu.obj.Release.cmake:281 (message):
  Error generating file
  C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_DistributionGeometricKernel.cu.obj


[4485/5556] cmd.exe /C "cd /D C:\Users\circleci\project\build\caffe2\CMakeFiles\torch_cuda.dir\__\aten\src\ATen\native\cuda && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E make_directory C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/. && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -D verbose:BOOL=ON -D build_configuration:STRING=Release -D generated_file:STRING=C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_DistributionNormal.cu.obj -D generated_cubin_file:STRING=C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_DistributionNormal.cu.obj.cubin.txt -P C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_DistributionNormal.cu.obj.Release.cmake"
-- Removing C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_DistributionNormal.cu.obj
C:/Jenkins/Miniconda3/Library/bin/cmake.exe -E remove C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_DistributionNormal.cu.obj
-- Generating dependency file: C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_DistributionNormal.cu.obj.NVCC-depend
C:/Users/circleci/project/build/win_tmp/bin/randomtemp.exe -M -D__CUDACC__ C:/Users/circleci/project/aten/src/ATen/native/cuda/DistributionNormal.cu -o C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_DistributionNormal.cu.obj.NVCC-depend -ccbin cl.exe -m64 -Dtorch_cuda_EXPORTS -DUSE_CUDA -DTORCH_CUDA_BUILD_MAIN_LIB -DWIN32_LEAN_AND_MEAN -DTH_BLAS_MKL -D_OPENMP_NOFORCE_MANIFEST -DONNX_ML=1 -DONNXIFI_ENABLE_EXT=1 -DONNX_NAMESPACE=onnx_torch -D_CRT_SECURE_NO_DEPRECATE=1 -DMAGMA_V2 -DIDEEP_USE_MKL -DUSE_EXTERNAL_MZCRC -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -Xcompiler ,\"/DWIN32\",\"/D_WINDOWS\",\"/GR\",\"/EHsc\",\"/w\",\"/bigobj\",\"-DUSE_PTHREADPOOL\",\"-openmp:experimental\",\"-IC:/Users/circleci/project/build/win_tmp/mkl/include\",\"-DNDEBUG\",\"-DUSE_FBGEMM\",\"-DUSE_XNNPACK\",\"-DHAVE_AVX_CPU_DEFINITION\",\"-DHAVE_AVX2_CPU_DEFINITION\",\"/MD\",\"/O2\",\"/Ob2\",\"/DNDEBUG\",\"/w\",\"/bigobj\",\"-DNDEBUG\" -Xcompiler /w -w -Xfatbin -compress-all -DONNX_NAMESPACE=onnx_torch --use-local-env -gencode arch=compute_75,code=sm_75 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=integer_sign_change,--diag_suppress=useless_using_declaration,--diag_suppress=set_but_not_used,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=implicit_return_from_non_void_function,--diag_suppress=unsigned_compare_with_zero,--diag_suppress=declared_but_not_referenced,--diag_suppress=bad_friend_decl --Werror cross-execution-space-call --no-host-device-move-forward -Xcompiler -MD --expt-relaxed-constexpr --expt-extended-lambda -Xcompiler=/wd4819,/wd4503,/wd4190,/wd4244,/wd4251,/wd4275,/wd4522 -Wno-deprecated-gpu-targets --expt-extended-lambda -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DNVCC "-IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include" -IC:/Users/circleci/project/build/aten/src -IC:/Users/circleci/project/aten/src -IC:/Users/circleci/project/build -IC:/Users/circleci/project -IC:/Users/circleci/project/build/third_party/gloo -IC:/Users/circleci/project/cmake/../third_party/gloo -IC:/Users/circleci/project/cmake/../third_party/googletest/googlemock/include -IC:/Users/circleci/project/cmake/../third_party/googletest/googletest/include -IC:/Users/circleci/project/third_party/protobuf/src -IC:/Users/circleci/project/build/win_tmp/mkl/include -IC:/Users/circleci/project/third_party/XNNPACK/include -IC:/Users/circleci/project/cmake/../third_party/benchmark/include -IC:/Users/circleci/project/third_party -IC:/Users/circleci/project/cmake/../third_party/eigen -IC:/Jenkins/Miniconda3/include -IC:/Jenkins/Miniconda3/lib/site-packages/numpy/core/include -IC:/Users/circleci/project/cmake/../third_party/pybind11/include -IC:/Users/circleci/project/cmake/../third_party/cub -IC:/Users/circleci/project/build/caffe2/contrib/aten -IC:/Users/circleci/project/third_party/onnx -IC:/Users/circleci/project/build/third_party/onnx -IC:/Users/circleci/project/third_party/foxi -IC:/Users/circleci/project/build/third_party/foxi -IC:/Users/circleci/project/build/win_tmp/magma/include -IC:/Users/circleci/project/third_party/ideep/mkl-dnn/include -IC:/Users/circleci/project/third_party/ideep/include -IC:/Users/circleci/project/build/include -IC:/Users/circleci/project/build/caffe2/aten/src/TH -IC:/Users/circleci/project/aten/src/TH -IC:/Users/circleci/project/build/caffe2/aten/src/THC -IC:/Users/circleci/project/aten/src/THC -IC:/Users/circleci/project/aten/src/THCUNN -IC:/Users/circleci/project/aten/src/ATen/cuda -IC:/Users/circleci/project/build/caffe2/aten/src -IC:/Users/circleci/project/aten/../third_party/catch/single_include -IC:/Users/circleci/project/aten/src/ATen/.. -IC:/Users/circleci/project/build/caffe2/aten/src/ATen -IC:/Users/circleci/project/c10/cuda/../.. -IC:/Users/circleci/project/c10/../ "-IC:/Program Files/NVIDIA Corporation/NvToolsExt/include" -IC:/Users/circleci/project/torch/csrc/api -IC:/Users/circleci/project/torch/csrc/api/include -IC:/Users/circleci/project/build/third_party/ideep/mkl-dnn/include -IC:/Users/circleci/project/third_party/ideep/mkl-dnn/src/../include
DistributionNormal.cu
-- Generating temporary cmake readable file: C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_DistributionNormal.cu.obj.depend.tmp

See CircleCI build pytorch_windows_vs2019_py36_cuda11.1_build (2/2)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Error generating file

C:/Users/circleci/project/caffe2/operators/deform_conv_op.cu(82): error: calling a __host__ function("__floorf") from a __device__ function("caffe2::deformable_im2col_bilinear<float> ") is not allowed

C:/Users/circleci/project/caffe2/operators/deform_conv_op.cu(82): error: identifier "__floorf" is undefined in device code

4 errors detected in the compilation of "C:/Users/circleci/project/caffe2/operators/deform_conv_op.cu".
deform_conv_op.cu
-- Removing C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda_cpp.dir/operators/./torch_cuda_cpp_generated_deform_conv_op.cu.obj
C:/Jenkins/Miniconda3/Library/bin/cmake.exe -E remove C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda_cpp.dir/operators/./torch_cuda_cpp_generated_deform_conv_op.cu.obj
CMake Error at torch_cuda_cpp_generated_deform_conv_op.cu.obj.Release.cmake:281 (message):
  Error generating file
  C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda_cpp.dir/operators/./torch_cuda_cpp_generated_deform_conv_op.cu.obj


ninja: build stopped: subcommand failed.
-- Building version 1.9.0a0+gitd2bb75b
 --- Trying to initialize submodules
 --- Submodule initialization took 136.91 sec
cmake -GNinja -DBUILD_ENVIRONMENT=pytorch-win-vs2019-cuda11-cudnn8-py3 -DBUILD_PYTHON=True -DBUILD_SPLIT_CUDA=ON -DBUILD_TEST=True -DBUILD_TYPE=release -DCMAKE_BUILD_TYPE=Release -DCMAKE_GENERATOR=Ninja -DCMAKE_INCLUDE_PATH=C:\Users\circleci\project\build\win_tmp\mkl\include -DCMAKE_INSTALL_PREFIX=C:\Users\circleci\project\torch -DCMAKE_PREFIX_PATH=C:\Jenkins\Miniconda3\Lib\site-packages -DCMAKE_VERBOSE_MAKEFILE=1 -DCUDA_NVCC_EXECUTABLE=C:\Users\circleci\project\build\win_tmp\bin\randomtemp.exe -DCUDNN_LIBRARY=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64 -DJAVA_HOME=C:\Program Files\OpenJDK\jdk-12.0.2 -DNUMPY_INCLUDE_DIR=C:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -DPYTHON_EXECUTABLE=C:\Jenkins\Miniconda3\python.exe -DPYTHON_INCLUDE_DIR=C:\Jenkins\Miniconda3\include -DPYTHON_LIBRARY=C:\Jenkins\Miniconda3/libs/python36.lib -DTORCH_BUILD_VERSION=1.9.0a0+gitd2bb75b -DUSE_CUDA=1 -DUSE_NUMPY=True C:\Users\circleci\project
cmake --build . --target install --config Release -- -j 16
Traceback (most recent call last):

2 failures not recognized by patterns:

Job Step Action
CircleCI pytorch_linux_bionic_py3_8_gcc9_coverage_test1 Run tests 🔁 rerun
CircleCI pytorch_linux_bionic_py3_8_gcc9_coverage_test2 Run tests 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

IntArrayRef padding, IntArrayRef output_padding, IntArrayRef stride, IntArrayRef dilation,
int64_t groups, bool benchmark, bool deterministic) {
const Tensor& bias = c10::value_or_else(bias_opt, [] {return Tensor();});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will clean up such cases (i.e. the impl just raises an error) in a separate pass.

Codemod commands generated by #54223

Differential Revision: [D27164092](https://our.internmc.facebook.com/intern/diff/D27164092/)

[ghstack-poisoned]
… to C10 full dispatcher"

Codemod commands generated by #54223

Differential Revision: [D27164092](https://our.internmc.facebook.com/intern/diff/D27164092/)

[ghstack-poisoned]
wenleix pushed a commit that referenced this pull request Mar 18, 2021
Pull Request resolved: #54263

Codemod commands generated by #54223
ghstack-source-id: 124314500

Differential Revision: [D27164092](https://our.internmc.facebook.com/intern/diff/D27164092/)
Codemod commands generated by #54223

Differential Revision: [D27164092](https://our.internmc.facebook.com/intern/diff/D27164092/)

[ghstack-poisoned]
wenleix pushed a commit that referenced this pull request Mar 19, 2021
Pull Request resolved: #54263

Codemod commands generated by #54223
ghstack-source-id: 124360597

Differential Revision: [D27164092](https://our.internmc.facebook.com/intern/diff/D27164092/)
@wenleix wenleix requested a review from smessmer March 19, 2021 05:55
Copy link
Contributor

@smessmer smessmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great :)

Maybe we should also add a small comment marker into the kernels so that if we ever want to clean them up further (i.e. have the kernel handle optional directly without going through the creation of a default constructed tensor), we know how to find them.
Maybe something like

Tensor my_op(...) {
// See [Note: hacky_wrapper_removal]
Tensor weight = value_or_else(...)
...
}

and then add a [Note: hacky_wrapper_removal] somewhere in a central location.

…cher"


Codemod commands generated by #54223

Signatures of the following 8 methods in LegacyTHFunctionsCUDA.h are
manually changed.

```
_thnn_multi_margin_loss_forward
_thnn_multi_margin_loss_backward
_thnn_nll_loss_forward
_thnn_nll_loss_backward
_thnn_nll_loss2d_forward
_thnn_nll_loss2d_backward
_thnn_conv2d_forward
_thnn_conv_depthwise2d_forward
```

Differential Revision: [D27164092](https://our.internmc.facebook.com/intern/diff/D27164092/)

[ghstack-poisoned]
wenleix pushed a commit that referenced this pull request Mar 20, 2021
Pull Request resolved: #54263

Codemod commands generated by #54223

Signatures of the following 8 methods in LegacyTHFunctionsCUDA.h are
manually changed.

```
_thnn_multi_margin_loss_forward
_thnn_multi_margin_loss_backward
_thnn_nll_loss_forward
_thnn_nll_loss_backward
_thnn_nll_loss2d_forward
_thnn_nll_loss2d_backward
_thnn_conv2d_forward
_thnn_conv_depthwise2d_forward
```

ghstack-source-id: 124460153

Differential Revision: [D27164092](https://our.internmc.facebook.com/intern/diff/D27164092/)
… Tensor? to C10 full dispatcher"


Codemod commands generated by #54223

Signatures of the following 8 methods in LegacyTHFunctionsCUDA.h are
manually changed.

```
_thnn_multi_margin_loss_forward
_thnn_multi_margin_loss_backward
_thnn_nll_loss_forward
_thnn_nll_loss_backward
_thnn_nll_loss2d_forward
_thnn_nll_loss2d_backward
_thnn_conv2d_forward
_thnn_conv_depthwise2d_forward
```

Differential Revision: [D27164092](https://our.internmc.facebook.com/intern/diff/D27164092/)

[ghstack-poisoned]
wenleix pushed a commit that referenced this pull request Mar 22, 2021
Pull Request resolved: #54263

Codemod commands generated by #54223

Signatures of the following 8 methods in LegacyTHFunctionsCUDA.h are
manually changed.

```
_thnn_multi_margin_loss_forward
_thnn_multi_margin_loss_backward
_thnn_nll_loss_forward
_thnn_nll_loss_backward
_thnn_nll_loss2d_forward
_thnn_nll_loss2d_backward
_thnn_conv2d_forward
_thnn_conv_depthwise2d_forward
```

ghstack-source-id: 124535551

Differential Revision: [D27164092](https://our.internmc.facebook.com/intern/diff/D27164092/)
…C10 full dispatcher"


Codemod commands generated by #54223

Signatures of the following 8 methods in LegacyTHFunctionsCUDA.h are
manually changed.

```
_thnn_multi_margin_loss_forward
_thnn_multi_margin_loss_backward
_thnn_nll_loss_forward
_thnn_nll_loss_backward
_thnn_nll_loss2d_forward
_thnn_nll_loss2d_backward
_thnn_conv2d_forward
_thnn_conv_depthwise2d_forward
```

Differential Revision: [D27164092](https://our.internmc.facebook.com/intern/diff/D27164092/)

[ghstack-poisoned]
wenleix pushed a commit that referenced this pull request Mar 22, 2021
Pull Request resolved: #54263

Codemod commands generated by #54223

Signatures of the following 8 methods in LegacyTHFunctionsCUDA.h are
manually changed.

```
_thnn_multi_margin_loss_forward
_thnn_multi_margin_loss_backward
_thnn_nll_loss_forward
_thnn_nll_loss_backward
_thnn_nll_loss2d_forward
_thnn_nll_loss2d_backward
_thnn_conv2d_forward
_thnn_conv_depthwise2d_forward
```

ghstack-source-id: 124539990

Differential Revision: [D27164092](https://our.internmc.facebook.com/intern/diff/D27164092/)
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in edfc787.

@facebook-github-bot facebook-github-bot deleted the gh/wenleix/5/head branch March 26, 2021 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants