Skip to content

slow_conv3d grad_weight: call gemm directly #65759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

@pytorch-probot
Copy link

pytorch-probot bot commented Sep 28, 2021

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/5785ca055af445164733c823de147f057093d822/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-bionic-py3.6-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/xla ✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux ✅ triggered
linux-xenial-py3.6-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers ✅ triggered
linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux ✅ triggered
linux-xenial-py3.6-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/win ✅ triggered
Skipped Workflows
libtorch-linux-xenial-cuda10.2-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow 🚫 skipped
linux-xenial-cuda10.2-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow 🚫 skipped
parallelnative-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux 🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.1-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
puretorch-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux 🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:
# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Sep 28, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 5785ca0 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See GitHub Actions build win-vs2019-cuda11.3-py3 / build (1/1)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

2021-10-05T03:33:31.1048787Z FAILED: bin/ProcessGroupGlooAsyncTest.exe
2021-10-05T03:33:30.5815843Z 
2021-10-05T03:33:30.6243324Z [6162/6338] C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\bin\sccache-cl.exe   /TP -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTH_BLAS_MKL -DUSE_C10D_GLOO -DUSE_CUDA -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_OPENMP_NOFORCE_MANIFEST -IC:\actions-runner\_work\pytorch\pytorch\build\aten\src -IC:\actions-runner\_work\pytorch\pytorch\aten\src -IC:\actions-runner\_work\pytorch\pytorch\build -IC:\actions-runner\_work\pytorch\pytorch -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\benchmark\include -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\cudnn_frontend\include -IC:\actions-runner\_work\pytorch\pytorch\build\caffe2\contrib\aten -IC:\actions-runner\_work\pytorch\pytorch\third_party\onnx -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\onnx -IC:\actions-runner\_work\pytorch\pytorch\third_party\foxi -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\foxi -IC:\actions-runner\_work\pytorch\pytorch\build\caffe2\..\aten\src -IC:\actions-runner\_work\pytorch\pytorch\build\caffe2\..\aten\src\ATen -IC:\actions-runner\_work\pytorch\pytorch\torch\csrc\api -IC:\actions-runner\_work\pytorch\pytorch\torch\csrc\api\include -IC:\actions-runner\_work\pytorch\pytorch\c10\.. -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\ideep\mkl-dnn\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\ideep\mkl-dnn\src\..\include -IC:\actions-runner\_work\pytorch\pytorch\c10\cuda\..\.. -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\gloo -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\gloo -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\googletest\googlemock\include -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\googletest\googletest\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\protobuf\src -IC:\actions-runner\_work\pytorch\pytorch\build\win_tmp\mkl\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\XNNPACK\include -IC:\actions-runner\_work\pytorch\pytorch\third_party -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\eigen -IC:\Jenkins\Miniconda3\include -IC:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\pybind11\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" -IC:\actions-runner\_work\pytorch\pytorch\build\win_tmp\magma\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\ideep\mkl-dnn\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\ideep\include -I"C:\Program Files\NVIDIA Corporation\NvToolsExt\include" -IC:\actions-runner\_work\pytorch\pytorch\third_party\googletest\googletest\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\googletest\googletest /DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/build/win_tmp/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /MD /O2 /Ob2 /DNDEBUG /w /bigobj -DNDEBUG -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /EHsc /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -std:c++14 /showIncludes /Fotest_api\CMakeFiles\test_api.dir\tensor_options_cuda.cpp.obj /Fdtest_api\CMakeFiles\test_api.dir\ /FS -c C:\actions-runner\_work\pytorch\pytorch\test\cpp\api\tensor_options_cuda.cpp
2021-10-05T03:33:30.6256403Z Microsoft (R) C/C++ Optimizing Compiler Version 19.28.29337 for x64
2021-10-05T03:33:30.6256975Z Copyright (C) Microsoft Corporation.  All rights reserved.
2021-10-05T03:33:30.6257319Z 
2021-10-05T03:33:30.7764539Z [6163/6338] C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\bin\sccache-cl.exe   /TP -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTH_BLAS_MKL -DUSE_C10D_GLOO -DUSE_CUDA -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_OPENMP_NOFORCE_MANIFEST -IC:\actions-runner\_work\pytorch\pytorch\build\aten\src -IC:\actions-runner\_work\pytorch\pytorch\aten\src -IC:\actions-runner\_work\pytorch\pytorch\build -IC:\actions-runner\_work\pytorch\pytorch -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\benchmark\include -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\cudnn_frontend\include -IC:\actions-runner\_work\pytorch\pytorch\build\caffe2\contrib\aten -IC:\actions-runner\_work\pytorch\pytorch\third_party\onnx -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\onnx -IC:\actions-runner\_work\pytorch\pytorch\third_party\foxi -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\foxi -IC:\actions-runner\_work\pytorch\pytorch\build\caffe2\..\aten\src -IC:\actions-runner\_work\pytorch\pytorch\build\caffe2\..\aten\src\ATen -IC:\actions-runner\_work\pytorch\pytorch\torch\csrc\api -IC:\actions-runner\_work\pytorch\pytorch\torch\csrc\api\include -IC:\actions-runner\_work\pytorch\pytorch\c10\.. -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\ideep\mkl-dnn\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\ideep\mkl-dnn\src\..\include -IC:\actions-runner\_work\pytorch\pytorch\c10\cuda\..\.. -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\gloo -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\gloo -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\googletest\googlemock\include -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\googletest\googletest\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\protobuf\src -IC:\actions-runner\_work\pytorch\pytorch\build\win_tmp\mkl\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\XNNPACK\include -IC:\actions-runner\_work\pytorch\pytorch\third_party -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\eigen -IC:\Jenkins\Miniconda3\include -IC:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\pybind11\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" -IC:\actions-runner\_work\pytorch\pytorch\build\win_tmp\magma\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\ideep\mkl-dnn\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\ideep\include -I"C:\Program Files\NVIDIA Corporation\NvToolsExt\include" -IC:\actions-runner\_work\pytorch\pytorch\third_party\googletest\googletest\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\googletest\googletest /DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/build/win_tmp/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /MD /O2 /Ob2 /DNDEBUG /w /bigobj -DNDEBUG -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /EHsc /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -std:c++14 /showIncludes /Fotest_api\CMakeFiles\test_api.dir\tensor_options.cpp.obj /Fdtest_api\CMakeFiles\test_api.dir\ /FS -c C:\actions-runner\_work\pytorch\pytorch\test\cpp\api\tensor_options.cpp
2021-10-05T03:33:30.7775465Z Microsoft (R) C/C++ Optimizing Compiler Version 19.28.29337 for x64
2021-10-05T03:33:30.7776038Z Copyright (C) Microsoft Corporation.  All rights reserved.
2021-10-05T03:33:30.7776382Z 
2021-10-05T03:33:31.1043323Z [6164/6338] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=test_cpp_c10d\CMakeFiles\ProcessGroupGlooAsyncTest.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  test_cpp_c10d\CMakeFiles\ProcessGroupGlooAsyncTest.dir\ProcessGroupGlooAsyncTest.cpp.obj  /out:bin\ProcessGroupGlooAsyncTest.exe /implib:lib\ProcessGroupGlooAsyncTest.lib /pdb:bin\ProcessGroupGlooAsyncTest.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10d_cuda_test.lib  lib\gtest_main.lib  lib\torch_cuda.lib  lib\torch_cuda_cu.lib  lib\torch_cuda_cpp.lib  -INCLUDE:?warp_size@cuda@at@@YAHXZ  lib\c10_cuda.lib  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\lib\x64\cudart_static.lib"  "C:\Program Files\NVIDIA Corporation\NvToolsExt\lib\x64\nvToolsExt64_1.lib"  lib\torch_cpu.lib  lib\libprotobuf.lib  lib\c10.lib  win_tmp\mkl\lib\mkl_intel_lp64.lib  win_tmp\mkl\lib\mkl_intel_thread.lib  win_tmp\mkl\lib\mkl_core.lib  win_tmp\mkl\lib\libiomp5md.lib  lib\dnnl.lib  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\lib\x64\cufft.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\lib\x64\curand.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\lib\x64\cublas.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\lib\x64\cudnn.lib"  -INCLUDE:?searchsorted_cuda@native@at@@YA?AVTensor@2@AEBV32@0_N1@Z  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2021-10-05T03:33:31.1048787Z FAILED: bin/ProcessGroupGlooAsyncTest.exe 
2021-10-05T03:33:31.1054315Z cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=test_cpp_c10d\CMakeFiles\ProcessGroupGlooAsyncTest.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  test_cpp_c10d\CMakeFiles\ProcessGroupGlooAsyncTest.dir\ProcessGroupGlooAsyncTest.cpp.obj  /out:bin\ProcessGroupGlooAsyncTest.exe /implib:lib\ProcessGroupGlooAsyncTest.lib /pdb:bin\ProcessGroupGlooAsyncTest.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10d_cuda_test.lib  lib\gtest_main.lib  lib\torch_cuda.lib  lib\torch_cuda_cu.lib  lib\torch_cuda_cpp.lib  -INCLUDE:?warp_size@cuda@at@@YAHXZ  lib\c10_cuda.lib  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\lib\x64\cudart_static.lib"  "C:\Program Files\NVIDIA Corporation\NvToolsExt\lib\x64\nvToolsExt64_1.lib"  lib\torch_cpu.lib  lib\libprotobuf.lib  lib\c10.lib  win_tmp\mkl\lib\mkl_intel_lp64.lib  win_tmp\mkl\lib\mkl_intel_thread.lib  win_tmp\mkl\lib\mkl_core.lib  win_tmp\mkl\lib\libiomp5md.lib  lib\dnnl.lib  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\lib\x64\cufft.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\lib\x64\curand.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\lib\x64\cublas.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\lib\x64\cudnn.lib"  -INCLUDE:?searchsorted_cuda@native@at@@YA?AVTensor@2@AEBV32@0_N1@Z  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2021-10-05T03:33:31.1060536Z MT: command "C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe /nologo /manifest bin\ProcessGroupGlooAsyncTest.exe.manifest /outputresource:bin\ProcessGroupGlooAsyncTest.exe;#1" failed (exit code 0x1f) with the following output:
2021-10-05T03:33:31.1061611Z 
2021-10-05T03:33:31.1062348Z mt.exe : general error c101008d: Failed to write the updated manifest to the resource of file "bin\ProcessGroupGlooAsyncTest.exe". Access is denied.
2021-10-05T03:33:31.1063005Z 
2021-10-05T03:33:31.4112647Z [6165/6338] C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\bin\sccache-cl.exe   /TP -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTH_BLAS_MKL -DUSE_C10D_GLOO -DUSE_CUDA -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_OPENMP_NOFORCE_MANIFEST -IC:\actions-runner\_work\pytorch\pytorch\build\aten\src -IC:\actions-runner\_work\pytorch\pytorch\aten\src -IC:\actions-runner\_work\pytorch\pytorch\build -IC:\actions-runner\_work\pytorch\pytorch -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\benchmark\include -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\cudnn_frontend\include -IC:\actions-runner\_work\pytorch\pytorch\build\caffe2\contrib\aten -IC:\actions-runner\_work\pytorch\pytorch\third_party\onnx -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\onnx -IC:\actions-runner\_work\pytorch\pytorch\third_party\foxi -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\foxi -IC:\actions-runner\_work\pytorch\pytorch\build\caffe2\..\aten\src -IC:\actions-runner\_work\pytorch\pytorch\build\caffe2\..\aten\src\ATen -IC:\actions-runner\_work\pytorch\pytorch\torch\csrc\api -IC:\actions-runner\_work\pytorch\pytorch\torch\csrc\api\include -IC:\actions-runner\_work\pytorch\pytorch\c10\.. -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\ideep\mkl-dnn\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\ideep\mkl-dnn\src\..\include -IC:\actions-runner\_work\pytorch\pytorch\c10\cuda\..\.. -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\gloo -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\gloo -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\googletest\googlemock\include -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\googletest\googletest\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\protobuf\src -IC:\actions-runner\_work\pytorch\pytorch\build\win_tmp\mkl\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\XNNPACK\include -IC:\actions-runner\_work\pytorch\pytorch\third_party -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\eigen -IC:\Jenkins\Miniconda3\include -IC:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\pybind11\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" -IC:\actions-runner\_work\pytorch\pytorch\build\win_tmp\magma\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\ideep\mkl-dnn\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\ideep\include -I"C:\Program Files\NVIDIA Corporation\NvToolsExt\include" -IC:\actions-runner\_work\pytorch\pytorch\third_party\googletest\googletest\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\googletest\googletest /DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/build/win_tmp/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /MD /O2 /Ob2 /DNDEBUG /w /bigobj -DNDEBUG -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /EHsc /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -std:c++14 /showIncludes /Fotest_api\CMakeFiles\test_api.dir\tensor_indexing.cpp.obj /Fdtest_api\CMakeFiles\test_api.dir\ /FS -c C:\actions-runner\_work\pytorch\pytorch\test\cpp\api\tensor_indexing.cpp
2021-10-05T03:33:31.4123645Z Microsoft (R) C/C++ Optimizing Compiler Version 19.28.29337 for x64
2021-10-05T03:33:31.4124220Z Copyright (C) Microsoft Corporation.  All rights reserved.
2021-10-05T03:33:31.4124565Z 
2021-10-05T03:33:31.7499039Z [6166/6338] C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\bin\sccache-cl.exe   /TP -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTH_BLAS_MKL -DUSE_C10D_GLOO -DUSE_CUDA -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_OPENMP_NOFORCE_MANIFEST -IC:\actions-runner\_work\pytorch\pytorch\build\aten\src -IC:\actions-runner\_work\pytorch\pytorch\aten\src -IC:\actions-runner\_work\pytorch\pytorch\build -IC:\actions-runner\_work\pytorch\pytorch -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\benchmark\include -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\cudnn_frontend\include -IC:\actions-runner\_work\pytorch\pytorch\build\caffe2\contrib\aten -IC:\actions-runner\_work\pytorch\pytorch\third_party\onnx -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\onnx -IC:\actions-runner\_work\pytorch\pytorch\third_party\foxi -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\foxi -IC:\actions-runner\_work\pytorch\pytorch\build\caffe2\..\aten\src -IC:\actions-runner\_work\pytorch\pytorch\build\caffe2\..\aten\src\ATen -IC:\actions-runner\_work\pytorch\pytorch\torch\csrc\api -IC:\actions-runner\_work\pytorch\pytorch\torch\csrc\api\include -IC:\actions-runner\_work\pytorch\pytorch\c10\.. -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\ideep\mkl-dnn\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\ideep\mkl-dnn\src\..\include -IC:\actions-runner\_work\pytorch\pytorch\c10\cuda\..\.. -IC:\actions-runner\_work\pytorch\pytorch\build\third_party\gloo -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\gloo -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\googletest\googlemock\include -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\googletest\googletest\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\protobuf\src -IC:\actions-runner\_work\pytorch\pytorch\build\win_tmp\mkl\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\XNNPACK\include -IC:\actions-runner\_work\pytorch\pytorch\third_party -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\eigen -IC:\Jenkins\Miniconda3\include -IC:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -IC:\actions-runner\_work\pytorch\pytorch\cmake\..\third_party\pybind11\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" -IC:\actions-runner\_work\pytorch\pytorch\build\win_tmp\magma\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\ideep\mkl-dnn\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\ideep\include -I"C:\Program Files\NVIDIA Corporation\NvToolsExt\include" -IC:\actions-runner\_work\pytorch\pytorch\third_party\googletest\googletest\include -IC:\actions-runner\_work\pytorch\pytorch\third_party\googletest\googletest /DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/build/win_tmp/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /MD /O2 /Ob2 /DNDEBUG /w /bigobj -DNDEBUG -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /EHsc /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -std:c++14 /showIncludes /Fotest_api\CMakeFiles\test_api.dir\tensor.cpp.obj /Fdtest_api\CMakeFiles\test_api.dir\ /FS -c C:\actions-runner\_work\pytorch\pytorch\test\cpp\api\tensor.cpp

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

peterbell10 added a commit that referenced this pull request Sep 28, 2021
ghstack-source-id: 9a96825
Pull Request resolved: #65759
@ngimel
Copy link
Collaborator

ngimel commented Sep 29, 2021

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

peterbell10 added a commit that referenced this pull request Sep 29, 2021
ghstack-source-id: 4f0fa6e
Pull Request resolved: #65759
@ngimel
Copy link
Collaborator

ngimel commented Sep 30, 2021

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

peterbell10 added a commit that referenced this pull request Oct 2, 2021
ghstack-source-id: f5e9255
Pull Request resolved: #65759
peterbell10 added a commit that referenced this pull request Oct 2, 2021
ghstack-source-id: d29c103
Pull Request resolved: #65759
const int64_t ldc = m;

for (int64_t group = 0; group < groups; ++group) {
at::native::cpublas::gemm(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, we probably need bmm path.

peterbell10 added a commit that referenced this pull request Oct 5, 2021
ghstack-source-id: d3a2887
Pull Request resolved: #65759
@ngimel
Copy link
Collaborator

ngimel commented Oct 5, 2021

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot facebook-github-bot deleted the gh/peterbell10/156/head branch October 12, 2021 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants