Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of Wishart distribution #68588

Closed
wants to merge 188 commits into from

Conversation

nonconvexopt
Copy link
Contributor

@nonconvexopt nonconvexopt commented Nov 18, 2021

Fixes #68050

TODO:

@pytorch-probot
Copy link

pytorch-probot bot commented Nov 18, 2021

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/nonconvexopt/pytorch/blob/d08a1fabaea9405327f6a3135b65bd385bd19470/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-bionic-py3.6-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk, ciflow/xla ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.6-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.6-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.6-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.6-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-full-jit ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-full-jit ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.1-py3.6-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.1-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:
# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Nov 18, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit d08a1fa (more details on the Dr. CI page):



🕵️ 18 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See GitHub Actions build periodic-win-vs2019-cuda11.1-py3 / test (default, 1, 2, windows.8xlarge.nvidia.gpu) (1/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T02:01:23.5392334Z RuntimeError: test_dataloader failed!
2021-12-24T02:01:23.2142423Z   File "C:\Jenkins\Miniconda3\lib\multiprocessing\connection.py", line 879, in wait
2021-12-24T02:01:23.2143316Z     ready_handles = _exhaustive_wait(waithandle_to_obj.keys(), timeout)
2021-12-24T02:01:23.2144284Z   File "C:\Jenkins\Miniconda3\lib\multiprocessing\connection.py", line 811, in _exhaustive_wait
2021-12-24T02:01:23.2145327Z     res = _winapi.WaitForMultipleObjects(L, False, timeout)
2021-12-24T02:01:23.2146057Z KeyboardInterrupt
2021-12-24T02:01:23.5388113Z Traceback (most recent call last):
2021-12-24T02:01:23.5389036Z   File "run_test.py", line 1097, in <module>
2021-12-24T02:01:23.5389741Z     main()
2021-12-24T02:01:23.5390268Z   File "run_test.py", line 1075, in main
2021-12-24T02:01:23.5391669Z     raise RuntimeError(err_message)
2021-12-24T02:01:23.5392334Z RuntimeError: test_dataloader failed!
2021-12-24T02:01:23.9307932Z Terminate batch job (Y/N)? 
2021-12-24T02:01:23.9309799Z 
2021-12-24T02:01:23.9310726Z (base) C:\actions-runner\_work\pytorch\pytorch\test>if ERRORLEVEL 1 exit /b 1 
2021-12-24T02:01:23.9348283Z + cleanup
2021-12-24T02:01:23.9348971Z + retcode=1
2021-12-24T02:01:23.9349441Z + set +x
2021-12-24T02:01:24.0294976Z ##[error]The operation was canceled.
2021-12-24T02:01:24.0898113Z ##[group]Run # -ir => recursive include all files in pattern
2021-12-24T02:01:24.0899172Z �[36;1m# -ir => recursive include all files in pattern�[0m
2021-12-24T02:01:24.0899867Z �[36;1m7z a "test-jsons-$Env:FILE_SUFFIX.zip" -ir'!test\*.json'�[0m

See GitHub Actions build win-vs2019-cpu-py3 / test (default, 2, 2, windows.4xlarge) (2/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:25:58.7207439Z ls: cannot access ...d/win_tmp/ci_scripts/*': No such file or directory
2021-12-24T01:25:58.6126013Z + PYTORCH_FINAL_PACKAGE_DIR=/c/1617581009/build-results/
2021-12-24T01:25:58.6188639Z ++ cygpath -w /c/1617581009/build-results/
2021-12-24T01:25:58.6286741Z + PYTORCH_FINAL_PACKAGE_DIR_WIN='C:\1617581009\build-results\'
2021-12-24T01:25:58.6287293Z + export PYTORCH_FINAL_PACKAGE_DIR_WIN
2021-12-24T01:25:58.6287719Z + export PYTORCH_TEST_SKIP_NOARCH=1
2021-12-24T01:25:58.6288112Z + PYTORCH_TEST_SKIP_NOARCH=1
2021-12-24T01:25:58.6288627Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/build/win_tmp/build/torch
2021-12-24T01:25:58.6672210Z + CI_SCRIPTS_DIR=/c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts
2021-12-24T01:25:58.6672974Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts
2021-12-24T01:25:58.6863235Z ++ ls '/c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts/*'
2021-12-24T01:25:58.7207439Z ls: cannot access '/c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts/*': No such file or directory
2021-12-24T01:25:58.7210370Z + '[' -n '' ']'
2021-12-24T01:25:58.7211088Z + export SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/.jenkins/pytorch/win-test-helpers
2021-12-24T01:25:58.7211958Z + SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/.jenkins/pytorch/win-test-helpers
2021-12-24T01:25:58.7212576Z + [[ win-vs2019-cpu-py3 == *cuda11* ]]
2021-12-24T01:25:58.7213497Z + [[ default = \f\o\r\c\e\_\o\n\_\c\p\u ]]
2021-12-24T01:25:58.7213871Z + [[ default == \s\m\o\k\e\_\t\e\s\t\s ]]
2021-12-24T01:25:58.7214181Z + run_tests
2021-12-24T01:25:58.7214765Z + for path in '/c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe' /c/Windows/System32/nvidia-smi.exe
2021-12-24T01:25:58.7215492Z + [[ -x /c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe ]]
2021-12-24T01:25:58.7222907Z + '/c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe'

See GitHub Actions build linux-bionic-cuda10.2-py3.9-gcc7 / test (nogpu_NO_AVX2, 1, 1, linux.2xlarge) (3/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:58:30.0299171Z test_add_done_ca...arg() takes 0 positional arguments but 1 was given
2021-12-24T01:58:30.0289897Z   /opt/conda/lib/python3.9/unittest/suite.py(122): run
2021-12-24T01:58:30.0290404Z   /opt/conda/lib/python3.9/unittest/suite.py(84): __call__
2021-12-24T01:58:30.0291107Z   /opt/conda/lib/python3.9/site-packages/xmlrunner/runner.py(66): run
2021-12-24T01:58:30.0291737Z   /opt/conda/lib/python3.9/unittest/main.py(271): runTests
2021-12-24T01:58:30.0292256Z   /opt/conda/lib/python3.9/unittest/main.py(101): __init__
2021-12-24T01:58:30.0293180Z   /opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py(660): run_tests
2021-12-24T01:58:30.0294310Z   /var/lib/jenkins/workspace/test/test_futures.py(331): <module>
2021-12-24T01:58:30.0294661Z 
2021-12-24T01:58:30.0294917Z ok (0.001s)
2021-12-24T01:58:30.0295464Z   test_add_done_callback_maintains_callback_order (__main__.TestFuture) ... ok (0.001s)
2021-12-24T01:58:30.0299171Z   test_add_done_callback_no_arg_error_is_ignored (__main__.TestFuture) ... [E pybind_utils.h:201] Got the following error when running the callback: TypeError: no_arg() takes 0 positional arguments but 1 was given
2021-12-24T01:58:30.0300007Z ok (0.001s)
2021-12-24T01:58:30.0308423Z   test_add_done_callback_simple (__main__.TestFuture) ... ok (0.001s)
2021-12-24T01:58:30.0332043Z   test_chained_then (__main__.TestFuture) ... ok (0.002s)
2021-12-24T01:58:30.1348350Z   test_collect_all (__main__.TestFuture) ... ok (0.101s)
2021-12-24T01:58:30.1356090Z   test_done (__main__.TestFuture) ... ok (0.001s)
2021-12-24T01:58:30.1367685Z   test_done_exception (__main__.TestFuture) ... ok (0.001s)
2021-12-24T01:58:30.1381967Z   test_interleaving_then_and_add_done_callback_maintains_callback_order (__main__.TestFuture) ... ok (0.001s)
2021-12-24T01:58:30.1391588Z   test_interleaving_then_and_add_done_callback_propagates_error (__main__.TestFuture) ... [E pybind_utils.h:201] Got the following error when running the callback: ValueError: Expected error
2021-12-24T01:58:30.1392869Z 
2021-12-24T01:58:30.1393459Z At:

See GitHub Actions build linux-xenial-py3.6-clang7-asan / test (default, 2, 3, linux.2xlarge) (4/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:07:03.4287285Z SUMMARY: Undefined.../jenkins/workspace/aten/src/ATen/Utils.cpp:20:3 in
2021-12-24T01:07:03.3794118Z     #9 0x5579926498f2 in PyEval_EvalCode /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/ceval.c:731
2021-12-24T01:07:03.3794847Z     #10 0x5579926b1cd5 in run_mod /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/pythonrun.c:1025
2021-12-24T01:07:03.3795606Z     #11 0x5579926b3d5d in PyRun_StringFlags /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/pythonrun.c:949
2021-12-24T01:07:03.3796463Z     #12 0x5579926b3dbb in PyRun_SimpleStringFlags /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/pythonrun.c:445
2021-12-24T01:07:03.3797250Z     #13 0x5579926b4926 in run_command /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Modules/main.c:301
2021-12-24T01:07:03.3797931Z     #14 0x5579926b4926 in Py_Main /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Modules/main.c:749
2021-12-24T01:07:03.3798740Z     #15 0x5579925ee196 in main /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Programs/python.c:69
2021-12-24T01:07:03.4285303Z     #16 0x7fc38674583f in __libc_start_main /build/glibc-S7Ft5T/glibc-2.23/csu/../csu/libc-start.c:291
2021-12-24T01:07:03.4286103Z     #17 0x55799267e33d in _start (/opt/conda/bin/python3.6+0x1a733d)
2021-12-24T01:07:03.4286431Z 
2021-12-24T01:07:03.4287285Z SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:20:3 in 
2021-12-24T01:07:03.4468317Z + retcode=1
2021-12-24T01:07:03.4469162Z + set -e
2021-12-24T01:07:03.4469466Z + return 1
2021-12-24T01:07:03.4472416Z + [[ linux-xenial-py3.6-clang7-asan-default == *-NO_AVX-* ]]
2021-12-24T01:07:03.4473638Z + [[ default == \n\o\g\p\u\_\N\O\_\A\V\X ]]
2021-12-24T01:07:03.4475096Z + [[ linux-xenial-py3.6-clang7-asan-default == *-NO_AVX2-* ]]
2021-12-24T01:07:03.4476293Z + [[ default == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]]
2021-12-24T01:07:03.4477728Z + [[ linux-xenial-py3.6-clang7-asan-default == *-NO_AVX512-* ]]
2021-12-24T01:07:03.4478939Z + [[ default == \n\o\g\p\u\_\N\O\_\A\V\X\5\1\2 ]]
2021-12-24T01:07:03.4480325Z + [[ linux-xenial-py3.6-clang7-asan-default == *tbb* ]]

See GitHub Actions build periodic-linux-xenial-cuda11.1-py3.6-gcc7-debug / test (distributed, 1, 1, linux.8xlarge.nvidia.gpu) (5/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:39:56.9882648Z RuntimeError: Expe...e, but found at least two devices, cuda:0 and cpu!
2021-12-24T01:39:56.9827812Z frame #40: clone + 0x6d (0x7f47a4a1951d in /lib/x86_64-linux-gnu/libc.so.6)
2021-12-24T01:39:56.9828264Z 
2021-12-24T01:39:56.9828480Z 
2021-12-24T01:39:56.9829000Z On WorkerInfo(id=2, name=worker2):
2021-12-24T01:39:56.9855912Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:476 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f715cf005bb in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xce (0x7f715cefc20e in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0x9a6 (0x7f716fca5da6 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f716fca811f in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf8 (0x7f716fca9c18 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2f (0x7f716fe67d8f in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: <unknown function> + 0xb23fd6 (0x7f715dc5efd6 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #7: <unknown function> + 0xb240f6 (0x7f715dc5f0f6 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0xd6 (0x7f7170644606 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: <unknown function> + 0x283e125 (0x7f71716fa125 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: <unknown function> + 0x283e8b9 (0x7f71716fa8b9 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x167 (0x7f7170675527 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: <unknown function> + 0x297df1 (0x7f717ce9cdf1 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)\nframe #13: <unknown function> + 0x2980d6 (0x7f717ce9d0d6 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)\nframe #14: PyCFunction_Call + 0xd6 (0x56010634f666 in /opt/conda/bin/python)\nframe #15: PyObject_Call + 0x3e (0x56010632a9fe in /opt/conda/bin/python)\nframe #16: <unknown function> + 0x162a56 (0x56010635aa56 in /opt/conda/bin/python)\nframe #17: <unknown function> + 0x1c5f98 (0x5601063bdf98 in /opt/conda/bin/python)\nframe #18: PyNumber_Add + 0x3e (0x56010633a85e in /opt/conda/bin/python)\nframe #19: _PyEval_EvalFrameDefault + 0x1090 (0x560106399080 in /opt/conda/bin/python)\nframe #20: <unknown function> + 0x130160 (0x560106328160 in /opt/conda/bin/python)\nframe #21: <unknown function> + 0x172a4b (0x56010636aa4b in /opt/conda/bin/python)\nframe #22: PyObject_Call + 0x3e (0x56010632a9fe in /opt/conda/bin/python)\nframe #23: _PyEval_EvalFrameDefault + 0x4b87 (0x56010639cb77 in /opt/conda/bin/python)\nframe #24: <unknown function> + 0x130160 (0x560106328160 in /opt/conda/bin/python)\nframe #25: <unknown function> + 0x17296b (0x56010636a96b in /opt/conda/bin/python)\nframe #26: PyObject_Call + 0x3e (0x56010632a9fe in /opt/conda/bin/python)\nframe #27: <unknown function> + 0x8c8a04 (0x7f717d4cda04 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f717d4cc3ad in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector<c10::Stream, std::allocator<c10::Stream> >, bool) const + 0x83 (0x7f717d4cede3 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)\nframe #30: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x96 (0x7f717d4d3016 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x10c (0x7f717258e58c in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed
2021-12-24T01:39:56.9877123Z Traceback (most recent call last):
2021-12-24T01:39:56.9878795Z   File "/opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py", line 204, in _run_function
2021-12-24T01:39:56.9879779Z     result = python_udf.func(*python_udf.args, **python_udf.kwargs)
2021-12-24T01:39:56.9881007Z   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5956, in _gpu_add_wrong_gpus
2021-12-24T01:39:56.9881871Z     return x.cpu() + y.cuda()
2021-12-24T01:39:56.9882648Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
2021-12-24T01:39:56.9883846Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:476 (most recent call first):
2021-12-24T01:39:56.9885856Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f715cf005bb in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-12-24T01:39:56.9888074Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xce (0x7f715cefc20e in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-12-24T01:39:56.9890210Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0x9a6 (0x7f716fca5da6 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-12-24T01:39:56.9892352Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f716fca811f in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-12-24T01:39:56.9894479Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf8 (0x7f716fca9c18 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-12-24T01:39:56.9896540Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2f (0x7f716fe67d8f in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-12-24T01:39:56.9898104Z frame #6: <unknown function> + 0xb23fd6 (0x7f715dc5efd6 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)
2021-12-24T01:39:56.9899462Z frame #7: <unknown function> + 0xb240f6 (0x7f715dc5f0f6 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)
2021-12-24T01:39:56.9901233Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0xd6 (0x7f7170644606 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)

See GitHub Actions build win-vs2019-cuda11.3-py3 / test (force_on_cpu, 1, 1, windows.4xlarge) (6/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:32:14.1259641Z ls: cannot access ...d/win_tmp/ci_scripts/*': No such file or directory
2021-12-24T01:32:14.0199846Z + PYTORCH_FINAL_PACKAGE_DIR=/c/1617581010/build-results/
2021-12-24T01:32:14.0260614Z ++ cygpath -w /c/1617581010/build-results/
2021-12-24T01:32:14.0356452Z + PYTORCH_FINAL_PACKAGE_DIR_WIN='C:\1617581010\build-results\'
2021-12-24T01:32:14.0356995Z + export PYTORCH_FINAL_PACKAGE_DIR_WIN
2021-12-24T01:32:14.0357434Z + export PYTORCH_TEST_SKIP_NOARCH=1
2021-12-24T01:32:14.0357813Z + PYTORCH_TEST_SKIP_NOARCH=1
2021-12-24T01:32:14.0358332Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/build/win_tmp/build/torch
2021-12-24T01:32:14.0738469Z + CI_SCRIPTS_DIR=/c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts
2021-12-24T01:32:14.0739199Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts
2021-12-24T01:32:14.0928877Z ++ ls '/c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts/*'
2021-12-24T01:32:14.1259641Z ls: cannot access '/c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts/*': No such file or directory
2021-12-24T01:32:14.1262946Z + '[' -n '' ']'
2021-12-24T01:32:14.1263610Z + export SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/.jenkins/pytorch/win-test-helpers
2021-12-24T01:32:14.1265263Z + SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/.jenkins/pytorch/win-test-helpers
2021-12-24T01:32:14.1265948Z + [[ win-vs2019-cuda11.3-py3 == *cuda11* ]]
2021-12-24T01:32:14.1266371Z + export BUILD_SPLIT_CUDA=ON
2021-12-24T01:32:14.1266726Z + BUILD_SPLIT_CUDA=ON
2021-12-24T01:32:14.1267087Z + [[ force_on_cpu = \f\o\r\c\e\_\o\n\_\c\p\u ]]
2021-12-24T01:32:14.1267423Z + export USE_CUDA=0
2021-12-24T01:32:14.1267721Z + USE_CUDA=0
2021-12-24T01:32:14.1267999Z + run_tests

See GitHub Actions build linux-xenial-cuda11.3-py3.6-gcc7 / test (default, 2, 2, linux.4xlarge.nvidia.gpu) (7/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:46:20.9775753Z RuntimeError: CUDA error: device-side assert triggered
2021-12-24T01:46:18.1735586Z   File "/opt/conda/lib/python3.6/site-packages/torch/cuda/__init__.py", line 495, in synchronize
2021-12-24T01:46:18.1736788Z     return torch._C._cuda_synchronize()
2021-12-24T01:46:18.1738001Z RuntimeError: CUDA error: device-side assert triggered
2021-12-24T01:46:18.1739072Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
2021-12-24T01:46:18.1740073Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
2021-12-24T01:46:20.9764161Z /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/TensorCompare.cu:148: _assert_async_cuda_kernel: block: [0,0,0], thread: [0,0,0] Assertion `input[0] != c10::complex<float>(0, 0)` failed.
2021-12-24T01:46:20.9770354Z Traceback (most recent call last):
2021-12-24T01:46:20.9771387Z   File "<string>", line 4, in <module>
2021-12-24T01:46:20.9773239Z   File "/opt/conda/lib/python3.6/site-packages/torch/cuda/__init__.py", line 495, in synchronize
2021-12-24T01:46:20.9774823Z     return torch._C._cuda_synchronize()
2021-12-24T01:46:20.9775753Z RuntimeError: CUDA error: device-side assert triggered
2021-12-24T01:46:20.9776803Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
2021-12-24T01:46:20.9777823Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
2021-12-24T01:46:21.1544559Z ok (11.262s)
2021-12-24T01:46:21.1618170Z   test_gather_bool (__main__.TestCuda) ... ok (0.007s)
2021-12-24T01:46:21.1661989Z   test_get_device_index (__main__.TestCuda) ... ok (0.004s)
2021-12-24T01:46:21.1672565Z   test_get_set_rng_state_all (__main__.TestCuda) ... skip (0.001s)
2021-12-24T01:46:21.1919960Z   test_grad_scaling_accumulation (__main__.TestCuda) ... ok (0.025s)
2021-12-24T01:46:21.2406436Z   test_grad_scaling_autocast (__main__.TestCuda) ... ok (0.049s)
2021-12-24T01:46:21.2697941Z   test_grad_scaling_clipping (__main__.TestCuda) ... ok (0.029s)
2021-12-24T01:46:21.2983196Z   test_grad_scaling_clipping_separate_unscale (__main__.TestCuda) ... ok (0.028s)

See GitHub Actions build periodic-win-vs2019-cuda11.5-py3 / test (default, 1, 2, windows.8xlarge.nvidia.gpu) (8/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:41:33.3706152Z ls: cannot access ...d/win_tmp/ci_scripts/*': No such file or directory
2021-12-24T01:41:33.2319267Z + PYTORCH_FINAL_PACKAGE_DIR=/c/1617581006/build-results/
2021-12-24T01:41:33.2400657Z ++ cygpath -w /c/1617581006/build-results/
2021-12-24T01:41:33.2537216Z + PYTORCH_FINAL_PACKAGE_DIR_WIN='C:\1617581006\build-results\'
2021-12-24T01:41:33.2538465Z + export PYTORCH_FINAL_PACKAGE_DIR_WIN
2021-12-24T01:41:33.2539499Z + export PYTORCH_TEST_SKIP_NOARCH=1
2021-12-24T01:41:33.2540438Z + PYTORCH_TEST_SKIP_NOARCH=1
2021-12-24T01:41:33.2541678Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/build/win_tmp/build/torch
2021-12-24T01:41:33.3011640Z + CI_SCRIPTS_DIR=/c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts
2021-12-24T01:41:33.3012569Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts
2021-12-24T01:41:33.3267286Z ++ ls '/c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts/*'
2021-12-24T01:41:33.3706152Z ls: cannot access '/c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts/*': No such file or directory
2021-12-24T01:41:33.3710594Z + '[' -n '' ']'
2021-12-24T01:41:33.3712069Z + export SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/.jenkins/pytorch/win-test-helpers
2021-12-24T01:41:33.3715214Z + SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/.jenkins/pytorch/win-test-helpers
2021-12-24T01:41:33.3717058Z + [[ periodic-win-vs2019-cuda11.5-py3 == *cuda11* ]]
2021-12-24T01:41:33.3718590Z + export BUILD_SPLIT_CUDA=ON
2021-12-24T01:41:33.3719427Z + BUILD_SPLIT_CUDA=ON
2021-12-24T01:41:33.3720261Z + [[ default = \f\o\r\c\e\_\o\n\_\c\p\u ]]
2021-12-24T01:41:33.3721131Z + [[ default == \s\m\o\k\e\_\t\e\s\t\s ]]
2021-12-24T01:41:33.3721926Z + run_tests
2021-12-24T01:41:33.3723283Z + for path in '/c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe' /c/Windows/System32/nvidia-smi.exe

See GitHub Actions build linux-xenial-cuda11.3-py3.6-gcc7 / test (distributed, 1, 1, linux.8xlarge.nvidia.gpu) (9/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:38:14.3540486Z RuntimeError: Expe...e, but found at least two devices, cuda:0 and cpu!
2021-12-24T01:38:14.3485704Z frame #40: clone + 0x6d (0x7fb2c309651d in /lib/x86_64-linux-gnu/libc.so.6)
2021-12-24T01:38:14.3486175Z 
2021-12-24T01:38:14.3486416Z 
2021-12-24T01:38:14.3486927Z On WorkerInfo(id=2, name=worker2):
2021-12-24T01:38:14.3514034Z RuntimeError('Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!\nException raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:476 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f166a6cc5bb in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xce (0x7f166a6c820e in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)\nframe #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0x9a6 (0x7f167a281826 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f167a283b9f in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf8 (0x7f167a285698 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2f (0x7f167a44380f in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #6: <unknown function> + 0xb10de6 (0x7f166b417de6 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #7: <unknown function> + 0xb10f06 (0x7f166b417f06 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)\nframe #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0xd6 (0x7f167ac20086 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #9: <unknown function> + 0x283bba5 (0x7f167bcd5ba5 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #10: <unknown function> + 0x283c339 (0x7f167bcd6339 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #11: at::_ops::add_Tensor::call(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x167 (0x7f167ac50fa7 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #12: <unknown function> + 0x297df1 (0x7f1686e71df1 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)\nframe #13: <unknown function> + 0x2980d6 (0x7f1686e720d6 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)\nframe #14: PyCFunction_Call + 0xd6 (0x55f459608666 in /opt/conda/bin/python)\nframe #15: PyObject_Call + 0x3e (0x55f4595e39fe in /opt/conda/bin/python)\nframe #16: <unknown function> + 0x162a56 (0x55f459613a56 in /opt/conda/bin/python)\nframe #17: <unknown function> + 0x1c5f98 (0x55f459676f98 in /opt/conda/bin/python)\nframe #18: PyNumber_Add + 0x3e (0x55f4595f385e in /opt/conda/bin/python)\nframe #19: _PyEval_EvalFrameDefault + 0x1090 (0x55f459652080 in /opt/conda/bin/python)\nframe #20: <unknown function> + 0x130160 (0x55f4595e1160 in /opt/conda/bin/python)\nframe #21: <unknown function> + 0x172a4b (0x55f459623a4b in /opt/conda/bin/python)\nframe #22: PyObject_Call + 0x3e (0x55f4595e39fe in /opt/conda/bin/python)\nframe #23: _PyEval_EvalFrameDefault + 0x4b87 (0x55f459655b77 in /opt/conda/bin/python)\nframe #24: <unknown function> + 0x130160 (0x55f4595e1160 in /opt/conda/bin/python)\nframe #25: <unknown function> + 0x17296b (0x55f45962396b in /opt/conda/bin/python)\nframe #26: PyObject_Call + 0x3e (0x55f4595e39fe in /opt/conda/bin/python)\nframe #27: <unknown function> + 0x8c8a04 (0x7f16874a2a04 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)\nframe #28: torch::distributed::rpc::PythonRpcHandler::runPythonUdf(pybind11::object const&) + 0x7d (0x7f16874a13ad in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)\nframe #29: torch::distributed::rpc::RequestCallbackImpl::runPythonFunction(pybind11::object const&, std::vector<c10::Stream, std::allocator<c10::Stream> >, bool) const + 0x83 (0x7f16874a3de3 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)\nframe #30: torch::distributed::rpc::RequestCallbackImpl::processPythonCall(torch::distributed::rpc::RpcCommandBase&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x96 (0x7f16874a8016 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)\nframe #31: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x10c (0x7f167cb6a00c in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)\nframe #32: torch::distributed::rpc::RequestCallbackImpl::processRpcWithErrors(torch::distributed
2021-12-24T01:38:14.3535476Z Traceback (most recent call last):
2021-12-24T01:38:14.3536589Z   File "/opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py", line 204, in _run_function
2021-12-24T01:38:14.3537575Z     result = python_udf.func(*python_udf.args, **python_udf.kwargs)
2021-12-24T01:38:14.3538830Z   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 5956, in _gpu_add_wrong_gpus
2021-12-24T01:38:14.3539692Z     return x.cpu() + y.cuda()
2021-12-24T01:38:14.3540486Z RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
2021-12-24T01:38:14.3541714Z Exception raised from compute_types at /var/lib/jenkins/workspace/aten/src/ATen/TensorIterator.cpp:476 (most recent call first):
2021-12-24T01:38:14.3543636Z frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f166a6cc5bb in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-12-24T01:38:14.3545836Z frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xce (0x7f166a6c820e in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
2021-12-24T01:38:14.3547984Z frame #2: at::TensorIteratorBase::compute_types(at::TensorIteratorConfig const&) + 0x9a6 (0x7f167a281826 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-12-24T01:38:14.3549969Z frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x7f (0x7f167a283b9f in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-12-24T01:38:14.3552193Z frame #4: at::TensorIteratorBase::build_borrowing_binary_op(at::TensorBase const&, at::TensorBase const&, at::TensorBase const&) + 0xf8 (0x7f167a285698 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-12-24T01:38:14.3554892Z frame #5: at::meta::structured_add_Tensor::meta(at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0x2f (0x7f167a44380f in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
2021-12-24T01:38:14.3556524Z frame #6: <unknown function> + 0xb10de6 (0x7f166b417de6 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)
2021-12-24T01:38:14.3557878Z frame #7: <unknown function> + 0xb10f06 (0x7f166b417f06 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cu.so)
2021-12-24T01:38:14.3559699Z frame #8: at::_ops::add_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, c10::Scalar const&) + 0xd6 (0x7f167ac20086 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)

See GitHub Actions build win-vs2019-cpu-py3 / test (default, 1, 2, windows.4xlarge) (10/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:40:59.9081811Z test_add_done_ca...arg() takes 0 positional arguments but 1 was given
2021-12-24T01:40:59.9060182Z   C:\Jenkins\Miniconda3\lib\unittest\suite.py(122): run
2021-12-24T01:40:59.9060705Z   C:\Jenkins\Miniconda3\lib\unittest\suite.py(84): __call__
2021-12-24T01:40:59.9061284Z   C:\Jenkins\Miniconda3\lib\site-packages\xmlrunner\runner.py(66): run
2021-12-24T01:40:59.9061887Z   C:\Jenkins\Miniconda3\lib\unittest\main.py(271): runTests
2021-12-24T01:40:59.9062416Z   C:\Jenkins\Miniconda3\lib\unittest\main.py(101): __init__
2021-12-24T01:40:59.9063180Z   C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_utils.py(660): run_tests
2021-12-24T01:40:59.9063807Z   test_futures.py(331): <module>
2021-12-24T01:40:59.9064050Z 
2021-12-24T01:40:59.9064287Z ok (0.002s)
2021-12-24T01:40:59.9072029Z   test_add_done_callback_maintains_callback_order (__main__.TestFuture) ... ok (0.002s)
2021-12-24T01:40:59.9081811Z   test_add_done_callback_no_arg_error_is_ignored (__main__.TestFuture) ... [E pybind_utils.h:201] Got the following error when running the callback: TypeError: no_arg() takes 0 positional arguments but 1 was given
2021-12-24T01:40:59.9082548Z ok (0.001s)
2021-12-24T01:40:59.9096598Z   test_add_done_callback_simple (__main__.TestFuture) ... ok (0.001s)
2021-12-24T01:40:59.9133871Z   test_chained_then (__main__.TestFuture) ... ok (0.004s)
2021-12-24T01:41:00.0226793Z   test_collect_all (__main__.TestFuture) ... ok (0.109s)
2021-12-24T01:41:00.0237905Z   test_done (__main__.TestFuture) ... ok (0.000s)
2021-12-24T01:41:00.0254825Z   test_done_exception (__main__.TestFuture) ... ok (0.000s)
2021-12-24T01:41:00.0275319Z   test_interleaving_then_and_add_done_callback_maintains_callback_order (__main__.TestFuture) ... ok (0.000s)
2021-12-24T01:41:00.0289059Z   test_interleaving_then_and_add_done_callback_propagates_error (__main__.TestFuture) ... [E pybind_utils.h:201] Got the following error when running the callback: ValueError: Expected error
2021-12-24T01:41:00.0289823Z 
2021-12-24T01:41:00.0296468Z At:

See GitHub Actions build periodic-win-vs2019-cuda11.5-py3 / test (force_on_cpu, 1, 1, windows.4xlarge) (11/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:37:18.7110693Z ls: cannot access ...d/win_tmp/ci_scripts/*': No such file or directory
2021-12-24T01:37:18.6088779Z + PYTORCH_FINAL_PACKAGE_DIR=/c/1617581006/build-results/
2021-12-24T01:37:18.6153201Z ++ cygpath -w /c/1617581006/build-results/
2021-12-24T01:37:18.6251991Z + PYTORCH_FINAL_PACKAGE_DIR_WIN='C:\1617581006\build-results\'
2021-12-24T01:37:18.6252546Z + export PYTORCH_FINAL_PACKAGE_DIR_WIN
2021-12-24T01:37:18.6252959Z + export PYTORCH_TEST_SKIP_NOARCH=1
2021-12-24T01:37:18.6253343Z + PYTORCH_TEST_SKIP_NOARCH=1
2021-12-24T01:37:18.6253848Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/build/win_tmp/build/torch
2021-12-24T01:37:18.6612975Z + CI_SCRIPTS_DIR=/c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts
2021-12-24T01:37:18.6613742Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts
2021-12-24T01:37:18.6808732Z ++ ls '/c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts/*'
2021-12-24T01:37:18.7110693Z ls: cannot access '/c/actions-runner/_work/pytorch/pytorch/build/win_tmp/ci_scripts/*': No such file or directory
2021-12-24T01:37:18.7113354Z + '[' -n '' ']'
2021-12-24T01:37:18.7114378Z + export SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/.jenkins/pytorch/win-test-helpers
2021-12-24T01:37:18.7116106Z + SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/.jenkins/pytorch/win-test-helpers
2021-12-24T01:37:18.7117151Z + [[ periodic-win-vs2019-cuda11.5-py3 == *cuda11* ]]
2021-12-24T01:37:18.7117696Z + export BUILD_SPLIT_CUDA=ON
2021-12-24T01:37:18.7118026Z + BUILD_SPLIT_CUDA=ON
2021-12-24T01:37:18.7118374Z + [[ force_on_cpu = \f\o\r\c\e\_\o\n\_\c\p\u ]]
2021-12-24T01:37:18.7118699Z + export USE_CUDA=0
2021-12-24T01:37:18.7118985Z + USE_CUDA=0
2021-12-24T01:37:18.7119276Z + run_tests

See GitHub Actions build pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build / build-and-test (12/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T00:49:17.5956606Z �[36;1m echo "ERR...t available for the merge-base of your branch"�[0m
2021-12-24T00:49:17.5950753Z �[36;1mfi�[0m
2021-12-24T00:49:17.5951201Z �[36;1m# Covers the case where a previous tag doesn't exist for the tree�[0m
2021-12-24T00:49:17.5951895Z �[36;1m# this is only really applicable on trees that don't have `.circleci/docker` at its merge base, i.e. nightly�[0m
2021-12-24T00:49:17.5952574Z �[36;1mif ! git rev-parse "$MERGE_BASE:.circleci/docker"; then�[0m
2021-12-24T00:49:17.5953292Z �[36;1m  echo "Directory '.circleci/docker' not found in commit $MERGE_BASE, you should probably rebase onto a more recent commit"�[0m
2021-12-24T00:49:17.5953949Z �[36;1m  exit 1�[0m
2021-12-24T00:49:17.5954228Z �[36;1mfi�[0m
2021-12-24T00:49:17.5954666Z �[36;1mPREVIOUS_DOCKER_TAG=$(git rev-parse "$MERGE_BASE:.circleci/docker")�[0m
2021-12-24T00:49:17.5955342Z �[36;1m# If no image exists but the hash is the same as the previous hash then we should error out here�[0m
2021-12-24T00:49:17.5955949Z �[36;1mif [[ "${PREVIOUS_DOCKER_TAG}" = "${DOCKER_TAG}" ]]; then�[0m
2021-12-24T00:49:17.5956606Z �[36;1m  echo "ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch"�[0m
2021-12-24T00:49:17.5957334Z �[36;1m  echo "       contact the PyTorch team to restore the original images"�[0m
2021-12-24T00:49:17.5957769Z �[36;1m  exit 1�[0m
2021-12-24T00:49:17.5958049Z �[36;1mfi�[0m
2021-12-24T00:49:17.5958411Z �[36;1mecho ::set-output name=rebuild::yes�[0m
2021-12-24T00:49:17.5968323Z shell: /usr/bin/bash -e {0}
2021-12-24T00:49:17.5968638Z env:
2021-12-24T00:49:17.5969648Z   BUILD_ENVIRONMENT: pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build
2021-12-24T00:49:17.5971209Z   DOCKER_IMAGE_BASE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-android-ndk-r19c
2021-12-24T00:49:17.5972422Z   SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2
2021-12-24T00:49:17.5973316Z   XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla

See GitHub Actions build linux-bionic-cuda10.2-py3.9-gcc7 / test (nogpu_NO_AVX, 1, 1, linux.2xlarge) (13/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:58:26.9452679Z test_add_done_ca...arg() takes 0 positional arguments but 1 was given
2021-12-24T01:58:26.9443469Z   /opt/conda/lib/python3.9/unittest/suite.py(122): run
2021-12-24T01:58:26.9443979Z   /opt/conda/lib/python3.9/unittest/suite.py(84): __call__
2021-12-24T01:58:26.9444653Z   /opt/conda/lib/python3.9/site-packages/xmlrunner/runner.py(66): run
2021-12-24T01:58:26.9445222Z   /opt/conda/lib/python3.9/unittest/main.py(271): runTests
2021-12-24T01:58:26.9445732Z   /opt/conda/lib/python3.9/unittest/main.py(101): __init__
2021-12-24T01:58:26.9446466Z   /opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py(660): run_tests
2021-12-24T01:58:26.9447613Z   /var/lib/jenkins/workspace/test/test_futures.py(331): <module>
2021-12-24T01:58:26.9448011Z 
2021-12-24T01:58:26.9448245Z ok (0.001s)
2021-12-24T01:58:26.9448718Z   test_add_done_callback_maintains_callback_order (__main__.TestFuture) ... ok (0.001s)
2021-12-24T01:58:26.9452679Z   test_add_done_callback_no_arg_error_is_ignored (__main__.TestFuture) ... [E pybind_utils.h:201] Got the following error when running the callback: TypeError: no_arg() takes 0 positional arguments but 1 was given
2021-12-24T01:58:26.9453517Z ok (0.001s)
2021-12-24T01:58:26.9461779Z   test_add_done_callback_simple (__main__.TestFuture) ... ok (0.001s)
2021-12-24T01:58:26.9485620Z   test_chained_then (__main__.TestFuture) ... ok (0.002s)
2021-12-24T01:58:27.0500872Z   test_collect_all (__main__.TestFuture) ... ok (0.101s)
2021-12-24T01:58:27.0506946Z   test_done (__main__.TestFuture) ... ok (0.001s)
2021-12-24T01:58:27.0517480Z   test_done_exception (__main__.TestFuture) ... ok (0.001s)
2021-12-24T01:58:27.0530580Z   test_interleaving_then_and_add_done_callback_maintains_callback_order (__main__.TestFuture) ... ok (0.001s)
2021-12-24T01:58:27.0538558Z   test_interleaving_then_and_add_done_callback_propagates_error (__main__.TestFuture) ... [E pybind_utils.h:201] Got the following error when running the callback: ValueError: Expected error
2021-12-24T01:58:27.0539245Z 
2021-12-24T01:58:27.0539664Z At:

See GitHub Actions build periodic-linux-bionic-cuda11.5-py3.6-gcc7 / test (distributed, 1, 1, linux.8xlarge.nvidia.gpu) (14/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:59:09.1321938Z RuntimeError: hello
2021-12-24T01:59:09.1312790Z -- Process 0 terminated with the following error:
2021-12-24T01:59:09.1313433Z Traceback (most recent call last):
2021-12-24T01:59:09.1314421Z   File "/opt/conda/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
2021-12-24T01:59:09.1315156Z     fn(i, *args)
2021-12-24T01:59:09.1316196Z   File "/opt/conda/lib/python3.6/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 382, in _wrap
2021-12-24T01:59:09.1317457Z     ret = record(fn)(*args_)
2021-12-24T01:59:09.1318613Z   File "/opt/conda/lib/python3.6/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
2021-12-24T01:59:09.1319748Z     return f(*args, **kwargs)
2021-12-24T01:59:09.1320604Z   File "/var/lib/jenkins/workspace/test/distributed/elastic/multiprocessing/api_test.py", line 137, in echo2
2021-12-24T01:59:09.1321448Z     raise RuntimeError(msg)
2021-12-24T01:59:09.1321938Z RuntimeError: hello
2021-12-24T01:59:09.1322257Z 
2021-12-24T01:59:09.1322603Z ok (1.222s)
2021-12-24T01:59:09.1348490Z   test_function_with_tensor (__main__.StartProcessesTest) ... ok (0.003s)
2021-12-24T01:59:09.1365797Z   test_invalid_log_dir (__main__.StartProcessesTest) ... ok (0.002s)
2021-12-24T01:59:09.1424178Z   test_multiprocess_context_close (__main__.StartProcessesTest) ... Closing process 10963 via signal SIGTERM
2021-12-24T01:59:09.1439475Z ok (0.007s)
2021-12-24T01:59:09.1476475Z   test_multiprocessing_context_poll_raises_exception (__main__.StartProcessesTest) ... failed (exitcode: -1) local_rank: 0 (pid: 123) of fn: echo0 (start_method: spawn)
2021-12-24T01:59:09.1478087Z Traceback (most recent call last):
2021-12-24T01:59:09.1479269Z   File "/opt/conda/lib/python3.6/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 466, in _poll
2021-12-24T01:59:09.1480236Z     self._pc.join(-1)

See GitHub Actions build linux-bionic-cuda10.2-py3.9-gcc7 / test (distributed, 1, 1, linux.8xlarge.nvidia.gpu) (15/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:57:26.8406677Z RuntimeError: hello
2021-12-24T01:57:26.8397046Z -- Process 0 terminated with the following error:
2021-12-24T01:57:26.8397681Z Traceback (most recent call last):
2021-12-24T01:57:26.8398688Z   File "/opt/conda/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
2021-12-24T01:57:26.8399431Z     fn(i, *args)
2021-12-24T01:57:26.8401328Z   File "/opt/conda/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 382, in _wrap
2021-12-24T01:57:26.8402231Z     ret = record(fn)(*args_)
2021-12-24T01:57:26.8403402Z   File "/opt/conda/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
2021-12-24T01:57:26.8404288Z     return f(*args, **kwargs)
2021-12-24T01:57:26.8405104Z   File "/var/lib/jenkins/workspace/test/distributed/elastic/multiprocessing/api_test.py", line 137, in echo2
2021-12-24T01:57:26.8406157Z     raise RuntimeError(msg)
2021-12-24T01:57:26.8406677Z RuntimeError: hello
2021-12-24T01:57:26.8407223Z 
2021-12-24T01:57:26.8407593Z ok (1.121s)
2021-12-24T01:57:26.8432646Z   test_function_with_tensor (__main__.StartProcessesTest) ... ok (0.002s)
2021-12-24T01:57:26.8448954Z   test_invalid_log_dir (__main__.StartProcessesTest) ... ok (0.002s)
2021-12-24T01:57:26.8499528Z   test_multiprocess_context_close (__main__.StartProcessesTest) ... Closing process 9931 via signal SIGTERM
2021-12-24T01:57:26.8512972Z ok (0.006s)
2021-12-24T01:57:26.8552550Z   test_multiprocessing_context_poll_raises_exception (__main__.StartProcessesTest) ... failed (exitcode: -1) local_rank: 0 (pid: 123) of fn: echo0 (start_method: spawn)
2021-12-24T01:57:26.8553663Z Traceback (most recent call last):
2021-12-24T01:57:26.8554836Z   File "/opt/conda/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 466, in _poll
2021-12-24T01:57:26.8555795Z     self._pc.join(-1)

See GitHub Actions build periodic-win-vs2019-cuda11.1-py3 / test (default, 2, 2, windows.8xlarge.nvidia.gpu) (16/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T02:01:23.0495674Z RuntimeError: KeyboardInterrupt:
2021-12-24T02:01:23.0488531Z   File "C:\Jenkins\Miniconda3\lib\runpy.py", line 265, in run_path
2021-12-24T02:01:23.0489230Z     return _run_module_code(code, init_globals, run_name,
2021-12-24T02:01:23.0489971Z   File "C:\Jenkins\Miniconda3\lib\runpy.py", line 97, in _run_module_code
2021-12-24T02:01:23.0490663Z     _run_code(code, mod_globals, init_globals,
2021-12-24T02:01:23.0491349Z   File "C:\Jenkins\Miniconda3\lib\runpy.py", line 87, in _run_code
2021-12-24T02:01:23.0491929Z     exec(code, run_globals)
2021-12-24T02:01:23.0492797Z   File "C:\actions-runner\_work\pytorch\pytorch\test\distributed\test_distributed_spawn.py", line 6, in <module>
2021-12-24T02:01:23.0493589Z     import torch
2021-12-24T02:01:23.0494329Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\__init__.py", line 199, in <module>
2021-12-24T02:01:23.0495095Z     from torch._C import *  # noqa: F403
2021-12-24T02:01:23.0495674Z RuntimeError: KeyboardInterrupt: 
2021-12-24T02:01:23.2713847Z Test exited with non-zero exitcode 3221225786. Command to reproduce: BACKEND=gloo WORLD_SIZE=3 C:\Jenkins\Miniconda3\python.exe distributed/test_distributed_spawn.py --import-disabled-tests --import-slow-tests -v TestDistBackendWithSpawn.test_sparse_all_reduce_sum
2021-12-24T02:01:25.0839847Z 
2021-12-24T02:01:25.0840533Z Running tests...
2021-12-24T02:01:25.0841096Z ----------------------------------------------------------------------
2021-12-24T02:01:25.0842000Z Test results will be stored in test-reports\dist-gloo\distributed.test_distributed_spawn
2021-12-24T02:01:27.4136303Z   test_sparse_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip (2.343s)
2021-12-24T02:01:27.4215973Z 
2021-12-24T02:01:27.4217078Z ----------------------------------------------------------------------
2021-12-24T02:01:27.4217761Z Ran 1 test in 2.343s
2021-12-24T02:01:27.4218021Z 

See GitHub Actions build win-vs2019-cuda11.3-py3 / test (default, 1, 2, windows.8xlarge.nvidia.gpu) (17/18)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun)

2021-12-24T02:01:21.5514251Z RuntimeError: test_dataloader failed!
2021-12-24T02:01:21.2956767Z   File "C:\Jenkins\Miniconda3\lib\multiprocessing\process.py", line 149, in join
2021-12-24T02:01:21.2957511Z     res = self._popen.wait(timeout)
2021-12-24T02:01:21.2958324Z   File "C:\Jenkins\Miniconda3\lib\multiprocessing\popen_spawn_win32.py", line 108, in wait
2021-12-24T02:01:21.2959264Z     res = _winapi.WaitForSingleObject(int(self._handle), msecs)
2021-12-24T02:01:21.2959980Z KeyboardInterrupt
2021-12-24T02:01:21.5511085Z Traceback (most recent call last):
2021-12-24T02:01:21.5512160Z   File "run_test.py", line 1097, in <module>
2021-12-24T02:01:21.5512618Z     main()
2021-12-24T02:01:21.5513093Z   File "run_test.py", line 1075, in main
2021-12-24T02:01:21.5513669Z     raise RuntimeError(err_message)
2021-12-24T02:01:21.5514251Z RuntimeError: test_dataloader failed!
2021-12-24T02:01:21.8861294Z Terminate batch job (Y/N)? 
2021-12-24T02:01:21.8863058Z 
2021-12-24T02:01:21.8863922Z (base) C:\actions-runner\_work\pytorch\pytorch\test>if ERRORLEVEL 1 exit /b 1 
2021-12-24T02:01:21.8900104Z + cleanup
2021-12-24T02:01:21.8900744Z + retcode=1
2021-12-24T02:01:21.8901163Z + set +x
2021-12-24T02:01:21.9774570Z ##[error]The operation was canceled.
2021-12-24T02:01:22.0362839Z ##[group]Run # -ir => recursive include all files in pattern
2021-12-24T02:01:22.0363711Z �[36;1m# -ir => recursive include all files in pattern�[0m
2021-12-24T02:01:22.0364388Z �[36;1m7z a "test-jsons-$Env:FILE_SUFFIX.zip" -ir'!test\*.json'�[0m

See GitHub Actions build linux-bionic-py3.6-clang9 / test (xla, 1, 1, linux.2xlarge) (18/18)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2021-12-24T01:07:53.9362373Z unlink: cannot unlink '/usr/bin/bazel': No such file or directory
2021-12-24T01:07:53.9158737Z �[?25h�[?25h�[1A�[0G�[K/usr/local/bin/bazelisk -> /usr/local/lib/node_modules/@bazel/bazelisk/bazelisk.js
2021-12-24T01:07:53.9159445Z �[?25l�[0G
2021-12-24T01:07:53.9161996Z �[?25h�[?25h�[1A�[0G�[K/usr/local/bin/bazel -> /usr/local/lib/node_modules/@bazel/bazelisk/bazelisk.js
2021-12-24T01:07:53.9162660Z �[?25l�[0G
2021-12-24T01:07:53.9200237Z �[?25h�[?25h�[1A�[0G�[K�[?25l�[0G
2021-12-24T01:07:53.9240100Z �[?25h�[?25h�[1A�[0G�[K/usr/local/lib
2021-12-24T01:07:53.9240784Z └── �[40m�[33m@bazel/bazelisk@1.11.0�[0m�[0m 
2021-12-24T01:07:53.9241084Z 
2021-12-24T01:07:53.9241419Z �[?25l�[0G
2021-12-24T01:07:53.9307988Z �[?25h�[?25h�[1A�[0G�[K�[?25h�[0G�[K+ sudo unlink /usr/bin/bazel
2021-12-24T01:07:53.9362373Z unlink: cannot unlink '/usr/bin/bazel': No such file or directory
2021-12-24T01:07:53.9367401Z + cleanup
2021-12-24T01:07:53.9367757Z + retcode=1
2021-12-24T01:07:53.9368100Z + set +x
2021-12-24T01:07:53.9406125Z ##[error]Process completed with exit code 1.
2021-12-24T01:07:53.9449995Z ##[group]Run # Ensure the working directory gets chowned back to the current user
2021-12-24T01:07:53.9450757Z �[36;1m# Ensure the working directory gets chowned back to the current user�[0m
2021-12-24T01:07:53.9451394Z �[36;1mdocker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .�[0m
2021-12-24T01:07:53.9469750Z shell: /usr/bin/bash -e {0}
2021-12-24T01:07:53.9470085Z env:
2021-12-24T01:07:53.9470558Z   BUILD_ENVIRONMENT: linux-bionic-py3.6-clang9

9 failures not recognized by patterns:

Job Step Action
GitHub Actions periodic-linux-bionic-cuda11.5-py3.6-gcc7 / test (default, 2, 2, linux.4xlarge.nvidia.gpu) Unknown 🔁 rerun
GitHub Actions macos-11-py3-x86-64 / build Unknown 🔁 rerun
GitHub Actions linux-bionic-cuda10.2-py3.9-gcc7 / test (slow, 1, 1, linux.4xlarge.nvidia.gpu) Unknown 🔁 rerun
GitHub Actions macos-10-15-py3-arm64 / build Unknown 🔁 rerun
GitHub Actions periodic-linux-bionic-cuda11.5-py3.6-gcc7 / test (default, 1, 2, linux.4xlarge.nvidia.gpu) Unknown 🔁 rerun
GitHub Actions linux-bionic-cuda10.2-py3.9-gcc7 / test (multigpu, 1, 1, linux.16xlarge.nvidia.gpu) Unknown 🔁 rerun
GitHub Actions linux-bionic-cuda10.2-py3.9-gcc7 / test (default, 2, 2, linux.4xlarge.nvidia.gpu) Unknown 🔁 rerun
GitHub Actions linux-bionic-cuda10.2-py3.9-gcc7 / test (default, 1, 2, linux.4xlarge.nvidia.gpu) Unknown 🔁 rerun
GitHub Actions periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck / test (default, 1, 2, linux.4xlarge.nvidia.gpu) Unknown 🔁 rerun

❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

See GitHub Actions build periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck / test (default, 2, 2, linux.4xlarge.nvidia.gpu) (1/1)

Step: "Unknown" (full log | diagnosis details | 🔁 rerun) ❄️

2021-12-24T01:22:49.0857291Z ConnectionResetError: [Errno 104] Connection reset by peer
2021-12-24T01:22:49.0848104Z   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 455, in accept
2021-12-24T01:22:49.0849099Z     deliver_challenge(c, self._authkey)
2021-12-24T01:22:49.0850086Z   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 722, in deliver_challenge
2021-12-24T01:22:49.0851292Z     response = connection.recv_bytes(256)        # reject large message
2021-12-24T01:22:49.0852209Z   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
2021-12-24T01:22:49.0853206Z     buf = self._recv_bytes(maxlength)
2021-12-24T01:22:49.0853993Z   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
2021-12-24T01:22:49.0854915Z     buf = self._recv(4)
2021-12-24T01:22:49.0855620Z   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
2021-12-24T01:22:49.0856576Z     chunk = read(handle, remaining)
2021-12-24T01:22:49.0857291Z ConnectionResetError: [Errno 104] Connection reset by peer
2021-12-24T01:22:54.1845269Z /opt/conda/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 14 leaked semaphores to clean up at shutdown
2021-12-24T01:22:54.1846341Z   len(cache))
2021-12-24T01:22:57.8440000Z Process ErrorTrackingProcess-88:
2021-12-24T01:22:57.8440816Z Traceback (most recent call last):
2021-12-24T01:22:57.8441584Z   File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
2021-12-24T01:22:57.8442608Z     self.run()
2021-12-24T01:22:57.8443697Z   File "/var/lib/jenkins/workspace/test/test_dataloader.py", line 406, in run
2021-12-24T01:22:57.8444514Z     super(ErrorTrackingProcess, self).run()
2021-12-24T01:22:57.8445347Z   File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 93, in run
2021-12-24T01:22:57.8446076Z     self._target(*self._args, **self._kwargs)

🚧 5 fixed upstream failures:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@ngimel ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Nov 18, 2021
Copy link
Contributor

@neerajprad neerajprad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing this really useful distribution!

@facebook-github-bot
Copy link
Contributor

@neerajprad has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@neerajprad neerajprad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice that you merged your df changes in this PR, just had some small comments regarding that. We should also add a float df value to the unit tests.

torch/distributions/wishart.py Show resolved Hide resolved
torch/distributions/wishart.py Outdated Show resolved Hide resolved
@nonconvexopt
Copy link
Contributor Author

I didn't notice that you merged your df changes in this PR, just had some small comments regarding that. We should also add a float df value to the unit tests.

Thank you for your feedback. I will add unit test for float df value.

@facebook-github-bot
Copy link
Contributor

@neerajprad has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

This pull request has been reverted by 6217fee. To re-land this change, follow these steps.

@nonconvexopt nonconvexopt restored the wishart_distribution branch December 22, 2021 04:27
@neerajprad
Copy link
Contributor

@nonconvexopt - Could you reopen the original PR? This was reverted by mistake due to an unrelated reason.

@nonconvexopt
Copy link
Contributor Author

nonconvexopt commented Dec 23, 2021

@nonconvexopt - Could you reopen the original PR? This was reverted by mistake due to an unrelated reason.

Sure, I will reopen it.

*EDIT:
I think I am not allow to reopen the PR since I had opened another PR of this branch, #70290. I am looking for solution. Sorry for the delay.

@nonconvexopt
Copy link
Contributor Author

@neerajprad May I open a new PR with same branch?
I am afraid I cannot find a way to reopen this PR. It is my bad that I opened a new PR with this branch.
Github web and github cli neither works.

@neerajprad
Copy link
Contributor

I am afraid I cannot find a way to reopen this PR. It is my bad that I opened a new PR with this branch.
Github web and github cli neither works.

That's fine, we can open a new one instead.

@nonconvexopt
Copy link
Contributor Author

That's fine, we can open a new one instead.

Thanks for letting me know. I will open a new one.

facebook-github-bot pushed a commit that referenced this pull request Dec 30, 2021
Summary:
Implement #68050
Reopened merged and reverted PR #68588 worked with neerajprad
cc neerajprad

Sorry for the confusion.

TODO:

- [x] Unit Test
- [x] Documentation
- [x] Change constraint of matrix variables with 'torch.distributions.constraints.symmetric' if it is reviewed and merged. Debug positive definite constraints #68720

Pull Request resolved: #70377

Reviewed By: mikaylagawarecki

Differential Revision: D33355132

Pulled By: neerajprad

fbshipit-source-id: e968c0d9a3061fb2855564b96074235e46a57b6c
wconstab pushed a commit that referenced this pull request Jan 5, 2022
Summary:
Implement #68050
Reopened merged and reverted PR #68588 worked with neerajprad
cc neerajprad

Sorry for the confusion.

TODO:

- [x] Unit Test
- [x] Documentation
- [x] Change constraint of matrix variables with 'torch.distributions.constraints.symmetric' if it is reviewed and merged. Debug positive definite constraints #68720

Pull Request resolved: #70377

Reviewed By: mikaylagawarecki

Differential Revision: D33355132

Pulled By: neerajprad

fbshipit-source-id: e968c0d9a3061fb2855564b96074235e46a57b6c
@facebook-github-bot
Copy link
Contributor

This pull request has been reverted by 6217fee. To re-land this change, follow these steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed open source Reverted triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Distributions for Symmetric Matrices.
6 participants