Also, the mkldnn windows builds keep failing in the v1.9.x branch. Some example failures:
[2021-10-07T01:27:03.835Z] ======================================================================
[2021-10-07T01:27:03.835Z] ERROR: test_gluon.test_hybrid_static_memory_switching
[2021-10-07T01:27:03.835Z] ----------------------------------------------------------------------
[2021-10-07T01:27:03.835Z] Traceback (most recent call last):
[2021-10-07T01:27:03.835Z] File "/usr/local/lib/python3.7/dist-packages/nose/case.py", line 198, in runTest
[2021-10-07T01:27:03.835Z] self.test(*self.arg)
[2021-10-07T01:27:03.835Z] File "/work/mxnet/tests/python/unittest/common.py", line 218, in test_new
[2021-10-07T01:27:03.835Z] orig_test(*args, **kwargs)
[2021-10-07T01:27:03.835Z] File "/work/mxnet/tests/python/unittest/test_gluon.py", line 1760, in test_hybrid_static_memory_switching
[2021-10-07T01:27:03.835Z] check_hybrid_static_memory_switching(static_alloc=True)
[2021-10-07T01:27:03.835Z] File "/work/mxnet/tests/python/unittest/test_gluon.py", line 1755, in check_hybrid_static_memory_switching
[2021-10-07T01:27:03.835Z] mx.nd.waitall()
[2021-10-07T01:27:03.835Z] File "/work/mxnet/python/mxnet/ndarray/ndarray.py", line 211, in waitall
[2021-10-07T01:27:03.835Z] check_call(_LIB.MXNDArrayWaitAll())
[2021-10-07T01:27:03.835Z] File "/work/mxnet/python/mxnet/base.py", line 246, in check_call
[2021-10-07T01:27:03.835Z] raise get_last_ffi_error()
[2021-10-07T01:27:03.835Z] mxnet.base.MXNetError: Traceback (most recent call last):
[2021-10-07T01:27:03.835Z] [bt] (9) /work/mxnet/python/mxnet/../../lib/libmxnet.so(std::_Function_handler<void (std::shared_ptr<dmlc::ManualEvent>), mxnet::engine::ThreadedEnginePerDevice::PushToExecute(mxnet::engine::OprBlock*, bool)::{lambda()#1}::operator()() const::{lambda(std::shared_ptr<dmlc::ManualEvent>)#1}>::_M_invoke(std::_Any_data const&, std::shared_ptr<dmlc::ManualEvent>&&)+0x147) [0x7f9183955ee7]
[2021-10-07T01:27:03.835Z] [bt] (8) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::engine::ThreadedEngine::ExecuteOprBlock(mxnet::RunContext, mxnet::engine::OprBlock*)+0x2d8) [0x7f9183942738]
[2021-10-07T01:27:03.835Z] [bt] (7) /work/mxnet/python/mxnet/../../lib/libmxnet.so(std::_Function_handler<void (mxnet::RunContext, mxnet::engine::CallbackOnComplete), mxnet::engine::ThreadedEngine::BulkFlush()::{lambda(mxnet::RunContext, mxnet::engine::CallbackOnComplete)#1}>::_M_invoke(std::_Any_data const&, mxnet::RunContext&&, mxnet::engine::CallbackOnComplete&&)+0x1c6) [0x7f9183940056]
[2021-10-07T01:27:03.835Z] [bt] (6) /work/mxnet/python/mxnet/../../lib/libmxnet.so(std::_Function_handler<void (mxnet::RunContext), mxnet::imperative::PushFComputeEx(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&)> const&, nnvm::Op const*, nnvm::NodeAttrs const&, mxnet::Context const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::Resource, std::allocator<mxnet::Resource> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&)::{lambda(mxnet::RunContext)#1}>::_M_invoke(std::_Any_data const&, mxnet::RunContext&&)+0x17) [0x7f91838680d7]
[2021-10-07T01:27:03.835Z] [bt] (5) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::imperative::PushFComputeEx(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&)> const&, nnvm::Op const*, nnvm::NodeAttrs const&, mxnet::Context const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::Resource, std::allocator<mxnet::Resource> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&)::{lambda(mxnet::RunContext)#1}::operator()(mxnet::RunContext) const+0x293) [0x7f9183867f43]
[2021-10-07T01:27:03.835Z] [bt] (4) /work/mxnet/python/mxnet/../../lib/libmxnet.so(+0x5658020) [0x7f91835bb020]
[2021-10-07T01:27:03.835Z] [bt] (3) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::MKLDNNRun(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&)>, nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&)+0x264) [0x7f917f3eacf4]
[2021-10-07T01:27:03.835Z] [bt] (2) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::op::MKLDNNConvolutionForward(nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&)+0x4f0) [0x7f917f3d6280]
[2021-10-07T01:27:03.835Z] [bt] (1) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::op::MKLDNNConvolutionForwardFullFeature(mxnet::op::MKLDNNConvFullParam const&, mxnet::OpContext const&, mxnet::op::MKLDNNConvForward*, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&)+0x580) [0x7f917f3d5540]
[2021-10-07T01:27:03.835Z] [bt] (0) /work/mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x72) [0x7f917ea45852]
[2021-10-07T01:27:03.835Z] File "src/operator/nn/mkldnn/mkldnn_convolution.cc", line 434
[2021-10-07T01:27:03.835Z] MXNetError: Check failed: weight_mem->get_desc() == fwd->GetPd().weights_desc():
[2021-10-07T01:27:03.835Z] -------------------- >> begin captured logging << --------------------
[2021-10-07T01:27:03.835Z] common: WARNING: Error seen with seeded test, use MXNET_TEST_SEED=1188622132 to reproduce.
[2021-10-07T01:27:03.835Z] --------------------- >> end captured logging << ---------------------
[2021-10-07T02:39:35.601Z] [----------] 2 tests from MKLDNN_UTIL_FUNC
[2021-10-07T02:39:35.601Z] [ RUN ] MKLDNN_UTIL_FUNC.AlignMem
[2021-10-07T02:39:35.601Z] [ OK ] MKLDNN_UTIL_FUNC.AlignMem (1 ms)
[2021-10-07T02:39:35.601Z] [ RUN ] MKLDNN_UTIL_FUNC.MemFormat
[2021-10-07T02:39:35.601Z] unknown file: Failure
[2021-10-07T02:39:35.601Z] C++ exception with description "[02:39:59] /work/mxnet/tests/cpp/operator/mkldnn_test.cc:103: Check failed: (dnnl_format_tag_last) == (222)
[2021-10-07T02:39:35.601Z]
[2021-10-07T02:39:35.601Z] " thrown in the test body.
[2021-10-07T02:39:35.601Z] [ FAILED ] MKLDNN_UTIL_FUNC.MemFormat (0 ms)
[2021-10-07T02:39:35.601Z] [----------] 2 tests from MKLDNN_UTIL_FUNC (1 ms total)
[2021-10-07T02:32:31.459Z] [ RUN ] ThreadSafety.CachedOpFullModel
[2021-10-07T02:32:31.459Z] [02:32:53] src/nnvm/legacy_json_util.cc:208: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
[2021-10-07T02:32:31.459Z] [02:32:53] src/nnvm/legacy_json_util.cc:216: Symbol successfully upgraded!
[2021-10-07T02:32:34.725Z] terminate called after throwing an instance of 'dmlc::Error'
[2021-10-07T02:32:34.725Z] what(): [02:32:57] tests/cpp/thread_safety/thread_safety_test.cc:314: MXNetError: Check failed: weight_mem->get_desc() == fwd->GetPd().weights_desc():
[2021-10-07T02:32:34.725Z] Stack trace:
[2021-10-07T02:32:34.725Z] File "src/operator/nn/mkldnn/mkldnn_convolution.cc", line 434
[2021-10-07T02:32:34.725Z]
[2021-10-07T02:32:34.725Z]
[2021-10-07T02:32:34.725Z]
[2021-10-07T02:32:34.725Z] /work/runtime_functions.sh: line 1306: 1730 Aborted (core dumped) build/tests/cpp/mxnet_unit_tests --gtest_filter="ThreadSafety.*"
[2021-10-07T01:49:05.521Z] [748/749] Linking CXX shared library libmxnet.dll
[2021-10-07T01:49:05.521Z] FAILED: libmxnet.dll libmxnet.lib
[2021-10-07T01:49:05.521Z] cmd.exe /C "cd . && "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" -E vs_link_dll --intdir=CMakeFiles\mxnet.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100162~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100162~1.0\x64\mt.exe --manifests -- C:\PROGRA~2\MICROS~1\2019\COMMUN~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\mxnet.rsp /out:libmxnet.dll /implib:libmxnet.lib /pdb:libmxnet.pdb /dll /version:0.0 /machine:x64 /INCREMENTAL:NO /OPT:REF /OPT:ICF && cmd.exe /C "cd /D C:\jenkins_slave\workspace\build-cpu-mkldnn\build && "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" -E copy C:/jenkins_slave/workspace/build-cpu-mkldnn/build/3rdparty/mkldnn/include/oneapi/dnnl/dnnl_config.h C:/jenkins_slave/workspace/build-cpu-mkldnn/include/mkldnn/oneapi/dnnl/ && "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" -E copy C:/jenkins_slave/workspace/build-cpu-mkldnn/build/3rdparty/mkldnn/include/oneapi/dnnl/dnnl_version.h C:/jenkins_slave/workspace/build-cpu-mkldnn/include/mkldnn/oneapi/dnnl/""
[2021-10-07T01:49:05.521Z] LINK: command "C:\PROGRA~2\MICROS~1\2019\COMMUN~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\mxnet.rsp /out:libmxnet.dll /implib:libmxnet.lib /pdb:libmxnet.pdb /dll /version:0.0 /machine:x64 /INCREMENTAL:NO /OPT:REF /OPT:ICF /MANIFEST /MANIFESTFILE:libmxnet.dll.manifest" failed (exit code 1120) with the following output:
[2021-10-07T01:49:05.521Z] Creating library libmxnet.lib and object libmxnet.exp
[2021-10-07T01:49:05.521Z] LINK : warning LNK4098: defaultlib 'MSVCRT' conflicts with use of other libs; use /NODEFAULTLIB:library
[2021-10-07T01:49:05.521Z] LINK : warning LNK4217: symbol '_wcsdup' defined in 'libucrt.lib(wcsdup.obj)' is imported by 'dnnl.lib(ittnotify_static.c.obj)' in function '__itt_domain_createW_init_3_0'
[2021-10-07T01:49:05.521Z] LINK : warning LNK4217: symbol 'strncpy_s' defined in 'libucrt.lib(strncpy_s.obj)' is imported by 'dnnl.lib(ittnotify_static.c.obj)' in function '__itt_get_groups'
[2021-10-07T01:49:05.521Z] LINK : warning LNK4217: symbol 'malloc' defined in 'libucrt.lib(malloc.obj)' is imported by 'dnnl.lib(ittnotify_static.c.obj)' in function '__itt_domain_createA_init_3_0'
[2021-10-07T01:49:05.521Z] dnnl.lib(ittnotify_static.c.obj) : error LNK2019: unresolved external symbol __imp__strdup referenced in function __itt_domain_createA_init_3_0
[2021-10-07T01:49:05.521Z] libmxnet.dll : fatal error LNK1120: 1 unresolved externals
[2021-10-07T01:49:05.521Z] ninja: build stopped: subcommand failed.
[2021-10-07T01:49:05.521Z] 2021-10-07 01:49:29,320 5 build(s) have failed
[2021-10-07T01:49:05.521Z] 2021-10-07 01:49:29,320 Build failed
Description
The following tests keep failing consistently in the v1.9.x branch:
Also, the mkldnn windows builds keep failing in the v1.9.x branch. Some example failures:
See the error message below posted below in the Test Failure Log Output section.
Occurrences
Test/Build Failure Log Output