We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When running python tests on the buildbot server, MPI initialization is hanging in OpenMPI when launching non-distributed executions. See: http://ktau.nic.uoregon.edu:8020/#/builders/7/builds/198. The tests that fail:
60 - tests.regressions.python.python.empty_list_509 (Timeout) 61 - tests.regressions.python.python.empty_list_510 (Timeout) 62 - tests.regressions.python.python.exception_swallowed_369 (Timeout) 63 - tests.regressions.python.python.for_map_516 (Timeout) 64 - tests.regressions.python.python.lambda_492 (Timeout) 65 - tests.regressions.python.python.list_iteration_524 (Timeout) 66 - tests.regressions.python.python.list_iter_space_429 (Timeout) 67 - tests.regressions.python.python.list_slice_assign_528 (Timeout) 70 - tests.regressions.python.python.np_sum_489 (Timeout) 71 - tests.regressions.python.python.passing_compiler_state_453 (Timeout) 72 - tests.regressions.python.python.reassign_512 (Timeout) 73 - tests.regressions.python.python.zero_dimensional_array_502 (Timeout) 208 - tests.unit.python.ast.generate_ast (Timeout) 209 - tests.unit.python.ast.node (Timeout) 210 - tests.unit.python.ast.python_builds_ast (Timeout) 211 - tests.unit.python.ast.traverse_ast (Failed) 212 - tests.unit.python.execution_tree.dictionary (Failed) 213 - tests.unit.python.execution_tree.config_hpx (Timeout) 214 - tests.unit.python.execution_tree.dynamic_init (Timeout) 215 - tests.unit.python.execution_tree.for (Timeout) 216 - tests.unit.python.execution_tree.eval (Timeout) 217 - tests.unit.python.execution_tree.lazy_eval (Timeout) 218 - tests.unit.python.execution_tree.make_array (Timeout) 219 - tests.unit.python.execution_tree.map_numpy (Timeout) 220 - tests.unit.python.execution_tree.map_numpy_constants (Timeout) 221 - tests.unit.python.execution_tree.multi_init (Timeout) 223 - tests.unit.python.execution_tree.parallel (Timeout) 224 - tests.unit.python.execution_tree.set_operation (Timeout) 225 - tests.unit.python.execution_tree.slice (Timeout) 226 - tests.unit.python.primitives.lambda (Timeout) 227 - tests.unit.python.primitives.make_list (Timeout) 228 - tests.unit.python.primitives.make_vector (Timeout) 229 - tests.unit.python.primitives.numpy_dtype (Timeout)
An example backtrace (from test 60):
(gdb) bt #0 0x00007f9469d6e6fd in read () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f94554aa246 in rte_init () from /packages/openmpi/2.0.4_gcc-6.4/lib/libopen-rte.so.20 #2 0x00007f9455468a35 in orte_init () from /packages/openmpi/2.0.4_gcc-6.4/lib/libopen-rte.so.20 #3 0x00007f9458a576a6 in ompi_mpi_init () from /packages/openmpi/2.0.4_gcc-6.4/lib/libmpi.so.20 #4 0x00007f9458a768f3 in PMPI_Init_thread () from /packages/openmpi/2.0.4_gcc-6.4/lib/libmpi.so.20 #5 0x00007f945b973468 in hpx::util::mpi_environment::init (argc=0x7ffda64d1bcc, argv=0x7ffda64d1bc0, cfg=...) at /var/lib/buildbot/slaves/phylanx/x86_64-gcc7-debug/build/tools/buildbot/src/hpx/plugins/parcelport/mpi/mpi_environment.cpp:133 #6 0x00007f945b976efb in hpx::traits::plugin_config_data<hpx::parcelset::policies::mpi::parcelport, void>::init (argc=0x7ffda64d1bcc, argv=0x7ffda64d1bc0, cfg=...) at /var/lib/buildbot/slaves/phylanx/x86_64-gcc7-debug/build/tools/buildbot/src/hpx/plugins/parcelport/mpi/parcelport_mpi.cpp:272 #7 0x00007f945b977bcf in hpx::plugins::parcelport_factory<hpx::parcelset::policies::mpi::parcelport>::init ( this=0x7f945c506080 <parcelport_mpi_factory_init(std::vector<hpx::plugins::parcelport_factory_base*, std::allocator<hpx::plugins::parcelport_factory_base*> >&)::factory>, argc=0x7ffda64d1bcc, argv=0x7ffda64d1bc0, cfg=...) at /var/lib/buildbot/slaves/phylanx/x86_64-gcc7-debug/build/tools/buildbot/src/hpx/hpx/plugins/parcelport_factory.hpp:125 #8 0x00007f945b43659f in hpx::parcelset::parcelhandler::init (argc=0x7ffda64d1bcc, argv=0x7ffda64d1bc0, cfg=...) at /var/lib/buildbot/slaves/phylanx/x86_64-gcc7-debug/build/tools/buildbot/src/hpx/src/runtime/parcelset/parcelhandler.cpp:1559 #9 0x00007f945b773eae in hpx::util::command_line_handling::call (this=0x22ae000, desc_cmdline=..., argc=2, argv=0x7ffda64d4138) at /var/lib/buildbot/slaves/phylanx/x86_64-gcc7-debug/build/tools/buildbot/src/hpx/src/util/command_line_handling.cpp:1365 #10 0x00007f945b44cff5 in hpx::resource::detail::partitioner::parse(hpx::util::function<int (boost::program_options::variables_map&), false> const&, boost::program_options::options_description, int, char**, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, hpx::resource::partitioner_mode, hpx::ru---Type <return> to continue, or q <return> to quit--- ntime_mode, bool) (this=0x22ae000, f=..., desc_cmdline=..., argc=2, argv=0x7ffda64d4138, ini_config=..., rpmode=hpx::resource::mode_default, mode=hpx::runtime_mode_console, fill_internal_topology=true) at /var/lib/buildbot/slaves/phylanx/x86_64-gcc7-debug/build/tools/buildbot/src/hpx/src/runtime/resource/detail/detail_partitioner.cpp:916 #11 0x00007f945b457ff6 in hpx::resource::detail::create_partitioner(hpx::util::function<int (boost::program_options::variables_map&), false> const&, boost::program_options::options_description const&, int, char**, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, hpx::resource::partitioner_mode, hpx::runtime_mode, bool) (f=..., desc_cmdline=..., argc=2, argv=0x7ffda64d4138, ini_config=..., rpmode=hpx::resource::mode_default, mode=hpx::runtime_mode_console, check=false) at /var/lib/buildbot/slaves/phylanx/x86_64-gcc7-debug/build/tools/buildbot/src/hpx/src/runtime/resource/partitioner.cpp:236 #12 0x00007f945aedd97a in hpx::detail::run_or_start(hpx::util::function<int (boost::program_options::variables_map&), false> const&, boost::program_options::options_description const&, int, char**, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&&, hpx::util::unique_function<void (), false>, hpx::util::unique_function<void (), false>, hpx::runtime_mode, bool) (f=..., desc_cmdline=..., argc=2, argv=0x7ffda64d4138, ini_config=<unknown type in /var/lib/buildbot/slaves/phylanx/x86_64-gcc7-debug/build/tools/buildbot/build-delphi-x86_64-Linux-gcc/hpx-Debug/lib/libhpxd.so.1, CU 0xb91d5, DIE 0x21b9cc>, startup=..., shutdown=..., mode=hpx::runtime_mode_console, blocking=false) at /var/lib/buildbot/slaves/phylanx/x86_64-gcc7-debug/build/tools/buildbot/src/hpx/src/hpx_init.cpp:626 #13 0x00007f94603b6038 in hpx::start(hpx::util::function<int (boost::program_options::variables_map&), false> const&, boost::program_options::options_description const&, int, char**, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, hpx::util::unique_function<void (), false>, hpx::util::unique_function<void (), false>, hpx::runtime_mode) (f=..., desc_cmdline=..., argc=2, argv=0x7ffda64d4138, cfg=..., startup=..., shutdown=..., mode=hpx::runtime_mode_console) at /var/lib/buildbot/slaves/phylanx/x86_64-gcc7-debug/build/tools/buildbot/src/hpx/hpx/hpx_start_impl.hpp:77 #14 0x00007f94603b627a in hpx::start(hpx::util::function<int (int, char**), false> const&, int, char**, std::v---Type <return> to continue, or q <return> to quit---
The text was updated successfully, but these errors were encountered:
@khuck this should be fine now as the HPX PR was merged.
Sorry, something went wrong.
@hkaiser no - the build after that PR was merged (three hours ago?) failed the same way....
hkaiser
No branches or pull requests
When running python tests on the buildbot server, MPI initialization is hanging in OpenMPI when launching non-distributed executions. See: http://ktau.nic.uoregon.edu:8020/#/builders/7/builds/198. The tests that fail:
An example backtrace (from test 60):
The text was updated successfully, but these errors were encountered: