-
Notifications
You must be signed in to change notification settings - Fork 11
Closed
Description
Running command legate --launcher mpirun --ranks-per-node 2 --module pytest legateboost/test -svx
Either hangs or results in errors as below.
These runs included the fix nv-legate/legate#778
legion_python: /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-src/runtime/realm/runtime_impl.cc:2539: Realm::GenEventImpl* Realm::RuntimeImpl::get_genevent_impl(Realm::Event): Assertion `id.is_event()' failed.
Fatal Python error: Aborted
or
Fatal Python error: Segmentation fault
Thread 0x00007fa6b45d8000 (most recent call first):
File "/home/nfs/rorym/legate.core/legate/core/_legion/future.py", line 157 in get_buffer
File "/home/nfs/rorym/cunumeric/cunumeric/deferred.py", line 382 in get_scalar_array
File "/home/nfs/rorym/cunumeric/cunumeric/deferred.py", line 291 in __numpy_array__
File "/home/nfs/rorym/cunumeric/cunumeric/array.py", line 839 in __array__
File "/home/nfs/rorym/cunumeric/cunumeric/coverage.py", line 119 in wrapper
File "/home/nfs/rorym/legate.core/legate/core/runtime.py", line 2086 in wrapper
File "/home/nfs/rorym/cunumeric/cunumeric/array.py", line 176 in maybe_convert_to_np_ndarray
File "/home/nfs/rorym/cunumeric/cunumeric/utils.py", line 230 in deep_apply
File "/home/nfs/rorym/cunumeric/cunumeric/utils.py", line 226 in <genexpr>
File "/home/nfs/rorym/cunumeric/cunumeric/utils.py", line 226 in deep_apply
File "/home/nfs/rorym/cunumeric/cunumeric/array.py", line 431 in __array_function__
File "/home/nfs/rorym/cunumeric/cunumeric/coverage.py", line 119 in wrapper
File "/home/nfs/rorym/legate.core/legate/core/runtime.py", line 2086 in wrapper
File "<__array_function__ internals>", line 180 in result_type
File "/home/nfs/rorym/cunumeric/cunumeric/_ufunc/ufunc.py", line 569 in _find_common_type
File "/home/nfs/rorym/cunumeric/cunumeric/_ufunc/ufunc.py", line 581 in _resolve_dtype
File "/home/nfs/rorym/cunumeric/cunumeric/_ufunc/ufunc.py", line 669 in __call__
File "/home/nfs/rorym/cunumeric/cunumeric/array.py", line 1256 in __itruediv__
File "/home/nfs/rorym/cunumeric/cunumeric/coverage.py", line 119 in wrapper
File "/home/nfs/rorym/legate.core/legate/core/runtime.py", line 2086 in wrapper
File "/home/nfs/rorym/cunumeric/cunumeric/array.py", line 3142 in mean
File "/home/nfs/rorym/cunumeric/cunumeric/array.py", line 142 in wrapper
File "/home/nfs/rorym/cunumeric/cunumeric/coverage.py", line 119 in wrapper
File "/home/nfs/rorym/legate.core/legate/core/runtime.py", line 2083 in wrapper
File "/home/nfs/rorym/LegateGBM/legateboost/metrics.py", line 19 in metric
File "/home/nfs/rorym/LegateGBM/legateboost/legateboost.py", line 321 in fit
File "/home/nfs/rorym/LegateGBM/legateboost/legateboost.py", line 389 in fit
File "/home/nfs/rorym/LegateGBM/legateboost/test/test_estimator.py", line 56 in test_regressor_improving_with_depth
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/python.py", line 194 in pytest_pyfunc_call
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in __call__
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/python.py", line 1799 in runtest
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/runner.py", line 169 in pytest_runtest_call
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in __call__
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/runner.py", line 262 in <lambda>
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/runner.py", line 341 in from_call
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/runner.py", line 261 in call_runtest_hook
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/runner.py", line 222 in call_and_report
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/runner.py", line 133 in runtestprotocol
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/runner.py", line 114 in pytest_runtest_protocol
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in __call__
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/main.py", line 348 in pytest_runtestloop
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in __call__
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/main.py", line 323 in _main
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/main.py", line 269 in wrap_session
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/main.py", line 316 in pytest_cmdline_main
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pluggy/_hooks.py", line 265 in __call__
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/config/__init__.py", line 166 in main
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/_pytest/config/__init__.py", line 189 in console_main
File "/home/nfs/rorym/anaconda3/envs/legate-test/lib/python3.10/site-packages/pytest/__main__.py", line 5 in <module>
File "/home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-src/bindings/python/build/lib/legion_top.py", line 295 in run_path
File "/home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-src/bindings/python/build/lib/legion_top.py", line 463 in legion_python_main
Extension modules: _cffi_backend, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, sklearn.__check_build._check_build, scipy._lib._ccallback_c, scipy.sparse._sparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg._cythonized_array_utils, scipy.linalg._flinalg, scipy.linalg._solve_toeplitz, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_lapack, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, sklearn.utils.murmurhash, psutil._psutil_linux, psutil._psutil_posix, numpy.linalg.lapack_lite, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, beta_ufunc, scipy.stats._boost.beta_ufunc, binom_ufunc, scipy.stats._boost.binom_ufunc, nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, ncf_ufunc, scipy.stats._boost.ncf_ufunc, ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, nct_ufunc, scipy.stats._boost.nct_ufunc, skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._statlib, scipy.stats._mvn, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._rcont.rcont, sklearn.utils._isfinite, sklearn.utils._openmp_helpers, sklearn.utils._logistic_sigmoid, sklearn.utils.sparsefuncs_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.utils._random, sklearn.utils._seq_dataset, sklearn.utils._cython_blas, sklearn.utils.arrayfuncs, sklearn.utils._typedefs, sklearn.utils._readonly_array_wrapper, sklearn.metrics._dist_metrics, sklearn.metrics.cluster._expected_mutual_info_fast, sklearn.metrics._pairwise_distances_reduction._datasets_pair, sklearn.metrics._pairwise_distances_reduction._base, sklearn.metrics._pairwise_distances_reduction._middle_term_computer, sklearn.utils._heap, sklearn.utils._sorting, sklearn.metrics._pairwise_distances_reduction._argkmin, sklearn.utils._vector_sentinel, sklearn.metrics._pairwise_distances_reduction._radius_neighbors, sklearn.metrics._pairwise_fast, sklearn.linear_model._cd_fast, sklearn._loss._loss, sklearn.utils._weight_vector, sklearn.linear_model._sgd_fast, sklearn.linear_model._sag_fast, sklearn.svm._libsvm, sklearn.svm._liblinear, sklearn.svm._libsvm_sparse, sklearn.neighbors._partition_nodes, sklearn.neighbors._ball_tree, sklearn.neighbors._kd_tree, sklearn.decomposition._cdnmf_fast, sklearn.decomposition._online_lda_fast, sklearn.feature_extraction._hashing_fast, sklearn.datasets._svmlight_format_fast, scipy.io.matlab._mio_utils, scipy.io.matlab._streams, scipy.io.matlab._mio5_utils, legate.core._lib.types, context, legate.core._lib.context (total: 161)
Signal 11 received by node 0, process 472669 (thread 7fa6d00af000) - obtaining backtrace
Signal 11 received by process 472669 (thread 7fa6d00af000) at: stack trace: 17 frames
[0] = /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb) [0x7fa6d5d7300b]
[1] = /lib/x86_64-linux-gnu/libc.so.6(+0x4308f) [0x7fa6d5d7308f]
[2] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/liblegion.so.1(Legion::Internal::CollectiveViewRendezvous::unpack_collective(Legion::Deserializer&)+0x548) [0x7fa6d7b247f8]
[3] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/liblegion.so.1(Legion::Internal::GatherCollective::handle_collective_message(Legion::Deserializer&)+0x43) [0x7fa6d7af1a43]
[4] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/liblegion.so.1(Legion::Internal::ReplicateContext::register_collective(Legion::Internal::ShardCollective*)+0x266) [0x7fa6d79c43e6]
[5] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/liblegion.so.1(Legion::Internal::GatherCollective::perform_collective_async(Legion::Internal::RtEvent)+0x3d) [0x7fa6d7af186d]
[6] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/liblegion.so.1(Legion::Internal::CollectiveViewCreator<Legion::Internal::AttachOp>::rendezvous_collective_mapping(unsigned int, unsigned int, Legion::LogicalRegion, Legion::Internal::CollectiveViewCreatorBase::RendezvousResult*, unsigned int, std::vector<std::pair<unsigned long long, AVXBitMask<256u> >, Legion::Internal::LegionAllocator<std::pair<unsigned long long, AVXBitMask<256u> >, (Legion::Internal::AllocationType)106> > const&)+0x349) [0x7fa6d7a9c4c9]
[7] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/liblegion.so.1(Legion::Internal::CollectiveViewCreator<Legion::Internal::AttachOp>::convert_collective_views(unsigned int, unsigned int, Legion::LogicalRegion, Legion::Internal::InstanceSet const&, Legion::Internal::InnerContext*, Legion::Internal::CollectiveMapping*&, bool&, std::vector<Legion::Internal::FieldMaskSet<Legion::Internal::InstanceView, (Legion::Internal::AllocationType)106, false>, Legion::Internal::LegionAllocator<Legion::Internal::FieldMaskSet<Legion::Internal::InstanceView, (Legion::Internal::AllocationType)106, false>, (Legion::Internal::AllocationType)106> >&, std::map<Legion::Internal::InstanceView*, unsigned long, std::less<Legion::Internal::InstanceView*>, std::allocator<std::pair<Legion::Internal::InstanceView* const, unsigned long> > >&)+0xf9) [0x7fa6d7aa2ff9]
[8] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/liblegion.so.1(Legion::Internal::OverwriteAnalysis::convert_views(Legion::LogicalRegion, Legion::Internal::InstanceSet const&, unsigned int)+0x2c5) [0x7fa6d78b5695]
[9] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/liblegion.so.1(Legion::Internal::RegionTreeForest::attach_external(Legion::Internal::AttachOp*, unsigned int, Legion::RegionRequirement const&, Legion::Internal::InstanceSet const&, Legion::Internal::VersionInfo const&, Legion::Internal::ApEvent, Legion::Internal::PhysicalTraceInfo const&, std::set<Legion::Internal::RtEvent, std::less<Legion::Internal::RtEvent>, std::allocator<Legion::Internal::RtEvent> >&, bool)+0x13b) [0x7fa6d7c5863b]
[10] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/liblegion.so.1(Legion::Internal::AttachOp::trigger_mapping()+0xa9) [0x7fa6d7a4e4a9]
[11] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/liblegion.so.1(Legion::Internal::Runtime::legion_runtime_task(void const*, unsigned long, void const*, unsigned long, Realm::Processor)+0x771) [0x7fa6d7d846b1]
[12] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/librealm.so.1(+0x510a01) [0x7fa6d660fa01]
[13] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/librealm.so.1(+0x510ab5) [0x7fa6d660fab5]
[14] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/librealm.so.1(+0x514595) [0x7fa6d6613595]
[15] = /home/nfs/rorym/legate.core/_skbuild/linux-x86_64-3.10/cmake-build/_deps/legion-build/lib/librealm.so.1(+0x518d22) [0x7fa6d6617d22]
[16] = /lib/x86_64-linux-gnu/libc.so.6(+0x5b4df) [0x7fa6d5d8b4df]
Metadata
Metadata
Assignees
Labels
No labels