Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CloseByPGun Samples Failure #36800

Closed
mdmorris opened this issue Jan 25, 2022 · 14 comments
Closed

CloseByPGun Samples Failure #36800

mdmorris opened this issue Jan 25, 2022 · 14 comments

Comments

@mdmorris
Copy link

The CloseByPGun samples are failing in CMSSW_12_3_0_pre2 (2026D76). I am opening this issue to find the cause of the error., which is linked below. All other HGCAL requested samples completed without failing.

Error: https://cms-unified.web.cern.ch/cms-unified/joblogs/pdmvserv_RVCMSSW_12_3_0_pre2CloseByPGun_CE_H_Coarse_300um__2026D77noPU_220117_164641_2516/139/GenSimFull/68f22295-aa09-40d3-9e92-514fad5f7040-0-3-logArchive/job/WMTaskSpace/cmsRun1/cmsRun1-stdout.log

All CMSSW_12_3_0_pre2 (2026D76) relval samples: https://cms-pdmv.cern.ch/relval/tickets?prepid=*12_3_0_pre2*2026D*&shown=2047&page=0&limit=50

@cms-sw/pdmv-l2 @cms-sw/core-l2

@cmsbuild
Copy link
Contributor

A new Issue was created by @mdmorris .

@Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor

assign generators

@cmsbuild
Copy link
Contributor

New categories assigned: generators

@mkirsano,@alberto-sanchez,@SiewYan,@GurpreetSinghChahal,@Saptaparna you have been requested to review this Pull request/Issue and eventually sign? Thanks

@Dr15Jones
Copy link
Contributor

There doesn't appear to be much useful in the traceback

#4  <signal handler called>
#5  0x00002b8e65919bf8 in edm::CloseByParticleGunProducer::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/pluginIOMCParticleGuns.so
#6  0x00002b8e380abedb in edm::one::EDProducerBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so

@makortel
Copy link
Contributor

Stack trace is

Thread 18 (Thread 0x2b8e88e02700 (LWP 586) "cmsRun"):
#2  0x00002b8e3ff7bc20 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b8e397df84a in tbb::detail::d0::machine_pause (delay=0) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/../../include/oneapi/tbb/detail/_machine.h:104
#5  tbb::detail::d0::atomic_backoff::bounded_pause (this=<synthetic pointer>) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/../../include/oneapi/tbb/detail/_utils.h:77
#6  tbb::detail::r1::prolonged_pause_impl () at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/scheduler_common.h:202
#7  tbb::detail::r1::prolonged_pause () at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/scheduler_common.h:234
#8  tbb::detail::r1::stealing_loop_backoff::pause (this=0x2b8e88dfb058) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/scheduler_common.h:263
#9  tbb::detail::r1::waiter_base::pause (this=0x2b8e88dfb050) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/waiters.h:35
#10 tbb::detail::r1::outermost_worker_waiter::pause (this=0x2b8e88dfb050) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/waiters.h:69

Thread 17 (Thread 0x2b8e88401700 (LWP 585) "cmsRun"):
#2  0x00002b8e3ff7bc20 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b8e3a6e1917 in sched_yield () from /lib64/libc.so.6
#5  0x00002b8e397df8b0 in __gthread_yield () at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/slc7_amd64_gcc10/external/gcc/10.3.0-84898dea653199466402e67d73657f10/include/c++/10.3.0/x86_64-unknown-linux-gnu/bits/gthr-default.h:693
#6  std::this_thread::yield () at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/slc7_amd64_gcc10/external/gcc/10.3.0-84898dea653199466402e67d73657f10/include/c++/10.3.0/thread:379
#7  tbb::detail::r1::stealing_loop_backoff::pause (this=0x2b8e883fa058) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/scheduler_common.h:266
#8  tbb::detail::r1::waiter_base::pause (this=0x2b8e883fa050) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/waiters.h:35
#9  tbb::detail::r1::outermost_worker_waiter::pause (this=0x2b8e883fa050) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/waiters.h:69

Thread 16 (Thread 0x2b8e87800700 (LWP 584) "cmsRun"):
#2  0x00002b8e3ff7bc20 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b8e3a6e1917 in sched_yield () from /lib64/libc.so.6
#5  0x00002b8e397df8b0 in __gthread_yield () at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/slc7_amd64_gcc10/external/gcc/10.3.0-84898dea653199466402e67d73657f10/include/c++/10.3.0/x86_64-unknown-linux-gnu/bits/gthr-default.h:693
#6  std::this_thread::yield () at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/slc7_amd64_gcc10/external/gcc/10.3.0-84898dea653199466402e67d73657f10/include/c++/10.3.0/thread:379
#7  tbb::detail::r1::stealing_loop_backoff::pause (this=0x2b8e877f9058) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/scheduler_common.h:266
#8  tbb::detail::r1::waiter_base::pause (this=0x2b8e877f9050) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/waiters.h:35
#9  tbb::detail::r1::outermost_worker_waiter::pause (this=0x2b8e877f9050) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/waiters.h:69

Thread 15 (Thread 0x2b8e86601700 (LWP 583) "cmsRun"):
#2  0x00002b8e3ff7bc20 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b8e3a6e1917 in sched_yield () from /lib64/libc.so.6
#5  0x00002b8e397df8b0 in __gthread_yield () at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/slc7_amd64_gcc10/external/gcc/10.3.0-84898dea653199466402e67d73657f10/include/c++/10.3.0/x86_64-unknown-linux-gnu/bits/gthr-default.h:693
#6  std::this_thread::yield () at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/slc7_amd64_gcc10/external/gcc/10.3.0-84898dea653199466402e67d73657f10/include/c++/10.3.0/thread:379
#7  tbb::detail::r1::stealing_loop_backoff::pause (this=0x2b8e865fa058) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/scheduler_common.h:266
#8  tbb::detail::r1::waiter_base::pause (this=0x2b8e865fa050) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/waiters.h:35
#9  tbb::detail::r1::outermost_worker_waiter::pause (this=0x2b8e865fa050) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/waiters.h:69

Thread 14 (Thread 0x2b8e85c00700 (LWP 582) "cmsRun"):
#3  0x00002b8e3ff7f36b in sig_dostack_then_abort () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x00002b8e65919bf8 in edm::CloseByParticleGunProducer::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/pluginIOMCParticleGuns.so
#6  0x00002b8e380abedb in edm::one::EDProducerBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#7  0x00002b8e3809bd6f in edm::WorkerT<edm::one::EDProducerBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#8  0x00002b8e37ff6d45 in decltype ({parm#1}()) edm::convertException::wrap<edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}>(edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#9  0x00002b8e37ff703b in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr const*, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#10 0x00002b8e37ff9b99 in void edm::SerialTaskQueueChain::actionToRun<edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute()::{lambda()#1}&>(edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute()::{lambda()#1}&) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#11 0x00002b8e37ff9df1 in edm::SerialTaskQueue::QueuedTask<edm::SerialTaskQueueChain::push<edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute()::{lambda()#1}&>(tbb::detail::d1::task_group&, edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute()::{lambda()#1}&)::{lambda()#1}>::execute() () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#12 0x00002b8e37d8b075 in tbb::detail::d1::function_task<edm::SerialTaskQueue::spawn(edm::SerialTaskQueue::TaskBase&)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreConcurrency.so

Thread 13 (Thread 0x2b8e84c01700 (LWP 581) "cmsRun"):
#1  0x00002b8e3a6c3894 in sleep () from /lib64/libc.so.6
#2  0x00002b8e3ff7bc20 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b8e3a6e1917 in sched_yield () from /lib64/libc.so.6
#5  0x00002b8e397df8b0 in __gthread_yield () at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/slc7_amd64_gcc10/external/gcc/10.3.0-84898dea653199466402e67d73657f10/include/c++/10.3.0/x86_64-unknown-linux-gnu/bits/gthr-default.h:693
#6  std::this_thread::yield () at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/slc7_amd64_gcc10/external/gcc/10.3.0-84898dea653199466402e67d73657f10/include/c++/10.3.0/thread:379
#7  tbb::detail::r1::stealing_loop_backoff::pause (this=0x2b8e84bfa058) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/scheduler_common.h:266
#8  tbb::detail::r1::waiter_base::pause (this=0x2b8e84bfa050) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/waiters.h:35
#9  tbb::detail::r1::outermost_worker_waiter::pause (this=0x2b8e84bfa050) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/waiters.h:69

Thread 12 (Thread 0x2b8e84200700 (LWP 580) "cmsRun"):
#2  0x00002b8e3ff7bc20 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00002b8e391cc28f in je_extent_heap_remove (ph=<optimized out>, phn=phn@entry=0x2b8e84ebd880) at src/extent.c:283
#5  0x00002b8e391ccc4f in extents_remove_locked (extents=extents@entry=0x2b8e84e031a8, extent=extent@entry=0x2b8e84ebd880, tsdn=0x2b8e841f9a28) at src/extent.c:378
#6  0x00002b8e391cd4fd in je_extents_evict (tsdn=tsdn@entry=0x2b8e841f9a28, arena=arena@entry=0x2b8e84e008c0, r_extent_hooks=r_extent_hooks@entry=0x2b8e841f8508, extents=extents@entry=0x2b8e84e031a8, npages_min=npages_min@entry=257) at src/extent.c:592
#7  0x00002b8e39192598 in arena_stash_decayed (decay_extents=<synthetic pointer>, npages_decay_max=3203, npages_limit=257, extents=0x2b8e84e031a8, r_extent_hooks=0x2b8e841f8508, arena=0x2b8e84e008c0, tsdn=0x2b8e841f9a28) at src/arena.c:834
#8  arena_decay_to_limit (tsdn=tsdn@entry=0x2b8e841f9a28, arena=arena@entry=0x2b8e84e008c0, decay=decay@entry=0x2b8e84e06be0, extents=extents@entry=0x2b8e84e031a8, all=all@entry=false, npages_limit=<optimized out>, npages_decay_max=<optimized out>, is_background_thread=<optimized out>) at src/arena.c:934
#9  0x00002b8e3919299b in arena_decay_try_purge (is_background_thread=<optimized out>, npages_limit=<optimized out>, current_npages=<optimized out>, extents=<optimized out>, decay=<optimized out>, arena=<optimized out>, tsdn=<optimized out>) at src/arena.c:615
#10 arena_decay_try_purge (is_background_thread=<optimized out>, npages_limit=<optimized out>, current_npages=<optimized out>, extents=<optimized out>, decay=<optimized out>, arena=<optimized out>, tsdn=<optimized out>) at src/arena.c:611
#11 arena_maybe_decay (is_background_thread=<optimized out>, extents=<optimized out>, decay=<optimized out>, arena=<optimized out>, tsdn=<optimized out>) at src/arena.c:762
#12 arena_maybe_decay (tsdn=0x2b8e841f9a28, arena=0x2b8e84e008c0, decay=0x2b8e84e06be0, extents=0x2b8e84e031a8, is_background_thread=<optimized out>) at src/arena.c:714
#13 0x00002b8e39194f74 in arena_decay_impl (all=<optimized out>, is_background_thread=false, extents=0x2b8e84e031a8, decay=0x2b8e84e06be0, arena=0x2b8e84e008c0, tsdn=0x2b8e841f9a28) at src/arena.c:964
#14 arena_decay_dirty (all=<optimized out>, is_background_thread=false, arena=0x2b8e84e008c0, tsdn=0x2b8e841f9a28) at src/arena.c:985
#15 je_arena_decay (tsdn=0x2b8e841f9a28, arena=0x2b8e84e008c0, is_background_thread=false, all=<optimized out>) at src/arena.c:998
#16 0x00002b8e39195df2 in arena_decay_ticks (nticks=1, arena=0x2b8e84e031a8, tsdn=0x0) at include/jemalloc/internal/arena_inlines_b.h:126
#17 0x00002b8e391f5e24 in je_tcache_alloc_small_hard (tsdn=tsdn@entry=0x2b8e841f9a28, arena=arena@entry=0x2b8e84e008c0, tcache=tcache@entry=0x2b8e841f9c28, tbin=tbin@entry=0x2b8e841f9ce0, binind=binind@entry=7, tcache_success=tcache_success@entry=0x2b8e841f8670) at src/tcache.c:94
#18 0x00002b8e3918aa42 in tcache_alloc_small (slow_path=false, zero=false, binind=7, size=<optimized out>, tcache=0x2b8e841f9c28, arena=0x2b8e84e008c0, tsd=0x2b8e841f9a28) at include/jemalloc/internal/tcache_inlines.h:60
#19 arena_malloc (slow_path=false, tcache=0x2b8e841f9c28, zero=false, ind=7, size=<optimized out>, arena=0x0, tsdn=<optimized out>) at include/jemalloc/internal/arena_inlines_b.h:165
#20 iallocztm (slow_path=false, arena=0x0, is_internal=false, tcache=0x2b8e841f9c28, zero=false, ind=7, size=<optimized out>, tsdn=<optimized out>) at include/jemalloc/internal/jemalloc_internal_inlines_c.h:53
#21 imalloc_no_sample (ind=7, usize=112, size=<optimized out>, tsd=<optimized out>, dopts=<synthetic pointer>, sopts=<synthetic pointer>) at src/jemalloc.c:1949
#22 imalloc_body (tsd=<optimized out>, dopts=<synthetic pointer>, sopts=<synthetic pointer>) at src/jemalloc.c:2149
#23 imalloc (dopts=<synthetic pointer>, sopts=<synthetic pointer>) at src/jemalloc.c:2260
#24 je_malloc_default (size=<optimized out>) at src/jemalloc.c:2291
#25 0x00002b8e3918b86a in malloc (size=size@entry=104) at src/jemalloc.c:2390
#26 0x00002b8e391fa7d9 in newImpl<false> (size=104) at src/jemalloc_cpp.cpp:77
#27 operator new (size=104) at src/jemalloc_cpp.cpp:87
#28 0x00002b8e37ff6623 in void edm::Worker::doWorkAsync<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::WaitingTaskHolder, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::TransitionInfoType const&, edm::ServiceToken const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#29 0x00002b8e37ff370a in edm::Path::runNextWorkerAsync(unsigned int, edm::EventTransitionInfo const&, edm::ServiceToken const&, edm::StreamID const&, edm::StreamContext const*, tbb::detail::d1::task_group&) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#30 0x00002b8e37ff3b8e in edm::Path::processOneOccurrenceAsync(edm::WaitingTaskHolder, edm::EventTransitionInfo const&, edm::ServiceToken const&, edm::StreamID const&, edm::StreamContext const*) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#31 0x00002b8e38069b2a in edm::StreamSchedule::processOneEventAsync(edm::WaitingTaskHolder, edm::EventTransitionInfo&, edm::ServiceToken const&, std::vector<edm::propagate_const<std::shared_ptr<edm::PathStatusInserter> >, std::allocator<edm::propagate_const<std::shared_ptr<edm::PathStatusInserter> > > >&) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#32 0x00002b8e3803d292 in edm::Schedule::processOneEventAsync(edm::WaitingTaskHolder, unsigned int, edm::EventTransitionInfo&, edm::ServiceToken const&) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#33 0x00002b8e37f5caf7 in edm::waiting_task::detail::WaitingTaskChain<edm::waiting_task::detail::Conditional<edm::waiting_task::detail::AutoExceptionHandler<edm::EventProcessor::processEventAsyncImpl(edm::WaitingTaskHolder, unsigned int)::{lambda(auto:1)#2}> >, edm::waiting_task::detail::AutoExceptionHandler<edm::EventProcessor::processEventAsyncImpl(edm::WaitingTaskHolder, unsigned int)::{lambda(auto:1)#1}> >::runLast(edm::WaitingTaskHolder) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#34 0x00002b8e37f73e7b in edm::EventProcessor::processEventAsyncImpl(edm::WaitingTaskHolder, unsigned int) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#35 0x00002b8e37f73feb in tbb::detail::d1::function_task<edm::EventProcessor::processEventAsync(edm::WaitingTaskHolder, unsigned int)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so

Thread 1 (Thread 0x2b8e3c771140 (LWP 547) "cmsRun"):
#2  0x00002b8e3ff7bc20 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  free_fastpath (size_hint=true, size=104, ptr=0x2b8e893f6440) at src/jemalloc.c:2832
#5  je_je_sdallocx_noflags (ptr=0x2b8e893f6440, size=104) at src/jemalloc.c:3618
#6  0x00002b8e391fa89e in operator delete (ptr=<optimized out>, size=<optimized out>) at src/jemalloc_cpp.cpp:131
#7  0x00002b8e37f3cd79 in tbb::detail::d1::function_task<edm::WaitingTaskHolder::doneWaiting(std::__exception_ptr::exception_ptr)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#8  0x00002b8e397f2b8c in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::external_waiter> (waiter=..., t=0x2b8e3d29c500, this=0x2b8e3d417900) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/task_dispatcher.h:322
#9  tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::external_waiter> (waiter=..., t=<optimized out>, this=0x2b8e3d417900) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/task_dispatcher.h:463
#10 tbb::detail::r1::task_dispatcher::execute_and_wait (t=<optimized out>, wait_ctx=..., w_ctx=...) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/task_dispatcher.cpp:168
#11 0x00002b8e37f67d38 in edm::EventProcessor::processLumis(std::shared_ptr<void> const&) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#12 0x00002b8e37f72a8b in edm::EventProcessor::runToCompletion() () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#13 0x000000000040a266 in tbb::detail::d1::task_arena_function<main::{lambda()#1}::operator()() const::{lambda()#1}, void>::operator()() const ()
#14 0x00002b8e397e115b in tbb::detail::r1::task_arena_impl::execute (ta=..., d=...) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_12_3_0_pre2-slc7_amd64_gcc10/build/CMSSW_12_3_0_pre2-build/BUILD/slc7_amd64_gcc10/external/tbb/v2021.4.0-948cf178e0e7563ed46ef2d4709d6819/tbb-v2021.4.0/src/tbb/arena.cpp:698
#15 0x000000000040b094 in main::{lambda()#1}::operator()() const ()
#16 0x000000000040971c in main ()

Current Modules:

Module: CloseByParticleGunProducer:generator (crashed)
Module: none
Module: none
Module: none
Module: none
Module: none
Module: none
Module: none

@makortel
Copy link
Contributor

I was able to reproduce locally, compiling with -g gave line number

#4  <signal handler called>
#5  0x00007f0e6a99cbf8 in HepPDT::Measurement::Measurement (m=..., this=<optimized out>) at /cvmfs/cms.cern.ch/slc7_amd64_gcc10/external/heppdt/3.04.01-86902b9f3ac7badd9ed7e1d9eeffb425/include/HepPDT/Measurement.icc:36
#6  HepPDT::ParticleData::mass (this=0x0) at /cvmfs/cms.cern.ch/slc7_amd64_gcc10/external/heppdt/3.04.01-86902b9f3ac7badd9ed7e1d9eeffb425/include/HepPDT/ParticleData.hh:69
#7  edm::CloseByParticleGunProducer::produce (this=0x7f0e6ce32c00, e=..., es=...) at /build/mkortela/debug/CMSSW_12_3_0_pre2/src/IOMC/ParticleGuns/src/CloseByParticleGunProducer.cc:86
#8  0x00007f0e991e7edb in edm::one::EDProducerBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so
#9  0x00007f0e991d7d6f in edm::WorkerT<edm::one::EDProducerBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_0_pre2/lib/slc7_amd64_gcc10/libFWCoreFramework.so

which points to

int PartID = CLHEP::RandFlat::shoot(engine, 0, fPartIDs.size());
const HepPDT::ParticleData* PData = fPDGTable->particle(HepPDT::ParticleID(abs(PartID)));
double mass = PData->mass().value();

where PData is nullptr.

@makortel
Copy link
Contributor

The fPartIDs comes from configuration, and it is

process.generator = cms.EDProducer("CloseByParticleGunProducer",
    PGunParameters = cms.PSet(
        PartID = cms.vint32(22),

Adding a printout shows that the failing case has PartID = 0. I'm wondering if the code should have something along

int index = CLHEP::RandFlat::shoot(engine, 0, fPartIDs.size());
int PartID = fPartIDs[index];

I see this part was changed recently in #36460. @lecriste @rovere

@makortel
Copy link
Contributor

Is this particle gun generator tested in any runTheMatrix workflow or other test in IBs? @cms-sw/pdmv-l2

(if not, would it make sense to make a policy, if there isn't yet, that generators run in RelVals would have to be tested in IBs?)

@rovere
Copy link
Contributor

rovere commented Jan 25, 2022

Ciao @makortel @Dr15Jones thanks for having looked into that. We will review and submit a fix asap.
@makortel there is no such a workflow tested at PR integration. At the same it would be nice to have the freedom and possibly to selectively test specific workflows depending on the content of the proposed changes.
My recent experience on this front has not been stellar and that could be due to my ignorance. It is not completely intuitive and well documented the possibility to trigger any workflow from any 'matrix collection'.
Having said that, I agree maybe having one such workflow regularly tested is useful.

Ciao and thanks,
Marco.

@Dr15Jones
Copy link
Contributor

It is also possible to write tests for a specific module using the TestProcessor framework

https://github.com/cms-sw/cmssw/tree/master/FWCore/TestProcessor

@rovere
Copy link
Contributor

rovere commented Jan 25, 2022

@Dr15Jones thanks for the useful pointer. I was not aware of this functionality.
On a side note, #36554 has embedded the fix for the problem but is pending for 1+ weeks now, with exactly the same request I made here: how to trigger a generic runTheMatrix workflow from any of its collection.

@SiewYan
Copy link
Contributor

SiewYan commented Jan 27, 2022

@rovere sorry about that, could you use TestProcessor framework to trigger the specific test for your PR? I will sign it off if the PR specific test is a success.

@rovere
Copy link
Contributor

rovere commented Feb 4, 2022

I believe the fix has been merged and is effective since CloseByPGun is tested at every IB and PR.
Can we close this issue?

@perrotta
Copy link
Contributor

perrotta commented Feb 8, 2022

@mdmorris @cms-sw/generators-l2 I'm closing this issue.
Please reopen it if you think there is anything else remained to do with it,

@perrotta perrotta closed this as completed Feb 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants