Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV in HGCalImagingAlgo present in RelVals for slc7_aarch64_gcc530 & slc7_aarch64_gcc700 (aarch64 only) #19179

Closed
mrodozov opened this issue Jun 9, 2017 · 28 comments

Comments

@mrodozov
Copy link
Contributor

mrodozov commented Jun 9, 2017

We were tracking release validation errors present only for aarch64 builds (here http://goo.gl/bhxlJE and here http://goo.gl/wPUz5C, fails 270* and 274* SIGSEGV) and found they've started after this PR #18236. Before that, we ran manually the first test 27034.0 which failed with the following:

(gdb) where
#0  0x000003ff87439bc8 in _int_free () from /lib64/libc.so.6
#1  0x000003ff8907ebf4 in std::vector<double, std::allocator<double> >::~vector (this=<optimized out>, __in_chrg=<optimized out>)
    at /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc530/external/gcc/5.3.0/include/c++/5.3.0/bits/stl_vector.h:425
#2  std::_Destroy<std::vector<double, std::allocator<double> > > (__pointer=<optimized out>) at /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc530/external/gcc/5.3.0/include/c++/5.3.0/bits/stl_construct.h:93
#3 std::_Destroy_aux<false>::__destroy<std::vector<double, std::allocator<double> >*> (__first=0x5a13e108, __last=0x5a13e2a0)
    at /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc530/external/gcc/5.3.0/include/c++/5.3.0/bits/stl_construct.h:103
#4  0x000003ff8907ec20 in std::_Destroy<std::vector<double, std::allocator<double> >*> (__last=<optimized out>, __first=<optimized out>)
    at /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc530/external/gcc/5.3.0/include/c++/5.3.0/bits/stl_construct.h:126
#5  std::_Destroy<std::vector<double, std::allocator<double> >*, std::vector<double, std::allocator<double> > > (__last=<optimized out>, __first=<optimized out>)
    at /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc530/external/gcc/5.3.0/include/c++/5.3.0/bits/stl_construct.h:151
#6  std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > >::~vector (this=0x2f972db0, __in_chrg=<optimized out>)
    at /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc530/external/gcc/5.3.0/include/c++/5.3.0/bits/stl_vector.h:424
#7  0x000003ff61095e00 in HGCalImagingAlgo::~HGCalImagingAlgo (this=0x2f972cc0, __in_chrg=<optimized out>)
    at /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/src/RecoLocalCalo/HGCalRecAlgos/interface/HGCalImagingAlgo.h:123
#8  0x000003ff61095e94 in HGCalImagingAlgo::~HGCalImagingAlgo (this=0x2f972cc0, __in_chrg=<optimized out>)
    at /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/src/RecoLocalCalo/HGCalRecAlgos/interface/HGCalImagingAlgo.h:124
#9  0x000003ff6109a274 in std::default_delete<HGCalImagingAlgo>::operator() (this=0x2f9725d8, __ptr=0x2f972cc0)
    at /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc530/external/gcc/5.3.0/include/c++/5.3.0/bits/unique_ptr.h:76
#10 0x000003ff610975cc in std::unique_ptr<HGCalImagingAlgo, std::default_delete<HGCalImagingAlgo> >::~unique_ptr (this=0x2f9725d8, __in_chrg=<optimized out>)
    at /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc530/external/gcc/5.3.0/include/c++/5.3.0/bits/unique_ptr.h:236
#11 0x000003ff61096354 in HGCalClusterProducer::~HGCalClusterProducer (this=0x2f972480, __in_chrg=<optimized out>)
    at /build/cmsbld/x/CMSSW_9_2_X_2017-06-06-2300/src/RecoLocalCalo/HGCalRecProducers/plugins/HGCalClusterProducer.cc:35
#12 0x000003ff61096398 in HGCalClusterProducer::~HGCalClusterProducer (this=0x2f972480, __in_chrg=<optimized out>)
    at /build/cmsbld/x/CMSSW_9_2_X_2017-06-06-2300/src/RecoLocalCalo/HGCalRecProducers/plugins/HGCalClusterProducer.cc:35
#13 0x000003ff896cacec in edm::stream::ProducingModuleAdaptorBase<edm::stream::EDProducerBase>::~ProducingModuleAdaptorBase() ()
   from /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc7_aarch64_gcc530/libFWCoreFramework.so
#14 0x000003ff610912f8 in edm::stream::EDProducerAdaptorBase::~EDProducerAdaptorBase (this=0x2f96b5e0, __in_chrg=<optimized out>)
    at /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/src/FWCore/Framework/interface/stream/EDProducerAdaptorBase.h:47
#15 0x000003ff610a2178 in edm::stream::ProducingModuleAdaptor<HGCalClusterProducer, edm::stream::EDProducerBase, edm::stream::EDProducerAdaptorBase>::~ProducingModuleAdaptor (this=0x2f96b5e0, __in_chrg=<optimized out>)
    at /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/src/FWCore/Framework/interface/stream/ProducingModuleAdaptor.h:53
#16 0x000003ff610a21ac in edm::stream::ProducingModuleAdaptor<HGCalClusterProducer, edm::stream::EDProducerBase, edm::stream::EDProducerAdaptorBase>::~ProducingModuleAdaptor (this=0x2f96b5e0, __in_chrg=<optimized out>)
    at /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/src/FWCore/Framework/interface/stream/ProducingModuleAdaptor.h:53
#17 0x000003ff717bbfc0 in std::_Sp_counted_ptr_inplace<edm::maker::ModuleHolderT<edm::stream::EDProducerAdaptorBase>, std::allocator<edm::maker::ModuleHolderT<edm::stream::EDProducerAdaptorBase> >, (__gnu_cxx::_Lock_policy)2>::_M_dispose() () from /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc7_aarch64_gcc530/pluginRecoBTagCombinedPlugins.so
#18 0x000003ff895c1250 in std::_Rb_tree<std::string, std::pair<std::string const, edm::propagate_const<std::shared_ptr<edm::maker::ModuleHolder> > >, std::_Select1st<std::pair<std::string const, edm::propagate_const<std::shared_ptr<edm::maker::ModuleHolder> > > >, std::less<std::string>, std::allocator<std::pair<std::string const, edm::propagate_const<std::shared_ptr<edm::maker::ModuleHolder> > > > >::_M_erase(std::_Rb_tree_node<std::pair<std::string const, edm::propagate_const<std::shared_ptr<edm::maker::ModuleHolder> > > >*) ()
   from /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc7_aarch64_gcc530/libFWCoreFramework.so
#19 0x000003ff895c117c in std::_Rb_tree<std::string, std::pair<std::string const, edm::propagate_const<std::shared_ptr<edm::maker::ModuleHolder> > >, std::_Select1st<std::pair<std::string const, edm::propagate_const<std::shared_ptr<edm::maker::ModuleHolder> > > >, std::less<std::string>, std::allocator<std::pair<std::string const, edm::propagate_const<std::shared_ptr<edm::maker::ModuleHolder> > > > >::_M_erase(std::_Rb_tree_node<std::pair<std::string const, edm::propagate_const<std::shared_ptr<edm::maker::ModuleHolder> > > >*) ()
   from /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc7_aarch64_gcc530/libFWCoreFramework.so
#20 0x000003ff895c12b8 in std::_Sp_counted_ptr<edm::ModuleRegistry*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() ()
   from /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc7_aarch64_gcc530/libFWCoreFramework.so
#21 0x00000000004120c8 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() ()
#22 0x000003ff896623b8 in std::default_delete<edm::Schedule>::operator()(edm::Schedule*) const [clone .isra.814] ()
   from /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc7_aarch64_gcc530/libFWCoreFramework.so
#23 0x000003ff89669e28 in edm::EventProcessor::~EventProcessor() () from /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc7_aarch64_gcc530/libFWCoreFramework.so
#24 0x000003ff8966a27c in edm::EventProcessor::~EventProcessor() () from /cvmfs/cms-ib.cern.ch/week1/slc7_aarch64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc7_aarch64_gcc530/libFWCoreFramework.so
#25 0x000000000040d3a8 in (anonymous namespace)::EventProcessorWithSentry::~EventProcessorWithSentry() () 

This appears to start failing in the destructor of HGCalClusterProducer (which is empty), but
as we went further there was a reference showing something was wrong with the disposal of
https://github.com/cms-sw/cmssw/blob/master/RecoLocalCalo/HGCalRecProducers/plugins/HGCalClusterProducer.cc#L47
showing not proper deletion of a nested vector structure.
@Dr15Jones @clelange

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 9, 2017

A new Issue was created by @mrodozov .

@davidlange6, @Dr15Jones, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@Dr15Jones
Copy link
Contributor

assign reconstruction, upgrade

@cmsbuild
Copy link
Contributor

New categories assigned: upgrade,reconstruction

@kpedro88,@slava77,@perrotta you have been requested to review this Pull request/Issue and eventually sign? Thanks

@Dr15Jones
Copy link
Contributor

I'm running valgrind on the step 3 of 27034.0. The job isn't finished yet but it has already found

==808920== Invalid read of size 8
==808920==    at 0x6479DA60: HGCalImagingAlgo::populate(edm::SortedCollection<HGCRecHit, edm::StrictWeakOrdering<HGCRecHit> > const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc
6_amd64_gcc530/libRecoLocalCaloHGCalRecAlgos.so)
==808920==    by 0x64776093: HGCalClusterProducer::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/pluginRecoLocalCaloHGCa
lRecProducersPlugins.so)

==808920==  Address 0x13c7cc900 is 0 bytes after a block of size 6,352 alloc'd
==808920==    at 0x40271C6: operator new(unsigned long) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/external/valgrind/3.12.0-oenich/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==808920==    by 0x5E0F5D50: std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > >::_M_fill_insert(__gnu_cxx::__normal_iterator<std::vector<double, std::allocator<double> >*, std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > > >, unsigned long, std::vector<double, std::allocator<double> > const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/pluginDQMOfflineMuon.so)
==808920==    by 0x6479D60A: HGCalImagingAlgo::computeThreshold() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/libRecoLocalCaloHGCalRecAlgos.so)
==808920==    by 0x6479DAEC: HGCalImagingAlgo::populate(edm::SortedCollection<HGCRecHit, edm::StrictWeakOrdering<HGCRecHit> > const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/libRecoLocalCaloHGCalRecAlgos.so)
==808920==    by 0x64776060: HGCalClusterProducer::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/pluginRecoLocalCaloHGCalRecProducersPlugins.so)

@mrodozov
Copy link
Contributor Author

if this helps, they've substituted makeClusters with populate here https://github.com/cms-sw/cmssw/pull/18236/files#diff-b09c179fedcd894f76956c03c04d943cR132. seems like the usage of populate inside the produce is where something goes wrong

@slava77
Copy link
Contributor

slava77 commented Jun 11, 2017

@rovere @felicepantaleo @clelange please follow up for this HGCal issue.
Thank you.

@mrodozov please change the title of this issue to be more descriptive of the problem (e.g. "SIGSEGV in HGCalImagingAlgo"

@slava77
Copy link
Contributor

slava77 commented Jun 11, 2017

Also, for the record, we need some instructions to reproduce.
I suspect that the shortened links to IBs will go dead in a week or so.

@Dr15Jones
Copy link
Contributor

The valgrind log has

==808920== Invalid write of size 8
==808920==    at 0x6479D36F: HGCalImagingAlgo::computeThreshold() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/libRecoLocalCaloHGCalRecAlgos.so)
==808920==    by 0x6479DAEC: HGCalImagingAlgo::populate(edm::SortedCollection<HGCRecHit, edm::StrictWeakOrdering<HGCRecHit> > const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/libRecoLocalCaloHGCalRecAlgos.so)
==808920==    by 0x64776060: HGCalClusterProducer::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/pluginRecoLocalCaloHGCalRecProducersPlugins.so)

==808920==  Address 0x13656fa20 is 0 bytes after a block of size 6,352 alloc'd
==808920==    at 0x40271C6: operator new(unsigned long) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/external/valgrind/3.12.0-oenich/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==808920==    by 0x5E0F5D50: std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > >::_M_fill_insert(__gnu_cxx::__normal_iterator<std::vector<double, std::allocator<double> >*, std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > > >, unsigned long, std::vector<double, std::allocator<double> > const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/pluginDQMOfflineMuon.so)
==808920==    by 0x6479D60A: HGCalImagingAlgo::computeThreshold() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/libRecoLocalCaloHGCalRecAlgos.so)
==808920==    by 0x6479DAEC: HGCalImagingAlgo::populate(edm::SortedCollection<HGCRecHit, edm::StrictWeakOrdering<HGCRecHit> > const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/libRecoLocalCaloHGCalRecAlgos.so)
==808920==    by 0x64776060: HGCalClusterProducer::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/pluginRecoLocalCaloHGCalRecProducersPlugins.so)

and

==808920== Invalid write of size 8
==808920==    at 0x6479D380: HGCalImagingAlgo::computeThreshold() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/libRecoLocalCaloHGCalRecAlgos.so)
==808920==    by 0x6479DAEC: HGCalImagingAlgo::populate(edm::SortedCollection<HGCRecHit, edm::StrictWeakOrdering<HGCRecHit> > const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/libRecoLocalCaloHGCalRecAlgos.so)
==808920==    by 0x64776060: HGCalClusterProducer::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/pluginRecoLocalCaloHGCalRecProducersPlugins.so)

==808920==  Address 0x136a64e10 is 0 bytes after a block of size 6,352 alloc'd
==808920==    at 0x40271C6: operator new(unsigned long) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/external/valgrind/3.12.0-oenich/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==808920==    by 0x5E0F5D50: std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > >::_M_fill_insert(__gnu_cxx::__normal_iterator<std::vector<double, std::allocator<double> >*, std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > > >, unsigned long, std::vector<double, std::allocator<double> > const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/pluginDQMOfflineMuon.so)
==808920==    by 0x6479D5EA: HGCalImagingAlgo::computeThreshold() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/libRecoLocalCaloHGCalRecAlgos.so)
==808920==    by 0x6479DAEC: HGCalImagingAlgo::populate(edm::SortedCollection<HGCRecHit, edm::StrictWeakOrdering<HGCRecHit> > const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/libRecoLocalCaloHGCalRecAlgos.so)
==808920==    by 0x64776060: HGCalClusterProducer::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc530/cms/cmssw/CMSSW_9_2_X_2017-06-06-2300/lib/slc6_amd64_gcc530/pluginRecoLocalCaloHGCalRecProducersPlugins.so)

@Dr15Jones
Copy link
Contributor

To reproduce, I created a work area for CMSSW_9_2_X_2017-06-06-2300 on a standard amd64 machine (slc6_amd64_gcc530) and then ran step 3 of workflow 27034.0. This does not crash, but valgrind does show problems.

@Dr15Jones
Copy link
Contributor

As a first guess, I think the problem is probably line 571 and/or 572

thresholds[layer-1][wafer]=sigmaNoise*ecut;

probably because of an off by one error in the wafer numbering.

@Dr15Jones
Copy link
Contributor

As a test I added the following to HGCalImagingAlgo.cc

assert(layer > 0);
assert( (layer -1) < static_cast<long>(thresholds.size()));
assert(layer -1 < static_cast<long>(v_sigmaNoise.size()));
assert(wafer < static_cast<long>(thresholds[layer-1].size()));
assert(wafer < static_cast<long>(v_sigmaNoise[layer-1].size()));

I then ran the job and it failed with

cmsRun: /uscms_data/d2/cdj/build/temp/crash/CMSSW_9_2_X_2017-06-06-2300/src/RecoLocalCalo/HGCalRecAlgos/src/HGCalImagingAlgo.cc:567: void HGCalImagingAlgo::computeThreshold(): Assertion `wafer < static_cast(thresholds[layer-1].size())' failed.

So it does look like an off by one error with wafer

@clelange
Copy link
Contributor

Hi @Dr15Jones @mrodozov - sorry about the hassle. If it's just a off-by-one error for wafer, then changing

dummy.resize(maxNumberOfWafersPerLayer, 0);
to dummy.resize(maxNumberOfWafersPerLayer+1, 0); should fix it. If not, then the magic number in
static const unsigned int maxNumberOfWafersPerLayer = 794;
is wrong and we need to ask @bsunanda if anything changed.
I can have a look at that tomorrow, too many other things going on today.

@kpedro88
Copy link
Contributor

I ran valgrind on 27434.0 after compiling RecoLocalCalo/HGCalRecAlgos with debug symbols, and got this:

==3155149== Invalid write of size 8
==3155149==    at 0x64D3B32F: HGCalImagingAlgo::computeThreshold() (HGCalImagingAlgo.cc:571)
==3155149==    by 0x64D3BAAC: HGCalImagingAlgo::populate(edm::SortedCollection<HGCRecHit, edm::StrictWeakOrdering<HGCRecHit> > const&) (HGCalImagingAlgo.cc:18)
==3155149==  Address 0xcda84350 is 0 bytes after a block of size 6,352 alloc'd
==3155149==    at 0x40271C6: operator new(unsigned long) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/valgrind/3.12.0-oenich/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3155149==    by 0x5E690970: std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > >::_M_fill_insert(__gnu_cxx::__normal_iterator<std::vector<double, std::allocator<double> >*, std::vector<std::vector<double, std::allocator<double> >, std::allocator<st
==3155149==    by 0x64D3B5CA: insert (stl_vector.h:1054)
==3155149==    by 0x64D3B5CA: resize (stl_vector.h:696)
==3155149==    by 0x64D3B5CA: HGCalImagingAlgo::computeThreshold() (HGCalImagingAlgo.cc:548)
==3155149==    by 0x64D3BAAC: HGCalImagingAlgo::populate(edm::SortedCollection<HGCRecHit, edm::StrictWeakOrdering<HGCRecHit> > const&) (HGCalImagingAlgo.cc:18)

==3155149== Invalid write of size 8
==3155149==    at 0x64D3B340: HGCalImagingAlgo::computeThreshold() (HGCalImagingAlgo.cc:572)
==3155149==    by 0x64D3BAAC: HGCalImagingAlgo::populate(edm::SortedCollection<HGCRecHit, edm::StrictWeakOrdering<HGCRecHit> > const&) (HGCalImagingAlgo.cc:18)
==3155149==  Address 0xcffc30d0 is 0 bytes after a block of size 6,352 alloc'd
==3155149==    at 0x40271C6: operator new(unsigned long) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/external/valgrind/3.12.0-oenich/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3155149==    by 0x5E690970: std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > >::_M_fill_insert(__gnu_cxx::__normal_iterator<std::vector<double, std::allocator<double> >*, std::vector<std::vector<double, std::allocator<double> >, std::allocator<st
==3155149==    by 0x64D3B5AA: insert (stl_vector.h:1054)
==3155149==    by 0x64D3B5AA: resize (stl_vector.h:696)
==3155149==    by 0x64D3B5AA: HGCalImagingAlgo::computeThreshold() (HGCalImagingAlgo.cc:549)
==3155149==    by 0x64D3BAAC: HGCalImagingAlgo::populate(edm::SortedCollection<HGCRecHit, edm::StrictWeakOrdering<HGCRecHit> > const&) (HGCalImagingAlgo.cc:18)

So it looks like @Dr15Jones had the right idea.

@smuzaffar
Copy link
Contributor

@clelange , using dummy.resize(maxNumberOfWafersPerLayer+1, 0) did not work. It still fails with same core dump.

@smuzaffar
Copy link
Contributor

@slava77, the crash is only visible on aarch64. In order to reproduce it you need to login to one of our arm64 build machines then create 92X dev area and run workflow 27034.0. I have created cmsuser account on moonshot-arm64-13.cern.ch (I can send you the password in email).

@mrodozov mrodozov changed the title SIGSEGV in RelVals for slc7_aarch64_gcc530 & slc7_aarch64_gcc700 (aarch64 only) SIGSEGV in HGCalImagingAlgo present in RelVals for slc7_aarch64_gcc530 & slc7_aarch64_gcc700 (aarch64 only) Jun 11, 2017
@kpedro88
Copy link
Contributor

@clelange I think you're correct that we need to ask @bsunanda for the correct "magic number" maxNumberOfWafersPerLayer*. I replaced the vector index operator []s with .at()s, so it would throw an out of range exception that can be caught by gdb. When I initialize the vectors to a size of maxNumberOfWafersPerLayer+1, the exception still gets thrown with wafer = maxNumberOfWafersPerLayer + 1 = 795.

* It would be great to be able to get this number directly from the HGCal geometry/topology in a way that would enforce its correctness...

@davidlt
Copy link
Contributor

davidlt commented Jun 12, 2017

I looked at 27034.0 CMSSW_9_2_ROOT6_X_2017-06-08-2300 slc6_amd64_gcc700

==30338== Invalid read of size 8
==30338== Invalid read of size 8
==30338== Invalid write of size 8
==30338== Invalid write of size 8
==30338== Invalid read of size 8
==30338==    at 0x717432D: __dynamic_cast (dyncast.cc:50)
==30338==    by 0x1B1EA38E: TMVA::Reader::~Reader() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/lcg/root/6.09.04-opkfni/lib/libTMVA.so)
==30338==    by 0x60E49B75: PhotonMVAEstimatorRun2Spring16NonTrig::createSingleReader(int, edm::FileInPath const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/pluginRecoEgammaPhotonIdentif
icationPlugins.so)
==30338==    by 0x60E49FC6: PhotonMVAEstimatorRun2Spring16NonTrig::PhotonMVAEstimatorRun2Spring16NonTrig(edm::ParameterSet const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/pluginRecoEga
mmaPhotonIdentificationPlugins.so)
==30338==    by 0x60E4A3B0: edmplugin::PluginFactory<AnyMVAEstimatorRun2Base* (edm::ParameterSet const&)>::PMaker<PhotonMVAEstimatorRun2Spring16NonTrig>::create(edm::ParameterSet const&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9
_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==30338==    by 0x2CFAE740: egamma::MVAObjectCache::MVAObjectCache(edm::ParameterSet const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libRecoEgammaEgammaTools.so)
==30338==    by 0x60E446CE: edm::WorkerMaker<MVAValueMapProducer<reco::Photon> >::makeModule(edm::ParameterSet const&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/pluginRecoEgammaPho
tonIdentificationPlugins.so)
==30338==    by 0x4BABFA6: edm::Maker::makeModule(edm::MakeModuleParams const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cm
s/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B3DD06: edm::Factory::makeModule(edm::MakeModuleParams const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/
cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B52D8C: edm::ModuleRegistry::getModule(edm::MakeModuleParams const&, std::string const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) (in /cvmfs/cms-ib.cern.ch/nweek-0247
5/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4C08624: edm::WorkerRegistry::getWorker(edm::WorkerParams const&, std::string const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4BC7605: edm::WorkerManager::getWorker(edm::ParameterSet&, edm::ProductRegistry&, edm::PreallocationConfiguration const*, std::shared_ptr<edm::ProcessConfiguration const>, std::string const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cm
ssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)

==30338== Invalid read of size 8
==30338==    at 0x7174363: __dynamic_cast (dyncast.cc:68)
==30338==    by 0x1B1EA38E: TMVA::Reader::~Reader() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/lcg/root/6.09.04-opkfni/lib/libTMVA.so)
==30338==    by 0x60E49B75: PhotonMVAEstimatorRun2Spring16NonTrig::createSingleReader(int, edm::FileInPath const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==30338==    by 0x60E49FC6: PhotonMVAEstimatorRun2Spring16NonTrig::PhotonMVAEstimatorRun2Spring16NonTrig(edm::ParameterSet const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==30338==    by 0x60E4A3B0: edmplugin::PluginFactory<AnyMVAEstimatorRun2Base* (edm::ParameterSet const&)>::PMaker<PhotonMVAEstimatorRun2Spring16NonTrig>::create(edm::ParameterSet const&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==30338==    by 0x2CFAE740: egamma::MVAObjectCache::MVAObjectCache(edm::ParameterSet const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libRecoEgammaEgammaTools.so)
==30338==    by 0x60E446CE: edm::WorkerMaker<MVAValueMapProducer<reco::Photon> >::makeModule(edm::ParameterSet const&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==30338==    by 0x4BABFA6: edm::Maker::makeModule(edm::MakeModuleParams const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B3DD06: edm::Factory::makeModule(edm::MakeModuleParams const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B52D8C: edm::ModuleRegistry::getModule(edm::MakeModuleParams const&, std::string const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4C08624: edm::WorkerRegistry::getWorker(edm::WorkerParams const&, std::string const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4BC7605: edm::WorkerManager::getWorker(edm::ParameterSet&, edm::ProductRegistry&, edm::PreallocationConfiguration const*, std::shared_ptr<edm::ProcessConfiguration const>, std::string const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)

==30338== Invalid write of size 8
==30338==    at 0x60B454CA: HGCalImagingAlgo::computeThreshold() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libRecoLocalCaloHGCalRecAlgos.so)
==30338==    by 0x60B45E9C: HGCalImagingAlgo::populate(edm::SortedCollection<HGCRecHit, edm::StrictWeakOrdering<HGCRecHit> > const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libRecoLocalCaloHGCalRecAlgos.so)
==30338==    by 0x60B1AD0D: HGCalClusterProducer::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/pluginRecoLocalCaloHGCalRecProducersPlugins.so)
==30338==    by 0x4C5C202: edm::stream::EDProducerAdaptorBase::doEvent(edm::EventPrincipal const&, edm::EventSetup const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B90021: edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventPrincipal const&, edm::EventSetup const&, edm::ModuleCallingContext const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B356C6: decltype ({parm#1}()) edm::convertException::wrap<bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}>(bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B3587C: bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B37265: void edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr const*, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B37730: edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x64508AD: tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task&, tbb::task*) (custom_scheduler.h:501)
==30338==    by 0x4BECA03: edm::EventProcessor::readAndProcessEvent() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B406EC: statemachine::HandleEvent::readAndProcessEvent() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)

==30338== Invalid write of size 8
==30338==    at 0x60B454DA: HGCalImagingAlgo::computeThreshold() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libRecoLocalCaloHGCalRecAlgos.so)
==30338==    by 0x60B45E9C: HGCalImagingAlgo::populate(edm::SortedCollection<HGCRecHit, edm::StrictWeakOrdering<HGCRecHit> > const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libRecoLocalCaloHGCalRecAlgos.so)
==30338==    by 0x60B1AD0D: HGCalClusterProducer::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/pluginRecoLocalCaloHGCalRecProducersPlugins.so)
==30338==    by 0x4C5C202: edm::stream::EDProducerAdaptorBase::doEvent(edm::EventPrincipal const&, edm::EventSetup const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B90021: edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventPrincipal const&, edm::EventSetup const&, edm::ModuleCallingContext const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B356C6: decltype ({parm#1}()) edm::convertException::wrap<bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}>(bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B3587C: bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B37265: void edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr const*, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B37730: edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x64508AD: tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task&, tbb::task*) (custom_scheduler.h:501)
==30338==    by 0x4BECA03: edm::EventProcessor::readAndProcessEvent() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)
==30338==    by 0x4B406EC: statemachine::HandleEvent::readAndProcessEvent() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc6_amd64_gcc700/cms/cmssw-patch/CMSSW_9_2_ROOT6_X_2017-06-08-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so)

Same workflow on AArch64 for CMSSW_9_2_ROOT6_X_2017-06-05-2300 produces 40 invalid writes/reads. Some are here:

==19069== Invalid read of size 8
==19069==    at 0x6DDF0FC: __dynamic_cast (dyncast.cc:50)
==19069==    by 0x1A2CB053: TMVA::Reader::~Reader() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/lcg/root/6.09.04-opkfni/lib/libTMVA.so)
==19069==    by 0x619562EF: PhotonMVAEstimatorRun2Spring16NonTrig::createSingleReader(int, edm::FileInPath const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==19069==    by 0x61956757: PhotonMVAEstimatorRun2Spring16NonTrig::PhotonMVAEstimatorRun2Spring16NonTrig(edm::ParameterSet const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==19069==    by 0x61956B9B: edmplugin::PluginFactory<AnyMVAEstimatorRun2Base* (edm::ParameterSet const&)>::PMaker<PhotonMVAEstimatorRun2Spring16NonTrig>::create(edm::ParameterSet const&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==19069==    by 0x2BAE79AB: egamma::MVAObjectCache::MVAObjectCache(edm::ParameterSet const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libRecoEgammaEgammaTools.so)
==19069==    by 0x6195187F: edm::WorkerMaker<MVAValueMapProducer<reco::Photon> >::makeModule(edm::ParameterSet const&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==19069==    by 0x4A0F20F: edm::Maker::makeModule(edm::MakeModuleParams const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x49BE85F: edm::Factory::makeModule(edm::MakeModuleParams const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x4AA560B: edm::ModuleRegistry::getModule(edm::MakeModuleParams const&, std::string const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x4A6783B: edm::WorkerRegistry::getWorker(edm::WorkerParams const&, std::string const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x4A2B447: edm::WorkerManager::getWorker(edm::ParameterSet&, edm::ProductRegistry&, edm::PreallocationConfiguration const*, std::shared_ptr<edm::ProcessConfiguration const>, std::string const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)

==19069== Invalid read of size 8
==19069==    at 0x6DDF114: __dynamic_cast (dyncast.cc:68)
==19069==    by 0x1A2CB053: TMVA::Reader::~Reader() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/lcg/root/6.09.04-opkfni/lib/libTMVA.so)
==19069==    by 0x619562EF: PhotonMVAEstimatorRun2Spring16NonTrig::createSingleReader(int, edm::FileInPath const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==19069==    by 0x61956757: PhotonMVAEstimatorRun2Spring16NonTrig::PhotonMVAEstimatorRun2Spring16NonTrig(edm::ParameterSet const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==19069==    by 0x61956B9B: edmplugin::PluginFactory<AnyMVAEstimatorRun2Base* (edm::ParameterSet const&)>::PMaker<PhotonMVAEstimatorRun2Spring16NonTrig>::create(edm::ParameterSet const&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==19069==    by 0x2BAE79AB: egamma::MVAObjectCache::MVAObjectCache(edm::ParameterSet const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libRecoEgammaEgammaTools.so)
==19069==    by 0x6195187F: edm::WorkerMaker<MVAValueMapProducer<reco::Photon> >::makeModule(edm::ParameterSet const&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/pluginRecoEgammaPhotonIdentificationPlugins.so)
==19069==    by 0x4A0F20F: edm::Maker::makeModule(edm::MakeModuleParams const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x49BE85F: edm::Factory::makeModule(edm::MakeModuleParams const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) const (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x4AA560B: edm::ModuleRegistry::getModule(edm::MakeModuleParams const&, std::string const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x4A6783B: edm::WorkerRegistry::getWorker(edm::WorkerParams const&, std::string const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x4A2B447: edm::WorkerManager::getWorker(edm::ParameterSet&, edm::ProductRegistry&, edm::PreallocationConfiguration const*, std::shared_ptr<edm::ProcessConfiguration const>, std::string const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)

==19069== Invalid write of size 8
==19069==    at 0x5A95CC90: ROOT::Math::SVector<double, 6u>& ROOT::Math::SVector<double, 6u>::operator=<ROOT::Math::VectorMatrixRowOp<ROOT::Math::SMatrix<double, 6u, 6u, ROOT::Math::MatRepSym<double, 6u> >, ROOT::Math::SVector<double, 6u>, 6u> >(ROOT::Math::VecExpr<ROOT::Math::VectorMatrixRowOp<ROOT::Math::SMatrix<double, 6u, 6u, ROOT::Math::MatRepSym<double, 6u> >, ROOT::Math::SVector<double, 6u>, 6u>, double, 6u> const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libRecoEgammaEgammaPhotonAlgos.so)
==19069==    by 0x5A96188B: KinematicConstrainedVertexUpdatorT<2, 2>::update(ROOT::Math::SVector<double, 17u> const&, ROOT::Math::SMatrix<double, 17u, 17u, ROOT::Math::MatRepSym<double, 17u> >&, std::vector<KinematicState, std::allocator<KinematicState> >&, Point3DBase<float, GlobalTag> const&, Vector3DBase<float, GlobalTag> const&, MultiTrackKinematicConstraintT<2, 2>*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libRecoEgammaEgammaPhotonAlgos.so)
==19069==    by 0x5A96397F: KinematicConstrainedVertexFitterT<2, 2>::fit(std::vector<ReferenceCountingPointer<KinematicParticle>, std::allocator<ReferenceCountingPointer<KinematicParticle> > > const&, MultiTrackKinematicConstraintT<2, 2>*, Point3DBase<float, GlobalTag>*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libRecoEgammaEgammaPhotonAlgos.so)
==19069==    by 0x5A959FAB: ConversionVertexFinder::run(std::vector<reco::TransientTrack, std::allocator<reco::TransientTrack> > const&, reco::Vertex&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libRecoEgammaEgammaPhotonAlgos.so)
==19069==    by 0x61DF460B: ConversionProducer::checkVertex(reco::TransientTrack const&, reco::TransientTrack const&, MagneticField const*, reco::Vertex&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/pluginRecoEgammaEgammaPhotonProducers.so)
==19069==    by 0x61DF6A37: ConversionProducer::buildCollection(edm::Event&, edm::EventSetup const&, std::multimap<float, edm::Ptr<reco::ConversionTrack>, std::less<float>, std::allocator<std::pair<float const, edm::Ptr<reco::ConversionTrack> > > > const&, std::multimap<double, edm::Ptr<reco::CaloCluster>, std::less<double>, std::allocator<std::pair<double const, edm::Ptr<reco::CaloCluster> > > > const&, std::multimap<double, edm::Ptr<reco::CaloCluster>, std::less<double>, std::allocator<std::pair<double const, edm::Ptr<reco::CaloCluster> > > > const&, reco::Vertex const&, std::vector<reco::Conversion, std::allocator<reco::Conversion> >&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/pluginRecoEgammaEgammaPhotonProducers.so)
==19069==    by 0x61DF8DE7: ConversionProducer::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/pluginRecoEgammaEgammaPhotonProducers.so)
==19069==    by 0x4ABA123: edm::stream::EDProducerAdaptorBase::doEvent(edm::EventPrincipal const&, edm::EventSetup const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x4A9F063: edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventPrincipal const&, edm::EventSetup const&, edm::ModuleCallingContext const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498B8F7: decltype ({parm#1}()) edm::convertException::wrap<bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}>(bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498BAE3: bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498BEBB: void edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr const*, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==  Address 0x1fff000ee0 is on thread 1's stack
==19069==  64 bytes below stack pointer
==19069==
==19069== Invalid write of size 8
==19069==    at 0x7DE2006C: ???
==19069==  Address 0x1fff002d10 is on thread 1's stack
==19069==  32 bytes below stack pointer
==19069==
==19069== Invalid write of size 8
==19069==    at 0x76C400E0: ???
==19069==  Address 0x1fff002d10 is on thread 1's stack
==19069==  32 bytes below stack pointer
==19069==
==19069== Invalid write of size 8
==19069==    at 0x7DE0006C: ???
==19069==  Address 0x1fff002d10 is on thread 1's stack
==19069==  32 bytes below stack pointer
==19069==

// Maybe stack was damaged here

==19069== Invalid write of size 8
==19069==    at 0x5BA9D5F0: HcalSimHitStudy::analyzeHits(std::vector<PCaloHit, std::allocator<PCaloHit> >&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/biglib/slc7_aarch64_gcc700/pluginSimulation.so)
==19069==    by 0x5BA9E45B: HcalSimHitStudy::analyze(edm::Event const&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/biglib/slc7_aarch64_gcc700/pluginSimulation.so)
==19069==    by 0x4ABDCEB: edm::stream::EDAnalyzerAdaptorBase::doEvent(edm::EventPrincipal const&, edm::EventSetup const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-
06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x4A9F333: edm::WorkerT<edm::stream::EDAnalyzerAdaptorBase>::implDo(edm::EventPrincipal const&, edm::EventSetup const&, edm::ModuleCallingContext const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/
lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498B8F7: decltype ({parm#1}()) edm::convertException::wrap<bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::E
ventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}>(bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::Occurre
nceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498BAE3: bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498BEBB: void edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr const*, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498C3DB: edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreF
ramework.so)
==19069==    by 0x67AF6B7: tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task&, tbb::task*) (custom_scheduler.h:501)
==19069==    by 0x4A541C3: edm::EventProcessor::readAndProcessEvent() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x49A760F: statemachine::HandleEvent::readAndProcessEvent() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x49AA243: statemachine::HandleEvent::HandleEvent(boost::statechart::state<statemachine::HandleEvent, statemachine::HandleLumis, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::my_context) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)

==19069== Invalid write of size 8
==19069==    at 0x5BA9D684: HcalSimHitStudy::analyzeHits(std::vector<PCaloHit, std::allocator<PCaloHit> >&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/biglib/slc7_aarch64_gcc700/pluginSimulation.so)
==19069==    by 0x5BA9E45B: HcalSimHitStudy::analyze(edm::Event const&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/biglib/slc7_aarch64_gcc700/pluginSimulation.so)
==19069==    by 0x4ABDCEB: edm::stream::EDAnalyzerAdaptorBase::doEvent(edm::EventPrincipal const&, edm::EventSetup const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x4A9F333: edm::WorkerT<edm::stream::EDAnalyzerAdaptorBase>::implDo(edm::EventPrincipal const&, edm::EventSetup const&, edm::ModuleCallingContext const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498B8F7: decltype ({parm#1}()) edm::convertException::wrap<bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}>(bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498BAE3: bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498BEBB: void edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr const*, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498C3DB: edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x67AF6B7: tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task&, tbb::task*) (custom_scheduler.h:501)
==19069==    by 0x4A541C3: edm::EventProcessor::readAndProcessEvent() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x49A760F: statemachine::HandleEvent::readAndProcessEvent() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x49AA243: statemachine::HandleEvent::HandleEvent(boost::statechart::state<statemachine::HandleEvent, statemachine::HandleLumis, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::my_context) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)

// HcalSimHitStudy::analyzeHits repeats multiple times (for reads and writes)

==19069== Invalid read of size 8
==19069==    at 0x5BA9DB2C: HcalSimHitStudy::analyzeHits(std::vector<PCaloHit, std::allocator<PCaloHit> >&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/biglib/slc7_aarch64_gcc700/pluginSimulation.so)
==19069==    by 0x5BA9E45B: HcalSimHitStudy::analyze(edm::Event const&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/biglib/slc7_aarch64_gcc700/pluginSimulation.so)
==19069==    by 0x4ABDCEB: edm::stream::EDAnalyzerAdaptorBase::doEvent(edm::EventPrincipal const&, edm::EventSetup const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x4A9F333: edm::WorkerT<edm::stream::EDAnalyzerAdaptorBase>::implDo(edm::EventPrincipal const&, edm::EventSetup const&, edm::ModuleCallingContext const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498B8F7: decltype ({parm#1}()) edm::convertException::wrap<bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}>(bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498BAE3: bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498BEBB: void edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr const*, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x498C3DB: edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x67AF6B7: tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task&, tbb::task*) (custom_scheduler.h:501)
==19069==    by 0x4A541C3: edm::EventProcessor::readAndProcessEvent() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x49A760F: statemachine::HandleEvent::readAndProcessEvent() (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)
==19069==    by 0x49AA243: statemachine::HandleEvent::HandleEvent(boost::statechart::state<statemachine::HandleEvent, statemachine::HandleLumis, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::my_context) (in /cvmfs/cms-ib.cern.ch/nweek-02475/slc7_aarch64_gcc700/cms/cmssw/CMSSW_9_2_ROOT6_X_2017-06-05-2300/lib/slc7_aarch64_gcc700/libFWCoreFramework.so)

@bsunanda
Copy link
Contributor

There are wafer #'s 0..795 which are present for FH. So setting the maximum to 796 should help. I am trying to get a helper function in geometry which can provide maximum wafer # for a given configuration.

@bsunanda
Copy link
Contributor

Submitted a PR with hardwired number soon to be replaced by number derived from geometry

@kpedro88
Copy link
Contributor

I confirm that #19198 from @bsunanda does not have any out-of-range exceptions when I run workflow 27434.0 replacing []s with .at()s.

@rovere
Copy link
Contributor

rovere commented Jun 13, 2017 via email

@smuzaffar
Copy link
Contributor

I confirm that 4 failing workflows on aarch64 run without crash with #19198

27034.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D16_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D16+RecoFullGlobal_2023D16+HARVESTFullGlobal_2023D16 Step0-PASSED Step1-PASSED Step2-PASSED Step3-PASSED  - time date Tue Jun 13 08:51:06 2017-date Tue Jun 13 08:03:32 2017; exit: 0 0 0 0
27034.2_TTbar_14TeV_Timing+TTbar_14TeV_TuneCUETP8M1_2023D16_GenSimHLBeamSpotFull14_Timing+DigiFullTrigger_Timing_2023D16+RecoFullGlobal_Timing_2023D16+HARVESTFullGlobal_Timing_2023D16 Step0-PASSED Step1-PASSED Step2-PASSED Step3-PASSED  - time date Tue Jun 13 08:51:06 2017-date Tue Jun 13 08:03:37 2017; exit: 0 0 0 0
27434.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D17_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D17+RecoFullGlobal_2023D17+HARVESTFullGlobal_2023D17 Step0-PASSED Step1-PASSED Step2-PASSED Step3-PASSED  - time date Tue Jun 13 08:51:05 2017-date Tue Jun 13 08:03:40 2017; exit: 0 0 0 0
3 3 3 3 tests passed, 0 0 0 0 failed

@davidlt
Copy link
Contributor

davidlt commented Jun 13, 2017 via email

@kpedro88
Copy link
Contributor

@rovere the official code still uses []s - the change to .at()s was just in my work area.

@davidlt if it solves the observed crash, I think it satisfies this issue - another issue could be opened for the other potential problems you found.

@smuzaffar
Copy link
Contributor

PR #19207 should fix the TMVA::Reader invalid read issue we have seen here #19179 (comment)

cmsbuild added a commit that referenced this issue Jun 15, 2017
Phase2-hgx85 Correct for overwriting (as reported in PR #19179)
@kpedro88
Copy link
Contributor

+1
Resolved by #19198 as previously stated
Other valgrind issues that haven't been causing segfaults should get their own issue(s)

@slava77
Copy link
Contributor

slava77 commented Jun 28, 2017

+1

fixed in #19198 as noted above already

@cmsbuild
Copy link
Contributor

This issue is fully signed and ready to be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants