Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory problem in CSCTriggerPrimitivesProducer #25200

Closed
Dr15Jones opened this issue Nov 12, 2018 · 17 comments
Closed

Memory problem in CSCTriggerPrimitivesProducer #25200

Dr15Jones opened this issue Nov 12, 2018 · 17 comments

Comments

@Dr15Jones
Copy link
Contributor

The CMSSW_10_4_ASAN_X_2018-11-09-2300 release shows a 'read after delete' problem in CSCTriggerPrimitivesProducer. However, I think that it is a misclassification and instead it is proably a memory out of bounds problem.

@cmsbuild
Copy link
Contributor

A new Issue was created by @Dr15Jones Chris Jones.

@davidlange6, @Dr15Jones, @smuzaffar, @fabiocos, @kpedro88 can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@Dr15Jones
Copy link
Contributor Author

I rebuild the code using debug and reran the ASAN job for workflow 11626.0

==20525==ERROR: AddressSanitizer: heap-use-after-free on address 0x619005d50d08 at pc 0x7f12aaae03fb bp 0x7ffec3489540 sp 0x7ffec3489538
READ of size 4 at 0x619005d50d08 thread T0
    #0 0x7f12aaae03fa in void CSCGEMMotherboard::matchingPads<GEMPadDigi>(CSCCLCTDigi const&, std::vector<std::pair<unsigned int, GEMPadDigi>, std::allocator<std::pair<unsigned int, GEMPadDigi> > >&) const /build/chrjones/temp/asan/CMSSW_10_4_ASAN_X_2018-11-09-2300/src/L1Trigger/CSCTriggerPrimitives/src/CSCGEMMotherboard.h:297
    #1 0x7f12aaadb783 in void CSCGEMMotherboard::matchingPads<CSCCLCTDigi, GEMPadDigi>(CSCCLCTDigi const&, CSCCLCTDigi const&, std::vector<std::pair<unsigned int, GEMPadDigi>, std::allocator<std::pair<unsigned int, GEMPadDigi> > >&) const (/build/chrjones/temp/asan/CMSSW_10_4_ASAN_X_2018-11-09-2300/lib/slc6_amd64_gcc700/libL1TriggerCSCTriggerPrimitives.so+0x131783)
    #2 0x7f12aaad7d63 in void CSCGEMMotherboard::matchingPads<GEMPadDigi>(CSCCLCTDigi const&, CSCCLCTDigi const&, CSCALCTDigi const&, CSCALCTDigi const&, std::vector<std::pair<unsigned int, GEMPadDigi>, std::allocator<std::pair<unsigned int, GEMPadDigi> > >&) const (/build/chrjones/temp/asan/CMSSW_10_4_ASAN_X_2018-11-09-2300/lib/slc6_amd64_gcc700/libL1TriggerCSCTriggerPrimitives.so+0x12dd63)
    #3 0x7f12aab1d267 in CSCGEMMotherboardME11::run(MuonDigiCollection<CSCDetId, CSCWireDigi> const*, MuonDigiCollection<CSCDetId, CSCComparatorDigi> const*, MuonDigiCollection<GEMDetId, GEMPadDigi> const*) /build/chrjones/temp/asan/CMSSW_10_4_ASAN_X_2018-11-09-2300/src/L1Trigger/CSCTriggerPrimitives/src/CSCGEMMotherboardME11.cc:145
    #4 0x7f12aaaf1b35 in CSCTriggerPrimitivesBuilder::build(CSCBadChambers const*, MuonDigiCollection<CSCDetId, CSCWireDigi> const*, MuonDigiCollection<CSCDetId, CSCComparatorDigi> const*, MuonDigiCollection<GEMDetId, GEMPadDigi> const*, MuonDigiCollection<GEMDetId, GEMPadDigiCluster> const*, MuonDigiCollection<CSCDetId, CSCALCTDigi>&, MuonDigiCollection<CSCDetId, CSCCLCTDigi>&, MuonDigiCollection<CSCDetId, CSCCLCTPreTriggerDigi>&, MuonDigiCollection<CSCDetId, int>&, MuonDigiCollection<CSCDetId, CSCCorrelatedLCTDigi>&, MuonDigiCollection<CSCDetId, CSCCorrelatedLCTDigi>&, MuonDigiCollection<GEMDetId, GEMCoPadDigi>&) /build/chrjones/temp/asan/CMSSW_10_4_ASAN_X_2018-11-09-2300/src/L1Trigger/CSCTriggerPrimitives/src/CSCTriggerPrimitivesBuilder.cc:270
    #5 0x7f12aac8bc3e in CSCTriggerPrimitivesProducer::produce(edm::StreamID, edm::Event&, edm::EventSetup const&) const /build/chrjones/temp/asan/CMSSW_10_4_ASAN_X_2018-11-09-2300/src/L1Trigger/CSCTriggerPrimitives/plugins/CSCTriggerPrimitivesProducer.cc:162

ASAN says the memory was deleted here

0x619005d50d08 is located 136 bytes inside of 1025-byte region [0x619005d50c80,0x619005d51081)
freed by thread T0 here:
    #0 0x7f12dc48eee0 in operator delete(void*) ../../../../libsanitizer/asan/asan_new_delete.cc:124
    #1 0x7f12d980df45 in __gnu_cxx::new_allocator<char>::deallocate(char*, unsigned long) /build/cmsbld/auto-builds/CMSSW_10_1_0_pre2-slc6_amd64_gcc700/build/CMSSW_10_1_0_pre2-build/BUILD/slc6_amd64_gcc700/external/gcc/7.0.0-omkpbe2/gcc-branches_gcc-7-branch-257125/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:125
    #2 0x7f12d980df45 in std::allocator_traits<std::allocator<char> >::deallocate(std::allocator<char>&, char*, unsigned long) /build/cmsbld/auto-builds/CMSSW_10_1_0_pre2-slc6_amd64_gcc700/build/CMSSW_10_1_0_pre2-build/BUILD/slc6_amd64_gcc700/external/gcc/7.0.0-omkpbe2/gcc-branches_gcc-7-branch-257125/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/alloc_traits.h:462
    #3 0x7f12d980df45 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_destroy(unsigned long) /build/cmsbld/auto-builds/CMSSW_10_1_0_pre2-slc6_amd64_gcc700/build/CMSSW_10_1_0_pre2-build/BUILD/slc6_amd64_gcc700/external/gcc/7.0.0-omkpbe2/gcc-branches_gcc-7-branch-257125/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h:226
    #4 0x7f12d980df45 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_dispose() /build/cmsbld/auto-builds/CMSSW_10_1_0_pre2-slc6_amd64_gcc700/build/CMSSW_10_1_0_pre2-build/BUILD/slc6_amd64_gcc700/external/gcc/7.0.0-omkpbe2/gcc-branches_gcc-7-branch-257125/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h:221
    #5 0x7f12d980df45 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() /build/cmsbld/auto-builds/CMSSW_10_1_0_pre2-slc6_amd64_gcc700/build/CMSSW_10_1_0_pre2-build/BUILD/slc6_amd64_gcc700/external/gcc/7.0.0-omkpbe2/gcc-branches_gcc-7-branch-257125/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h:647
    #6 0x7f12d980df45 in std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_stringbuf() /build/cmsbld/auto-builds/CMSSW_10_1_0_pre2-slc6_amd64_gcc700/build/CMSSW_10_1_0_pre2-build/BUILD/slc6_amd64_gcc700/external/gcc/7.0.0-omkpbe2/gcc-branches_gcc-7-branch-257125/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/sstream:65
    #7 0x7f12d980df45 in std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::~basic_stringstream() /build/cmsbld/auto-builds/CMSSW_10_1_0_pre2-slc6_amd64_gcc700/build/CMSSW_10_1_0_pre2-build/BUILD/slc6_amd64_gcc700/external/gcc/7.0.0-omkpbe2/gcc-branches_gcc-7-branch-257125/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/sstream:731
    #8 0x7f12abf283c7 in RPCDigitizer::doAction(MixCollection<PSimHit>&, MuonDigiCollection<RPCDetId, RPCDigi>&, edm::DetSetVector<RPCDigiSimLink>&, CLHEP::HepRandomEngine*) /build/chrjones/temp/asan/CMSSW_10_4_ASAN_X_2018-11-09-2300/src/SimMuon/RPCDigitizer/src/RPCDigitizer.cc:58
    #9 0x7f12abf713d9 in RPCDigiProducer::produce(edm::Event&, edm::EventSetup const&) /build/chrjones/temp/asan/CMSSW_10_4_ASAN_X_2018-11-09-2300/src/SimMuon/RPCDigitizer/src/RPCDigiProducer.cc:111
    #10 0x7f12dc26276b in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventPrincipal const&, edm::EventSetup const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) (/cvmfs/cms-ib.cern.ch/week1/slc6_amd64_gcc700/cms/cmssw/CMSSW_10_4_ASAN_X_2018-11-09-2300/lib/slc6_amd64_gcc700/libFWCoreFramework.so+0x75f76b)

which as much as I can tell has no connection what so ever to the other code.

@Dr15Jones
Copy link
Contributor Author

The problem happens here

const int lowPad(mymap[clct.getKeyStrip()].first);

where mymap is a std::vector
virtual std::vector<std::pair<int,int> > get_csc_hs_to_gem_pad(Parity par, enum CSCPart) const=0;

So I'm guessing the value of clct.getKeyStrip() is way outside of the size of the std::vector.

@Dr15Jones
Copy link
Contributor Author

assign l1

@cmsbuild
Copy link
Contributor

New categories assigned: l1

@rekovic,@nsmith-,@thomreis you have been requested to review this Pull request/Issue and eventually sign? Thanks

@kpedro88
Copy link
Contributor

@dildick can you take a look

@dildick
Copy link
Contributor

dildick commented Nov 12, 2018

@kpedro88 @Dr15Jones I'll look into it this week.

@dildick
Copy link
Contributor

dildick commented Nov 14, 2018

My colleague Tao ((@tahuang1991) may have a solution. tahuang1991@20194b6. It seems the problem may be due to the fact that in recent pull requests we undid the decoupling of ME1a and ME1b in the CSC local trigger. In the Trigger & DAQ systems ME1a is seen as an extension of ME1b (and this will not change in the future). The ME1a half-strip numbering starts where ME1b ends. Thus a correction needs to be made when looking up the corresponding GEM pad number for half-strips in ME1a. The fix has yet to be tested though.

@Dr15Jones
Copy link
Contributor Author

Ping

@tahuang1991
Copy link
Contributor

In previous Sven's thread, tahuang1991@20194b6
this commit should already be integrated into our pull request and it solves the problem.

@Dr15Jones
Copy link
Contributor Author

We are waiting for the l2s for l1 to sign off on the issue.

@Dr15Jones
Copy link
Contributor Author

There still seems to be problems with the code: See
https://cmssdt.cern.ch/SDT/cgi-bin/buildlogs/raw/slc7_amd64_gcc700/CMSSW_10_4_ASAN_X_2018-12-07-2300/pyRelValMatrixLogs/run/21634.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D22_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D22+RecoFullGlobal_2023D22+HARVESTFullGlobal_2023D22/step2_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D22_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D22+RecoFullGlobal_2023D22+HARVESTFullGlobal_2023D22.log

With

==7955==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61900e29c810 at pc 0x7fb89f9727b9 bp 0x7ffebe74b450 sp 0x7ffebe74b448
READ of size 4 at 0x61900e29c810 thread T0
    #0 0x7fb89f9727b8 in void CSCGEMMotherboard::matchingPads&lt;GEMPadDigi&gt;(CSCCLCTDigi const&amp, std::vector&lt;std::pair&lt;unsigned int, GEMPadDigi&gt;, std::allocator&lt;std::pair&lt;unsigned int, GEMPadDigi&gt; &gt; &gt;&amp) const (/cvmfs/cms-ib.cern.ch/nweek-02553/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_4_ASAN_X_2018-12-07-2300/lib/slc7_amd64_gcc700/libL1TriggerCSCTriggerPrimitives.so+0xaa7b8)
    #1 0x7fb89f972e2f in void CSCGEMMotherboard::matchingPads&lt;CSCCLCTDigi, GEMPadDigi&gt;(CSCCLCTDigi const&amp, CSCCLCTDigi const&amp, std::vector&lt;std::pair&lt;unsigned int, GEMPadDigi&gt;, std::allocator&lt;std::pair&lt;unsigned int, GEMPadDigi&gt; &gt; &gt;&amp) const (/cvmfs/cms-ib.cern.ch/nweek-02553/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_4_ASAN_X_2018-12-07-2300/lib/slc7_amd64_gcc700/libL1TriggerCSCTriggerPrimitives.so+0xaae2f)
    #2 0x7fb89f975955 in void CSCGEMMotherboard::matchingPads&lt;GEMPadDigi&gt;(CSCCLCTDigi const&amp, CSCCLCTDigi const&amp, CSCALCTDigi const&amp, CSCALCTDigi const&amp, std::vector&lt;std::pair&lt;unsigned int, GEMPadDigi&gt;, std::allocator&lt;std::pair&lt;unsigned int, GEMPadDigi&gt; &gt; &gt;&amp) const (/cvmfs/cms-ib.cern.ch/nweek-02553/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_4_ASAN_X_2018-12-07-2300/lib/slc7_amd64_gcc700/libL1TriggerCSCTriggerPrimitives.so+0xad955)
    #3 0x7fb89f9aa414 in CSCGEMMotherboardME11::run(MuonDigiCollection&lt;CSCDetId, CSCWireDigi&gt; const*, MuonDigiCollection&lt;CSCDetId, CSCComparatorDigi&gt; const*, MuonDigiCollection&lt;GEMDetId, GEMPadDigi&gt; const*) (/cvmfs/cms-ib.cern.ch/nweek-02553/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_4_ASAN_X_2018-12-07-2300/lib/slc7_amd64_gcc700/libL1TriggerCSCTriggerPrimitives.so+0xe2414)
    #4 0x7fb89f98d7e2 in CSCTriggerPrimitivesBuilder::build(CSCBadChambers const*, MuonDigiCollection&lt;CSCDetId, CSCWireDigi&gt; const*, MuonDigiCollection&lt;CSCDetId, CSCComparatorDigi&gt; const*, MuonDigiCollection&lt;GEMDetId, GEMPadDigi&gt; const*, MuonDigiCollection&lt;GEMDetId, GEMPadDigiCluster&gt; const*, MuonDigiCollection&lt;CSCDetId, CSCALCTDigi&gt;&amp, MuonDigiCollection&lt;CSCDetId, CSCCLCTDigi&gt;&amp, MuonDigiCollection&lt;CSCDetId, CSCCLCTPreTriggerDigi&gt;&amp, MuonDigiCollection&lt;CSCDetId, int&gt;&amp, MuonDigiCollection&lt;CSCDetId, CSCCorrelatedLCTDigi&gt;&amp, MuonDigiCollection&lt;CSCDetId, CSCCorrelatedLCTDigi&gt;&amp, MuonDigiCollection&lt;GEMDetId, GEMCoPadDigi&gt;&amp) (/cvmfs/cms-ib.cern.ch/nweek-02553/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_4_ASAN_X_2018-12-07-2300/lib/slc7_amd64_gcc700/libL1TriggerCSCTriggerPrimitives.so+0xc57e2)
    #5 0x7fb89fab7776 in CSCTriggerPrimitivesProducer::produce(edm::StreamID, edm::Event&amp, edm::EventSetup const&amp) const (/cvmfs/cms-ib.cern.ch/nweek-02553/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_4_ASAN_X_2018-12-07-2300/lib/slc7_amd64_gcc700/pluginCSCTriggerPrimitivesPlugins.so+0x1c776)

and

Address 0x61900e29c810 is a wild pointer.
SUMMARY: AddressSanitizer: heap-buffer-overflow (/cvmfs/cms-ib.cern.ch/nweek-02553/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_4_ASAN_X_2018-12-07-2300/lib/slc7_amd64_gcc700/libL1TriggerCSCTriggerPrimitives.so+0xaa7b8) in void CSCGEMMotherboard::matchingPads&lt;GEMPadDigi&gt;(CSCCLCTDigi const&amp, std::vector&lt;std::pair&lt;unsigned int, GEMPadDigi&gt;, std::allocator&lt;std::pair&lt;unsigned int, GEMPadDigi&gt; &gt; &gt;&amp) const
Shadow bytes around the buggy address:
  0x0c3281c4b8b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c3281c4b8c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c3281c4b8d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c3281c4b8e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c3281c4b8f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=&gt;0x0c3281c4b900: fa fa[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c3281c4b910: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c3281c4b920: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c3281c4b930: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c3281c4b940: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c3281c4b950: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb

@tahuang1991
Copy link
Contributor

I will look at this one during the weekend.
Tao

@tahuang1991
Copy link
Contributor

I found another bug in L1Trigger/CSCTriggerPrimitives/src/CSCUpgradeMotherboard.cc and after fixing it, I repeated the test by running "running runTheMatrix.py -l 21634.0" and confirmed that the problem was solved. either Sven or I will make a pull request to cmssw today later.

@tahuang1991
Copy link
Contributor

tahuang1991 commented Dec 11, 2018

commit to fix this bug is pushed to cmssw:
#25467

@thomreis
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This issue is fully signed and ready to be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants