Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assert in debug builds from GsfTrajectorySmoother #34097

Closed
Dr15Jones opened this issue Jun 11, 2021 · 17 comments
Closed

assert in debug builds from GsfTrajectorySmoother #34097

Dr15Jones opened this issue Jun 11, 2021 · 17 comments

Comments

@Dr15Jones
Copy link
Contributor

The DBG IB RelVals are failing with

cmsRun: /data/cmsbld/jenkins/workspace/build-any-ib/w/tmp/BUILDROOT/ff4d68f07cafd4cd27ed5a12537c66fd/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_12_0_DBG_X_2021-06-10-2300/src/TrackingTools/TrajectoryState/src/BasicTrajectoryState.cc:252: virtual const Components& BasicSingleTrajectoryState::components() const: Assertion `false' failed.

With the back track

#8  0x00002b6799f2c252 in __assert_fail () from /lib64/libc.so.6
#9  0x00002b67bcf73ee8 in BasicSingleTrajectoryState::components (this=<optimized out>) at /data/cmsbld/jenkins/workspace/build-any-ib/w/tmp/BUILDROOT/ff4d68f07cafd4cd27ed5a12537c66fd/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_12_0_DBG_X_2021-06-10-2300/src/TrackingTools/TrajectoryState/src/BasicTrajectoryState.cc:252
#10 0x00002b67c97eb668 in TrajectoryStateOnSurface::components (this=0x2b67e0bf7f50) at /data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc900/external/gcc/9.3.0/include/c++/9.3.0/bits/shared_ptr_base.h:1020
#11 (anonymous namespace)::dump (tsos=..., header=0x2b67c9802573 "smooTsos", msgCat=...) at /data/cmsbld/jenkins/workspace/build-any-ib/w/tmp/BUILDROOT/ff4d68f07cafd4cd27ed5a12537c66fd/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_12_0_DBG_X_2021-06-10-2300/src/TrackingTools/TrackFitters/interface/DebugHelpers.h:84
#12 0x00002b67c97ed2cd in GsfTrajectorySmoother::trajectory (this=0x2b6849ea85a0, aTraj=...) at /data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc900/external/gcc/9.3.0/include/c++/9.3.0/ext/new_allocator.h:80
#13 0x00002b67f9ad16d8 in LowPtGsfElectronSeedProducer::lightGsfTracking (this=<optimized out>, preId=..., trackRef=..., seed=...) at /data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc900/external/gcc/9.3.0/include/c++/9.3.0/bits/unique_ptr.h:360
#14 0x00002b67f9ae8d32 in LowPtGsfElectronSeedProducer::loop<reco::PFRecTrack> (this=this@entry=0x2b67fc0d0c00, handle=..., hcalClusters=..., seeds=..., ecalPreIds=..., hcalPreIds=..., trksToPreIdIndx=..., event=..., setup=...) at /data/cmsbld/jenkins/workspace/build-any-ib/w/tmp/BUILDROOT/ff4d68f07cafd4cd27ed5a12537c66fd/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_12_0_DBG_X_2021-06-10-2300/src/DataFormats/Common/interface/RefCoreWithIndex.h:57
#15 0x00002b67f9ad5d85 in LowPtGsfElectronSeedProducer::produce (this=0x2b67fc0d0c00, event=..., setup=...) at /data/cmsbld/jenkins/workspace/build-any-ib/w/tmp/BUILDROOT/ff4d68f07cafd4cd27ed5a12537c66fd/opt/cmssw/slc7_amd64_gcc900/cms/cmssw/CMSSW_12_0_DBG_X_2021-06-10-2300/src/RecoEgamma/EgammaElectronProducers/plugins/LowPtGsfElectronSeedProducer.cc:243
@cmsbuild
Copy link
Contributor

A new Issue was created by @Dr15Jones Chris Jones.

@Dr15Jones, @dpiparo, @silviodonato, @smuzaffar, @makortel, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@Dr15Jones
Copy link
Contributor Author

The assert is here

BasicSingleTrajectoryState::Components const& BasicSingleTrajectoryState::components() const {
edm::LogError("BasicSingleTrajectoryState") << "asking for componenets to a SingleTrajectoryState" << std::endl;
assert(false);

which is being called from

inline void dump(TrajectoryStateOnSurface const& tsos, const char* header, const std::string& msgCat) {
std::ostringstream ss;
ss << " weights ";
for (auto const& c : tsos.components())

which comes from

dump(*hit, hitCounter, "TrackFitters");

The dump has a conditional implementation based on if the macro EDM_ML_DEBUG is set (which it is for DBG builds).

@Dr15Jones
Copy link
Contributor Author

assign reconstruction

@cmsbuild
Copy link
Contributor

New categories assigned: reconstruction

@slava77,@perrotta,@jpata you have been requested to review this Pull request/Issue and eventually sign? Thanks

@perrotta
Copy link
Contributor

@slava77
Copy link
Contributor

slava77 commented Jun 11, 2021

@Dr15Jones
Please provide some details how to reproduce, at least the actual IB and a workflow.

do you know if this is a new problem or was it perhaps shadowed in the same test before by an earlier crash?

@Dr15Jones
Copy link
Contributor Author

See https://cmssdt.cern.ch/SDT/html/cmssdt-ib/#/ib/CMSSW_12_0_X with the DBG_X build of 12_0_X_2021-06-10-2300. 748 of the RelVals failed for this reason.

@Dr15Jones
Copy link
Contributor Author

The DBG_X build is the only IB where we set on EDM_ML_DEBUG for all of CMSSW so it uncovers these conditional code build problems.

@dan131riley
Copy link

CMSSW_12_0_DBG_X_2021-06-10-2300, pretty much any wf, only happens if EDM_ML_DEBUG is defined, previously hidden because the DBG builds were completely borked.

@slava77
Copy link
Contributor

slava77 commented Jun 11, 2021

CMSSW_12_0_DBG_X_2021-06-10-2300, pretty much any wf, only happens if EDM_ML_DEBUG is defined, previously hidden because the DBG builds were completely borked.

this clarifies.
I picked some other random cases and the stack trace before DebugHelpers.h is different.
e.g. in 5.1 it's GsfTrajectorySmoother::trajectory -> dump

@slava77
Copy link
Contributor

slava77 commented Jun 11, 2021

both cases are still related to Gsf.
I wonder if this was broken for ages.
@VinInn

@slava77
Copy link
Contributor

slava77 commented Sep 30, 2021

@vmariani @mmusich
@wrtabb @SohamBhattacharya

Please clarify if somebody in TRK or EGM is available to check this issue.
Thank you.

@swagata87
Copy link
Contributor

Hi,
I tried to address this problem here
#35524

this is still a draft PR because after fixing this issue, the debug mode uncovers more issues, which needs to be solved, in order to successfully run the workflows in debug mode. But in the meantime, tracking experts might want to check if my changes are okay to address the Gsf tracking assert issue.

@slava77
Copy link
Contributor

slava77 commented Oct 19, 2021

+reconstruction

looking at the first IB after #35524 was merged
CMSSW_12_1_X_2021-10-14-2300 in DBG_X the logs for e.g. wf 5.1 now does not show the assert.
There is still an exception, but this was already noticed in tests of #35524 https://cmssdt.cern.ch/SDT/cgi-bin/buildlogs/raw/slc7_amd64_gcc900/CMSSW_12_1_DBG_X_2021-10-14-2300/pyRelValMatrixLogs/run/5.1_TTbar+TTbarFS+HARVESTFS/step1_TTbar+TTbarFS+HARVESTFS.log

This will be a topic for a follow-up issue

@cmsbuild
Copy link
Contributor

This issue is fully signed and ready to be closed.

@qliphy qliphy closed this as completed Oct 19, 2021
@smuzaffar
Copy link
Contributor

smuzaffar commented Jan 7, 2022

@swagata87 many workflows in DBG_X IBs are failing with error [a]. Was there any follow up issue/PR to discuss/fix this?

----- Begin Fatal Exception 07-Jan-2022 06:37:42 UTC-----------------------
An exception of category 'PFECALSuperClusterAlgo::buildSuperCluster' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 3 stream: 2
   [1] Running path 'prevalidation_step'
   [2] Prefetching for module MultiTrackValidator/'trackValidator'
   [3] Prefetching for module JetTracksAssociationToTrackRefs/'cutsRecoTracksAK4PFJets'
   [4] Prefetching for module JetTracksAssociatorExplicit/'ak4JetTracksAssociatorExplicitAll'
   [5] Prefetching for module FastjetJetProducer/'ak4PFJets'
   [6] Prefetching for module PFLinker/'particleFlow'
   [7] Prefetching for module PFProducer/'particleFlowTmp'
   [8] Prefetching for module PFBlockProducer/'particleFlowBlock'
   [9] Prefetching for module PFElecTkProducer/'pfTrackElec'
   [10] Prefetching for module GsfTrackProducer/'electronGsfTracks'
   [11] Prefetching for module TrackCandidateProducer/'fastElectronCkfTrackCandidates'
   [12] Prefetching for module ElectronSeedMerger/'electronMergedSeeds'
   [13] Prefetching for module ElectronSeedProducer/'ecalDrivenElectronSeeds'
   [14] Calling method for module PFECALSuperClusterProducer/'particleFlowSuperClusterECAL'
Exception Message:
Found a PS cluster matched to more than one EE cluster!
0x2ba8a5cfc5b8 == 0x2ba8a5cfc500
----- End Fatal Exception -------------------------------------------------

@swagata87
Copy link
Contributor

Hello @smuzaffar,
As far as I recall, there was no separate github-issue to follow it up.
However, now I have looked into it a bit more, and I propose a solution here:
#36655
Hopefully it will solve the problem.. let's see from the tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants