Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in HCAL DQM for Data relvals in 9-4-14-UL #27456

Closed
fabiocos opened this issue Jul 8, 2019 · 7 comments
Closed

Crash in HCAL DQM for Data relvals in 9-4-14-UL #27456

fabiocos opened this issue Jul 8, 2019 · 7 comments

Comments

@fabiocos
Copy link
Contributor

fabiocos commented Jul 8, 2019

From @prebello

PdmV would like to report a crash in 9-4-14-UL release (only observed after jobs injection in the system, not locally) related with Module: DigiPhase1Task:digiPhase1Task in HCAL DQM, the same of the issue reported in 10-2-0-pre3 below

#23259

Could you please inform if the fix #23262 proposed in this issue was backported to 9-4-X?

Any other suggestion, apart to remove @hcal from the DQM sequence, to fix it?

A link for the log of the recent crash (all new Data relvals in 9-4-14-UL affected)

https://cms-unified.web.cern.ch/cms-unified/report/prebello_RVCMSSW_9_4_14_ULRunDoubleMuon2017C_UL2017__RelVal_2017C_190702_233947_1928

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 8, 2019

A new Issue was created by @fabiocos Fabio Cossutti.

@davidlange6, @Dr15Jones, @smuzaffar, @fabiocos, @kpedro88 can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@fabiocos
Copy link
Contributor Author

fabiocos commented Jul 8, 2019

assign dqm

@DryRun FYI

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 8, 2019

New categories assigned: dqm

@jfernan2,@andrius-k,@schneiml,@fioriNTU,@kmaeshima you have been requested to review this Pull request/Issue and eventually sign? Thanks

@fabiocos
Copy link
Contributor Author

fabiocos commented Jul 8, 2019

@prebello @zhenhu do you have a tarball that allows us to reproduce the crash? I wonder why this crash is seen just now, as that code seems stable since long time, no data RelVal injected since last spring I would imagine?

@prebello
Copy link
Contributor

prebello commented Jul 8, 2019

@fabiocos @fioriNTU

the 2017B era has not crashed but from 2017C to 2017F
https://dmytro.web.cern.ch/dmytro/cmsprodmon/workflows.php?prep_id=CMSSW_9_4_14_UL__dataUL2017B-1562097746-RunDoubleMuon2017B_UL2017

Latest data rereco and relvals were done with 9-4-0.

Relvals were injected based on new implementation from prebello@2769090
using the standard commands runTheMatrix.py --what standard -l .... accordingly

My local test for 2017C is at /afs/cern.ch/user/p/prebello/prebello/public/PDMV/UL2017Tracker/CMSSW_9_4_14_UL/src/2017C
Actually, the local test is fine for all, although few HLT error msgs. Therefore I don't know how we could reproduce the crash.

@DryRun
Copy link
Contributor

DryRun commented Jul 9, 2019

Hi all,

I created a PR to solve the crash at #27470. I was able to run the cmsDriver command from @franzoni on JetHT 2017C [1], but as @prebello says, it's possible that a successful local test doesn't reflect the crash.

cmsDriver.py Configuration/GenProduction/python/superfragment_cfi.py --fileout file:HiggsLLP_test.GENSIM.root --mc --eventcontent RAWSIM --datatier GEN-SIM --conditions 102X_upgrade2018_realistic_v11 --beamspot Realistic25ns13TeVEarly2018Collision --customise_commands="process.source.numberEventsInLuminosityBlock = cms.untracked.uint32(10)" --step GEN,SIM --nThreads 8 --geometry DB:Extended --era Run2_2018 --python_filename superfragment_cfg.py --no_exec --customise Configuration/DataProcessing/Utils.addMonitoring -n 100

@fabiocos
Copy link
Contributor Author

fabiocos commented Aug 9, 2019

#27470 has been merged

@fabiocos fabiocos closed this as completed Aug 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants