Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add check validity of edm::Handle in HLTDQM modules #24445

Merged
merged 3 commits into from Sep 7, 2018

Conversation

mtosi
Copy link
Contributor

@mtosi mtosi commented Sep 3, 2018

as pointed out in #24155 (comment)
and spotted / seen also in https://hypernews.cern.ch/HyperNews/CMS/get/tier0-Ops/1978/1/1.html

some modules make the job crashing (!!!)
I've added the check of the handle validity

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2018

A new Pull Request was created by @mtosi (mia tosi) for CMSSW_10_2_X.

It involves the following packages:

DQMOffline/Trigger

@kmaeshima, @cmsbuild, @andrius-k, @jfernan2, @schneiml can you please review it and eventually sign? Thanks.
@battibass, @jhgoh, @calderona, @HuguesBrun, @trocino, @folguera, @rociovilar this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@fabiocos
Copy link
Contributor

fabiocos commented Sep 3, 2018

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2018

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/30222/console Started: 2018/09/03 15:40

@fabiocos
Copy link
Contributor

fabiocos commented Sep 3, 2018

@mtosi as we have discussed directly:

  1. the fix prevents the crash, verified with Configuration/dataProcessing

python RunExpressProcessing.py --scenario ppEra_Run2_2018 --fevt --dqmio --global-tag 102X_dataRun2_Express_v2 --lfn /store/t0streamer/Data/Express/000/321/879/run321879_ls0315_streamExpress_StorageManager.dat

using as a source

process.source = cms.Source("NewEventStreamFileReader",
fileNames = cms.untracked.vstring('/store/t0streamer/Data/Express/000/321/879/run321879_ls0315_streamExpress_StorageManager.dat')
)

  1. I suggest to add LogWarnings for a limited number of checks so as to avoid a silent failures but also to avoid filling log files with failure messages;

  2. please forward-port to master;

  3. again this is a work-around, not the ultimate solution (a proper management of the DQM express sequences discussed since long time).

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2018

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2018

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2018

Pull request #24445 was updated. @kmaeshima, @cmsbuild, @andrius-k, @jfernan2, @schneiml can you please check and sign again.

@mtosi
Copy link
Contributor Author

mtosi commented Sep 3, 2018

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2018

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/30228/console Started: 2018/09/03 18:09

@schneiml
Copy link
Contributor

schneiml commented Sep 3, 2018

@fabiocos as you mentioned, this does not really fix the problem, but a proper fix will be much harder/riskier.

IIRC @mtosi said that they don't need any HLT DQM in StreamExpress anyways, so we could just disable that entirely. Except we can't currently, since StreamExpress is using the default relval sequence...

Another thing that would be interesting is if we can reproduce this crash using runTheMatrix. Fixing WF 1001.2 to use the "proper", default sequence as used in real Express did not do it, but it is based on 2017 data. I wonder what else changed that causes this problem to appear in T0 express.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2018

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2018

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 5, 2018

Pull request #24445 was updated. @kmaeshima, @cmsbuild, @andrius-k, @jfernan2, @schneiml can you please check and sign again.

Update BPHMonitor.cc

Update BPHMonitor.cc

Update HTMonitor.cc

Update METMonitor.cc
@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 5, 2018

Pull request #24445 was updated. @kmaeshima, @cmsbuild, @andrius-k, @jfernan2, @schneiml can you please check and sign again.

@mtosi
Copy link
Contributor Author

mtosi commented Sep 5, 2018

I've implemented the logic discussed with Fabio:

  • check the InputTag is not empty
  • only if it is not, then skip the event (not fill the plot)

and then I rebase for limiting / squeezing the number of commits

@mtosi
Copy link
Contributor Author

mtosi commented Sep 6, 2018

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 6, 2018

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/30282/console Started: 2018/09/06 11:52

@mtosi
Copy link
Contributor Author

mtosi commented Sep 6, 2018

@fabiocos could this PR be integrated and a patch release w/ this be built ?
TSG asked to PdmV to produce samples for the 102X validation w/ data, and we need this PR in

please, let me know if there is anything to do
and thanks !

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 6, 2018

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 6, 2018

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 6, 2018

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-24445/30282/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 31
  • DQMHistoTests: Total histograms compared: 2985378
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2985187
  • DQMHistoTests: Total skipped: 190
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 6.536 KiB( 30 files compared)
  • DQMHistoSizes: changed ( 150.0,... ): 3.268 KiB HLT/B2G
  • Checked 129 log files, 14 edm output root files, 31 DQM output files

@fabiocos
Copy link
Contributor

fabiocos commented Sep 7, 2018

@jfernan2 @schneiml @andrius-k could you please check and sign it in case?

@andrius-k
Copy link

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 7, 2018

This pull request is fully signed and it will be integrated in one of the next CMSSW_10_2_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_10_3_X is complete. This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

cmsbuild added a commit that referenced this pull request Sep 7, 2018
add check validity of edm::Handle in HLTDQM modules [forwardport of #24445]
@fabiocos
Copy link
Contributor

fabiocos commented Sep 7, 2018

+1

@cmsbuild cmsbuild merged commit 82b4b30 into cms-sw:CMSSW_10_2_X Sep 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants