Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guard HLTRecHitInAllL1RegionsProducer<T> against empty collection of L1T candidates #41466

Conversation

missirol
Copy link
Contributor

PR description:

This PR adds a check to HLTRecHitInAllL1RegionsProducer<T> to handle gracefully events where the input collection of L1T candidates is empty (either completely empty, or even empty just for BX=0).

For such events, pre-PR the plugin can crash here. This type of crash was observed multiple times online in the last days. The root cause of the problem (i.e. missing L1T candidates) is likely related to issues with the L1T menu deployed a couple of weeks ago (CMSLITOPS-411).

A reproducer is in [1], and it produces the stack trace attached here (which matches the error log seen online).

FYI: @Sam-Harper @swagata87 (as this module is used by E/gamma triggers)

[1]

#!/bin/bash

# cmsrel CMSSW_13_0_3
# cd CMSSW_13_0_3/src
# cmsenv

hltGetConfiguration run:366727 \
  --globaltag 130X_dataRun3_HLT_v2 --data \
  --no-prescale --no-output \
  --max-events -1 \
  --input dummy.root \
  > hlt.py

cat <<@EOF >> hlt.py
del process.DQMOutput

process.options.numberOfThreads = 1
process.options.numberOfStreams = 0

process.hltOnlineBeamSpotESProducer.timeThreshold = int(1e6)

del process.MessageLogger
process.load('FWCore.MessageService.MessageLogger_cfi')
process.MessageLogger.cerr.FwkReport.reportEvery = 1
process.MessageLogger.cerr.enableStatistics = False
process.MessageLogger.cerr.threshold = 'INFO'

from EventFilter.Utilities.FedRawDataInputSource_cfi import source as _source
process.source = _source.clone(
    eventChunkSize = 200,
    eventChunkBlock = 200,
    numBuffers = 4,
    maxBufferedFiles = 4,
    fileListMode = True,
    fileNames = [
      "/eos/cms/store/group/dpg_trigger/comm_trigger/TriggerStudiesGroup/FOG/error_stream/run366727/run366727_ls0136_index000137_fu-c2b04-34-01_pid3305175.raw",
    ]
)

from EventFilter.Utilities.EvFDaqDirector_cfi import EvFDaqDirector as _EvFDaqDirector
process.EvFDaqDirector = _EvFDaqDirector.clone(buBaseDir = '.', runNumber = 0)
@EOF

cmsRun hlt.py &> hlt.log

PR validation:

With the changes in this PR, the reproducer does not crash.

If this PR is a backport, please specify the original PR and why you need to backport that PR. If this PR will be backported, please specify to which release cycle the backport is meant for:

CMSSW_13_0_X

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41466/35330

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @missirol (Marino Missiroli) for master.

It involves the following packages:

  • RecoEgamma/EgammaHLTProducers (hlt)

@cmsbuild, @missirol, @Martin-Grunewald can you please review it and eventually sign? Thanks.
@Sam-Harper, @HuguesBrun, @silviodonato, @jainshilpi, @sameasy, @valsdav, @Fedespring, @lgray, @sobhatta, @afiqaize, @wrtabb, @a-kapoor, @Prasant1993, @varuns23, @cericeci, @ram1123 this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@missirol
Copy link
Contributor Author

type bugfix

@missirol
Copy link
Contributor Author

urgent

This PR can reduce the number of HLT crashes seen online these days, so it should be integrated quickly.

@missirol
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2cea4d/32260/summary.html
COMMIT: 980aeed
CMSSW: CMSSW_13_1_X_2023-04-30-0000/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/41466/32260/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 17 lines from the logs
  • Reco comparison results: 8 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3460877
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3460849
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 207 log files, 159 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@missirol
Copy link
Contributor Author

+hlt

No changes as expected. There is a discussion on whether LogDebug is appropriate in such cases (#41467 (comment)), but I suggest to address that, if needed, in a follow-up PR (cc: @dinyar).

@cms-sw/orp-l2 , please consider this PR for the upcoming IB.

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@rappoccio
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit d008462 into cms-sw:master Apr 30, 2023
11 checks passed
@cmsbuild cmsbuild mentioned this pull request May 1, 2023
@missirol missirol deleted the devel_sanityCheckfixHLTRecHitInAllL1RegionsProducer branch May 6, 2023 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants