Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception from HGCalBackendLayer2Producer #41451

Closed
makortel opened this issue Apr 27, 2023 · 24 comments
Closed

Exception from HGCalBackendLayer2Producer #41451

makortel opened this issue Apr 27, 2023 · 24 comments

Comments

@makortel
Copy link
Contributor

The workflow 25234.911 step2 occasionally fails with an exception

----- Begin Fatal Exception 27-Apr-2023 17:14:00 CEST-----------------------
An exception of category 'OutOfBound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 3 stream: 1
   [1] Running path 'FEVTDEBUGHLToutput_step'
   [2] Prefetching for module PoolOutputModule/'FEVTDEBUGHLToutput'
   [3] Prefetching for module L1TEGMultiMerger/'l1tLayer1EG'
   [4] Prefetching for module L1TCorrelatorLayer1Producer/'l1tLayer1HGCal'
   [5] Prefetching for module PFClusterProducerFromHGC3DClusters/'l1tPFClustersFromHGC3DClusters'
   [6] Calling method for module HGCalBackendLayer2Producer/'l1tHGCalBackEndLayer2Producer'
Exception Message:
TC X1 = 2.77857e+07 out of the seeding histogram bounds 0.076 - 0.58
----- End Fatal Exception -------------------------------------------------

https://cmssdt.cern.ch/SDT/cgi-bin/logreader/el8_amd64_gcc11/CMSSW_13_1_NONLTO_X_2023-04-27-1100/pyRelValMatrixLogs/run/25234.911_TTbar_14TeV+2026D99_DD4hep/step2_TTbar_14TeV+2026D99_DD4hep.log#/

@makortel
Copy link
Contributor Author

assign upgrade, l1, geometry

@cmsbuild
Copy link
Contributor

New categories assigned: geometry,upgrade,l1

@mdhildreth,@epalencia,@AdrianoDee,@Dr15Jones,@srimanob,@aloeliger,@makortel,@bsunanda,@cecilecaillol,@civanch you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 27, 2023

A new Issue was created by @makortel Matti Kortelainen.

@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

Earlier occurrences have been reported in #41376 (comment) and #40404 (comment)

@makortel
Copy link
Contributor Author

@makortel
Copy link
Contributor Author

Let's tag @cms-sw/hgcal-dpg-l2 as well

@aandvalenzuela
Copy link
Contributor

New occurrence of this exception in CLANG IBs (CMSSW_13_3_CLANG_X_2023-08-04-2300):

An exception of category 'OutOfBound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 4 stream: 1
   [1] Running path 'HLTriggerFinalPath'
   [2] Prefetching for module TriggerSummaryProducerAOD/'hltTriggerSummaryAOD'
   [3] Prefetching for module EgammaHLTRecoEcalCandidateProducers/'hltEgammaCandidatesL1Seeded'
   [4] Prefetching for module PFECALSuperClusterProducer/'particleFlowSuperClusterHGCalFromTICLL1Seeded'
   [5] Prefetching for module PFClusterProducer/'particleFlowClusterHGCalFromTICLL1Seeded'
   [6] Prefetching for module PFRecHitProducer/'particleFlowRecHitHGCL1Seeded'
   [7] Prefetching for module HGCalRecHitProducer/'HGCalRecHitL1Seeded'
   [8] Prefetching for module HGCalUncalibRecHitProducer/'HGCalUncalibRecHitL1Seeded'
   [9] Prefetching for module HLTHGCalDigisInRegionsProducer/'hgcalDigisL1Seeded'
   [10] Prefetching for module L1TEGammaFilteredCollectionProducer/'hltL1TEGammaHGCFilteredCollectionProducer'
   [11] Prefetching for module L1TEGMultiMerger/'l1tLayer1EG'
   [12] Prefetching for module L1TCorrelatorLayer1Producer/'l1tLayer1HGCal'
   [13] Prefetching for module PFClusterProducerFromHGC3DClusters/'l1tPFClustersFromHGC3DClusters'
   [14] Calling method for module HGCalBackendLayer2Producer/'l1tHGCalBackEndLayer2Producer'
Exception Message:
TC X1 = inf out of the seeding histogram bounds 0.076 - 0.58

@iarspider
Copy link
Contributor

Another occurrence of this exception - in CMSSW_13_3_X_2023-08-21-2300

----- Begin Fatal Exception 22-Aug-2023 08:29:10 CEST-----------------------
An exception of category 'OutOfBound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 2 stream: 3
   [1] Running path 'HLTriggerFinalPath'
   [2] Prefetching for module TriggerSummaryProducerAOD/'hltTriggerSummaryAOD'
   [3] Prefetching for module EgammaHLTRecoEcalCandidateProducers/'hltEgammaCandidatesL1Seeded'
   [4] Prefetching for module PFECALSuperClusterProducer/'particleFlowSuperClusterHGCalFromTICLL1Seeded'
   [5] Prefetching for module PFClusterProducer/'particleFlowClusterHGCalFromTICLL1Seeded'
   [6] Prefetching for module PFRecHitProducer/'particleFlowRecHitHGCL1Seeded'
   [7] Prefetching for module HGCalRecHitProducer/'HGCalRecHitL1Seeded'
   [8] Prefetching for module HGCalUncalibRecHitProducer/'HGCalUncalibRecHitL1Seeded'
   [9] Prefetching for module HLTHGCalDigisInRegionsProducer/'hgcalDigisL1Seeded'
   [10] Prefetching for module L1TEGammaFilteredCollectionProducer/'hltL1TEGammaHGCFilteredCollectionProducer'
   [11] Prefetching for module L1TEGMultiMerger/'l1tLayer1EG'
   [12] Prefetching for module L1TCorrelatorLayer1Producer/'l1tLayer1HGCal'
   [13] Prefetching for module PFClusterProducerFromHGC3DClusters/'l1tPFClustersFromHGC3DClusters'
   [14] Calling method for module HGCalBackendLayer2Producer/'l1tHGCalBackEndLayer2Producer'
Exception Message:
TC X1 = inf out of the seeding histogram bounds 0.076 - 0.58
----- End Fatal Exception -------------------------------------------------

@perrotta
Copy link
Contributor

@cms-sw/hgcal-dpg-l2 instead of throwing in lines
https://github.com/cms-sw/cmssw/blob/master/L1Trigger/L1THGCal/src/backend/HGCalHistoSeedingImpl.cc#L91-L92
and
https://github.com/cms-sw/cmssw/blob/master/L1Trigger/L1THGCal/src/backend/HGCalHistoSeedingImpl.cc#L95-L96
couldn't you just issue a warning and pass to the next cluster (i.e. continue)?

@iarspider
Copy link
Contributor

Happened again in CMSSW_13_3_DBG_X_2023-10-26-2300. @cms-sw/hgcal-dpg-l2 could you please reply to @perrotta 's suggestion above?

@civanch
Copy link
Contributor

civanch commented Oct 27, 2023

@bsunanda , can this issue somehow connected with D99 geometry?

@iarspider
Copy link
Contributor

Happened again in CMSSW_14_0_X_2023-11-23-2300:

----- Begin Fatal Exception 24-Nov-2023 07:18:03 CET-----------------------
An exception of category 'OutOfBound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 5 stream: 3
   [1] Running path 'HLTriggerFinalPath'
   [2] Prefetching for module TriggerSummaryProducerAOD/'hltTriggerSummaryAOD'
   [3] Prefetching for module EgammaHLTRecoEcalCandidateProducers/'hltEgammaCandidatesL1Seeded'
   [4] Prefetching for module PFECALSuperClusterProducer/'particleFlowSuperClusterHGCalFromTICLL1Seeded'
   [5] Prefetching for module PFClusterProducer/'particleFlowClusterHGCalFromTICLL1Seeded'
   [6] Prefetching for module PFRecHitProducer/'particleFlowRecHitHGCL1Seeded'
   [7] Prefetching for module HGCalRecHitProducer/'HGCalRecHitL1Seeded'
   [8] Prefetching for module HGCalUncalibRecHitProducer/'HGCalUncalibRecHitL1Seeded'
   [9] Prefetching for module HLTHGCalDigisInRegionsProducer/'hgcalDigisL1Seeded'
   [10] Prefetching for module L1TEGammaFilteredCollectionProducer/'hltL1TEGammaHGCFilteredCollectionProducer'
   [11] Prefetching for module L1TEGMultiMerger/'l1tLayer1EG'
   [12] Prefetching for module L1TCorrelatorLayer1Producer/'l1tLayer1HGCal'
   [13] Prefetching for module PFClusterProducerFromHGC3DClusters/'l1tPFClustersFromHGC3DClusters'
   [14] Calling method for module HGCalBackendLayer2Producer/'l1tHGCalBackEndLayer2Producer'
Exception Message:
TC X1 = inf out of the seeding histogram bounds 0.076 - 0.58
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 24-Nov-2023 07:18:03 CET-----------------------
An exception of category 'OutOfBound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 2 stream: 2
   [1] Running path 'HLTriggerFinalPath'
   [2] Prefetching for module TriggerSummaryProducerAOD/'hltTriggerSummaryAOD'
   [3] Prefetching for module EgammaHLTRecoEcalCandidateProducers/'hltEgammaCandidatesL1Seeded'
   [4] Prefetching for module PFECALSuperClusterProducer/'particleFlowSuperClusterHGCalFromTICLL1Seeded'
   [5] Prefetching for module PFClusterProducer/'particleFlowClusterHGCalFromTICLL1Seeded'
   [6] Prefetching for module PFRecHitProducer/'particleFlowRecHitHGCL1Seeded'
   [7] Prefetching for module HGCalRecHitProducer/'HGCalRecHitL1Seeded'
   [8] Prefetching for module HGCalUncalibRecHitProducer/'HGCalUncalibRecHitL1Seeded'
   [9] Prefetching for module HLTHGCalDigisInRegionsProducer/'hgcalDigisL1Seeded'
   [10] Prefetching for module L1TEGammaFilteredCollectionProducer/'hltL1TEGammaHGCFilteredCollectionProducer'
   [11] Prefetching for module L1TEGMultiMerger/'l1tLayer1EG'
   [12] Prefetching for module L1TCorrelatorLayer1Producer/'l1tLayer1HGCal'
   [13] Prefetching for module PFClusterProducerFromHGC3DClusters/'l1tPFClustersFromHGC3DClusters'
   [14] Calling method for module HGCalBackendLayer2Producer/'l1tHGCalBackEndLayer2Producer'
Exception Message:
TC X1 = 1.79347 out of the seeding histogram bounds 0.076 - 0.58
----- End Fatal Exception -------------------------------------------------

@iarspider
Copy link
Contributor

@bsunanda @cms-sw/hgcal-dpg-l2 gentle ping.

@srimanob
Copy link
Contributor

Can this topic be closed?

@makortel
Copy link
Contributor Author

Has the problem been fixed?

@srimanob
Copy link
Contributor

Ah, the PR I see is not a fix for this issue. So, this PR is still be an issue.
Sorry for the noise.

@bsunanda
Copy link
Contributor

bsunanda commented Jan 18, 2024 via email

@makortel
Copy link
Contributor Author

Workflow 25234.911 (step2 of it), any build. Likely requires multiple threads (at least I don't recall seeing the exception in PR tests). The exception occurs randomly, likely requires multiple attempts to reproduce, possibly on a loaded machine.

@dan131riley
Copy link

We are still seeing this exception, and there's some circumstantial evidence that this issue and #42025 could be related.

@aandvalenzuela
Copy link
Contributor

New occurrence of this failure in RelVal 25234.911 for el8_amd64_gcc12 in CMSSW_14_1_X_2024-05-19-2300 IBs:

----- Begin Fatal Exception 20-May-2024 01:32:28 CEST-----------------------
An exception of category 'OutOfBound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 1 stream: 3
   [1] Running path 'HLTriggerFinalPath'
   [2] Prefetching for module TriggerSummaryProducerAOD/'hltTriggerSummaryAOD'
   [3] Prefetching for module L1HPSPFTauProducer/'l1tHPSPFTauProducer'
   [4] Prefetching for module L1TPFCandMultiMerger/'l1tLayer1'
   [5] Prefetching for module L1TCorrelatorLayer1Producer/'l1tLayer1HGCal'
   [6] Prefetching for module PFClusterProducerFromHGC3DClusters/'l1tPFClustersFromHGC3DClusters'
   [7] Calling method for module HGCalBackendLayer2Producer/'l1tHGCalBackEndLayer2Producer'
Exception Message:
TC X1 = inf out of the seeding histogram bounds 0.076 - 0.58
----- End Fatal Exception -------------------------------------------------

@srimanob
Copy link
Contributor

The same issue appears in D110 DD4hep, #45175 (comment)

@srimanob
Copy link
Contributor

srimanob commented Jul 6, 2024

A strange way to reproduce the issue,
#41927 (comment)

@srimanob
Copy link
Contributor

#41927 (comment)

@makortel
Copy link
Contributor Author

This one got fixed with #45442 as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants