New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix crash of ECAL GPU reco when ECAL is out of the run - 11_3_X #34768
Fix crash of ECAL GPU reco when ECAL is out of the run - 11_3_X #34768
Conversation
A new Pull Request was created by @thomreis (Thomas Reis) for CMSSW_11_3_X. It involves the following packages:
@perrotta, @jpata, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@@ -138,6 +138,10 @@ void EcalRawToDigiGPU::acquire(edm::Event const& event, | |||
if (counter > 0) { | |||
ecal::raw::entryPoint( | |||
inputCPU, inputGPU, outputGPU_, scratchGPU, outputCPU_, conditions, ctx.stream(), counter, currentCummOffset); | |||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be cleaner to set these to 0 unconditionally, before calling (or not) ecal::raw::entryPoint
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be done as well but zeroing seems an unnecessary operation when ecal::raw::entryPoint
is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to make sure the variables are in a clean (initialised) state, independently of the future use.
Especially because they are only set in the subsequent call to produce()
, and anybody looking at them inside ecal::raw::entryPoint
(for example, to debug the next problem in a few weeks/months/years time) will likely be confused by seeing random values.
type bug-fix |
please test |
No enable gpu needed for this one @fwyzard ? |
It's not like the GPU tests were finding the crash anyway... but sure,
let's run that as well.
|
enable gpu
|
please test
|
urgent |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-6de7ff/17525/summary.html GPU Comparison SummarySummary:
Comparison SummarySummary:
|
+reconstruction
|
This pull request is fully signed and it will be integrated in one of the next CMSSW_11_3_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_12_1_X is complete. This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy, @perrotta (and backports should be raised in the release meeting by the corresponding L2) |
+1
|
PR description:
Fixes a crash of the EcalUncalibRecHitProducerGPU module when the ECAL is not in the run. The reason for the crash is that the number of channels is not properly initialised to zero in the EcalRawToDigiGPU when there are no FEDs to unpack.
Also improved error message that helped to find the origin of the problem.
PR validation:
HLT configuration detailed in #34197 (comment) runs to completion now.
Backport of #34765