-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix assert failure in readLuminosityBlock #27158
Conversation
We are having a rare, nonreproducible failure in EventProcessor::readLuminosityBlock in the IBs. The symptom is an assert because the pointer to the LuminosityBlockPrincipal is null. If the LuminosityBlockProcessingStatus object was being destroyed before calling resumeGlobalLumiQueue then such a failure could occur because the order the of the member variables in LuminosityBlockProcessingStatus is wrong. We should free the LuminosityBlockPrincipal before resuming the lumi queue. I am not sure this will actually help though, because I can see no way to get into this state. The existing code tries to always call resumeGlobalLumiQueue, but this should eliminate at least one possible source for the assert we have seen.
The code-checks are being triggered in jenkins. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-27158/10304
|
please test |
The tests are being triggered in jenkins. |
A new Pull Request was created by @wddgit (W. David Dagenhart) for master. It involves the following packages: FWCore/Framework @cmsbuild, @smuzaffar, @Dr15Jones can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
We are having a rare, nonreproducible failure in
EventProcessor::readLuminosityBlock in the IBs.
The symptom is an assert because the pointer to
the LuminosityBlockPrincipal is null. If the
LuminosityBlockProcessingStatus object was being
destroyed before calling resumeGlobalLumiQueue
then such a failure could occur because the order the
of the member variables in LuminosityBlockProcessingStatus
is wrong. We should free the LuminosityBlockPrincipal
before resuming the lumi queue. I am not sure this will
actually help though, because I can see no way to get
into this state. The existing code tries to always call
resumeGlobalLumiQueue, but this should eliminate at least
one possible source for the assert we have seen.
This should not change any behavior except maybe in some
complex case where we are already dealing with an unrelated
exception.
PR validation:
Run existing unit tests.