New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
more robust protection against infinite loops in Ecal multifit (7_4_12_patchX) #11368
more robust protection against infinite loops in Ecal multifit (7_4_12_patchX) #11368
Conversation
A new Pull Request was created by @bendavid (Josh Bendavid) for CMSSW_7_4_12_patchX. more robust protection against infinite loops in Ecal multifit It involves the following packages: RecoLocalCalo/EcalRecAlgos @cmsbuild, @cvuosalo, @slava77 can you please review it and eventually sign? Thanks. Note that this branch is designed for requested bug fixes specific to the CMSSW_7_4_12 release. |
For some more background, the first protection catches the cases where the loop returns to an identical state within a single iteration. The further observed cases have the loop cycling through repeating states with a period of 3 or 4 iterations. Cutting this sequence off at an arbitrary place in the middle is undesirable, because one state in the repeating sequence represents de-facto convergence, the others not. Gradually loosening the convergence threshold should take care of this. (Then there is a hard cutoff at the very end) |
@cmsbuild please test |
The tests are being triggered in jenkins. |
why CMSSW_7_4_12_patchX as a target? |
@smuzaffar you probably know as well if CMSSW_7_4_12_patchX is useable by general code integration |
@davidlange6 asked me to make the pull request here in addition to elsewhere Let me know if you want me to immediately open all three in parallel. |
all 4 in parallel there is no IB to test on this branch |
@slava77 - this is just a patches branch for prompt and hlt since I had started intrgrating requests for the next full build. You can test in74x. |
I've confirmed identity (with respect to 7_4_12) for all rechits in 10 events from the Zee skim. |
I will merge this and start a build while tests happen in 74x. |
more robust protection against infinite loops in Ecal multifit (7_4_12_patchX)
ugh, wait, the final protection doesn't actually have a break statement with it |
On 9/18/15 1:22 PM, David Lange wrote:
I'm not really familiar with this machinery/logic. David, could you please write up something on a twiki or send to
|
(the missing break statement is added to this branch here It's already included in the still-open 74x pull request. |
my goal is indeed to use the latest 74x, yes. But the concept of patches on the non-latest full build is not new. For documentation, you can look at my spring offline/computing week talk on releases for production (which is more complicated than what we actually have implemented, but the idea is the same)
|
Previous protection only caught a subset of possible cases.
Even rarer infinite loops still observed in HLT. A more refined solution for this may be introduced later, but this should be very safe. (And an absolute cutoff has been added for the worst case scenario)
Can reproduce in 7_4_12_patch1 with
/afs/cern.ch/user/a/avetisya/public/StuckEvent/hlt.py
Still running some tests myself to make doubly sure results are unchanged for other rechits.