New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix automated pixel pair mitigation to include "Fed25" information (10_1_X) #23064
Conversation
A new Pull Request was created by @makortel (Matti Kortelainen) for CMSSW_10_1_X. It involves the following packages: RecoTracker/TkTrackingRegions @perrotta, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@cmsbuild, please test |
The tests are being triggered in jenkins. |
type bugfix |
backport of #23052 |
(there are no code-checks for non-master PRs?) |
I already commented in #23052 that I find this update more of an improvement, not a fully proper bugfix after taking into account the validation of the auto-mitigation feature in 10X. |
Sorry Slava, I have to strongly disagree. It was supposed to be there and is mostly valuable at HLT This Fix does not affect MC, so it is not a problem. @fabiocos @davidlange6 @venturia @veszpv @boudoul @mtosi |
On 4/26/18 6:15 AM, Vincenzo Innocente wrote:
and in Express (where is currently off). In prompt the PixelQuality PLC
should create proper IOV and the very same regions will than be "mitigated".
So w/o this bug fix Express and Prompt will differ for no reason (and
Express and HLT as well).
Please elaborate on the impact of this for PCL.
Is the pixel-less itaration used in PCL for marking IOVs and defining
payloads for the tracker components?
|
sorry, I meant pixel pair |
Maybe @tvami could comment the procedure. |
Sorry, I did not express it well. So w/o this fix for the same dead-region Express and Prompt will differ. Mitigation was supposed to trigger on FED25. It is part of the core strategy and must be enabled as planned asap |
On 4/26/18 6:32 AM, Vincenzo Innocente wrote:
Sorry, I did not express well.
In PCL PixelQuality is supposed to be re-evaluated (using pixel-DPG
info) and a new IOV created.
In most of the cases the region identified by FED25 errors will then
become "dead ROC" and pixelpair mitigation will therefore trigger for
those region in Prompt.
So w/o this fix for the same dead-region Express and Prompt will differ.
(or if PixelQuality is not updated in PCL we get inconsistent behavior
between runs with updated PixelQuality and run where it was missed (as
it happen in 2017). Triggering MItigation on FED25 error is safer.
Motigation was supposed to trigger on FED25. It is part of the core
strategy and must be enabled as planned asap
Thank you for the clarification.
I understand now that changes in this PR will predominantly affect only
the Express stream reco.
What is the fraction of ROCs that would still show up as FED25 and not a
part of bad components for the purpose of the prompt reco (assuming the
PCL is working as expected)?
|
There is a crucial feature which is forgotten in this discussion: the FED25 information is by event, the PCL, at most, it will be by lumisection. Not using FED25 is a downgrade of the system |
On 26 Apr, 2018, at 4:09 PM, Slava Krutelyov ***@***.***> wrote:
What is the fraction of ROCs that would still show up as FED25 and not a
part of bad components for the purpose of the prompt reco (assuming the
PCL is working as expected)?
Good question: no clue.
maybe some stuck TBM, I have no idea of the granularity of the PCL and what decision is taken
for ROCs that die in the middle of a run
|
-1 Tested at: 2a73834 You can see the results of the tests here: I found follow errors while testing this PR Failed tests: RelVals
The relvals timed out after 2 hours. |
Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped) |
The tests are being triggered in jenkins. |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
To best of my knowledge FED25 does not exist in simulation (modules are marked inactive only based on SiPixelQuality payload in GT). I'm not sure if the recent #22465 changes the picture, but it is disabled by default. Adding more people who could confirm for sure @tsusa @tvami @jkarancs @veszpv |
I can confirm that FED25 does not exist in simulation. We are actually working on how to include all the failures in the detector simulation, Deborah will have a status report at the next pixel offline meeting about this. Regarding the PCL workflow, there are 3 payloads produced: (1) containing the permanently bad ROCs, (2) ROCs with FED25 but not the permanently bad ones, (3) everything else. From this three we merge (1) + (3) and we get the payload for prompt. (2) is not included in order to avoid double counting, that is used for monitoring purposes. |
Simulation: Indeed, FED-signaled errors are not simulated at all. The effect of bad channels (all ROCs in a channel delivering no data) will probably be added by a mechanism similar to the dynamic inefficiency. PR #22465, instead, is something totally different: it makes randomly dropping single pixels configurable individually for every ROC (the mechanism existed in the past, but only configurable per layer.) Collision: The PCL bad component feature is not yet fully ready. When ready, its first consumer will be prompt reco, as it derives the information by processing express data. In contrast, the FED25 error is available both in HLT and Express. Also, as @venturia pointed out, the FED25 error is delivered event by event based on knowledge of the state of the system, not from a statistical measurement per lumisection. In fact, this is why the PCL mechanism is (supposed to be) constructed such that FED25 bad components are not going to be marked bad by the payload in order to avoid double counting. I agree with @VinInn , this should be used whenever possible. |
Just to make sure, is the information of such ROCs planned to be delivered to the reconstruction with the same mechanism as the FED25? |
+1
This is expected to be merged as soon as we get feedback from the 10_2_0_pre2 relvals. @fabiocos |
This pull request is fully signed and it will be integrated in one of the next CMSSW_10_1_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_10_2_X is complete. This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2) |
@slava77 I agree, in any case we said that we are waiting for the CMSSW_10_2_0_pre2 validation to back-port this update/fix into 10_1_X |
+1 |
Backport of #23052. Original description
@VinInn