New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HGCal] Run HGCalValidator only on the hard scatterer event and correcting the mess from merging #27478
Conversation
The code-checks are being triggered in jenkins. |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-27478/10808
Code check has found code style and quality issues which could be resolved by applying following patch(s)
|
The code-checks are being triggered in jenkins. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-27478/10810
|
A new Pull Request was created by @apsallid for master. It involves the following packages: Validation/HGCalValidation @andrius-k, @kmaeshima, @schneiml, @cmsbuild, @jfernan2, @fioriNTU can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
The tests are being triggered in jenkins. |
@@ -192,6 +192,18 @@ void HGCalValidator::dqmAnalyze(const edm::Event& event, | |||
histoProducerAlgo_->fill_info_histos(histograms.histoProducerAlgo, totallayers_to_monitor_); | |||
} | |||
|
|||
//Consider CaloParticles coming from the hard scatterer, excluding the PU contribution. | |||
std::vector<CaloParticle> caloParticlesFromHardScat; | |||
for (auto& it_caloPart : caloParticles) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be const auto?
<< " with BX: " << it_caloPart.g4Tracks()[0].eventId().bunchCrossing() << std::endl; | ||
continue; | ||
} | ||
caloParticlesFromHardScat.push_back(it_caloPart); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is making a copy of each hard-scatter caloparticle. Some alternatives:
- store pointers to hard-scatter caloparticles
- store vector<size_t> with a list of hard-scatter indices, and then select only those indices from the original vector
Both of these alternatives have worse memory locality, but reduce memory usage and copying overhead. It would be interesting to see if these improve the performance any further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kpedro88 ,
Using pointers or the selected indices has far reaching consequences to the whole package since the input to the vast
majority of the methods in Validation/HGCalValidation/src/HGVHistoProducerAlgo.cc should be changed and subsequently
a lot of code. Could we please go on with the emplace_back?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine for now, but if the memory usage becomes concerning, these alternatives should be revisited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apsallid move the filter on the CaloParticles where you actually use them, i.e. one level down the stack.
In this way, you do not have to copy anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rovere i tested that but for some reason it is really slow. Maybe because we need that check in 5 different places while the code in these places will loop in any case through all CaloParticles which is of the order of 10000. By reducing the collection to ~200 CaloParticles once in the beginning we gain speed. I do not know what else to think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apsallid doesn't sound like a good explanation. Are you sure, by any chance, you left some part of the code w/o the selection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apsallid otherwise use the vector of indices down the stack, so that you don't need to loop over and over and make the very same selection.
Comparison job queued. |
The code-checks are being triggered in jenkins. |
@jfernan2 We understood the cause of the nan's and it should be fine now. I will update the description so that it is clear what was causing the problem as well as with the new timing average of the module, which is a little faster. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-27478/10872
|
Pull request #27478 was updated. @andrius-k, @kmaeshima, @schneiml, @cmsbuild, @jfernan2, @fioriNTU can you please check and sign again. |
please test |
The tests are being triggered in jenkins. |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2) |
In a private test on wf 20634.99 I see in step4:
So I would say that the original problem is fixed. |
+1 |
PR description:
This PR addresses the issue #27417. We decided (@rovere , @felicepantaleo , @amartelli )
to follow PR #26915 and consider only CaloParticles coming from the hard scatterer,
excluding the PU contribution. Running the Timing Service on 10 events of workflow 20234.99
we see that the timing of the hgcalValidator module is on average 2.88 s/event.
We should also note that the hgcalValidator module will for sure be even faster in cooperation
with TICL multiclusters, since at the moment the number of multiclusters produced is huge
(~10^4 multiclusters per event for wf 20234.99).
Furthermore, we noted:
PR validation:
We tested the PR with wf 20234.99 (TTbar with 200PU).