Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improving performance of PuppiProducer #31164

Merged
merged 6 commits into from Aug 25, 2020

Conversation

missirol
Copy link
Contributor

PR description:

This PR is an attempt to improve the performance (in terms of execution time) of the PuppiProducer plugin.

No changes in the outputs are expected.

Most of these technical changes are not in the producer itself, but in the PuppiContainer class; all changes are related (directly or indirectly) to the function var_within_R (the "alpha" variable on which the PUPPI weight is based), which appears to be the most time-consuming part of the algorithm.

The table below summarizes a few timing estimates for the puppi module in the RECO step for a Run-3 and Phase-2 workflow.

wf time pre-PR [ms] time post-PR [ms]
Run-3 RECO (QCD MC) 74.2 33.2
Phase-2 RECO (QCD MC) 620.1 255.9

The numbers are taken from the FastTimerService, running on 100 events; they suggest that the overall reduction in timing is approx. 55-60%.

Here is a short summary of the changes in the PR (below, goodVar and var_within_R are used interchangeably, since the former function is simply a call to the latter):

  • PuppiCandidate is converted to a simple struct, removing its inheritance from fastjet::PseudoJet (looking at the PUPPI implementation as a whole, the usage of fastjet does not seem necessary); this change accounts for most of the speedup (approx. 35-40% speedup);

  • skip the calculation of goodVar when not needed, e.g. for candidates that are assigned a default weight (0, or 1); this seems to bring roughly a 10-15% improvement in performance.

  • another bit of speedup is obtained by taking one condition (puppi_id != 3) out of var_within_R, and using it just once in PuppiContainer::initialize to fill the collections later used as inputs to goodVar; this leads to adding one more data member (a vector of PuppiCandidates) to the PuppiContainer class.

In this PR, there are two spots where things could certainly be improved (but improving them will lead to small changes in the outputs):

  • in PuppiContainer::initialize, an instance of fastjet::PseudoJet (here) is used to obtain the 4-momenta of the candidates used by PUPPI; this was the only way I could find to maintain the same numbers as pre-PR (this may be due to the fact that the inputs used for reset_PtYPhiM are floats, and then fastjet recomputes and stores the p4 using doubles); one way to simplify this (assuming the differences in the outputs are ultimately not significant) would be to remove PuppiCandidate altogether and simply use the RecoObj class already in use in the PuppiProducer (maybe, changing the floats in RecoObj to doubles).

  • in goodVar, the DeltaR2 between two given candidates is calculated (at most) twice (here), first with rapidity, then with pseudo-rapidity; this is not modified by this PR (otherwise, the outputs would inevitably change), but maybe it’s something that could be simplified.

FYI: @ahinzmann @lathomas @kirschen

PR validation:

Checked that the PUPPI weights of all PF candidates are unchanged for 200 events (100 for a Run-3 workflow, and 100 for a Phase-2 workflow), for both puppi (used for jets) and puppiNoLep (used for MET).
Standard workflows, i.e. runTheMatrix.py -l limited -i all --ibeos, ran successfully.

if this PR is a backport please specify the original PR and why you need to backport that PR:

N/A

@slava77
Copy link
Contributor

slava77 commented Aug 18, 2020

@mrodozov @smuzaffar
do we have some problems with webhooks?
this one was posted for 14 minutes already (at the time of my message)

@mrodozov
Copy link
Contributor

mrodozov commented Aug 18, 2020

@slava77 the remote calls to jenkins hanged with proxy error 502 and I'm trying to restart the service but I don't have sudo rights to do it. I'm trying to do it from the java CLI using the keys for cmsbuild

@davidlange6
Copy link
Contributor

davidlange6 commented Aug 18, 2020 via email

@mrodozov
Copy link
Contributor

@davidlange6 do you by any chance happen to have root permissions for the jenkins03 machine ?

@davidlange6
Copy link
Contributor

davidlange6 commented Aug 18, 2020 via email

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-31164/17796

  • This PR adds an extra 24KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @missirol (Marino Missiroli) for master.

It involves the following packages:

CommonTools/PileupAlgos

@perrotta, @jpata, @cmsbuild, @santocch, @slava77 can you please review it and eventually sign? Thanks.
@rappoccio, @ahinzmann, @riga, @jdolen, @gkasieczka, @hatakeyamak, @clelange this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor

slava77 commented Aug 18, 2020

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 18, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+1
Tested at: 0c3b885
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-63bdad/8799/summary.html
CMSSW: CMSSW_11_2_X_2020-08-17-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-63bdad/8799/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2608246
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2608197
  • DQMHistoTests: Total skipped: 48
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 149 log files, 22 edm output root files, 35 DQM output files

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-63bdad/8853/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 145 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2608222
  • DQMHistoTests: Total failures: 294
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2607906
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 149 log files, 22 edm output root files, 35 DQM output files

@santocch
Copy link

+1

@slava77
Copy link
Contributor

slava77 commented Aug 24, 2020

+1

for #31164 e0f02d3

curiously enough, there was no actual overlap in the context lines with #31174

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@missirol
Copy link
Contributor Author

the last commits removed dependence on fastjet, but introduced a numerical difference

@slava77

In case it helps: numerical changes in the outputs are introduced only by the very last commit of the PR, i.e. e0f02d3
(we didn't verify this explicitly in the PR review, but I had checked it locally).

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 25a40ac into cms-sw:master Aug 25, 2020
@missirol
Copy link
Contributor Author

@slava77 , one late question from my side: would it be okay to backport this (except for the last commit) to 11_1_X, so that it can be used for the Phase-2 HLT TDR studies?

@slava77
Copy link
Contributor

slava77 commented Aug 25, 2020

@slava77 , one late question from my side: would it be okay to backport this (except for the last commit) to 11_1_X, so that it can be used for the Phase-2 HLT TDR studies?

sure, but please limit it to the commit range that does not change the results.

@kpedro88
Copy link
Contributor

@missirol thanks for this PR! It's a worthy followup to my initial effort in #23270.

Could you make the same limited backport to 10_6_X? It would be nice to propagate this speedup to users for ultra-legacy analysis.

@missirol
Copy link
Contributor Author

Could you make the same limited backport to 10_6_X? It would be nice to propagate this speedup to users for ultra-legacy analysis.

I'm looking into a backport of this PR to 10_6_X (taking into account the no-change policy). It would not be entirely trivial, because the Puppi implementation in earlier releases (incl. 10_6) is significantly different; nevertheless, I'll give it a try.

@slava77
Copy link
Contributor

slava77 commented Aug 27, 2020

I'm looking into a backport of this PR to 10_6_X (taking into account the no-change policy). It would not be entirely trivial, because the Puppi implementation in earlier releases (incl. 10_6) is significantly different; nevertheless, I'll give it a try.

Thank you.

@missirol
Copy link
Contributor Author

Backport to 10_6_X tried in #31290.

@missirol missirol deleted the devel_112X_puppi_speedup2 branch September 16, 2020 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants