Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[10_6_X] improving performance of PuppiProducer #31290

Merged
merged 2 commits into from Sep 4, 2020

Conversation

missirol
Copy link
Contributor

@missirol missirol commented Aug 29, 2020

PR description:

This PR is an attempt to backport the changes in #31164.

This backport follows in large part the original PR, but it is not entirely trivial; this is due to the significant differences in the PuppiProducer's implementation in 10_6_X (wrt later releases), and the need to not introduce any numerical changes in the outputs (no-change policy).

These requirements led to adding more members to the PuppiCandidate struct (compared to the original PR):

  • an int named puppi_register, due to the fact that in this release a given Puppi candidate has two IDs associated to it (named puppi "register" and puppi "id" in the source code), while in 11_X only one is used (currently called id);

  • four floats (px, py, pz, e) for the candidate's p4; this is used in the PuppiProducer when storing the candidates kinematics (see puppiP4s in the PuppiProducer; in 11_X, things are done differently and this change was not needed); recomputing (px,py,pz,e) on-the-fly based on (pt,eta,phi,m) would introduce small numerical changes in the outputs, and the only way I found to avoid this was to add more members to the PuppiCandidate struct.

The improvement in computational performance is approx. 45% (from ~27ms to ~15ms for one instance of the PuppiProducer in the PAT step of a 2017 MC workflow); this is slighly less than the speedup in the original PR (which was closer to 60%), maybe partly due to the fact that the PuppiCandidate struct here includes more data members.

A couple of minor changes from #31164 in PuppiContainer.h (renaming of a data member, and removal of an unused getter) are avoided here for simplicity.

No changes in the outputs are expected.

FYI: @ahinzmann @lathomas @kirschen

PR validation:

Checked that the PUPPI weights of all PF candidates are unchanged for 500 events of a 2017 MC workflow (PAT step), for both puppi (used for jets) and puppiNoLep (used for MET).
Standard workflows, i.e. runTheMatrix.py -l limited -i all --ibeos, ran successfully.

if this PR is a backport please specify the original PR and why you need to backport that PR:

#31164

(with limited modifications needed to enforce the no-change policy)

This speeedup could be useful for UL re-miniAOD and analysis.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 29, 2020

A new Pull Request was created by @missirol (Marino Missiroli) for CMSSW_10_6_X.

It involves the following packages:

CommonTools/PileupAlgos

@perrotta, @jpata, @cmsbuild, @santocch, @slava77 can you please review it and eventually sign? Thanks.
@rappoccio, @ahinzmann, @riga, @jdolen, @gkasieczka, @hatakeyamak, @clelange this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@perrotta
Copy link
Contributor

backport of #31164

@perrotta
Copy link
Contributor

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 29, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+1
Tested at: a27570a
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d0150e/9000/summary.html
CMSSW: CMSSW_10_6_X_2020-08-28-2300
SCRAM_ARCH: slc7_amd64_gcc700

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d0150e/9000/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 3214618
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3214283
  • DQMHistoTests: Total skipped: 334
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 140 log files, 14 edm output root files, 34 DQM output files

@silviodonato
Copy link
Contributor

In https://github.com/cms-sw/cmssw/pull/31164/files#diff-1b7e1647edfc1e35ddf8671132acf8f4L2 there are also changes in CommonTools/PileupAlgos/interface/PuppiContainer.h that are not backported here. Could you confirm you did not backported on purpose? If yes, please add a line in the PR description

@missirol
Copy link
Contributor Author

Hi @silviodonato ,

I didn't backport those changes on purpose because in 10_6_X there is no need to introduce what in 11_X is named fPFParticlesForVar, and the variable fPVFrac is still used in 10_6 (was not used anymore in 11_X, so it was removed).

The only remaining change would have been to rename fChargedPV to fPFParticlesForVarChargedPV and removing the getter pvParticles() (which is never used anywhere); I decided to avoid those changes b/c inconsequential; if it's better to be consistent with the original PR, I can add those, of course.

I added the following tentative comment to the PR description:

A couple of minor changes from #31164 in PuppiContainer.h (renaming of a data member, and removal of an unused getter) are avoided here for simplicity.

@kpedro88
Copy link
Contributor

kpedro88 commented Sep 2, 2020

thanks @missirol !

@santocch
Copy link

santocch commented Sep 2, 2020

+1

// or (2) are required for computations inside puppi-algos (see call to PuppiAlgo::add below)
double pVal = -1;
bool const getsDefaultWgtIfApplyCHS = iConstits[i0].id == 1 or iConstits[i0].id == 2;
if (not(fApplyCHS and getsDefaultWgtIfApplyCHS) or (std::abs(iConstits[i0].eta) < fPuppiAlgo[pPupId].etaMaxExtrap() and iConstits[i0].puppi_register != 0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The master version uses and getsDefaultWgtIfApplyCHS in the second conditional instead of and iConstits[i0].puppi_register != 0. Are these equivalent? If yes, a redefinition of getsDefaultWgtIfApplyCHS may be in order.
If no, please elaborate
(in this case, the separate definition of getsDefaultWgtIfApplyCHS seems unnecessary because it is not reused anywhere else).

also a linebreak at/after or would be nice

Copy link
Contributor Author

@missirol missirol Sep 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, I think the condition id == 1 or id == 2 corresponds to register != 0 (looking at how puppi_register is filled here), but in principle the two things can differ (or at least, I wanted to make no assumptions).

The logic behind how these changes are implemented is the following: goodVar needs to be calculated when the final weight of the candidate is non-trivial (this is the first condition, which corresponds to the logical NOT of these two cases), or when the goodVar of a given candidate is needed internally by PuppiAlgo::add, here; in 10_6_X, the latter condition translates to (.. < etaMaxExtrap) and puppi_register != 0, while in 11_X_Y the criterion is (.. < etaMaxExtrap) and (id == 1 or id == 2) (see here).

So, the logic is the same here and in the original PR; the end result looks different just as a consequence of the fact that the Puppi{Container,Algo} implementation is different in the two releases.

I added a commit to remove the temporary variable getsDefaultWgtIfApplyCHS, which was indeed only used once, and took the opportunity to add the linebreak.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2020

Pull request #31290 was updated. @perrotta, @jpata, @cmsbuild, @santocch, @slava77 can you please check and sign again.

@slava77
Copy link
Contributor

slava77 commented Sep 3, 2020

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2020

+1
Tested at: 9ed0cf6
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d0150e/9106/summary.html
CMSSW: CMSSW_10_6_X_2020-09-03-1100
SCRAM_ARCH: slc7_amd64_gcc700

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2020

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2020

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d0150e/9106/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 3214631
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3214295
  • DQMHistoTests: Total skipped: 334
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 140 log files, 14 edm output root files, 34 DQM output files

@slava77
Copy link
Contributor

slava77 commented Sep 3, 2020

+1

for #31290 9ed0cf6

  • code changes are in line with the PR description and the follow up review. This is a backport of improving performance of PuppiProducer #31164 somewhat in spirit, due to substantial evolution of this code.
  • jenkins tests pass and comparisons with the baseline show no (relevant) differences as expected.

@silviodonato
Copy link
Contributor

merge

@cmsbuild cmsbuild merged commit 1464697 into cms-sw:CMSSW_10_6_X Sep 4, 2020
@missirol missirol deleted the devel_106X_puppi_speedup5 branch September 4, 2020 16:52
@santocch
Copy link

santocch commented Sep 5, 2020

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 5, 2020

This pull request is fully signed and it will be integrated in one of the next CMSSW_10_6_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_11_2_X is complete. This pull request will be automatically merged.

@slava77 slava77 mentioned this pull request Sep 17, 2020
41 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants