Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speedup: switch EgammaTools EnergyScaleCorrection scales and smearings to use std::map #32053

Conversation

slava77
Copy link
Contributor

@slava77 slava77 commented Nov 6, 2020

the technical regression became more obvious after a recent update in the scale/smearing introduced in #31936. E.g. the start time of a 2018 UL re-miniAOD workflow 136.88811 went up to about 10 mins.

Apparently the EnergyScaleCorrection used an equivalent of a std::map implemented via a vector with repeated calls to sort during insertion, which lead to the startup time explosion

job start time for wf a 2018 UL re-miniAOD workflow 136.88811 went down by about a factor of 10
from https://slava77sk.web.cern.ch/slava77sk/reco/cgi-bin/igprof-navigator/CMSSW_11_2_X_2020-11-04-1100-orig.136.88811.step2.1.pp/9
to https://slava77sk.web.cern.ch/slava77sk/reco/cgi-bin/igprof-navigator/CMSSW_11_2_X_2020-11-04-1100-sign1111.136.88811.step2.1.pp/26
(the profile links are for 1 event processed).

the time cost of calling EnergyScaleCorrection::addScale went down by almost 4 orders of magnitude
https://slava77sk.web.cern.ch/slava77sk/reco/cgi-bin/igprof-navigator/CMSSW_11_2_X_2020-11-04-1100-orig.136.88811.step2.1.pp/21 =>
https://slava77sk.web.cern.ch/slava77sk/reco/cgi-bin/igprof-navigator/CMSSW_11_2_X_2020-11-04-1100-sign1111.136.88811.step2.1.pp/1011

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 6, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 6, 2020

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-32053/19646

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 6, 2020

A new Pull Request was created by @slava77 (Slava Krutelyov) for master.

It involves the following packages:

RecoEgamma/EgammaTools

@perrotta, @jpata, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@Sam-Harper, @jainshilpi, @lgray, @sobhatta, @afiqaize, @varuns23 this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor Author

slava77 commented Nov 6, 2020

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 6, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 7, 2020

+1
Tested at: 373d326
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-320949/10565/summary.html
CMSSW: CMSSW_11_2_X_2020-11-06-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 7, 2020

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 7, 2020

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-320949/10565/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2544144
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2544115
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 149 log files, 22 edm output root files, 35 DQM output files

@slava77
Copy link
Contributor Author

slava77 commented Nov 7, 2020

+1

for #32053 373d326

  • full tests confirm no differences
  • logs for 136.88811 show a decrease in the startup time, in agreement with the local pre-submission tests

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 7, 2020

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@slava77
Copy link
Contributor Author

slava77 commented Nov 7, 2020

@bainbrid
you may want to use this to speedup the tests

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 0e39160 into cms-sw:master Nov 7, 2020
@slava77
Copy link
Contributor Author

slava77 commented Nov 10, 2020

@Sam-Harper @shervin86
please check if the update in this PR is OK, that I didn't miss some subtleties.
(I hope I wasn't too trigger-happy to sign off after seeing no differences in the workflows where the smearing is used).
Looking back at #22531, apparently the question about using a vector vs a map did not come up.
Thank you.

@slava77
Copy link
Contributor Author

slava77 commented Nov 10, 2020

@Sam-Harper @shervin86
please check if the update in this PR is OK, that I didn't miss some subtleties.
(I hope I wasn't too trigger-happy to sign off after seeing no differences in the workflows where the smearing is used).
Looking back at #22531, apparently the question about using a vector vs a map did not come up.
Thank you.

curiously, the earlier version in #22308 used std::map

correction_map_t scales, scales_not_defined;
correction_map_t smearings, smearings_not_defined;

@Sam-Harper
Copy link
Contributor

so the standard vector should be faster in the event loop, I understand though its slow at the start though. It could probably be fixed to be made faster in the event loop. Its also got significantly slower since it was first written due to many more categories being used.

It may be well we improve this in the future but I think that shouldn't stop improvements now.

cmsbuild added a commit that referenced this pull request Nov 11, 2020
…caleCorrection

speedup: switch EgammaTools EnergyScaleCorrection scales and smearings to use std::map (backport of #32053)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants