New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up Puppi #35799
Speed up Puppi #35799
Conversation
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35799/26164
|
A new Pull Request was created by @kpedro88 (Kevin Pedro) for master. It involves the following packages:
@jpata, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
not anymore in recent gcc |
and in clang since ever. |
Good to know that newer gcc versions have picked up the better optimization. Nevertheless, it's still worthwhile to have the optimization in 10_6_X with its older gcc version. I leave it up to release managers whether they still want the change to enter master, even if it no longer has an effect. |
The switch code is cleaner anyhow: strange that gcc was not able to optimize. |
enable profiling |
@cmsbuild please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ccffe7/19865/summary.html Comparison SummarySummary:
|
@gartung the "Igprof Comparison cpu usage RECO produce methods." page in the profiling results is blank. is there a bug in the script that needs to be fixed? |
strangely enough this still shows up as ah, I guess that in response to the last comment somebody restarted the profiling tests some 20 minutes ago. |
Reverting to the older script revealed the real issue
|
@gartung if this script relies on the CMSSW environment, maybe it needs to be changed to |
Yes. I just pushed that fix. |
I resubmitted the pr-profiling Jenkins job which will pick up this fix. |
Is any measurable effect expected or observed here for phase2? |
I see a similar difference in 23424.21 workflow for a test pull request |
The jenkins job log shows that a new RES_CPU_compare_11834.21.txt was created but it was not copied. Triggering the build again should create a new results directory and allow the file to be uploaded.l |
please test |
PuppiProducer isn't listed in that CPU comparison. I guess it isn't high enough on the list. If I drill down into the full igprof reports: The % total does decrease, though the overall time usage seems to be higher for the PR (for everything, not just this model). My use case was rerunning Puppi on top of miniAOD, where it's a larger % of the total. |
The increase in time is probably caused by enabling the IgprofService in the job which takes some time to write out the profile after the first, middle and next to last event. |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ccffe7/19957/summary.html Comparison SummarySummary:
|
+reconstruction
|
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
+1
|
PR description:
Switches tend to be faster than else if when more than a couple of (simple) cases exist (better optimization by the compiler). This change in particular made Puppi 23% faster (tested on a ttbar UL sample in 10_6_X).
PR validation:
Code compiles and runs successfully, no changes expected or observed.
This PR will be backported to 10_6_X to speed up ultra-legacy analysis processing. Backports to other releases can be made upon request.