Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run3-hcx300 Update code for new shower library (Lev & Salavat) #34432

Merged
merged 6 commits into from Jul 14, 2021

Conversation

bsunanda
Copy link
Contributor

@bsunanda bsunanda commented Jul 9, 2021

PR description:

Update code for new shower library (Lev & Salavat), Uses a modifier run3_HFSL to activate the new library. This can be activated only when the corresponding file is saved in the data repository

PR validation:

Use the runTheMatrix test workflow and a new script in SimG4CMS/Calo/test/python/runHF6_cfg.py

if this PR is a backport please specify the original PR and why you need to backport that PR:

Nothing special

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 9, 2021

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34432/23848

  • This PR adds an extra 44KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 9, 2021

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34432/23849

  • This PR adds an extra 44KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 9, 2021

A new Pull Request was created by @bsunanda (Sunanda Banerjee) for master.

It involves the following packages:

  • Configuration/Eras (operations)
  • Geometry/HcalSimData (geometry)
  • SimG4CMS/Calo (simulation)
  • SimG4Core/Application (simulation)

@civanch, @Dr15Jones, @makortel, @cvuosalo, @ianna, @mdhildreth, @cmsbuild, @silviodonato, @qliphy, @fabiocos, @davidlange6 can you please review it and eventually sign? Thanks.
@makortel, @cvuosalo, @rovere, @Martin-Grunewald, @thomreis, @simonepigazzini, @fabiocos, @slomeo this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy, @perrotta you are the release manager for this.

cms-bot commands are listed here

@bsunanda
Copy link
Contributor Author

bsunanda commented Jul 9, 2021

@cmsbuild Please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 9, 2021

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9a3f37/16669/summary.html
COMMIT: 0627027
CMSSW: CMSSW_12_0_X_2021-07-09-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/34432/16669/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 3 differences found in the comparisons
  • DQMHistoTests: Total files compared: 38
  • DQMHistoTests: Total histograms compared: 0
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 0
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 37 files compared)
  • DQMHistoSizes: changed ( 312.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 160 log files, 37 edm output root files, 38 DQM output files
  • TriggerResults: no differences found

@cvuosalo
Copy link
Contributor

cvuosalo commented Jul 9, 2021

+1

@bsunanda
Copy link
Contributor Author

@civanch Please approve

@civanch
Copy link
Contributor

civanch commented Jul 13, 2021

+1

@qliphy
Copy link
Contributor

qliphy commented Jul 14, 2021

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

@cmsbuild cmsbuild merged commit a2b431c into cms-sw:master Jul 14, 2021
@davidlange6
Copy link
Contributor

belatedly - has someone assessed the performance impact of this new library in the context of the parameters with which the ROOT file was created?

@abdoulline
Copy link

abdoulline commented Jul 14, 2021

@davidlange6
No. Not yet (for not yet baseline new HF ShowerLibrary).
Well, assuming HF ShowerLibrary performance is in the shadow of real GEANT workflow for the rest/bulk of CMS detector.
But expert's help would be appreciated. I recall your optimization done 5 years ago
#16049
Not sure if it still holds (important) nowadays? - I don't remember what was CPU gain back then...

@davidlange6
Copy link
Contributor

davidlange6 commented Jul 14, 2021 via email

@civanch
Copy link
Contributor

civanch commented Jul 14, 2021

@davidlange6 , recently we did profile with Run2 shower library in 12_0_0_pre1 and it was not contributing to CPU in a significant way. We should repeat the exercise when this new SL will be integrated.

@davidlange6
Copy link
Contributor

davidlange6 commented Jul 14, 2021 via email

@abdoulline
Copy link

abdoulline commented Jul 15, 2021

@davidlange6 David, the situation is as follows:

(1) you're right, there seems to be a sizable impact of using new SL (HF Shower Library) on SIM performance:
~25% slowdown of the entire GENSIM step for 11634.0 wf [*];

(2) relevant files (their slightly updated versions) used for merging many "FillSim" (Full SImulation of Cherenkov light) components for each particle spice (e,pi) and energy value into the new SL, alas, were not committed to CMSSW (unlike all other codes related to new SL) and were used privately by Lev Kheyn, please find them in his public
/afs/cern.ch/user/k/kheyn/public/For_Sunanda

  • HFShowerLibraryAnalyzer.cc - code reading "FullSim"-produced root files and creating the library
  • writelibraryfile_cfg.py - config
  • fileList.txt - used list of FullSim files

NB: new SL is not compressed or optimized in any way, so parameters of root files are default ones (whatever they are).
We'd appreciate your expert's insight - whether something can be done to improve the situation (1)...


[*]
Quick and (very) dirty comparison of CPU performance by running simultaneously (in parallel) two jobs w/wo new SL for wf 11634.0 (noPU TTbar 14 TeV 2021, 100ev in STEP1=GENSIM) on the same lxplus PC, numbers coming out of cms.Service("Timing").
Repeated twice (with very much the same results) on different PCs.

(1) regular/default SL :
Event Throughput: 0.0417111 ev/s
CPU Summary:
Total loop: 2371.84

(2) new SL:
Event Throughput: 0.0327577 ev/s
CPU Summary:
Total loop: 2997.68

@davidlange6
Copy link
Contributor

davidlange6 commented Jul 15, 2021 via email

@abdoulline
Copy link

@davidlange6
could you, please, convert/translate your consideration into a prescription/recipe
for aforementioned (2) files in /afs/cern.ch/user/k/kheyn/public/For_Sunanda/ ?

@davidlange6
Copy link
Contributor

davidlange6 commented Jul 15, 2021 via email

@abdoulline
Copy link

abdoulline commented Jul 22, 2021

@davidlange6
Hi David, sorry to bug you once again, unfortunately we were misled by the files put by Lev in a rush to the aformentioned public directory, who has produced/assembled new SL (HF ShowerLibrary) [1].
Actually he didn't use PoolOutputModule, but TFileService...

I've published (somewhat modified version of) both code and job config for assembling SL from 32 input components in #34561 and produced yet another version of LS, after many attempts to improve the performance of new SL(s) in TTbar 2021 MC SIM step.
NB: new SL contains twice more number of showers per each energy point, so its size is inevitably bigger than for actual default [2].

Finally the best updated new SL has been found/produced with parameters buffersize=1 and splitlevel=2 for Tree->Branches.
This modified/updated version of new SL still makes ~5% slower TTbar 2021 MC SIM wrt actual default SL, but not ~25% anymore as for the intial version [1] produced by Lev with ROOT defaults for Tree->Branch buffersize=32000 and splitlevel=99.
Do you think if any further optimization is possible?

Unfortunately we have little idea what did you do 5 years ago (2016), when you've somehow reformatted initial version of HFShowerLibrary_npmt_noatt_eta4_16en_xxx.root (produced by Lev back in 2015) so that it supposedly became faster (no numbers available though...) ?


[1]
HFShowerLibrary_run3_v5.root
put by Shahzad into
https://cmssdt.cern.ch/SDT/data/CMSSW/SimG4CMS/Calo/data/

[2]
HFShowerLibrary_npmt_noatt_eta4_16en_v4.root
in
https://cmssdt.cern.ch/SDT/data/CMSSW/SimG4CMS/Calo/data/

@abdoulline
Copy link

abdoulline commented Jul 22, 2021

@davidlange6
hope the folowing explicit settings may further help :
fs->file().SetCompressionAlgorithm(4); // LZ4
fs->file().SetCompressionLevel(4); // recommended level=4
when added here
https://cmssdt.cern.ch/lxr/source/SimG4CMS/ShowerLibraryProducer/plugins/HcalForwardLibWriter.cc#0020

->
This way re-assembled SL became even bigger (1198820016 bytes), but at the first glance it delivers CPU performance, which is not worse, even marginally better than the default SL (in my TTbar 2021 MC "parallel" jobs on the same PC with TimeService).
Hope Lev would confirm it with his OscarMTProducer-specific CPU "estimator"...

@abdoulline
Copy link

abdoulline commented Jul 22, 2021

Update:

Lev has reported (using single-particle gun, shot at HF) that the latest/biggest SL is the best (as evaluated on OscarMTProducer TimeService fragments collected over 100 ev) among three new tried SL's, ~60-70 times faster (per single particle) than aforementioned cmssdt-residing [1] one, but still not at the level of the actual default one...

@davidlange6
Copy link
Contributor

davidlange6 commented Jul 22, 2021 via email

@abdoulline
Copy link

abdoulline commented Jul 23, 2021

Finally Lev has re-done his OscarMTProducer CPU time eveluations with single 100 GeV pions and it seems that the "latest-greatest" (even thought the biggest...) version of new SL is the best of all compared:

CPU_time_OSCAR_pi100_SL_versions

Left (best) to right (worst):

(1) green : new SL with added (wrt (3) below) LZ4 compression algo and compression level=4
now in /afs/cern.ch/user/a/abdullin/public/updated_HF_SL_July22_2021/HFShowerLibrary_run3_v5.root

(2) black: actual default HFShowerLibrary_npmt_noatt_eta4_16en_v4.root

(3) blue: my intermediate attempt to re-assemble new SL only using buffersize=1 and splitlevel=2 for Tree->Branches
#34432 (comment)

(4) red: HFShowerLibrary_run3_v5(_210713).root - the initial version of new SL committed to CMSSDT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants