Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slimming Strip Calibration Trees output #23045

Merged
merged 18 commits into from May 2, 2018

Conversation

mmusich
Copy link
Contributor

@mmusich mmusich commented Apr 24, 2018

Greetings,
this PR packages several updates to the CalibTracker code in order to optimize the storage disk consumption of the Strip Calibration ntuples on eos.
Main features added:

  • possibilty to prescale events entering any of the shallow tree producers;
  • added compression of TFileService for the ShallowTree class;
  • moved where unnecessary double precision in favor of float;
  • unused variables masked behind a preprocessor flag, ExtendedCALIBTree;
  • chargeoverpath (single largest offender) variable masked, in favor of building it via ratio of existing charge and path ;
  • updated unit tests

Changes have been proposed and revised by @mdelcourt,@clacaputo,@echabert and @jlagram.

Testing this branch with a O(1k) events from a 2017C ALCARECO file, we get a net reduction of ~ 28% in size.
Going tree by tree:

gainCalibrationTree tree:

gaincalibrationtreestdbunch_tree

anEff tree:

aneff_traj

EventInfo tree:

eventinfo_tree
As the code touched here, is used also for the SiStripGains PCL algorithm a dedicated test with O(100k) events has been carried out by using the following commands:

cmsDriver.py step3 --datatier ALCARECO --conditions auto:run2_data -s ALCA:PromptCalibProdSiStripGains --eventcontent ALCARECO -n -1 --dasquery='file dataset=/ZeroBias/Run2016C-SiStripCalMinBias-18Apr2017-v1/ALCARECO run=276097' 

followed by:

cmsDriver.py stepMultiHarvest --data --conditions auto:run2_data --scenario pp -s ALCAHARVEST:SiStripGains --filein file:PromptCalibProdSiStripGains.root -n -1 --fileout file:calib.root --customise_command "process.DQMStore.collateHistograms = cms.untracked.bool(True)\nprocess.dqmSaver.saveByRun=cms.untracked.int32(-1)\n process.dqmSaver.saveAtJobEnd=cms.untracked.bool(True)\nprocess.dqmSaver.forceRunNumber=cms.untracked.int32(999999)" --no_exec

to emulate the Multi-Run Harvesting.
No difference is found in the output (the complete histograms comparison is available here)
Just to show two examples:

image

image

Martin Delcourt and others added 17 commits April 20, 2018 16:04
modified:   CalibTracker/SiStripCommon/plugins/ShallowTracksProducer.cc
- double to float

modified:   CalibTracker/SiStripCommon/plugins/ShallowTracksProducer.cc
- std::bitset introduced
- mods encapsulated in a preprocessore flag, CALIBTreeDEV

Note: std::bitset type still not accepted from ShallowTree
modified:   CalibTracker/SiStripCommon/plugins/ShallowEventDataProducer.cc
modified:   CalibTracker/SiStripCommon/plugins/ShallowGainCalibration.cc
modified:   CalibTracker/SiStripCommon/plugins/ShallowTracksProducer.cc
modified:   CalibTracker/SiStripHitEfficiency/interface/HitEff.h
modified:   CalibTracker/SiStripHitEfficiency/src/HitEff.cc
- Unused variables masked behind a preprocessor flag, CALIBTreeDEV
- chargeoverpath varable masked

modified:   CalibTracker/SiStripChannelGain/src/SiStripGainsPCLWorker.cc
- chargeoverpath dependencies removed

runTheMatrix.py -l 1001.0 successfully passed
- Preprocessor flag changed to ExtendedCALIBTree
- Minor indentation mods
@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-23045/4443

Code check has found code style and quality issues which could be resolved by applying a patch in https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-23045/4443/git-diff.patch
e.g. curl https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-23045/4443/git-diff.patch | patch -p1

You can run scram build code-checks to apply code checks directly

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @mmusich (Marco Musich) for master.

It involves the following packages:

CalibTracker/Configuration
CalibTracker/SiStripChannelGain
CalibTracker/SiStripCommon
CalibTracker/SiStripHitEfficiency

@cmsbuild, @franzoni, @arunhep, @cerminar, @lpernie can you please review it and eventually sign? Thanks.
@echabert, @gbenelli, @tocheng, @mverzett, @OlivierBondu, @mmusich this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@lpernie
Copy link
Contributor

lpernie commented Apr 24, 2018

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 24, 2018

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/27634/console Started: 2018/04/24 16:11

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-23045/27634/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 29
  • DQMHistoTests: Total histograms compared: 2492830
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2492653
  • DQMHistoTests: Total skipped: 176
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 28 files compared)
  • Checked 119 log files, 9 edm output root files, 29 DQM output files

@mmusich
Copy link
Contributor Author

mmusich commented Apr 26, 2018

@arunhep @lpernie
When looking for changes in the Comparison Summary of this PR, I realized that the only relevant workflow comparison for the changes proposed here (i.e. 1001.0) actually got empty relmon:

https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_10_2_X_2018-04-23-2300+23045/26271/1001.0_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5+ALCAHARVDSIPIXELCALRUN1/

The log file claims there is a division by 0:

https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_10_2_X_2018-04-23-2300+23045/26271/1001.0_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5+ALCAHARVDSIPIXELCALRUN1RelMonComp-1001.0.log

Traceback (most recent call last):
  File "/cvmfs/cms-ib.cern.ch/nweek-02521/slc6_amd64_gcc630/cms/cmssw/CMSSW_10_2_X_2018-04-22-1100/bin/slc6_amd64_gcc630/compare_using_files.py", line 345, in <module>
    directory2html(directory, options.hash_name, options.standalone)
  File "/cvmfs/cms-ib.cern.ch/week1/slc6_amd64_gcc630/cms/cmssw-patch/CMSSW_10_2_X_2018-04-23-2300/python/Utilities/RelMon/directories2html.py", line 464, in directory2html
    page_html+=get_rank_section(directory)
  File "/cvmfs/cms-ib.cern.ch/week1/slc6_amd64_gcc630/cms/cmssw-patch/CMSSW_10_2_X_2018-04-23-2300/python/Utilities/RelMon/directories2html.py", line 390, in get_rank_section
    scale = gPad.GetUymax()/rightmax
ZeroDivisionError: float division by zero

N.B. This feature is common to any other recent PR and it is not due to the changes proposed here.
Upon manual inspection I realized that for some reason, when the last harvesting step of 1001.0 is run (i.e. ALCAHARVDSIPIXELCALRUN1) the harvested DQM files becomes empty.
This makes the comparison with the baseline useless in PRs as this one.
I took the liberty to "fix" this in #23063.

@lpernie
Copy link
Contributor

lpernie commented Apr 26, 2018

Very nice

@lpernie
Copy link
Contributor

lpernie commented Apr 26, 2018

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@fabiocos
Copy link
Contributor

fabiocos commented May 2, 2018

+1

@cmsbuild cmsbuild merged commit 7cba396 into cms-sw:master May 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants