EGamma Scale & Smearing For Re-miniAOD #22531

Sam-Harper · 2018-03-08T15:18:29Z

Dear All,

This is mostly a complete re-write of the E/gamma Scale & Smearing code to bring it up to minimal acceptable standards. This code is not perfect, many things could be done better but perfect is the enemy of good. It is functionally equivalent to #22308

This is step one in an evolution of E/gamma energy correctors. You will notice significant ground work has been done in energy regression management. It is a future goal to remove the badness that is "EGExtraInfoModifierFromDB" and instead have the regression all calculated in simple classes which can be wrapped in producers or modifiers as we see fit.

Guide to changes:
ReducedEGProducer has been modified to allow the energy to be changed inside it. In hindsight, it would be have been better to have a general modification framework rather than such a specific hardcoded option. It is the easiest option here to modify the electrons rather an introduce another intermediate step. A reminder, VID re-run on the miniAOD output must give identical results hence ReducedEGProducer needs to already have any modifiers applied to it. Note, I am still "abusing" the PFClusterIsolation maps, this will be fixed by just renaming the parameters to something more neutral

All the classes in EgammaAnalysis/ElectronTools needed for scale & smearing have been put into RecoEgamma/EgammaTools. The classes have been substantially re-written, although the original design is clearly recognizable. They are as follows

EnergyScaleCorrection
The class which reads the scale & smearing corrections from a file and then returns the ones appropriate for an electron/photon. The information is now encapsulated in sub classes ScaleCorrection and SmearCorrection which contain the information for a given e/gamma object. In general, I dislike the reading mechanism, its all kinds of awful. However it is what we have. In the future, this should be either in the database or at the very least some sort of nice json/xml format.
PhotonEnergyCalibrator
Uses the energy scale correction class to calculate the new corrected energies and systmatic variations. Uses a deterministic random number based on run/event/lumi/supercluster seed id/#crystals in SC for the smearing. Note the smearing does effect the data, it enters in the error of the energy.
ElectronEnergyCalibrator
The electron equivalent class of PhotonEnergyCalibrator. Diffs because electrons energy is ECAL-Trk combined and the corrections apply solely to the ECAL energy. This complicates things, and is why I had to start to overhaul the regression code as it needs to re-apply the ecal-track regression starting from the new corrected energy. Obviously having two parallel codes doing exactly the same thing is unacceptable, this is why the class EpCombinationTool which does this is designed to as a drop in replacement for the code in "EGExtraInfoModifierFromDB". This hasnt been swapped over yet but will be in a future PR. Note I'm about 90% of the way to being able to fully template this with PhotonEnergyCalibrator but it wont happen for this PR.
CalibratedElectronProducerT, CalibratedPhotonProducerT
The produces which write the output of PhotonEnergyCalibrator , ElectronEnergyCalibrator to a file. Optionally writes out a new collection of calibrated eles/phos

Proto Regression Changes:
As mentioned the ElectronEnergyCalibrator needs to reapply the ecal-track regression. The following classes facilitate this

EpCombinationTool
applies the ECAL-Track regression. Intended to replace the code : https://github.com/cms-sw/cmssw/blob/CMSSW_9_4_0/RecoEgamma/EgammaTools/plugins/EGExtraInfoModifierFromDB.cc#L481-L542 and instead wrap it up in a nice little function
EgammaRegressionContainer
e/gamma regressions are binned in et, eta and saturation. This simply contains the the 4 regressions and allows the correct value to be returned based on et/eta/saturation. Also contains the range values rather than hardcoding them in the .cc.
Allows code such as this : https://github.com/cms-sw/cmssw/blob/CMSSW_9_4_0/RecoEgamma/EgammaTools/plugins/EGExtraInfoModifierFromDB.cc#L438-L462 to be replaced

Future evolution:

all actual regression functionality from EGExtraInfoModifierFromDB is removed and it will be done via classes such as EpCombinationTool
the entire EgammaAnalysis package will be deleted from the release
CalibratedElectron and CalibratedPhoton producers will ultimately be a templated instance of each other
CalibratedElectron and CalibratedPhoton producers will likely be able to apply the ECAL energy regression as well

I am now re-running my longer validations these are

miniAOD -> miniAOD of the entire DoubleEG dataset, this allows us to check the preformance in every bin, also done for a DY sample
re-miniAOD on a smaller sample size of the DoubleEG dataset + DY sample to validate the results are the same as the re-miniAOD
the checks Slava suggested (profiles, size increases , etc)

cmsbuild · 2018-03-08T15:18:49Z

The code-checks are being triggered in jenkins.

cmsbuild · 2018-03-08T15:21:13Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-22531/3844

cmsbuild · 2018-03-08T15:21:35Z

A new Pull Request was created by @Sam-Harper for master.

It involves the following packages:

EgammaAnalysis/ElectronTools
PhysicsTools/PatAlgos
RecoEgamma/EgammaPhotonProducers
RecoEgamma/EgammaTools
RecoEgamma/ElectronIdentification
RecoEgamma/PhotonIdentification

@perrotta, @monttj, @cmsbuild, @slava77, @gpetruc, @arizzi can you please review it and eventually sign? Thanks.
@TaiSakuma, @gouskos, @jainshilpi, @rappoccio, @varuns23, @seemasharmafnal, @mmarionncern, @imarches, @ahinzmann, @acaudron, @lgray, @jdolen, @ferencek, @rovere, @jdamgov, @nhanvtran, @gkasieczka, @schoef, @JyothsnaKomaragiri, @mverzett, @gpetruc, @mariadalfonso, @pvmulder this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

davidlange6 · 2018-03-08T15:42:50Z

please test

cmsbuild · 2018-03-08T15:43:07Z

The tests are being triggered in jenkins.

slava77 · 2018-03-08T15:51:49Z

@Sam-Harper
please clarify the status of smearing in this PR.
The description is not clear enough to me and I'm also puzzled by what looks like photons are smeared in data as well ("Note the smearing does effect the data").

My rough estimate is that this will take until about Wednesday (Mar 14) to review and sign off for the master.

Sam-Harper · 2018-03-08T16:05:41Z

The smearing is currently enabled. It can be disabled with a bool switch.

The smearing will effect the data as it enters the energy error. @shervin86 can give a full explanation this more eloquently than me

Sam-Harper · 2018-03-08T16:20:10Z

On the time estimate, the miniAOD production can start without this, this is an ongoing discussion with PC and PPD. What remains is that it is mandatory for this to be applied for all CMS analyses using this data. Therefore I want this in the release regardless whether it makes it to the re-miniAOD. And back ported to 9_4 (with it all disabled if needs be).

This is a simple consequence of having high standards for code in the release now, things take longer, we had a working ready to go PR submitted. It got rejected due to code standards last week and it takes time to re-write and understand things. This will ultimately benefit us in the future. If it means that either the reminiAOD takes a bit longer or analysers have to suffer a bit with recipes, well that was the choice made.

slava77 · 2018-03-08T16:21:10Z

On 3/8/18 8:05 AM, Sam-Harper wrote: The smearing is currently enabled. It can be disabled with a bool switch. The smearing will effect the data as it enters the energy error. @shervin86 <https://github.com/shervin86> can give a full explanation this more eloquently than me

I was expecting the following is done: a . having smeared energies accessible in miniAOD MC is of value, keeping the un-smeared ones from the default accessors

Sam-Harper · 2018-03-08T16:34:49Z

Okay, will turn off the MC for now. The final decision on this is not taken though.

cmsbuild · 2018-03-08T16:59:33Z

The code-checks are being triggered in jenkins.

Sam-Harper · 2018-03-08T17:01:21Z

done. It just occurred to me that to save a little bit of space, I didnt store the corrected ECAL energy as its trivally a function of other stored numbers. But given its not applied by default now I think its better to store another 4 floats which this information. If you agree, I'll update the PR later to do this as I have other issues to attend to

cmsbuild · 2018-03-08T17:03:40Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-22531/3846

cmsbuild · 2018-03-15T09:35:05Z

Pull request #22531 was updated. @perrotta, @monttj, @cmsbuild, @slava77, @gpetruc, @arizzi can you please check and sign again.

davidlange6 · 2018-03-15T09:42:51Z

please test

cmsbuild · 2018-03-15T09:43:16Z

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/26867/console Started: 2018/03/15 10:43

cmsbuild · 2018-03-15T11:40:24Z

+1
Tested at: e0525c8
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-22531/26867/summary.html

cmsbuild · 2018-03-15T11:40:29Z

Comparison job queued.

slava77 · 2018-03-15T12:32:01Z

unhold

slava77 · 2018-03-15T12:36:34Z

+1

for #22531 e0525c8

bugfix in the input tags was added for cases of running on miniAOD inputs ; otherwise this is the same as the previously signed EGamma Scale & Smearing For Re-miniAOD #22531 2656d1e
jenkins tests pass (the jobs run confirming that everything else is still not affected, but the test of running calibrations on miniAOD is not a part of the tests)

cmsbuild · 2018-03-15T13:14:28Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-22531/26867/summary.html

Comparison Summary:

No significant changes to the logs found
Reco comparison results: 53 differences found in the comparisons
DQMHistoTests: Total files compared: 29
DQMHistoTests: Total histograms compared: 2477528
DQMHistoTests: Total failures: 1
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 2477351
DQMHistoTests: Total skipped: 176
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.869999999886 KiB( 22 files compared)
Checked 118 log files, 9 edm output root files, 29 DQM output files

slava77 · 2018-03-15T13:16:27Z

@smuzaffar
[since I looked]
I was wondering about the health of the comparison stage in/out
https://cmssdt.cern.ch/jenkins/job/compare-root-files-short-matrix/25660/console
for this PR I see

13:39:32 + rsync -az -e
14:14:22 + echo JA_STATUS=OK

It seems a bit crazy these days to have a copy out take 35 mins.
Are the worker nodes on dialup or a similarly performant connection?

smuzaffar · 2018-03-15T13:27:36Z

@slava77 , last time I check it was still on high bandwidth connection :-)
I think slowdown is due to large number (over 200K) of small files. We can try to archive them first and then upload.

slava77 · 2018-03-15T13:53:23Z

I think slowdown is due to large number (over 200K) of small files. We can try to archive them first and then upload.

Archiving should be a good thing to do.
Is this really the case here that we have 200K files?
Can you please copy the output somewhere easily accessible.
I wanted to inspect what's there.
I guess RelMon output is the issue where it has a file for every single folder of DQM even if that folder is empty.
Perhaps its output can be pruned so that only folders with plots are created

fabiocos · 2018-03-15T18:04:31Z

+1

basis for 94X reminiAODv2, not affecting 2018

smuzaffar · 2018-03-15T18:04:44Z

@slava77 , comparison files are avialable here /afs/cern.ch/work/m/muzaffar/public/PR22531.tar.gz

fabiocos · 2018-03-15T18:04:47Z

merge

cmsbuild added this to the CMSSW_10_1_X milestone Mar 8, 2018

cmsbuild added analysis-pending code-checks-pending comparison-pending orp-pending pending-signatures reconstruction-pending tests-pending labels Mar 8, 2018

cmsbuild added code-checks-approved and removed code-checks-pending labels Mar 8, 2018

cmsbuild added tests-started and removed tests-pending labels Mar 8, 2018

cmsbuild added code-checks-pending tests-pending and removed code-checks-approved tests-started labels Mar 8, 2018

cmsbuild removed the code-checks-pending label Mar 8, 2018

cmsbuild added code-checks-approved and removed code-checks-pending labels Mar 15, 2018

cmsbuild added tests-started and removed tests-pending labels Mar 15, 2018

cmsbuild added tests-approved and removed tests-started labels Mar 15, 2018

cmsbuild removed the hold label Mar 15, 2018

cmsbuild added reconstruction-approved and removed reconstruction-pending labels Mar 15, 2018

cmsbuild added comparison-available and removed comparison-pending labels Mar 15, 2018

cmsbuild added orp-approved and removed orp-pending labels Mar 15, 2018

cmsbuild merged commit 06dde77 into cms-sw:master Mar 15, 2018

slava77 mentioned this pull request Mar 15, 2018

relmon comparisons directory cleanup cms-sw/cms-bot#964

Closed

slava77 mentioned this pull request Nov 10, 2020

speedup: switch EgammaTools EnergyScaleCorrection scales and smearings to use std::map #32053

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EGamma Scale & Smearing For Re-miniAOD #22531

EGamma Scale & Smearing For Re-miniAOD #22531

Sam-Harper commented Mar 8, 2018

cmsbuild commented Mar 8, 2018

cmsbuild commented Mar 8, 2018

cmsbuild commented Mar 8, 2018

davidlange6 commented Mar 8, 2018

cmsbuild commented Mar 8, 2018

slava77 commented Mar 8, 2018

Sam-Harper commented Mar 8, 2018

Sam-Harper commented Mar 8, 2018

slava77 commented Mar 8, 2018 via email

Sam-Harper commented Mar 8, 2018

cmsbuild commented Mar 8, 2018

Sam-Harper commented Mar 8, 2018

cmsbuild commented Mar 8, 2018

cmsbuild commented Mar 15, 2018

davidlange6 commented Mar 15, 2018

cmsbuild commented Mar 15, 2018 •

edited

cmsbuild commented Mar 15, 2018

cmsbuild commented Mar 15, 2018

slava77 commented Mar 15, 2018

slava77 commented Mar 15, 2018

cmsbuild commented Mar 15, 2018

slava77 commented Mar 15, 2018

smuzaffar commented Mar 15, 2018

slava77 commented Mar 15, 2018

fabiocos commented Mar 15, 2018

smuzaffar commented Mar 15, 2018

fabiocos commented Mar 15, 2018

Navigation Menu

EGamma Scale & Smearing For Re-miniAOD #22531

EGamma Scale & Smearing For Re-miniAOD #22531

Conversation

Sam-Harper commented Mar 8, 2018

cmsbuild commented Mar 8, 2018

cmsbuild commented Mar 8, 2018

cmsbuild commented Mar 8, 2018

davidlange6 commented Mar 8, 2018

cmsbuild commented Mar 8, 2018

slava77 commented Mar 8, 2018

Sam-Harper commented Mar 8, 2018

Sam-Harper commented Mar 8, 2018

slava77 commented Mar 8, 2018 via email

Sam-Harper commented Mar 8, 2018

cmsbuild commented Mar 8, 2018

Sam-Harper commented Mar 8, 2018

cmsbuild commented Mar 8, 2018

cmsbuild commented Mar 15, 2018

davidlange6 commented Mar 15, 2018

cmsbuild commented Mar 15, 2018 • edited

cmsbuild commented Mar 15, 2018

cmsbuild commented Mar 15, 2018

slava77 commented Mar 15, 2018

slava77 commented Mar 15, 2018

cmsbuild commented Mar 15, 2018

slava77 commented Mar 15, 2018

smuzaffar commented Mar 15, 2018

slava77 commented Mar 15, 2018

fabiocos commented Mar 15, 2018

smuzaffar commented Mar 15, 2018

fabiocos commented Mar 15, 2018

cmsbuild commented Mar 15, 2018 •

edited