Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JMEnano oversized #549

Closed
mariadalfonso opened this issue Oct 20, 2020 · 6 comments
Closed

JMEnano oversized #549

mariadalfonso opened this issue Oct 20, 2020 · 6 comments

Comments

@mariadalfonso
Copy link

I did a check of the JMEnano size out of the workflow 11024.15 already in production.
11024.15 uses MC ttbar events from 2018 with PU.
The JME nano original goal is to derive JEC. It was implicitly redesigned to derive most of the SF for JME and BTV.
[Note this does not contains any of the new JME desired PF candidates or Jet constituents or anything that is needed to do developments for Run3]

Test done on CMSSW_11_2_0_pre7 with 500 events
http://dalfonso.web.cern.ch/dalfonso/XPOG/11_2_0_pre7/jme-11024.15_size_report.html
We have now 13.69 kb/event.


In comparison the central nano in the similar high PU conditions is < 2kb/ev
test done with
/eos/cms/store/relval/CMSSW_11_2_0_pre7/RelValTTbar_13/NANOAODSIM/PU25ns_112X_upgrade2018_realistic_v3-v1/20000/638FCD01-8C5E-0B41-8240-65FF1A67B342.root

http://dalfonso.web.cern.ch/dalfonso/XPOG/11_2_0_pre7/centralNano_size_report.html

@mariadalfonso
Copy link
Author

JME experts reported similar size on TTbar events

cms-sw#31714 (comment)
cms-sw#31831 (comment)

@mariadalfonso
Copy link
Author

mariadalfonso commented Oct 22, 2020

Summarize the review as of Monday 19 October.

JME stores 3 AK4jets (PF, CHS, Puppi) + another set of AK4chs CorrT1METJet duplicate in the main collection.
Those 4 collections are about 80% of the jmenano as 11024.15

  1. an average 80 items/evt for PF and CHS but only 11 items/evt per event for Puppi .
    what is the reason of the asymmetry? is the size of PF/CHS large of the Puppi too little ?

  2. from JMAR "PFjets with the same content is usually a good check" , does this means that the PFjets are not really necessary ? suggested to drop

  3. discriminators such as btagging/particlenet are stored also at very low PT but will be only used from 20-30 GeV suggested to tailor the content to derive the needed SF

@mariadalfonso
Copy link
Author

some small suggestion:

  1. drop CorrT1METJet for type1MET you already have all possible jets in the events in the other collections

  2. eta/phi jets have larger precision of the pt and are by far the most offending floats
    jet eta/phi are saved with high precision ~ 12 as leptons
    while the pt are redefined with precision 10

  3. drop the regression variables you have now in the jet-chs.
    something like this should work
    getattr(proc,jetTable).externalVariables = cms.PSet()

@nurfikri89
Copy link

PR cms-sw#32722 will make changes to reduce JMEnano size. The event size is now reduced to 6.42 kb/event from 9.32 kb/event. The comparison was made using 10K events from a TTJets RunIISummer19UL17MiniAOD sample. The changes were discussed in the 13/01/2021 XPOG meeting [1]. One change discussed during the meeting but not included in the PR is reducing eta and phi precision from 12 to 10. It was found that the size reduction is negligible so it was decided to not reduce the eta and phi precision.

[1] https://indico.cern.ch/event/978436/

@nurfikri89
Copy link

PR cms-sw#32722 and its backports (cms-sw#32759 for 10_6_X and cms-sw#32760 for 11_2_X) have been merged

@mariadalfonso
Copy link
Author

closing this issue following the mentioned PRs effectively reduce the size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants