New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates to MiniAOD to satisfy JME/SMP/B2G requests for NanoAOD #20638
Conversation
The code-checks are being triggered in jenkins. |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/PR-20638/933 Code check has found code style and quality issues which could be resolved by applying a patch in https://cmssdt.cern.ch/SDT/code-checks/PR-20638/933/git-diff.patch You can run |
-code-checks
passing the code-checks is now required.
Please update before expecting further review
|
Did I just draw the short straw to take care of the code checks from the entire DataFormats/Candidate, /HepMCCandidate and /PatCandidate because I added dictionaries? Ha! |
On 9/23/17 7:07 PM, rappoccio wrote:
Did I just draw the short straw to take care of the code checks from the
entire DataFormats/Candidate, /HepMCCandidate and /PatCandidate because
I added dictionaries?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#20638 (comment)>, or
mute the thread
<https://github.com/notifications/unsubscribe-auth/AEdcbiZhiMy3y1eA_o7fe_o1edXw9olZks5slblsgaJpZM4PhrzD>.
I only see
PhysicsTools/PatAlgos and RecoJets/JetProducers
for this PR
|
Oops, I have a few open. So I don't really have to do all the code checks for the DataFormats? :) Glee! |
It depends on #20637, where the same code checks have failed for the dataformats. |
The code-checks are being triggered in jenkins. |
Nice, it looks like the patch figures out which files you have actually modified and only patches those. That being said, perhaps a central campaign to get those "overrides" implemented is in order. |
+code-checks |
A new Pull Request was created by @rappoccio for master. It involves the following packages: CommonTools/RecoAlgos @perrotta, @efeyazgan, @perrozzi, @thuer, @monttj, @cmsbuild, @slava77, @govoni can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
Any other ideas on this? Or are we basically good to go for merging? |
total size in crease is ~500bytes/ev right? |
For the AK8 jets, right. The total is 800 bytes/ev (other additions for the soft drop subjets for reco and genjets). |
I start thinking if for the SMP request to have always at least 3 jets, but no details at all on them, isn't better to go for a separate collection with just the needed information as we do for calo jets. |
The added information is (with splitlevel=0 for AK8 jets) RECO Jets:
Gen Jets:
Previous configuration: New configuration, all additions implemented, splitlevel=0: |
(Also sent this in email to you and Mikko, but here for the overall record). Having a separate collection is a possibility, but we are already dropping most / all of the information for low pt jets (we keep JECs, a few b-tagging discriminators, and nothing else). The remaining question is whether they compress better in their own collection, but there is also the operational difficulty to consider. From my perspective, we definitely need the AK8 Gen Jets to correctly do substructure analyses, if we adopt the nanoAOD (otherwise, we are currently running them all ourselves, reducing the overall buy-in of nanoAOD). The additional AK8 RECO jets are worth adding if we can get buy-in from SMPJ and the JME groups (or, at least, JERC). Overall this will hopefully give us the JECs faster. I realize this is something like $40kCHF per year extra, but it is probably worth that to shave 2-3 weeks off the JECs every time (accumulates probably to 1.5-2 months time saving over the year.... this is a competitive advantage). |
On 9/29/17 10:29 AM, rappoccio wrote:
The added information is:
/RECO Jets/:
* <= 3 AK8 jets with pt > 30 GeV (minimal information)
Are the jets ordered appropriately and uniquely for relevant use cases
for this "first 3" to work?
… * All AK8 jets with 100 < pt < 170 GeV (minimal information) :
* All AK8 jets with pt > 170 (grooming, subjet information, N2)
* All AK8 jets with pt > 230 (adding N3 variable)
===> These add 0.4 kb/event
/Gen Jets/:
* <= 3 AK8 jets with pt > 10 GeV
* All AK8 jets with 80 < pt < 150 GeV (no grooming)
* All AK8 jets with pt > 150 GeV (adding grooming AND subjets)
===> These add 0.3 k/bevent
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20638 (comment)>, or
mute the thread
<https://github.com/notifications/unsubscribe-auth/AEdcbjQLPScFUQBuzPDZUgHgSHqotzdTks5snSjugaJpZM4PhrzD>.
|
@slava77 they are sorted upstream when the pat::Jets are made. |
I also computed CPU with 50 evts of wf 10224 (TTbar with PU: is it the same you are looking at, @rappoccio?), and I can confirm that there is no increase in timing. Perhaps also some gain, from the removal of btagging modules (in particular pfCombinedCvsLJetTagsAK8Puppi):
In my test the additional event size is a bit bigger than what you report above (miniAOD only, as reco output is unchanged)
However, if the workflow you use is the same, I would rather thrust your 9k events than my current 50 ones for it. |
It's odd that the btaggers are gone. I thought that the idea was to save
less at low pt but still keep at high pt.
…On Sep 30, 2017 3:23 PM, "perrotta" ***@***.***> wrote:
I also computed CPU with 50 evts of wf 10224 (TTbar with PU: is it the
same you are looking at, @rappoccio <https://github.com/rappoccio>?), and
I can confirm that there is no increase in timing. Perhaps also some gain,
from the removal of btagging modules (in particular
pfCombinedCvsLJetTagsAK8Puppi):
removed -0.00% 0.04 ms/ev -> 0.00 ms/ev pfJetBProbabilityBJetTagsAK8Puppi
removed -0.00% 0.12 ms/ev -> 0.00 ms/ev pfInclusiveSecondaryVertexFinderCvsLTagInfosAK8Puppi
added +0.00% 0.00 ms/ev -> 0.03 ms/ev slimmedGenJetsAK8SoftDropSubJets
removed -0.00% 0.30 ms/ev -> 0.00 ms/ev pfCombinedCvsBJetTagsAK8Puppi
added +0.00% 0.00 ms/ev -> 0.05 ms/ev ak8GenJetsNoNuConstituents
removed -0.00% 0.02 ms/ev -> 0.00 ms/ev pfTrackCountingHighEffBJetTagsAK8Puppi
removed -0.00% 0.05 ms/ev -> 0.00 ms/ev softPFElectronBJetTagsAK8Puppi
removed -0.00% 0.02 ms/ev -> 0.00 ms/ev pfJetProbabilityBJetTagsAK8Puppi
removed -0.00% 0.02 ms/ev -> 0.00 ms/ev pfSimpleInclusiveSecondaryVertexHighEffBJetTagsAK8Puppi
removed -0.00% 0.04 ms/ev -> 0.00 ms/ev softPFMuonBJetTagsAK8Puppi
removed -0.00% 0.02 ms/ev -> 0.00 ms/ev pfSimpleSecondaryVertexHighEffBJetTagsAK8Puppi
removed -0.01% 0.46 ms/ev -> 0.00 ms/ev pfCombinedCvsLJetTagsAK8Puppi
added +0.00% 0.00 ms/ev -> 0.21 ms/ev ak8GenJetsNoNuSoftDrop
In my test the additional event size is a bit bigger than what you report
above (miniAOD only, as reco output is unchanged)
272.6 -> 1090.8 818 120.0 1.12 patJets_slimmedJetsAK8__RECO.
0.0 -> 180.0 180 NEWO 0.25 recoGenJets_slimmedGenJetsAK8SoftDropSubJets__RECO.
102.2 -> 335.1 233 106.5 0.32 recoGenJets_slimmedGenJetsAK8__RECO.
------------------------------
73168 -> 74403 1235 1.7 ALL BRANCHES
However, if the workflow you use is the same, I would rather thrust your
9k events than my current 50 ones for it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20638 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEdcbu82SEN0NAuqQFOuULCgtsgKxNE-ks5snr91gaJpZM4PhrzD>
.
|
@slava77 see #20638 (comment) and #20638 (comment) |
@perrotta I am using WF 25402.0 (TTbar_13+FS_TTbar_13_UP15_PU25+HARVESTUP15FS+MINIAODMCUP15FS) because it had relvals available already. |
Trying to summarize: the overall event size increase measured in 25402 (by Sal) and wf 10224 (by me) is similar in relative terms: 1.8% in 25402 and 1.7% in 10224. Code-wise this PR is already fine for reco. @rappoccio @arizzi : if I understand correctly, you still need to finalize your discussion. Once you have reached any conclusion, could you please post it here, so that we can decide how to go on with this? |
from XPOG perspective it is ok to add this, on longer term we may look for strategy to reduce the size but for now it is ok. |
+1
|
+1 |
merge |
@@ -263,6 +263,8 @@ namespace pat { | |||
if (specificJPT_.empty()) throw cms::Exception("Type Mismatch") << "This PAT jet was not made from a JPTJet.\n"; | |||
return specificJPT_[0]; | |||
} | |||
/// check to see if the PFSpecific object is stored | |||
bool hasPFSpecific() const { return !specificPF_.empty(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this needed?
The implementation appears to be the same as bool isPFJet()
cutLoose = cms.string(""), | ||
nLoose = cms.uint32(0), | ||
clearDaughters = cms.bool(False), #False means rekeying | ||
dropSpecific = cms.bool(True), # Save space |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like extra care now needs to be taken to access things like
patJets_slimmedJetsAK8__reRECO.obj.neutralHadronEnergy()
JME / SMP / B2G preparation for the nanoAOD.
This is for nanoAOD to be fully usable for substructure analyses. The CPU time added is negligible (<1 ms/event) because I am not computing the areas for the gen jets (not needed). I also preclustered them with AK8. There is expected to be a small increase in size.
The preclustering required more code than I intended, since, to first order, this is merely a configuration change. However many dictionaries were missing to do the preclustering on GenJets, and the VirtualJetProducer could not understand the
vector<FwdPtr<GenParticle>>
.This is expected to add 3 AK8 jets and 3 AK8 genjets per event.
@arizzi, @gpetruc, @clelange also watching.