Add boosted taus to NanoAOD #33528

swozniewski · 2021-04-26T16:36:35Z

PR description:

Adds quantities for boosted taus to nanoAOD.
For this, a new cff-file similar to the one of std taus is added, but with less content.
Sequences are modified such that boosted taus are included by default, but not for modifiers corresponding to past productions.
Histograms are added to NanoAOD DQM

Using 1000evts of a RelValTTbar sample:

CPU time of nanoAOD step increases by ~1%
nanoAOD filesize increases by 1.2%

To be backported to 10_6_X targeting nanoAOD v9.

PR validation:

checked that new quantities are written out
code/format checks passed, unit tests passed, limited matrix tests passed

* Merge pull request cms-sw#33150 from cms-tau-pog/CMSSW_11_3_X_tau-pog_tauIDtoolsDev Updates to tauID python tool * Initial working commit of boosted taus in CMSSW_11_3 NanoAOD * Remove 2015 anti-E for 10_6v2 era compatibility in boosted taus * Initial working commit of boosted taus in CMSSW_11_3 NanoAOD * Remove 2015 anti-E for 10_6v2 era compatibility in boosted taus * Remove commented boosted tau nanoAOD code * Remove main nanoAOD config comments * Remove leading charged hadronic candidate dxy and dz * Update boosted tau configuration to remove excess ID variables * Fix removal of boosted tau vars base * Remove boosted tau sequences from previous eras * Remove redundant decay mode information * some polishing * change Gen information for boosted taus * Update nanoDQM for boosted taus Co-authored-by: cmsbuild <cmsbuild@cern.ch> Co-authored-by: Andrew Loeliger <aloelige@cern.ch> Co-authored-by: Andrew David Loeliger <andrew.loeliger@cern.ch>

cmsbuild · 2021-04-26T16:43:41Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-33528/22296

This PR adds an extra 32KB to repository
There are other open Pull requests which might conflict with changes you have proposed:
- File PhysicsTools/NanoAOD/python/nanoDQM_cfi.py modified in PR(s): Add ParticleNet mass regression #33483
- File PhysicsTools/NanoAOD/python/nano_cff.py modified in PR(s): Add ParticleNet mass regression #33483, Refactoring gen weight storage in EDM + Nano integration #32167

gouskos · 2021-04-26T17:20:51Z

Hi @swozniewski can you please add the increase in size/evt and processing time/evt?

cmsbuild · 2021-04-26T17:21:15Z

A new Pull Request was created by @swozniewski for master.

It involves the following packages:

PhysicsTools/NanoAOD

@cmsbuild, @mariadalfonso, @gouskos, @fgolf can you please review it and eventually sign? Thanks.
@gpetruc, @swertz this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

gouskos · 2021-04-26T17:58:44Z

please test

gouskos · 2021-04-26T19:48:34Z

PhysicsTools/NanoAOD/python/nano_cff.py

@@ -371,6 +395,7 @@ def nanoAOD_customizeCommon(process):
                                     addParticleNet=nanoAOD_addDeepInfoAK8_switch.nanoAOD_addParticleNet_switch,
                                     jecPayload=nanoAOD_addDeepInfoAK8_switch.jecPayload)
    (run2_nanoAOD_94XMiniAODv1 | run2_nanoAOD_94X2016 | run2_nanoAOD_94XMiniAODv2 | run2_nanoAOD_102Xv1 | run2_nanoAOD_106Xv1).toModify(process, lambda p : nanoAOD_addTauIds(p))
+    (~(run2_nanoAOD_94XMiniAODv1 | run2_nanoAOD_94X2016 | run2_nanoAOD_94XMiniAODv2 | run2_nanoAOD_102Xv1 | run2_nanoAOD_106Xv1)).toModify(process, lambda p : nanoAOD_addBoostedTauIds(p))


@swozniewski in the CMSSW master we put the latest-greatest developments i.e., we do not protect against previews CMSSW versions. Please remove the modifiers here and elsewhere in this file. The modifiers would be need when you make the PR in CMSSW_106X.

@gouskos I'm not sure how much we care in master, but as outlined during the meeting, previous miniAOD versions do not provide the correct input for boosted taus. Assuming that 11_X is not used with those, shall I still remove the modifiers? For the backport to 10_6_X, I should keep them, right?

@swagata87 After the discussion in the meeting I understood the situation, which for the sake of completeness and bookeeping let me briefly summarise it here. The first miniAOD version that provides the correct inputs for BoostedTau is the UL miniAOD sent for production Dec 2020.
As it is now, you do not want to include boostedTaus in previous campaigns, right? If yes, then I think is safer to keep this part of the code as it is also for the master.

For 106X yes - you should keep them and add "run2_nanoAOD_devel". [Effectively, you want to always recalculate the inputs if not in "run2_nanoAOD_devel"] So I think it should look like:
(~(run2_nanoAOD_94XMiniAODv1 | run2_nanoAOD_94X2016 | run2_nanoAOD_94XMiniAODv2 | run2_nanoAOD_102Xv1 | run2_nanoAOD_106Xv1) & run2_nanoAOD_devel).toModify(process, lambda p : nanoAOD_addBoostedTauIds(p))
[but needs to be tested]

Right, we think it's only safe to include them from the latest campaign on. Ok, so I'll keep it. Is there sth left for this PR to master (in case I missed sth)?
It's actually not 're'-calculating the inputs. If boosted taus are written out, we also need to run this ID. My feeling is that boosted taus should be produced independently from run2_nanoAOD_devel (but maybe I'm wrong about the purpose of this modifier). Could you please confirm what dependency is envisaged? Then I'll add it to the backport and could also open the PR for it.

Right, we think it's only safe to include them from the latest campaign on. Ok, so I'll keep it. Is there sth left for this PR to master (in case I missed sth)?

It looks good to me - we just need to run the standard nanoAOD tests, but there is an issue we are trying to solve before testing the PR. I would suggest to move to the 106X PR in the meantime

It's actually not 're'-calculating the inputs. If boosted taus are written out, we also need to run this ID. My feeling is that boosted taus should be produced independently from run2_nanoAOD_devel (but maybe I'm wrong about the purpose of this modifier). Could you please confirm what dependency is envisaged? Then I'll add it to the backport and could also open the PR for it.

Thanks for the clarification! So then what you have there should work.

I got a bit delayed by an incompatibility in 10_6_X and I had to include a fix. I'll create the backport PR now.

gouskos · 2021-04-26T19:49:09Z

PhysicsTools/NanoAOD/python/nano_cff.py

@@ -371,6 +395,7 @@ def nanoAOD_customizeCommon(process):
                                     addParticleNet=nanoAOD_addDeepInfoAK8_switch.nanoAOD_addParticleNet_switch,
                                     jecPayload=nanoAOD_addDeepInfoAK8_switch.jecPayload)
    (run2_nanoAOD_94XMiniAODv1 | run2_nanoAOD_94X2016 | run2_nanoAOD_94XMiniAODv2 | run2_nanoAOD_102Xv1 | run2_nanoAOD_106Xv1).toModify(process, lambda p : nanoAOD_addTauIds(p))


This is not introduced by you, but why are these modifiers needed here?

As far as I know, this is to produce IDs on top of miniAOD input, which were not yet part of miniAOD at that time but requested for nanoAOD. Meanwhile, those are obtained from miniAOD directly.

gouskos · 2021-04-26T19:49:36Z

PhysicsTools/NanoAOD/python/nano_cff.py


 (run2_nanoAOD_92X | run2_miniAOD_80XLegacy | run2_nanoAOD_94X2016 | run2_nanoAOD_94X2016 | \
    run2_nanoAOD_94XMiniAODv1 | run2_nanoAOD_94XMiniAODv2 | \
    run2_nanoAOD_102Xv1).toReplaceWith(nanoSequenceFS, nanoSequenceFS.copyAndExclude([genVertexTable, genVertexT0Table]))

+#remove boosted tau from previous eras
+(run2_miniAOD_80XLegacy | run2_nanoAOD_92X | run2_nanoAOD_94XMiniAODv1 | run2_nanoAOD_94X2016 | run2_nanoAOD_94XMiniAODv2 | run2_nanoAOD_102Xv1 | run2_nanoAOD_106Xv1).toReplaceWith(nanoSequenceFS, nanoSequenceFS.copyAndExclude([boostedTauMC]))


Same as my other comments about the modifiers [see L. 398]

cmsbuild · 2021-04-26T20:36:17Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b9b754/14587/summary.html
COMMIT: 96962a9
CMSSW: CMSSW_12_0_X_2021-04-26-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/33528/14587/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

No significant changes to the logs found
ROOTFileChecks: Some differences in event products or their sizes found
Reco comparison results: 0 differences found in the comparisons
DQMHistoTests: Total files compared: 38
DQMHistoTests: Total histograms compared: 2877605
DQMHistoTests: Total failures: 1
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 2877582
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 8.928 KiB( 37 files compared)
DQMHistoSizes: changed ( 1325.81,... ): 4.464 KiB Physics/NanoAODDQM
Checked 160 log files, 37 edm output root files, 38 DQM output files
TriggerResults: found differences in 7 / 37 workflows

swozniewski · 2021-04-28T10:34:34Z

Hi @swozniewski can you please add the increase in size/evt and processing time/evt?

I've added the numbers to the PR description.

mariadalfonso · 2021-05-03T08:08:27Z

PhysicsTools/NanoAOD/python/boostedTaus_cff.py

+boostedTausMCMatchLepTauForTable = cms.EDProducer("MCMatcher",  # cut on deltaR, deltaPt/Pt; pick best by deltaR
+    src         = boostedTauTable.src,                 # final reco collection
+    matched     = cms.InputTag("finalGenParticles"), # final mc-truth particle collection
+    mcPdgId     = cms.vint32(11,13),            # one or more PDG ID (11 = electron, 13 = muon); absolute values (see below)
+    checkCharge = cms.bool(False),              # True = require RECO and MC objects to have the same charge
+    mcStatus    = cms.vint32(),                 # PYTHIA status code (1 = stable, 2 = shower, 3 = hard scattering)
+    maxDeltaR   = cms.double(0.3),              # Minimum deltaR for the match
+    maxDPtRel   = cms.double(0.5),              # Minimum deltaPt/Pt for the match
+    resolveAmbiguities    = cms.bool(True),     # Forbid two RECO objects to match to the same GEN object
+    resolveByMatchQuality = cms.bool(True),     # False = just match input in order; True = pick lowest deltaR pair first
+)
+
+#This requires genVisTaus in taus_cff.py
+boostedTausMCMatchHadTauForTable = cms.EDProducer("MCMatcher",  # cut on deltaR, deltaPt/Pt; pick best by deltaR
+    src         = boostedTauTable.src,                 # final reco collection
+    matched     = cms.InputTag("genVisTaus"),   # generator level hadronic tau decays
+    mcPdgId     = cms.vint32(15),               # one or more PDG ID (15 = tau); absolute values (see below)
+    checkCharge = cms.bool(False),              # True = require RECO and MC objects to have the same charge
+    mcStatus    = cms.vint32(),                 # CV: no *not* require certain status code for matching (status code corresponds to decay mode for hadronic tau decays)
+    maxDeltaR   = cms.double(0.3),              # Maximum deltaR for the match
+    maxDPtRel   = cms.double(1.),               # Maximum deltaPt/Pt for the match
+    resolveAmbiguities    = cms.bool(True),     # Forbid two RECO objects to match to the same GEN object
+    resolveByMatchQuality = cms.bool(True),     # False = just match input in order; True = pick lowest deltaR pair first
+)
+
+boostedTauMCTable = cms.EDProducer("CandMCMatchTableProducer",
+    src = boostedTauTable.src,
+    mcMap = cms.InputTag("boostedTausMCMatchLepTauForTable"),
+    mcMapVisTau = cms.InputTag("boostedTausMCMatchHadTauForTable"),                         
+    objName = boostedTauTable.name,
+    objType = cms.string("Tau"),
+    branchName = cms.string("genPart"),
+    docString = cms.string("MC matching to status==2 taus"),
+)


these are all copied from PhysicsTools/NanoAOD/python/taus_cff.py
the only change are the src.

will be better if we clone them w/o reimplementing

ok, sounds reasonable. I will change that.

With the last commit a also reuse the WPmasks from std taus.
An import that I found to be unnecessary was removed.

mariadalfonso · 2021-05-03T08:16:29Z

PhysicsTools/NanoAOD/python/boostedTaus_cff.py

+       rawMVAoldDM2017v2=Var("tauID('byIsolationMVArun2017v2DBoldDMwLTraw2017')",float, doc="byIsolationMVArun2017v2DBoldDMwLT raw output discriminator (2017v2)",precision=10),
+       rawMVAnewDM2017v2 = Var("tauID('byIsolationMVArun2017v2DBnewDMwLTraw2017')",float,doc='byIsolationMVArun2017v2DBnewDMwLT raw output discriminator (2017v2)',precision=10),
+       rawMVAoldDMdR032017v2 = Var("tauID('byIsolationMVArun2017v2DBoldDMdR0p3wLTraw2017')",float,doc='byIsolationMVArun2017v2DBoldDMdR0p3wLT raw output discriminator (2017v2)'),    
+       idMVAnewDM2017v2 = _tauId7WPMask("by%sIsolationMVArun2017v2DBnewDMwLT2017", doc="IsolationMVArun2017v2DBnewDMwLT ID working point (2017v2)"),
+       idMVAoldDM2017v2=_tauId7WPMask("by%sIsolationMVArun2017v2DBoldDMwLT2017",doc="IsolationMVArun2017v2DBoldDMwLT ID working point (2017v2)"),
+       idMVAoldDMdR032017v2 = _tauId7WPMask("by%sIsolationMVArun2017v2DBoldDMdR0p3wLT2017",doc="IsolationMVArun2017v2DBoldDMdR0p3wLT ID working point (2017v2)"),
+       rawAntiEle2018 = Var("tauID('againstElectronMVA6Raw')", float, doc= "Anti-electron MVA discriminator V6 raw output discriminator (2018)", precision=10),
+       rawAntiEleCat2018 = Var("tauID('againstElectronMVA6category')", int, doc="Anti-electron MVA discriminator V6 category (2018)"),
+       idAntiEle2018 = _tauId5WPMask("againstElectron%sMVA6", doc= "Anti-electron MVA discriminator V6 (2018)"),


the tau as redefined in
https://cms-nanoaod-integration.web.cern.ch/integration/33513/mc106Xul17v2_size_report.html#Tau

the v2 variables are associated to tauID("...raw2017v2")
https://github.com/cms-sw/cmssw/blob/master/PhysicsTools/NanoAOD/python/taus_cff.py#L101
while here you only have 2017. Would it be better rawMVAoldDM2017v2--> rawMVAoldDM2017 to avoid any confusion ?

The input names differ from the standard taus because the toolchain for getting the IDs is not the same. But in fact this is v2 as well (see middle of the name (...MVArun2017v2...)). Therefore the name of the nanoAOD branch is correct.

cmsbuild · 2021-05-03T13:05:53Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-33528/22431

This PR adds an extra 32KB to repository
There are other open Pull requests which might conflict with changes you have proposed:
- File PhysicsTools/NanoAOD/python/nanoDQM_cfi.py modified in PR(s): Add ParticleNet mass regression #33483
- File PhysicsTools/NanoAOD/python/nano_cff.py modified in PR(s): Refactoring gen weight storage in EDM + Nano integration #32167, Add ParticleNet mass regression #33483

cmsbuild · 2021-05-03T13:06:13Z

Pull request #33528 was updated. @cmsbuild, @mariadalfonso, @gouskos, @fgolf can you please check and sign again.

mariadalfonso · 2021-05-04T08:52:14Z

please test

cmsbuild · 2021-05-04T15:27:34Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b9b754/14834/summary.html
COMMIT: 05ec716
CMSSW: CMSSW_12_0_X_2021-05-03-2300/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/33528/14834/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

No significant changes to the logs found
ROOTFileChecks: Some differences in event products or their sizes found
Reco comparison results: 7 differences found in the comparisons
DQMHistoTests: Total files compared: 37
DQMHistoTests: Total histograms compared: 2662646
DQMHistoTests: Total failures: 12
DQMHistoTests: Total nulls: 1
DQMHistoTests: Total successes: 2662611
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 8.932 KiB( 36 files compared)
DQMHistoSizes: changed ( 1325.81,... ): 4.464 KiB Physics/NanoAODDQM
DQMHistoSizes: changed ( 312.0 ): 0.004 KiB MessageLogger/Warnings
Checked 155 log files, 37 edm output root files, 37 DQM output files
TriggerResults: found differences in 7 / 36 workflows

gouskos · 2021-05-06T16:22:49Z

+xpog

The PR adds the boostedTau collection in nanoAOD.
NanoAOD-integration tests passed: https://gitlab.cern.ch/cms-nanoAOD/nanoAOD-integration/-/issues/77#note_4458840
increase in size and timing in sync with PR description [~1% for both]

cmsbuild · 2021-05-06T16:23:10Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

qliphy · 2021-05-07T00:34:15Z

+1

…W_11_3_X_tau-pog_boostedTaus Add boosted taus to nanoAOD (backport of #33528)

cmsbuild added this to the CMSSW_12_0_X milestone Apr 26, 2021

cmsbuild added code-checks-pending orp-pending pending-signatures tests-pending xpog-pending labels Apr 26, 2021

cmsbuild added code-checks-approved and removed code-checks-pending labels Apr 26, 2021

cmsbuild added tests-started and removed tests-pending labels Apr 26, 2021

mariadalfonso mentioned this pull request Apr 26, 2021

features for V9 nanoAOD cms-nanoAOD/cmssw#555

Closed

55 tasks

gouskos reviewed Apr 26, 2021

View reviewed changes

cmsbuild added tests-approved and removed tests-started labels Apr 26, 2021

swozniewski mentioned this pull request Apr 30, 2021

Add boosted taus to nanoAOD (backport of #33528) #33587

Merged

mariadalfonso reviewed May 3, 2021

View reviewed changes

reuse objects from std taus and remove unnecessary import

05ec716

cmsbuild added code-checks-pending tests-pending and removed code-checks-approved tests-approved labels May 3, 2021

cmsbuild added code-checks-approved and removed code-checks-pending labels May 3, 2021

cmsbuild mentioned this pull request May 3, 2021

Clean up tau related nanoAOD content #33513

Merged

cmsbuild added tests-started and removed tests-pending labels May 4, 2021

cmsbuild added tests-approved and removed tests-started labels May 4, 2021

cmsbuild added fully-signed xpog-approved and removed pending-signatures xpog-pending labels May 6, 2021

cmsbuild added orp-approved and removed orp-pending labels May 7, 2021

cmsbuild merged commit a9c4b93 into cms-sw:master May 7, 2021

cmsbuild mentioned this pull request May 7, 2021

MTD geometry and reconstruction: store ETL layout logic into MTDTopology, use for ETL layers navigation in reconstruction #33598

Merged

cmsbuild added a commit that referenced this pull request May 12, 2021

Merge pull request #33587 from cms-tau-pog/CMSSW_10_6_X_backport_CMSS…

2d2c898

…W_11_3_X_tau-pog_boostedTaus Add boosted taus to nanoAOD (backport of #33528)

mbluj deleted the CMSSW_11_3_X_tau-pog_boostedTaus branch October 10, 2023 10:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add boosted taus to NanoAOD #33528

Add boosted taus to NanoAOD #33528

swozniewski commented Apr 26, 2021 •

edited

cmsbuild commented Apr 26, 2021

gouskos commented Apr 26, 2021

cmsbuild commented Apr 26, 2021

gouskos commented Apr 26, 2021

gouskos Apr 26, 2021

swozniewski Apr 28, 2021

gouskos Apr 28, 2021

swozniewski Apr 29, 2021

gouskos Apr 29, 2021

swozniewski Apr 30, 2021

gouskos Apr 26, 2021

swozniewski Apr 28, 2021

gouskos Apr 26, 2021

cmsbuild commented Apr 26, 2021

swozniewski commented Apr 28, 2021

mariadalfonso May 3, 2021 •

edited

swozniewski May 3, 2021

swozniewski May 3, 2021

mariadalfonso May 3, 2021

swozniewski May 3, 2021

cmsbuild commented May 3, 2021

cmsbuild commented May 3, 2021

mariadalfonso commented May 4, 2021

cmsbuild commented May 4, 2021

gouskos commented May 6, 2021

cmsbuild commented May 6, 2021

qliphy commented May 7, 2021

Add boosted taus to NanoAOD #33528

Add boosted taus to NanoAOD #33528

Conversation

swozniewski commented Apr 26, 2021 • edited

PR description:

PR validation:

cmsbuild commented Apr 26, 2021

gouskos commented Apr 26, 2021

cmsbuild commented Apr 26, 2021

gouskos commented Apr 26, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmsbuild commented Apr 26, 2021

Comparison Summary

swozniewski commented Apr 28, 2021

mariadalfonso May 3, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmsbuild commented May 3, 2021

cmsbuild commented May 3, 2021

mariadalfonso commented May 4, 2021

cmsbuild commented May 4, 2021

Comparison Summary

gouskos commented May 6, 2021

cmsbuild commented May 6, 2021

qliphy commented May 7, 2021

swozniewski commented Apr 26, 2021 •

edited

mariadalfonso May 3, 2021 •

edited