Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding HZZ electron MVA ID #38355

Merged
merged 8 commits into from Jul 18, 2022
Merged

Adding HZZ electron MVA ID #38355

merged 8 commits into from Jul 18, 2022

Conversation

asculac
Copy link
Contributor

@asculac asculac commented Jun 13, 2022

PR description:

  • adding HZZ training for electron ID, which is an EGgamma-approved ID (latest talk at egamma POG here ) used for H->4l, which cannot otherwise be recomputed on top of nanoAODs.

  • It is based on the same electronMVAValueMapProducer used for existing mvaIds, but with an updated and separate training for 2016UL, 2017UL and 2018UL.

  • A single variable mvaHZZIso is added to electrons. The appropriate training to be used is selected automatically. The idea is that the variable name will stay the same in the future, selecting a new training when appropriate.

  • No need to store working points as bools since these can be easily derived from the MVA value.

  • In addition to this PR: renaming mva Egamma IDs to be more generic

PR validation:

  • Tested in CMSSW_12_4_0_pre1; verified that the new variable is added correctly.
  • Adds 1 float per electron.

@namapane @swagata87 pls follow

*this PR is a repeated clean version of previous PR (#37429)

Adding mvaHZZ variable with HZZ electron ID training.
Introducing generic names for egamma IDs
mvaFall17V2Iso --> mvaIso
mvaFall17V2noIso --> mvanoIso
@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38355/30541

  • This PR adds an extra 32KB to repository

  • There are other open Pull requests which might conflict with changes you have proposed:

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @asculac (Ana Sculac) for master.

It involves the following packages:

  • PhysicsTools/NanoAOD (xpog)
  • PhysicsTools/PatAlgos (reconstruction)

@gouskos, @clacaputo, @cmsbuild, @fgolf, @slava77, @jpata, @mariadalfonso can you please review it and eventually sign? Thanks.
@AlexDeMoor, @rappoccio, @gouskos, @jdolen, @swertz, @JyothsnaKomaragiri, @ahinzmann, @schoef, @emilbols, @jdamgov, @mbluj, @nhanvtran, @gkasieczka, @hatakeyamak, @gpetruc, @azotz, @mariadalfonso, @demuller, @andrzejnovak, @seemasharmafnal, @mmarionncern this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@mariadalfonso
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0e8389/25484/summary.html
COMMIT: e0f0c1c
CMSSW: CMSSW_12_5_X_2022-06-13-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/38355/25484/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 85 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3659074
  • DQMHistoTests: Total failures: 48
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3659004
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 208 log files, 45 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@mariadalfonso
Copy link
Contributor

@asculac
can you prepare in the mean time the backport to the 10_4_X ?

@@ -305,6 +309,15 @@ def _get_bitmapVIDForEle_docstring(modules,WorkingPoints):
VIDNestedWPBitmapSpring15 = cms.InputTag("bitmapVIDForEleSpring15"),
VIDNestedWPBitmapSum16 = cms.InputTag("bitmapVIDForEleSum16"),
)
(~run2_nanoAOD_preUL & run2_egamma_2016).toModify(slimmedElectronsWithUserData.userFloats,
mvaHZZIso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer16ULIdIsoValues"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mvaHZZIso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer16ULIdIsoValues"),
mvaHZZIso = "electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer16ULIdIsoValues",

mvaHZZIso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer16ULIdIsoValues"),
)
(~run2_nanoAOD_preUL & run2_egamma_2017).toModify(slimmedElectronsWithUserData.userFloats,
mvaHZZIso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer17ULIdIsoValues") ,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mvaHZZIso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer17ULIdIsoValues") ,
mvaHZZIso = "electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer17ULIdIsoValues" ,

mvaHZZIso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer17ULIdIsoValues") ,
)
(~run2_nanoAOD_preUL & run2_egamma_2018).toModify(slimmedElectronsWithUserData.userFloats,
mvaHZZIso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer18ULIdIsoValues"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mvaHZZIso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer18ULIdIsoValues"),
mvaHZZIso = "electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer18ULIdIsoValues",

@mariadalfonso
Copy link
Contributor

Similarly to what @perrotta pointed out, there are more places where the same cleanup is useful

modifier.toModify(slimmedElectronsUpdated, src = cms.InputTag("slimmedElectronsTo106X"))

(run2_egamma_2016 & tracker_apv_vfp30_2016).toModify(calibratedPatElectronsNano,
correctionFile = cms.string("EgammaAnalysis/ElectronTools/data/ScalesSmearings/Run2016_UltraLegacy_preVFP_RunFineEtaR9Gain")
)
(run2_egamma_2016 & ~tracker_apv_vfp30_2016).toModify(calibratedPatElectronsNano,
correctionFile = cms.string("EgammaAnalysis/ElectronTools/data/ScalesSmearings/Run2016_UltraLegacy_postVFP_RunFineEtaR9Gain")
)
run2_egamma_2017.toModify(calibratedPatElectronsNano,
correctionFile = cms.string("EgammaAnalysis/ElectronTools/data/ScalesSmearings/Run2017_24Feb2020_runEtaR9Gain_v2")
)
run2_egamma_2018.toModify(calibratedPatElectronsNano,
correctionFile = cms.string("EgammaAnalysis/ElectronTools/data/ScalesSmearings/Run2018_29Sep2020_RunFineEtaR9Gain")
)
run2_miniAOD_80XLegacy.toModify(calibratedPatElectronsNano,
correctionFile = cms.string("EgammaAnalysis/ElectronTools/data/ScalesSmearings/Legacy2016_07Aug2017_FineEtaR9_v3_ele_unc")
)
for modifier in run2_nanoAOD_94XMiniAODv1,run2_nanoAOD_94XMiniAODv2:
modifier.toModify(calibratedPatElectronsNano,
correctionFile = cms.string("EgammaAnalysis/ElectronTools/data/ScalesSmearings/Run2017_17Nov2017_v1_ele_unc")
)
run2_nanoAOD_102Xv1.toModify(calibratedPatElectronsNano,
correctionFile = cms.string("EgammaAnalysis/ElectronTools/data/ScalesSmearings/Run2018_Step2Closure_CoarseEtaR9Gain_v2")
)

run2_nanoAOD_94X2016.toModify(slimmedElectronsWithUserData.userIntFromBools,
# MVAs and HEEP are already pre-computed. Cut-based too (except V2), but we re-add it for consistency with the nested bitmap
cutbasedID_Sum16_veto = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Summer16-80X-V1-veto"),
cutbasedID_Sum16_loose = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Summer16-80X-V1-loose"),
cutbasedID_Sum16_medium = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Summer16-80X-V1-medium"),
cutbasedID_Sum16_tight = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Summer16-80X-V1-tight"),
cutbasedID_HLT = cms.InputTag("egmGsfElectronIDs:cutBasedElectronHLTPreselection-Summer16-V1"),
cutbasedID_Spring15_veto = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Spring15-25ns-V1-standalone-veto"),
cutbasedID_Spring15_loose = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Spring15-25ns-V1-standalone-loose"),
cutbasedID_Spring15_medium = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Spring15-25ns-V1-standalone-medium"),
cutbasedID_Spring15_tight = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Spring15-25ns-V1-standalone-tight"),
cutbasedID_Fall17_V2_veto = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Fall17-94X-V2-veto"),
cutbasedID_Fall17_V2_loose = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Fall17-94X-V2-loose"),
cutbasedID_Fall17_V2_medium = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Fall17-94X-V2-medium"),
cutbasedID_Fall17_V2_tight = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Fall17-94X-V2-tight"),
)
run2_miniAOD_80XLegacy.toModify(slimmedElectronsWithUserData.userFloats,
mvaSpring16GP = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Spring16GeneralPurposeV1Values"),
mvaSpring16HZZ = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Spring16HZZV1Values"),
)
run2_miniAOD_80XLegacy.toModify(slimmedElectronsWithUserData.userIntFromBools,
mvaSpring16GP_WP90 = cms.InputTag("egmGsfElectronIDs:mvaEleID-Spring16-GeneralPurpose-V1-wp90"),
mvaSpring16GP_WP80 = cms.InputTag("egmGsfElectronIDs:mvaEleID-Spring16-GeneralPurpose-V1-wp80"),
mvaSpring16HZZ_WPL = cms.InputTag("egmGsfElectronIDs:mvaEleID-Spring16-HZZ-V1-wpLoose"),
cutbasedID_Sum16_veto = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Summer16-80X-V1-veto"),
cutbasedID_Sum16_loose = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Summer16-80X-V1-loose"),
cutbasedID_Sum16_medium = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Summer16-80X-V1-medium"),
cutbasedID_Sum16_tight = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Summer16-80X-V1-tight"),
cutbasedID_HLT = cms.InputTag("egmGsfElectronIDs:cutBasedElectronHLTPreselection-Summer16-V1"),
cutbasedID_Spring15_veto = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Spring15-25ns-V1-standalone-veto"),
cutbasedID_Spring15_loose = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Spring15-25ns-V1-standalone-loose"),
cutbasedID_Spring15_medium = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Spring15-25ns-V1-standalone-medium"),
cutbasedID_Spring15_tight = cms.InputTag("egmGsfElectronIDs:cutBasedElectronID-Spring15-25ns-V1-standalone-tight"),
)
for modifier in run2_miniAOD_80XLegacy, run2_nanoAOD_94X2016:
modifier.toModify(slimmedElectronsWithUserData.userInts,
VIDNestedWPBitmapSpring15 = cms.InputTag("bitmapVIDForEleSpring15"),
VIDNestedWPBitmapSum16 = cms.InputTag("bitmapVIDForEleSum16"),

@mariadalfonso
Copy link
Contributor

@asculac
one more requests: the renaming of the FallV2 ID should be also done in the DQM
https://github.com/cms-sw/cmssw/blob/master/PhysicsTools/NanoAOD/python/nanoDQM_cfi.py#L86-L93

@clacaputo
Copy link
Contributor

@cmsbuild please test

Just refreshing the test results

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0e8389/25831/summary.html
COMMIT: e0f0c1c
CMSSW: CMSSW_12_5_X_2022-06-26-2300/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/38355/25831/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 83 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3659995
  • DQMHistoTests: Total failures: 42
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3659931
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 208 log files, 45 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@asculac
Copy link
Contributor Author

asculac commented Jul 11, 2022

94X2016 is not running in the runTheMatrix, this is why you do not see here but here https://gitlab.cern.ch/cms-nanoAOD/nanoAOD-integration/-/issues/153

Anyhow you need to roll back the changes of the InputTag for the 94X2016 modifier as done previously for the 8X.

for the backport we need in 12_4[ sorry for the typo creating confusion here https://github.com//pull/38355#issuecomment-1158576889 ]

Thank you for clearing it up. If this passes tests I'll immediately do the backport for 12_4_X

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38355/30985

  • This PR adds an extra 48KB to repository

  • There are other open Pull requests which might conflict with changes you have proposed:

@cmsbuild
Copy link
Contributor

Pull request #38355 was updated. @gouskos, @clacaputo, @cmsbuild, @fgolf, @jpata, @mariadalfonso can you please check and sign again.

@clacaputo
Copy link
Contributor

@cmsbuild please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0e8389/26139/summary.html
COMMIT: 4999e15
CMSSW: CMSSW_12_5_X_2022-07-11-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/38355/26139/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 87 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3655930
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3655900
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.317 KiB( 49 files compared)
  • DQMHistoSizes: changed ( 1325.81 ): -0.188 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 136.8523 ): -0.125 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 208 log files, 45 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@gouskos
Copy link
Contributor

gouskos commented Jul 17, 2022

+xpog
Modifications in line with PR description

@clacaputo
Copy link
Contributor

+reconstruction

  • RECO differences are related to the new variable "mvaHZZIso", not present in the reference release. This new var shifts some electronTable columns, introducing some artefact differences. In the table below you can see that the values of the already present variables (although the change in name) are the same
This PR Base ref
miniPFRelIso_chg miniPFRelIso_chg
1.8108803033828735 1.8108803033828735
mvaHZZIso mvaFall17V2Iso
-1.0 -0.999984622001648
mvaIso mvaFall17V2Iso_WP80
-0.999984622001648 0.0
mvaIso_WP80 mvaFall17V2Iso_WP90
0.0 0.0
mvaIso_WP90 mvaFall17V2Iso_WPL
0.0 0.0
mvaIso_WPL mvaFall17V2noIso
0.0 -0.9999901056289673
mvaNoIso mvaFall17V2noIso_WP80
-0.9999901056289673 0.0
mvaNoIso_WP80 mvaFall17V2noIso_WP90
0.0 0.0
mvaNoIso_WP90 mvaFall17V2noIso_WPL
0.0 0.0

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 72051cc into cms-sw:master Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants