Removal of depreciated tauIDs #37091

mbluj · 2022-02-28T20:50:57Z

PR description:

This PR is to remove depreciated BDT-based tauIDs superseded by deepTauIDs. The removal is preformed at AOD and miniAOD steps, nanoAOD is adjusted. Subset of old IDs is, however, kept for boosted taus as there is not viable replacement for them, yet.
Expected changes:

Removal of "byIsolationMVArun2" and "againstElectronMVA6" tau discriminators from AOD and RECO event content, and from slimmedTaus in miniAOD.

PR validation:

Matrix tests (`runTheMatrix.py -l limited -i all --ibeos) successful

cmsbuild · 2022-02-28T20:59:06Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37091/28584

This PR adds an extra 40KB to repository
There are other open Pull requests which might conflict with changes you have proposed:
- File PhysicsTools/NanoAOD/python/nano_cff.py modified in PR(s): Refactoring gen weight storage in EDM + Nano integration #32167

cmsbuild · 2022-02-28T20:59:24Z

A new Pull Request was created by @mbluj for master.

It involves the following packages:

PhysicsTools/NanoAOD (xpog)
PhysicsTools/PatAlgos (reconstruction)
RecoTauTag/Configuration (reconstruction)
Validation/RecoTau (dqm)

@gouskos, @emanueleusai, @ahmad3213, @cmsbuild, @jfernan2, @fgolf, @clacaputo, @slava77, @jpata, @mariadalfonso, @pmandrik, @pbo0, @rvenditti can you please review it and eventually sign? Thanks.
@AlexDeMoor, @rappoccio, @gouskos, @jdolen, @swertz, @JyothsnaKomaragiri, @ahinzmann, @schoef, @emilbols, @jdamgov, @mbluj, @nhanvtran, @gkasieczka, @hatakeyamak, @gpetruc, @azotz, @mariadalfonso, @demuller, @andrzejnovak, @seemasharmafnal, @mmarionncern this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

mariadalfonso · 2022-03-01T07:09:58Z

please test

cmsbuild · 2022-03-01T11:02:32Z

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-50d770/22732/summary.html
COMMIT: 89ba780
CMSSW: CMSSW_12_3_X_2022-02-28-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37091/22732/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found errors in the following unit tests:

---> test runtestPhysicsToolsNanoAOD had ERRORS

Comparison Summary

Summary:

No significant changes to the logs found
ROOTFileChecks: Some differences in event products or their sizes found
Reco comparison results: 3534 differences found in the comparisons
DQMHistoTests: Total files compared: 49
DQMHistoTests: Total histograms compared: 3944604
DQMHistoTests: Total failures: 299
DQMHistoTests: Total nulls: 263
DQMHistoTests: Total successes: 3944020
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: -20312.91 KiB( 48 files compared)
DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBnewDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBoldDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerZTT_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBnewDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerZTT_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBoldDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByMediumIsolationMVArun2v1DBnewDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByMediumIsolationMVArun2v1DBoldDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVLooseIsolationMVArun2v1DBnewDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVLooseIsolationMVArun2v1DBoldDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVTightIsolationMVArun2v1DBnewDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVTightIsolationMVArun2v1DBoldDMwLT
DQMHistoSizes: changed ( 10024.0 ): ...
Checked 204 log files, 45 edm output root files, 49 DQM output files
TriggerResults: no differences found

mbluj · 2022-03-01T11:26:08Z

The unit test fails due to problem with accessing input file - I checked and indeed the file is available only at tape. I can fix it, but need guidance which replacement should be used (preferably some file stored "permanently" for matrix tests or so).
What concerns differences in DQM they refer to removed tauIDs.

jfernan2 · 2022-03-01T11:56:03Z

Thanks @mbluj
The input file looks a bit old (94X) but I let @cms-sw/xpog-l2 to comment
About the DQM changes, with the removal of TauIDs, there are now (summary) plots which now get null entries, like:
https://tinyurl.com/ybjwzurd
and others where the number of labels in x-axis are reduced (since those IDs are not monitorized now):
https://tinyurl.com/ybtk3zx5

I wonder if this could be easily improved to avoid misunderstandings in the validation.
Thanks

mbluj · 2022-03-01T12:17:56Z

Thanks @mbluj The input file looks a bit old (94X) but I let @cms-sw/xpog-l2 to comment About the DQM changes, with the removal of TauIDs, there are now (summary) plots which now get null entries, like: https://tinyurl.com/ybjwzurd and others where the number of labels in x-axis are reduced (since those IDs are not monitorized now): https://tinyurl.com/ybtk3zx5

I wonder if this could be easily improved to avoid misunderstandings in the validation. Thanks

Fixes to DQM other than purely technical, i.e. preventing it from failing, are beyond scope of this PR. The idea is to provide functional fixes, i.e. use of deepTauIDs in case of standard taus and MVAIso new/old DMs in case of boosted one, in other PR. This DQM PR is expected soon.

jfernan2 · 2022-03-01T12:20:36Z

+1

cmsbuild · 2022-03-05T13:08:27Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-50d770/22864/summary.html
COMMIT: 0e58eec
CMSSW: CMSSW_12_3_X_2022-03-04-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37091/22864/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

No significant changes to the logs found
ROOTFileChecks: Some differences in event products or their sizes found
Reco comparison results: 3541 differences found in the comparisons
DQMHistoTests: Total files compared: 49
DQMHistoTests: Total histograms compared: 3931448
DQMHistoTests: Total failures: 300
DQMHistoTests: Total nulls: 263
DQMHistoTests: Total successes: 3930863
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: -20322.154 KiB( 48 files compared)
DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBnewDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBoldDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerZTT_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBnewDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerZTT_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBoldDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByMediumIsolationMVArun2v1DBnewDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByMediumIsolationMVArun2v1DBoldDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVLooseIsolationMVArun2v1DBnewDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVLooseIsolationMVArun2v1DBoldDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVTightIsolationMVArun2v1DBnewDMwLT
DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVTightIsolationMVArun2v1DBoldDMwLT
DQMHistoSizes: changed ( 10024.0 ): ...
Checked 204 log files, 45 edm output root files, 49 DQM output files
TriggerResults: no differences found

jfernan2 · 2022-03-05T20:49:59Z

+1

clacaputo · 2022-03-07T15:52:26Z

Reco changes in AOD are in line with the PR, while differences in miniAOD can be related to a known issue in the validate script #36179 (comment).
I'm checking them locally.

clacaputo · 2022-03-07T18:38:35Z

I've checked for wf 10024.0 the differences in miniAOD for tauIDs. Let's take the slimmedTausBoosted__RECO_obj___tauIDs__19 shown in the picture below

Comparing pT, eta, tauIDs[19], tauIDs[3] for PR+CMSSW_12_3_X_2022-03-04-2300 and CMSSW_12_3_X_2022-03-04-2300, we found that tauIDs[19] point to a different ID ( byIsolationMVArun2DBnewDMwLTraw Vs byIsolationMVArun2v1DBdR03oldDMwLTraw, see results below), while pT, eta, tauIDs[3] are the same

PR+CMSSW_12_3_X_2022-03-04-2300

tau pT   tau eta         tauIDs[19]                                                                            tauIDs[3]
77.55    -1.73   { "byIsolationMVArun2DBnewDMwLTraw", -0.990194f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 24.7089f }
38.81    -1.18   { "byIsolationMVArun2DBnewDMwLTraw", -0.947184f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 39.1803f }
29.63    -0.18   { "byIsolationMVArun2DBnewDMwLTraw", -0.884993f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 26.3342f }
19.34    -0.65   { "byIsolationMVArun2DBnewDMwLTraw", -0.980453f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 60.0140f }
36.12    -0.96   { "byIsolationMVArun2DBnewDMwLTraw", -0.991541f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 33.4097f }
36.42    -1.40   { "byIsolationMVArun2DBnewDMwLTraw", -0.568650f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 1.94070f }
105.37   0.57    { "byIsolationMVArun2DBnewDMwLTraw", -0.768153f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 13.4821f }
63.61    -0.18   { "byIsolationMVArun2DBnewDMwLTraw", -0.940415f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 22.8685f }
50.33    1.27    { "byIsolationMVArun2DBnewDMwLTraw", -0.0154546f }      { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 7.33329f }
32.36    1.25    { "byIsolationMVArun2DBnewDMwLTraw", -0.962710f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 10.9408f }
18.27    -1.33   { "byIsolationMVArun2DBnewDMwLTraw", -0.985566f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 46.5323f }
53.44    -0.92   { "byIsolationMVArun2DBnewDMwLTraw", -0.956164f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 27.0370f }
28.85    -0.90   { "byIsolationMVArun2DBnewDMwLTraw", -0.911898f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 22.0325f }

CMSSW_12_3_X_2022-03-04-2300

tau pT   tau eta         tauIDs[19]                                                                              tauIDs[3]
77.55    -1.73   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 24.7089f }
38.81    -1.18   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 39.1803f }
29.63    -0.18   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 26.3342f }
19.34    -0.65   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 60.0140f }
36.12    -0.96   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 33.4097f }
36.42    -1.40   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -0.642849f }         { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 1.94070f }
105.37   0.57    { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -0.968865f }         { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 13.4821f }
63.61    -0.18   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -0.974198f }         { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 22.8685f }
50.33    1.27    { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -0.436733f }         { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 7.33329f }
32.36    1.25    { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 10.9408f }
18.27    -1.33   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -0.868038f }         { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 46.5323f }
53.44    -0.92   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 27.0370f }
28.85    -0.90   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -0.956641f }         { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 22.0325f }

Summing up, the differences in miniAOD are in line with the PR

clacaputo · 2022-03-07T18:42:21Z

+reconstruction

the PR remove depreciated BDT-based tauIDs superseded by deepTauIDs. A subset of old IDs is, however, kept for boosted taus
reco changes are in line with the PR, see Removal of depreciated tauIDs #37091 (comment)

mariadalfonso · 2022-03-07T22:25:18Z

@mbluj

there is some change in the idAntiEle2018 for the modifier run2_nanoAOD_106Xv2
is this intentional ?

mbluj · 2022-03-08T09:27:04Z

@mbluj

there is some change in the idAntiEle2018 for the modifier run2_nanoAOD_106Xv2 is this intentional ?

In the previous version anti-e tauID was rerunon top of miniAOD, but not used. Instead anti-e from miniAOD was read. I fixed it, i.e. now the rerun ID is used which, I think, is a correct implementation. However, the difference between the two IDs should be small and I have not strong opinion what is better read from mini or rerun.

mariadalfonso · 2022-03-08T09:46:16Z

@mbluj
there is some change in the idAntiEle2018 for the modifier run2_nanoAOD_106Xv2 is this intentional ?

In the previous version anti-e tauID was rerunon top of miniAOD, but not used. Instead anti-e from miniAOD was read. I fixed it, i.e. now the rerun ID is used which, I think, is a correct implementation. However, the difference between the two IDs should be small and I have not strong opinion what is better read from mini or rerun.

ok, then let's consider it as bug fix for the future Run2 nano (that will read the same mini as specified by the run2_nanoAOD_106Xv2 )

mariadalfonso · 2022-03-08T09:49:01Z

+xpog

updates in view of Run3 + fix as in #37091 (comment)

cmsbuild · 2022-03-08T09:49:25Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

perrotta · 2022-03-08T12:51:24Z

+1

mbluj added 6 commits February 28, 2022 17:12

remove depreciated BDT-based tauIDs RECO/mini

ce40731

Add BDT-based tauIDs to boosted-taus + cleaning

a83b3b9

Not use removed tauIDs in DQM

a3e1178

Adjust boostedTaus to changes in miniAOD for Run3

1f4ea7e

fixes in tau nanoAOD sequences

f00b216

Not use removed tauIDs in mini-based DQM

89ba780

cmsbuild added this to the CMSSW_12_3_X milestone Feb 28, 2022

cmsbuild added code-checks-pending dqm-pending orp-pending pending-signatures reconstruction-pending tests-pending xpog-pending labels Feb 28, 2022

cmsbuild added code-checks-approved and removed code-checks-pending labels Feb 28, 2022

cmsbuild added tests-started and removed tests-pending labels Mar 1, 2022

cmsbuild added tests-rejected and removed tests-started labels Mar 1, 2022

cmsbuild added dqm-approved and removed dqm-pending labels Mar 1, 2022

cmsbuild added the tests-started label Mar 5, 2022

cmsbuild added tests-approved and removed tests-started labels Mar 5, 2022

cmsbuild added dqm-approved and removed dqm-pending labels Mar 5, 2022

cmsbuild added reconstruction-approved and removed reconstruction-pending labels Mar 7, 2022

mariadalfonso mentioned this pull request Mar 8, 2022

new features for V10 nanoAOD cms-nanoAOD/cmssw#580

Closed

39 tasks

cmsbuild added fully-signed xpog-approved and removed pending-signatures xpog-pending labels Mar 8, 2022

cmsbuild added orp-approved and removed orp-pending labels Mar 8, 2022

cmsbuild merged commit 4ff9e43 into cms-sw:master Mar 8, 2022

cmsbuild mentioned this pull request Mar 9, 2022

[G4 VECGEOM] Geant4 update to 11.0.1 patch cms-sw/cmsdist#7678

Merged

mbluj mentioned this pull request Apr 4, 2022

Fixes to the tauID configuration tool #37456

Merged

mbluj deleted the CMSSW_12_3_X_tau-pog_tauIDcleaning branch October 10, 2023 10:02

mbluj mentioned this pull request Jan 26, 2024

Developers overriding production location for retrieving conditions in configurations #27393

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removal of depreciated tauIDs #37091

Removal of depreciated tauIDs #37091

mbluj commented Feb 28, 2022

cmsbuild commented Feb 28, 2022

cmsbuild commented Feb 28, 2022

mariadalfonso commented Mar 1, 2022

cmsbuild commented Mar 1, 2022

mbluj commented Mar 1, 2022

jfernan2 commented Mar 1, 2022

mbluj commented Mar 1, 2022

jfernan2 commented Mar 1, 2022

cmsbuild commented Mar 5, 2022

jfernan2 commented Mar 5, 2022

clacaputo commented Mar 7, 2022

clacaputo commented Mar 7, 2022

clacaputo commented Mar 7, 2022

mariadalfonso commented Mar 7, 2022

mbluj commented Mar 8, 2022 •

edited

mariadalfonso commented Mar 8, 2022

mariadalfonso commented Mar 8, 2022

cmsbuild commented Mar 8, 2022

perrotta commented Mar 8, 2022

Removal of depreciated tauIDs #37091

Removal of depreciated tauIDs #37091

Conversation

mbluj commented Feb 28, 2022

PR description:

PR validation:

cmsbuild commented Feb 28, 2022

cmsbuild commented Feb 28, 2022

mariadalfonso commented Mar 1, 2022

cmsbuild commented Mar 1, 2022

Unit Tests

Comparison Summary

mbluj commented Mar 1, 2022

jfernan2 commented Mar 1, 2022

mbluj commented Mar 1, 2022

jfernan2 commented Mar 1, 2022

cmsbuild commented Mar 5, 2022

Comparison Summary

jfernan2 commented Mar 5, 2022

clacaputo commented Mar 7, 2022

clacaputo commented Mar 7, 2022

clacaputo commented Mar 7, 2022

mariadalfonso commented Mar 7, 2022

mbluj commented Mar 8, 2022 • edited

mariadalfonso commented Mar 8, 2022

mariadalfonso commented Mar 8, 2022

cmsbuild commented Mar 8, 2022

perrotta commented Mar 8, 2022

mbluj commented Mar 8, 2022 •

edited