Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removal of depreciated tauIDs #37091

Merged

Conversation

mbluj
Copy link
Contributor

@mbluj mbluj commented Feb 28, 2022

PR description:

This PR is to remove depreciated BDT-based tauIDs superseded by deepTauIDs. The removal is preformed at AOD and miniAOD steps, nanoAOD is adjusted. Subset of old IDs is, however, kept for boosted taus as there is not viable replacement for them, yet.
Expected changes:

  • Removal of "byIsolationMVArun2" and "againstElectronMVA6" tau discriminators from AOD and RECO event content, and from slimmedTaus in miniAOD.

PR validation:

Matrix tests (`runTheMatrix.py -l limited -i all --ibeos) successful

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37091/28584

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @mbluj for master.

It involves the following packages:

  • PhysicsTools/NanoAOD (xpog)
  • PhysicsTools/PatAlgos (reconstruction)
  • RecoTauTag/Configuration (reconstruction)
  • Validation/RecoTau (dqm)

@gouskos, @emanueleusai, @ahmad3213, @cmsbuild, @jfernan2, @fgolf, @clacaputo, @slava77, @jpata, @mariadalfonso, @pmandrik, @pbo0, @rvenditti can you please review it and eventually sign? Thanks.
@AlexDeMoor, @rappoccio, @gouskos, @jdolen, @swertz, @JyothsnaKomaragiri, @ahinzmann, @schoef, @emilbols, @jdamgov, @mbluj, @nhanvtran, @gkasieczka, @hatakeyamak, @gpetruc, @azotz, @mariadalfonso, @demuller, @andrzejnovak, @seemasharmafnal, @mmarionncern this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@mariadalfonso
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 1, 2022

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-50d770/22732/summary.html
COMMIT: 89ba780
CMSSW: CMSSW_12_3_X_2022-02-28-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37091/22732/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found errors in the following unit tests:

---> test runtestPhysicsToolsNanoAOD had ERRORS

Comparison Summary

Summary:

  • No significant changes to the logs found
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 3534 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3944604
  • DQMHistoTests: Total failures: 299
  • DQMHistoTests: Total nulls: 263
  • DQMHistoTests: Total successes: 3944020
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -20312.91 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBnewDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBoldDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerZTT_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBnewDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerZTT_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBoldDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByMediumIsolationMVArun2v1DBnewDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByMediumIsolationMVArun2v1DBoldDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVLooseIsolationMVArun2v1DBnewDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVLooseIsolationMVArun2v1DBoldDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVTightIsolationMVArun2v1DBnewDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVTightIsolationMVArun2v1DBoldDMwLT
  • DQMHistoSizes: changed ( 10024.0 ): ...
  • Checked 204 log files, 45 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@mbluj
Copy link
Contributor Author

mbluj commented Mar 1, 2022

The unit test fails due to problem with accessing input file - I checked and indeed the file is available only at tape. I can fix it, but need guidance which replacement should be used (preferably some file stored "permanently" for matrix tests or so).
What concerns differences in DQM they refer to removed tauIDs.

@jfernan2
Copy link
Contributor

jfernan2 commented Mar 1, 2022

Thanks @mbluj
The input file looks a bit old (94X) but I let @cms-sw/xpog-l2 to comment
About the DQM changes, with the removal of TauIDs, there are now (summary) plots which now get null entries, like:
https://tinyurl.com/ybjwzurd
and others where the number of labels in x-axis are reduced (since those IDs are not monitorized now):
https://tinyurl.com/ybtk3zx5

I wonder if this could be easily improved to avoid misunderstandings in the validation.
Thanks

@mbluj
Copy link
Contributor Author

mbluj commented Mar 1, 2022

Thanks @mbluj The input file looks a bit old (94X) but I let @cms-sw/xpog-l2 to comment About the DQM changes, with the removal of TauIDs, there are now (summary) plots which now get null entries, like: https://tinyurl.com/ybjwzurd and others where the number of labels in x-axis are reduced (since those IDs are not monitorized now): https://tinyurl.com/ybtk3zx5

I wonder if this could be easily improved to avoid misunderstandings in the validation. Thanks

Fixes to DQM other than purely technical, i.e. preventing it from failing, are beyond scope of this PR. The idea is to provide functional fixes, i.e. use of deepTauIDs in case of standard taus and MVAIso new/old DMs in case of boosted one, in other PR. This DQM PR is expected soon.

@jfernan2
Copy link
Contributor

jfernan2 commented Mar 1, 2022

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 5, 2022

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-50d770/22864/summary.html
COMMIT: 0e58eec
CMSSW: CMSSW_12_3_X_2022-03-04-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37091/22864/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 3541 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3931448
  • DQMHistoTests: Total failures: 300
  • DQMHistoTests: Total nulls: 263
  • DQMHistoTests: Total successes: 3930863
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -20322.154 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBnewDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBoldDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerZTT_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBnewDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.692 KiB RecoTauV/hpsPFTauProducerZTT_hpsPFTauDiscriminationByVVTightIsolationMVArun2v1DBoldDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByMediumIsolationMVArun2v1DBnewDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByMediumIsolationMVArun2v1DBoldDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVLooseIsolationMVArun2v1DBnewDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVLooseIsolationMVArun2v1DBoldDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVTightIsolationMVArun2v1DBnewDMwLT
  • DQMHistoSizes: changed ( 10024.0,... ): -16.612 KiB RecoTauV/hpsPFTauProducerQCD_hpsPFTauDiscriminationByVTightIsolationMVArun2v1DBoldDMwLT
  • DQMHistoSizes: changed ( 10024.0 ): ...
  • Checked 204 log files, 45 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@jfernan2
Copy link
Contributor

jfernan2 commented Mar 5, 2022

+1

@clacaputo
Copy link
Contributor

Reco changes in AOD are in line with the PR, while differences in miniAOD can be related to a known issue in the validate script #36179 (comment).
I'm checking them locally.

@clacaputo
Copy link
Contributor

I've checked for wf 10024.0 the differences in miniAOD for tauIDs. Let's take the slimmedTausBoosted__RECO_obj___tauIDs__19 shown in the picture below

all_mini_OldVSNew_TTbar13TeV2017wf10024p0c_patTaus_slimmedTausBoosted__RECO_obj___tauIDs__19__second

Comparing pT, eta, tauIDs[19], tauIDs[3] for PR+CMSSW_12_3_X_2022-03-04-2300 and CMSSW_12_3_X_2022-03-04-2300, we found that tauIDs[19] point to a different ID ( byIsolationMVArun2DBnewDMwLTraw Vs byIsolationMVArun2v1DBdR03oldDMwLTraw, see results below), while pT, eta, tauIDs[3] are the same

PR+CMSSW_12_3_X_2022-03-04-2300

tau pT   tau eta         tauIDs[19]                                                                            tauIDs[3]
77.55    -1.73   { "byIsolationMVArun2DBnewDMwLTraw", -0.990194f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 24.7089f }
38.81    -1.18   { "byIsolationMVArun2DBnewDMwLTraw", -0.947184f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 39.1803f }
29.63    -0.18   { "byIsolationMVArun2DBnewDMwLTraw", -0.884993f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 26.3342f }
19.34    -0.65   { "byIsolationMVArun2DBnewDMwLTraw", -0.980453f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 60.0140f }
36.12    -0.96   { "byIsolationMVArun2DBnewDMwLTraw", -0.991541f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 33.4097f }
36.42    -1.40   { "byIsolationMVArun2DBnewDMwLTraw", -0.568650f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 1.94070f }
105.37   0.57    { "byIsolationMVArun2DBnewDMwLTraw", -0.768153f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 13.4821f }
63.61    -0.18   { "byIsolationMVArun2DBnewDMwLTraw", -0.940415f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 22.8685f }
50.33    1.27    { "byIsolationMVArun2DBnewDMwLTraw", -0.0154546f }      { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 7.33329f }
32.36    1.25    { "byIsolationMVArun2DBnewDMwLTraw", -0.962710f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 10.9408f }
18.27    -1.33   { "byIsolationMVArun2DBnewDMwLTraw", -0.985566f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 46.5323f }
53.44    -0.92   { "byIsolationMVArun2DBnewDMwLTraw", -0.956164f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 27.0370f }
28.85    -0.90   { "byIsolationMVArun2DBnewDMwLTraw", -0.911898f }       { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 22.0325f }

CMSSW_12_3_X_2022-03-04-2300

tau pT   tau eta         tauIDs[19]                                                                              tauIDs[3]
77.55    -1.73   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 24.7089f }
38.81    -1.18   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 39.1803f }
29.63    -0.18   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 26.3342f }
19.34    -0.65   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 60.0140f }
36.12    -0.96   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 33.4097f }
36.42    -1.40   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -0.642849f }         { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 1.94070f }
105.37   0.57    { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -0.968865f }         { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 13.4821f }
63.61    -0.18   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -0.974198f }         { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 22.8685f }
50.33    1.27    { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -0.436733f }         { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 7.33329f }
32.36    1.25    { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 10.9408f }
18.27    -1.33   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -0.868038f }         { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 46.5323f }
53.44    -0.92   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -1.00000f }  { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 27.0370f }
28.85    -0.90   { "byIsolationMVArun2v1DBdR03oldDMwLTraw", -0.956641f }         { "byCombinedIsolationDeltaBetaCorrRaw3Hits", 22.0325f }

Summing up, the differences in miniAOD are in line with the PR

@clacaputo
Copy link
Contributor

+reconstruction

@mariadalfonso
Copy link
Contributor

@mbluj

there is some change in the idAntiEle2018 for the modifier run2_nanoAOD_106Xv2
is this intentional ?

Screen Shot 2022-03-07 at 23 21 57

@mbluj
Copy link
Contributor Author

mbluj commented Mar 8, 2022

@mbluj

there is some change in the idAntiEle2018 for the modifier run2_nanoAOD_106Xv2 is this intentional ?

Screen Shot 2022-03-07 at 23 21 57

In the previous version anti-e tauID was rerunon top of miniAOD, but not used. Instead anti-e from miniAOD was read. I fixed it, i.e. now the rerun ID is used which, I think, is a correct implementation. However, the difference between the two IDs should be small and I have not strong opinion what is better read from mini or rerun.

@mariadalfonso
Copy link
Contributor

@mbluj
there is some change in the idAntiEle2018 for the modifier run2_nanoAOD_106Xv2 is this intentional ?
Screen Shot 2022-03-07 at 23 21 57

In the previous version anti-e tauID was rerunon top of miniAOD, but not used. Instead anti-e from miniAOD was read. I fixed it, i.e. now the rerun ID is used which, I think, is a correct implementation. However, the difference between the two IDs should be small and I have not strong opinion what is better read from mini or rerun.

ok, then let's consider it as bug fix for the future Run2 nano (that will read the same mini as specified by the run2_nanoAOD_106Xv2 )

@mariadalfonso
Copy link
Contributor

+xpog

updates in view of Run3 + fix as in #37091 (comment)

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 8, 2022

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

perrotta commented Mar 8, 2022

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants