Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Egamma PFID based on Tensorflow DNNs #35403

Merged
merged 24 commits into from Oct 18, 2021

Conversation

valsdav
Copy link
Contributor

@valsdav valsdav commented Sep 24, 2021

PR description:

This PR introduces the DNN based Egamma particle flow ID evaluation based on the Tensorflow internal support.
Two helper classes, ElectronDNNEstimator and PhotonDNNEstimator, are included to handle the loading and evaluation of the DNN models for GsfElectron and GEDPhotons candidates.

Multiple models are evaluated for electrons and photons in different detector and energy regions. Batch evaluation is implemented to call each Tensorflow session only 1 time for event. The evaluation of the DNN models is configurable from python config and txt files: the list of variables to be used, their scaling parameters, the model files can be changed without changes to the code.

The DNNs are evaluated before the electron and photons candidates pass through the PFEgammaFilters, where the DNN values are available in the candidates to be used for the selection. Thresholds configuration for the PFid have been added to the PFEGammaFilter producer.

In the PR, the DNN based PFid has been activated by default and default models files have been included for the testing. Final model file will be provided in a separate PR after the finalization on the CMSSW_12_0 samples.

Presentations and documentation:

PR validation:

Standard PF validation: https://akanugan.web.cern.ch/akanugan/PF_Validation_PFEGammaID_PR35403/
List of MINIAOD files and a few summary slides looking at the PF electrons & gedGsfElectrons in test vs reference are at:
https://cernbox.cern.ch/index.php/s/ki533KAwuWW1k3T

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35403/25508

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @valsdav (Davide Valsecchi) for master.

It involves the following packages:

  • DataFormats/EgammaCandidates (reconstruction)
  • DataFormats/ParticleFlowCandidate (reconstruction)
  • DataFormats/PatCandidates (reconstruction)
  • RecoEgamma/EgammaElectronAlgos (reconstruction)
  • RecoEgamma/EgammaElectronProducers (reconstruction)
  • RecoEgamma/EgammaIsolationAlgos (reconstruction)
  • RecoEgamma/EgammaPhotonProducers (reconstruction)
  • RecoEgamma/ElectronIdentification (reconstruction)
  • RecoEgamma/PhotonIdentification (reconstruction)
  • RecoParticleFlow/PFProducer (reconstruction)

@jpata, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@mmarionncern, @Sam-Harper, @sobhatta, @afiqaize, @jainshilpi, @cbernet, @rovere, @lgray, @gouskos, @lecriste, @hatakeyamak, @gpetruc, @wrtabb, @varuns23, @seemasharmafnal, @ram1123 this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor

slava77 commented Sep 24, 2021

it would be nice to squash a few minor commits in the history.
42 is a bit large to start.

@slava77
Copy link
Contributor

slava77 commented Sep 24, 2021

Files in RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/ should be removed from the commit history and instead proposed as a PR to https://github.com/cms-data/RecoEgamma-ElectronIdentification

I notice that the model files are as small as 15 kB. Is the model really so small/simple or are these "dummy" files?

@slava77
Copy link
Contributor

slava77 commented Sep 24, 2021

@cmsbuild please test

@slava77
Copy link
Contributor

slava77 commented Sep 24, 2021

abort test

@slava77
Copy link
Contributor

slava77 commented Sep 24, 2021

enable profiling

@slava77
Copy link
Contributor

slava77 commented Sep 24, 2021

@cmsbuild please test

@valsdav
Copy link
Contributor Author

valsdav commented Sep 24, 2021

HI @slava77

it would be nice to squash a few minor commits in the history.
42 is a bit large to start.

Of course, this can be done. Should I wait the end of the tests?

Files in RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/ should be removed from the commit history and instead proposed as a PR to https://github.com/cms-data/RecoEgamma-ElectronIdentification

I see, we can move them there.

I notice that the model files are as small as 15 kB. Is the model really so small/simple or are these "dummy" files?

They are so small because the models are quite simple (DNN with 128-64-64-32-16-1 nodes) and they have been saved in tensorflow graph format with the optimization performed by this nice tool from the CMS ML group: https://cms-ml.github.io/documentation/inference/tensorflow2.html#saving-your-model

@slava77
Copy link
Contributor

slava77 commented Sep 24, 2021

it would be nice to squash a few minor commits in the history.
42 is a bit large to start.

Of course, this can be done. Should I wait the end of the tests?

the build was already done, so the commits can be updated any time now.

@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests RelVals AddOn
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a0d08a/18898/summary.html
COMMIT: 035207b
CMSSW: CMSSW_12_1_X_2021-09-24-1100/slc7_amd64_gcc900
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/35403/18898/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found errors in the following unit tests:

---> test testRecoMETMETProducers had ERRORS
---> test testTauEmbeddingProducers had ERRORS

RelVals

----- Begin Fatal Exception 24-Sep-2021 17:43:19 CEST-----------------------
An exception of category 'InvalidGraphDef' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing module: class=GsfElectronProducer label='gedGsfElectronsTmp'
Exception Message:
error while loading graphDef from 'RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/lowpT/lowpT_modelDNN.pb': Not found: RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/lowpT/lowpT_modelDNN.pb; No such file or directory
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 24-Sep-2021 17:43:19 CEST-----------------------
An exception of category 'InvalidGraphDef' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing module: class=GsfElectronProducer label='gedGsfElectronsTmp'
Exception Message:
error while loading graphDef from 'RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/lowpT/lowpT_modelDNN.pb': Not found: RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/lowpT/lowpT_modelDNN.pb; No such file or directory
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 24-Sep-2021 17:43:19 CEST-----------------------
An exception of category 'InvalidGraphDef' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing module: class=GsfElectronProducer label='gedGsfElectronsTmp'
Exception Message:
error while loading graphDef from 'RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/lowpT/lowpT_modelDNN.pb': Not found: RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/lowpT/lowpT_modelDNN.pb; No such file or directory
----- End Fatal Exception -------------------------------------------------
Expand to see more relval errors ...

AddOn Tests

----- Begin Fatal Exception 24-Sep-2021 17:43:14 CEST-----------------------
An exception of category 'InvalidGraphDef' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing module: class=GsfElectronProducer label='gedGsfElectronsTmp'
Exception Message:
error while loading graphDef from 'RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/lowpT/lowpT_modelDNN.pb': Not found: RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/lowpT/lowpT_modelDNN.pb; No such file or directory
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 24-Sep-2021 17:43:14 CEST-----------------------
An exception of category 'InvalidGraphDef' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing module: class=GsfElectronProducer label='gedGsfElectronsTmp'
Exception Message:
error while loading graphDef from 'RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/lowpT/lowpT_modelDNN.pb': Not found: RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/lowpT/lowpT_modelDNN.pb; No such file or directory
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 24-Sep-2021 17:43:14 CEST-----------------------
An exception of category 'InvalidGraphDef' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing module: class=GsfElectronProducer label='gedGsfElectronsTmp'
Exception Message:
error while loading graphDef from 'RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/lowpT/lowpT_modelDNN.pb': Not found: RecoEgamma/ElectronIdentification/data/Ele_PFID_dnn/lowpT/lowpT_modelDNN.pb; No such file or directory
----- End Fatal Exception -------------------------------------------------
Expand to see more addon errors ...

tensorflow::setLogging(cfg_.log_level);
// load the graph definition
LogDebug("EleDNNPFid") << "Loading " << nModels_ << " graphs";
for (const auto& model_file : cfg_.models_files) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that these need to be FileInPath instead of strings; apparently the tests are not running from /src directory.

@hatakeyamak
Copy link
Contributor

Just tagging @akanugan here who is running some check of this PR in addition to jenkins.
BTW, he is seeing the same error as reported above from jenkins..

@cmsbuild
Copy link
Contributor

Pull request #35403 was updated. @jpata, @cmsbuild, @slava77 can you please check and sign again.

@jpata
Copy link
Contributor

jpata commented Oct 15, 2021

@cmsbuild please test

@jpata
Copy link
Contributor

jpata commented Oct 15, 2021

@valsdav please update the PR description to make sure that all the slides (latest PF validations etc) are linked there

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a0d08a/19675/summary.html
COMMIT: f45b48a
CMSSW: CMSSW_12_1_X_2021-10-15-2300/slc7_amd64_gcc900
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/35403/19675/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 5285 differences found in the comparisons
  • DQMHistoTests: Total files compared: 40
  • DQMHistoTests: Total histograms compared: 2751113
  • DQMHistoTests: Total failures: 3647
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2747444
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 39 files compared)
  • Checked 170 log files, 37 edm output root files, 40 DQM output files
  • TriggerResults: no differences found

@jpata
Copy link
Contributor

jpata commented Oct 18, 2021

+reconstruction

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

  • Needed externals for electrons and photons already merged

checkHcalStatus = cms.bool(True)
checkHcalStatus = cms.bool(True),
PhotonDNNPFid = cms.PSet(
enabled = cms.bool(False),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this meant to be enabled ? ref #42949

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its enabled via era, as it is meant to work only for run 3.

# Activate the Egamma PFID dnn only for Run3
from Configuration.Eras.Modifier_run3_common_cff import run3_common
run3_common.toModify(gedPhotonsTmp.PhotonDNNPFid,
enabled = True
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet