Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNN-based Tau-Id discrimians (94X) - Squashed version of PR 25385 #25621

Merged
merged 1 commit into from Jan 11, 2019

Conversation

fabiocos
Copy link
Contributor

@fabiocos fabiocos commented Jan 10, 2019

This PR is the squashed version of #25385 , kept to preserve the review history.
The unwanted addition/deletion of very large files is here removed.
A direct comparison with the merge of the original code on top of CMSSW_9_4_X shows no difference.

Building dpf isolation module

Adding in v1

Adding in runTauIDMVA for other users

making things fully reproducible

Reorganisation of configuration files: cff split to cfi and cff

Some code cleaning

adapt to cfi/cff reorganization

Review of DPF and DeepTauId code.

- Defined base class for deep tau discriminators.
- Removed weight files from home cms repository. Now using weights from cms-data.
- Defined WP for both discriminators. Now all discriminators return the corresponding WP results.
- Removed cfi files. Using fillDescriptions instead.
- General code review and cleaning.

Added example of a python configuration file to produce pat::Tau collection with the new Tau-Ids

requested changes on runDeepTauIDsOnMiniAOD.py

Clean runTauIdMVA.py tool and test config to run tauIDs

Made DeepTauId and DPFIsolation thread-safe

Finish implement thread-safe requirements on DPFIsolation

Disable DPFTau_2016_v1 and issue some warnings

- Implemented on runTauIdMVA the option to work with new training files quantized
- Added a new parameter 'version' on runTauIdMVA, used on DPFIsolation
- Changes on DeepTauId to reduce memory consumption

- Implementation of global cache to avoid reloading graph for each thread and reduce the memory consuption
- Creation of class DeepTauCache in DeepTauBase, in which now is created graph and session
- Implementation of two new static methods inside the class DeepTauBase: initializeGlobalCache and globalEndJob. The graph and DeepTauCache object are created now inside initializeGlobalCache

Applied changes on DeepTauBase to allow load new training files using memory mapping

Implemented TauWPThreshold class.

TauWPThreshold class parses WP cut string (or value) provided in the
python configuration. It is needed because the use of the standard
StringObjectFunction class to parse complex expression results in an
extensive memory usage (> 100 MB per expression).

Remove the qm.pb input files and leaving just the quantized and the original files

-Overall, changes to improve memory usage, among these are:
	- Implementation of global cache to avoid reloading graph for each thread
	- Creation of two new static methods inside the class DeepTauBase: initializeGlobalCache and globalEndJob. The graph and DeepTauCache object are created now inside initializeGlobalCache. The memory consumption of initializeGlobalCache for the original, quantized and files that are load using memory mapping method are in the memory_usage.pdf file
	- Implemented configuration to use new training files quantized, and set them as default
	- Implementation of configuration for load files using memory mapping. In our case there wasn't any improvement, respect at the memory consumption of this method, respect the quantized files, so this is not used, but set for future training files
- General code review and cleaning.

Applied style comments

Applied style comments

Applied comments

Change to be by default the original training file for deepTau, instead of the quantized

Applied commets of previus PR

cleaning code

Modification in the config to work with new label in files

Applied comment about the expected format of name of training file

Fix in last commit

Applied last comments

Remove assigning value of variable to itself

Applied @perrotta comments on 104X

Fix error

Applied comments

Applied comments

Fix merge problem

Fix cherry-pick issue

Applied a few commets

Applied more changes

Applied a few small followups

 Fixed error on DPFIsolation

Added clamp function, that only is called it only when the std::clamp is not defined

Update DPFIsolation.cc

- RecoTauTag/RecoTau/test/runDeepTauIDsOnMiniAOD.py: Added changes made in the commit 194a1d5 from the PR cms-sw#25016
- RecoTauTag/RecoTau/plugins/DeepTauId.cc: code cleaning
@cmsbuild
Copy link
Contributor

A new Pull Request was created by @fabiocos (Fabio Cossutti) for CMSSW_9_4_X.

It involves the following packages:

RecoTauTag/RecoTau

@perrotta, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@fabiocos
Copy link
Contributor Author

code-checks

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@fabiocos
Copy link
Contributor Author

@mbluj FYI

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-25621/7926

  • This PR adds an extra 24KB to repository

Code check has found code style and quality issues which could be resolved by applying a patch in https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-25621/7926/git-diff.patch
e.g. curl https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-25621/7926/git-diff.patch | patch -p1

You can run scram build code-checks to apply code checks directly

@perrotta
Copy link
Contributor

please test
(external cms-sw/cmsdist#4580 is already merged now)

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 11, 2019

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/32526/console Started: 2019/01/11 11:16

@perrotta
Copy link
Contributor

@fabiocos : if there are no other counter-indications, could you please merge this PR in 94 at your earliest convenience, so that @mbluj can go on with the update of #25488? Thx

@fabiocos
Copy link
Contributor Author

see my comment #25385 (comment) the problem does not appear in the code-checks of #25385 even if the code is identical (tested comparing the produced patches, and comparing branches after merging separately the PRs).

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-25621/32526/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 27
  • DQMHistoTests: Total histograms compared: 2721493
  • DQMHistoTests: Total failures: 104
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2721227
  • DQMHistoTests: Total skipped: 162
  • DQMHistoTests: Total Missing objects: 0

@perrotta
Copy link
Contributor

+1

  • This is identical to DNN-based Tau-Id discrimians (94X) #25385, which was already signed by reco in DNN-based Tau-Id discrimians (94X) #25385 (comment):
    • New DNN-based Tau-Id discriminants are made available for analyses: they don't run in official central workflows
    • (Same comment as in DNN-based Tau-Id discrimians #25016) Memory footprint is still rather large (O(100MB/evt)) , but nonetheless deemed acceptable for the usage this code is intended to: if and when it will be requested to be included in some central sequence meant to production some further assessment and study should be performed, as well as some deeper investigation of the timing performance (which from the first results does not seem to be a concern, anyhow)
    • Since central sequences used in production are not touched by this PR, its possible backport has no counterindications
    • Jenkins tests pass and show no differences
  • Signing this one just in case we want to follow the suggestion of @fabiocos in DNN-based Tau-Id discrimians (94X) #25385 (comment) (which I endorse)

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_9_4_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_10_5_X is complete. This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@fabiocos
Copy link
Contributor Author

+1

@cmsbuild cmsbuild merged commit 71c018f into cms-sw:CMSSW_9_4_X Jan 11, 2019
mbluj added a commit to mbluj/cmssw that referenced this pull request Jan 14, 2019
mbluj added a commit to mbluj/cmssw that referenced this pull request Jan 14, 2019
mbluj added a commit to mbluj/cmssw that referenced this pull request Jan 14, 2019
@fabiocos fabiocos deleted the fc-squash25385 branch January 24, 2019 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants