DNN-based Tau-Id discrimians (102X) #25386

mbluj · 2018-11-30T13:54:29Z

This pull request provides two new DNN-based Tau-Ids, DeepTau and DPFTau, to be produced for pat::Taus with MiniAOD.
It is a backport of #25016 to 102X for analyses based on 2018 data and detailed description can be found therein.

- Defined base class for deep tau discriminators. - Removed weight files from home cms repository. Now using weights from cms-data. - Defined WP for both discriminators. Now all discriminators return the corresponding WP results. - Removed cfi files. Using fillDescriptions instead. - General code review and cleaning.

…ection with the new Tau-Ids

DNN tau IDs for CMSSW_10_2_X

…es quantized - Added a new parameter 'version' on runTauIdMVA, used on DPFIsolation - Changes on DeepTauId to reduce memory consumption

…read and reduce the memory consuption - Creation of class DeepTauCache in DeepTauBase, in which now is created graph and session - Implementation of two new static methods inside the class DeepTauBase: initializeGlobalCache and globalEndJob. The graph and DeepTauCache object are created now inside initializeGlobalCache

… memory mapping

TauWPThreshold class parses WP cut string (or value) provided in the python configuration. It is needed because the use of the standard StringObjectFunction class to parse complex expression results in an extensive memory usage (> 100 MB per expression).

…riginal files

- Implementation of global cache to avoid reloading graph for each thread - Creation of two new static methods inside the class DeepTauBase: initializeGlobalCache and globalEndJob. The graph and DeepTauCache object are created now inside initializeGlobalCache. The memory consumption of initializeGlobalCache for the original, quantized and files that are load using memory mapping method are in the memory_usage.pdf file - Implemented configuration to use new training files quantized, and set them as default - Implementation of configuration for load files using memory mapping. In our case there wasn't any improvement, respect at the memory consumption of this method, respect the quantized files, so this is not used, but set for future training files - General code review and cleaning.

…ad of the quantized

#105 and #106 from 94X to 104X

smuzaffar · 2018-12-09T04:03:13Z

hold
this PR history contains large data files.

cmsbuild · 2018-12-09T04:04:36Z

Pull request has been put on hold by @smuzaffar
They need to issue an unhold command to remove the hold state or L1 can unhold it for all

perrotta · 2018-12-10T12:44:16Z

@smuzaffar , if I understand it correctly, your request in practice is to rebase these pull requests: correct?

kandrosov · 2018-12-10T13:06:42Z

@smuzaffar The PR #25016 has already been merged. It should include the same big files in history as in this PR. Therefore, merging this PR should not consume more space, since the big files are already part of the history in the central repository.
To solve the problem, one can merge this PR (and similar backport PR in 94X) and then manually remove the big files from the history in the central repository.

smuzaffar · 2018-12-10T13:50:28Z

@kandrosov, It was mistake on our part that #25016 was merged with those big files in the history and now we get warnings like these [a]. We are going to rebase master branch to clean the history and remove ref to these big files.

For this back ports PR I would suggest to rebase ( and re-write the history , see detaile here https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History and edit the commits which add these files)

[a]

remote: warning: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
remote: warning: See http://git.io/iEPt8g for more information.
remote: warning: File RecoTauTag/RecoTau/data/DPFIsolation_2017v0.pb is 50.10 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB
remote: warning: File RecoTauTag/RecoTau/data/deepTau_2017v1_20L1024N.pb is 80.98 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB

smuzaffar · 2018-12-10T13:52:06Z

I would suggest to clean the history and remove ref to these files [a]. These were added and later deleted in the PR.

ERROR: RecoTauTag/RecoTau/data/DPFIsolation_2017v1.pb ['Added', 'Deleted']
ERROR: RecoTauTag/RecoTau/test/runTauIdMVA.py ['Added', 'Deleted']
ERROR: RecoTauTag/RecoTau/data/deepTau_2017v1_20L1024N.pb ['Added', 'Deleted']
ERROR: RecoTauTag/RecoTau/python/DPFIsolation_cfi.py ['Added', 'Deleted']
ERROR: RecoTauTag/RecoTau/python/DeepTauId_cfi.py ['Added', 'Deleted']
ERROR: RecoTauTag/RecoTau/data/DPFIsolation_2017v0.pb ['Added', 'Deleted']
ERROR: RecoTauTag/RecoTau/python/runTauIdMVA.py ['Added', 'Modified', 'Deleted']

mbluj · 2018-12-11T17:50:57Z

Hello,
as I never tried to rewrite git history (which sounds danger) I would like ask for confirmation if I got correctly what should be done. What I understand is following

I checkout the PR'ed branch from the repository from which I PR'ed to official CMSSW, e.g. like this

git cms-init -X repo-name
git checkout branch-name

Then, I clean history of the branch as follows

git filter-branch --tree-filter 'rm -f file1-to-remove, file2-to-remove ... ' branch-name

Finally, I should push the branch with modified history to remote (will it be possible?).

Is it what you mean?
This should be repeated also for 94X backport.

fabiocos · 2018-12-12T09:13:43Z

@mbluj as this and the 9_4_X are backports with no special review in the middle of the history, unless keeping the long list of commits has a value for you I think it might be simpler to follow the advice https://cms-sw.github.io/faq.html#how-do-i-collapse-multiple-commits-into-one

@smuzaffar this is what we were discussing as a possibility to trim commit histories, right?

mbluj · 2018-12-12T09:20:41Z

@fabiocos @smuzaffar I have no problem with rewriting a full history - it takes quite some time, but I can do it in background. However, I would like to understand if procedure I described is correct one (to not mess things).

smuzaffar · 2018-12-12T10:22:44Z

unhold
I am happy with the squash

cmsbuild · 2018-12-12T10:24:36Z

This pull request is fully signed and it will be integrated in one of the next CMSSW_10_2_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_10_4_X is complete. This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

fabiocos · 2018-12-12T10:32:39Z

@mbluj I can just squash myself the history at merge time, if that is ok for you

mbluj · 2018-12-12T11:03:58Z

@mbluj I can just squash myself the history at merge time, if that is ok for you

Yes, great! Thank you.

fabiocos · 2018-12-12T12:19:55Z

@smuzaffar if I add "+1" will this confuse the bot, or trigger any further action on his side? Just to update the label status for the record..

smuzaffar · 2018-12-12T12:23:58Z

it should be fine to do +1 now. Bot should only update the labels.

fabiocos · 2018-12-12T12:27:12Z

+1

@smuzaffar ok thanks

kandrosov and others added 30 commits October 27, 2018 21:23

First implementation of deep tau id.

ca4ab8d

Building dpf isolation module

32dd010

Adding in v1

b9d7a01

Adding in runTauIDMVA for other users

8ba178f

making things fully reproducible

0d5af0b

Reorganisation of configuration files: cff split to cfi and cff

c439bfe

Some code cleaning

12271fa

adapt to cfi/cff reorganization

84cfb99

Added example of a python configuration file to produce pat::Tau coll…

8e297d9

…ection with the new Tau-Ids

requested changes on runDeepTauIDsOnMiniAOD.py

41e4edc

Clean runTauIdMVA.py tool and test config to run tauIDs

e1f055d

Made DeepTauId and DPFIsolation thread-safe

0417afd

Finish implement thread-safe requirements on DPFIsolation

d854212

Disable DPFTau_2016_v1 and issue some warnings

efc0540

Merge pull request #103 from MRD2F/CMSSW_10_2_X_tau_pog_DNNTauID

104d5b7

DNN tau IDs for CMSSW_10_2_X

Remove assigning value of variable to itself

7df8fdb

- Implemented on runTauIdMVA the option to work with new training fil…

8c3fb20

…es quantized - Added a new parameter 'version' on runTauIdMVA, used on DPFIsolation - Changes on DeepTauId to reduce memory consumption

Applied changes on DeepTauBase to allow load new training files using…

c34d583

… memory mapping

Remove the qm.pb input files and leaving just the quantized and the o…

ea94956

…riginal files

Applied style comments

5d4d15a

Applied style comments

a82f820

Applied comments

b595d46

Change to be by default the original training file for deepTau, inste…

ac1ce8e

…ad of the quantized

Changes regarding forward-porting DNN-related developments from the PRs

505f4e8

#105 and #106 from 94X to 104X

Applied commets of previus PR

cdce09b

cleaning code

6cb1168

cmsbuild added hold pending-signatures and removed fully-signed labels Dec 9, 2018

cmsbuild added fully-signed and removed hold pending-signatures labels Dec 12, 2018

fabiocos merged commit afd5e8e into cms-sw:CMSSW_10_2_X Dec 12, 2018

fabiocos mentioned this pull request Dec 12, 2018

Added RecoTauTag-TrainingFiles V00-01-00 cms-sw/cmsdist#4554

Merged

cmsbuild added orp-approved and removed orp-pending labels Dec 12, 2018

This was referenced Dec 12, 2018

Backport #4554 to 10_2 gcc7 cms-sw/cmsdist#4579

Merged

GitHub "squash and merge" not tracked in IB history cms-sw/cms-bot#1069

Closed

Add to IB/release history PR 25386 DNN-based Tau-Id discrimians (102X) #25489

Merged

mbluj deleted the CMSSW_10_2_X_tau_pog_DNNTauIDs branch October 10, 2023 10:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DNN-based Tau-Id discrimians (102X) #25386

DNN-based Tau-Id discrimians (102X) #25386

mbluj commented Nov 30, 2018

smuzaffar commented Dec 9, 2018

cmsbuild commented Dec 9, 2018

perrotta commented Dec 10, 2018

kandrosov commented Dec 10, 2018

smuzaffar commented Dec 10, 2018

smuzaffar commented Dec 10, 2018

mbluj commented Dec 11, 2018

fabiocos commented Dec 12, 2018

mbluj commented Dec 12, 2018

smuzaffar commented Dec 12, 2018

cmsbuild commented Dec 12, 2018

fabiocos commented Dec 12, 2018

mbluj commented Dec 12, 2018

fabiocos commented Dec 12, 2018

smuzaffar commented Dec 12, 2018

fabiocos commented Dec 12, 2018

DNN-based Tau-Id discrimians (102X) #25386

DNN-based Tau-Id discrimians (102X) #25386

Conversation

mbluj commented Nov 30, 2018

smuzaffar commented Dec 9, 2018

cmsbuild commented Dec 9, 2018

perrotta commented Dec 10, 2018

kandrosov commented Dec 10, 2018

smuzaffar commented Dec 10, 2018

smuzaffar commented Dec 10, 2018

mbluj commented Dec 11, 2018

fabiocos commented Dec 12, 2018

mbluj commented Dec 12, 2018

smuzaffar commented Dec 12, 2018

cmsbuild commented Dec 12, 2018

fabiocos commented Dec 12, 2018

mbluj commented Dec 12, 2018

fabiocos commented Dec 12, 2018

smuzaffar commented Dec 12, 2018

fabiocos commented Dec 12, 2018