HEEP V7.0 in 90X #16913

Sam-Harper · 2016-12-08T17:24:00Z

Dear All,

It was decided by E/gamma to submit the HEEP V7.0 separately. So here is the pull request.

It adds the needed VID classes to calculate the cut, adds a function to EcalClusterTools too calculate the number of saturated crystals in the 5x5 region of the detector. It also adds a new class to make isolation from pat::PackedCandidates and a producer to make the isolation and nrSat crystal value maps to be picked up by the VID cuts.

This is not yet run in any default sequence (and nor is it intended to be just yet) and the goal of this is for it to get in 80X to ensure that analysers can use it easily. So this is a strict addition of things and does not change anything already running.

It has a weak dependence on #16777 in that it will crash when running on AOD without it. But its not run in standard sequences, only by analysers who will know to ensure that they have that PR. It will work fine for the miniAOD.

Best,
Sam

…to pass

cmsbuild · 2016-12-08T17:24:21Z

A new Pull Request was created by @Sam-Harper for CMSSW_9_0_X.

It involves the following packages:

PhysicsTools/SelectorUtils
RecoEcal/EgammaCoreTools
RecoEgamma/EgammaIsolationAlgos
RecoEgamma/ElectronIdentification

@cmsbuild, @cvuosalo, @slava77, @monttj, @davidlange6 can you please review it and eventually sign? Thanks.
@argiro, @rafaellopesdesa, @lgray this is something you requested to watch as well.
@slava77, @smuzaffar you are the release manager for this.

cms-bot commands are listed here #13028

cvuosalo · 2016-12-08T17:51:37Z

@cmsbuild please test

cmsbuild · 2016-12-08T17:51:54Z

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/16865/console Started: 2016/12/08 18:52

cmsbuild · 2016-12-08T19:22:19Z

-1

Tested at: 933ff5b

You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-16913/16865/summary.html

I found follow errors while testing this PR

Failed tests: UnitTests

Unit Tests:

I found errors in the following unit tests:

---> test runtestRecoEgammaElectronIdentification had ERRORS

cmsbuild · 2016-12-08T19:22:21Z

Comparison job queued.

Sam-Harper · 2016-12-08T19:38:11Z

Ah damn, sorry about this. The issue is that when re-making the tracker isolation for HEEP V70, it needs the IdealMagnetic field to make the PackedCandidates (specifically for the vertex info), ie this sequence needs to be in the config file:
process.load('Configuration.StandardSequences.MagneticField_cff')

Now we can fix this in two ways, if in any CMS workflow process.load('Configuration.StandardSequences.MagneticField_cff') will be already available we can simply fix the unit test by adding this into the unit tests's config.

The other way, we dont actually need this module to run for HEEP V6.0 (the value map producer is keyed off the word "HEEP" in the ID to run) that is currently in the release. Mainly as what is currently in the release is broken and should be ignored. So I could just tell it not to run for HEEPV6.0. I prefer this solution and will implement it as the default option if I dont hear anything.

cmsbuild · 2016-12-08T21:13:32Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-16913/16865/summary.html

The workflows 1003.0, 1001.0, 1000.0, 140.53, 136.731, 4.22 have different files in step1_dasquery.log than the ones found in the baseline. You may want to check and retrigger the tests if necessary. You can check it in the "files" directory in the results of the comparisons

cmsbuild · 2016-12-13T03:30:04Z

+1
Tested at: f11ac4a
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-16913/16982/summary.html

cmsbuild · 2016-12-13T03:30:07Z

Comparison job queued.

cmsbuild · 2016-12-13T05:13:00Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-16913/16982/summary.html

Dr15Jones · 2016-12-13T14:40:46Z

@Dr15Jones if there is a cyclic product dependency shouldn't the framework now stop the job or not even start running it?
The framework is meant to catch the problem and not run the job. Unfortunately, there is a known weakness in the algorithm I'm using. I have ideas on how to make it complete, but haven't had the time. Is there an example configuration someone can point me to which I could use for future testing?

cvuosalo · 2016-12-13T22:26:24Z

@Dr15Jones: The version of this PR corresponding to the Dec. 8 commit runs indefinitely in the RECO step of a standard workflow like 25202.0 with the change described here: #16913 (comment)

cvuosalo · 2016-12-15T21:02:24Z

+1

For #16913 f11ac4a

Adding capability to run new HEEP V7.0. By default, the old version is run as before. There should be no change in monitored quantities.

The code changes are satisfactory, and Jenkins tests against baseline CMSSW_9_0_X_2016-12-12-1100 show no significant differences, as expected. A test of workflow 25202.0_TTbar_13 for 200 events with the new HEEP V7.0 shows no significant differences and no change in the size of Reco event content. CPU time increases by the slightest amount, and memory usage increases marginally.

  delta/mean delta/orJob     original                   new       module name
  ---------- ------------     --------                  ----       ------------
       added      +0.00%         0.00 ms/ev ->         0.68 ms/ev heepIDVarValueMaps
  ---------- ------------     --------                  ----       ------------

Total time per event with DQM and Validation is about 20 s.

Memory:

Baseline
Max VSIZ 5029.42 on evt 193 ; max RSS 3479.64 on evt 137

PR
Max VSIZ 5090.72 on evt 199 ; max RSS 4003.46 on evt 174

davidlange6 · 2016-12-22T12:04:30Z

Hi @cvuosalo this 500MB increase in RSS is real? and if so, for what application?

slava77 · 2016-12-22T13:03:18Z

uhm, a marginal increase of 500MB.
With a few marginal increases like this our computing costs can double.

cvuosalo · 2016-12-22T15:44:11Z

@davidlange6: This memory measurement was for step 3 (RAW2DIGI_L1Reco_RECO_EI_PAT_VALIDATION_DQM) of workflow 25202.0 with HEEP V7.0 manually enabled. This result was from a single test, so there might be fluctuations with repeated tests.

Please note that this PR does not enable HEEP V7.0 in standard workflows. It only allows analysts to run it separately. The memory usage issues with HEEP V7.0 could be wrung out now before it is allowed in to CMSSW at all, or they could be addressed later at the time HEEP V7.0 would be proposed to be enabled in standard workflows.

Sam-Harper · 2016-12-22T15:49:42Z

HI all. I'm completely baffed by it taking 500mb of extra memory, it just calculates a few floats. How do I reproduce these studies for my own testing.

I had a look and I dont see a memory leak (and I've been running this offline a lot so I would have noticed). I also dont see me copying large collections. The only think I can think of is when the pat::PackedCandidate makes the pseudo track.

Dr15Jones · 2016-12-22T15:51:24Z

@cvuosalo how many threads were you running for the memory test? If it was more than one then that is another source of 'randomness' in the results.

cvuosalo · 2016-12-22T16:52:14Z

@Dr15Jones: I used the default number of threads, which should be one. I ran it in unscheduled mode, which is the default for runTheMatrix.py

cvuosalo · 2016-12-22T16:57:32Z

@Sam-Harper: Instructions for time and memory measurements are here:

https://twiki.cern.ch/twiki/bin/viewauth/CMS/RecoIntegration#pp_MC_Benchmark_setup_for_timing

More instructions are here:
https://twiki.cern.ch/twiki/bin/viewauth/CMS/RecoIntegration#Compare_timing

The memory usage measurement is explained in the "Using TimeEvent output" section.

Dr15Jones · 2016-12-29T17:35:01Z

#17102 now catches the circular data dependency and throws an exception which explains the problem.

Sam-Harper · 2017-01-02T13:44:11Z

Dear All,

I followed all the instructions in https://twiki.cern.ch/twiki/bin/viewauth/CMS/RecoIntegration#pp_MC_Benchmark_setup_for_timing and re-ran it for 200 events on file /store/relval/CMSSW_9_0_0_pre2/RelValTTbar_13/GEN-SIM-DIGI-RAW-HLTDEBUG/PU25ns_90X_mcRun2_asymptotic_v0-v1/10000/043953B3-A7C2-E611-A0DA-0CC47A4D75EC.root of /RelValTTbar_13/CMSSW_9_0_0_pre2-PU25ns_90X_mcRun2_asymptotic_v0-v1/GEN-SIM-DIGI-RAW-HLTDEBUG

I ran it in release CMSSW_9_0_X_2016-12-29-1100. I ran it 3 ways, the base release only, the base release + this PR and the base release + this PR + HEEP V7.0 enabled

I can see no different in the memory used, its within the precision of the method.

base: MAX VSIZE: 5167
base + PR : MAX VSIZE: 5169
base + PR + HEEP V7.0 enabled: MAX VSIZE: 5177

I measured the base release memory 6 times, VSIZE = 5177, 5171,5180,5175,5167,5171 so I consider the HEEP V7.0 marginal and within the bounds of uncertainty.

The log files are here for your own scrutiny:
www.cern.ch/sharper/cms/memtestForHEEPV70/baseRelease.log
www.cern.ch/sharper/cms/memtestForHEEPV70/heepV70PresentButNotEnabled.log
www.cern.ch/sharper/cms/memtestForHEEPV70/heepV70Enabled.log

davidlange6 · 2017-01-05T07:48:55Z

+1

davidlange6 · 2017-01-05T07:49:19Z

thanks @Sam-Harper

Sam-Harper · 2017-01-05T10:07:23Z

and thank you @davidlange6 and everybody else for scrutinising this. The absolute last thing I would have wanted to do is cause problems for the reco.

Now ultimately I would like this in 80X to make things easier for users. I recall the procedure is to wait to get this into a pre release and then back port it to 80X? Thanks.

davidlange6 · 2017-01-06T07:38:38Z

Right - we'll organize an 80x release discussion once people are back...

Sam-Harper added 12 commits November 30, 2016 20:27

removing jetcoreregional tracks from tracker isolation sum

5c3f873

making EleTrk isolation have the option to set which algos to reject

b572914

adding function to calculate number of saturated crystals in 5x5

c858424

add value map producer for HEEP ID

6c63e26

adding in HEEP V6.1 patch

c6bf880

changing track isol config name

644ffbd

adding 1st version of HEEP V6.0 in 80X

4e20840

adding 1st version of HEEP V6.0 in 80X

2cd9f45

adding in parameter set descriptions

333f7c9

fixing bug where it the saturation was causing it to auto fail not au…

3de6992

…to pass

VID Implimentation of HEEP V7.0

e8023cd

resetting the ElectronTkIsolation class to its default 80X state

933ff5b

cmsbuild added this to the Next CMSSW_9_0_X milestone Dec 8, 2016

cmsbuild added analysis-pending comparison-pending orp-pending pending-signatures reconstruction-pending tests-pending labels Dec 8, 2016

cmsbuild added tests-started and removed tests-pending labels Dec 8, 2016

cmsbuild added tests-rejected and removed tests-started labels Dec 8, 2016

cmsbuild added tests-started and removed tests-pending labels Dec 12, 2016

cmsbuild added tests-approved and removed tests-started labels Dec 13, 2016

cmsbuild added comparison-available and removed comparison-pending labels Dec 13, 2016

cmsbuild added reconstruction-approved and removed reconstruction-pending labels Dec 15, 2016

cmsbuild added orp-approved and removed orp-pending labels Jan 5, 2017

davidlange6 merged commit c803208 into cms-sw:CMSSW_9_0_X Jan 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HEEP V7.0 in 90X #16913

HEEP V7.0 in 90X #16913

Sam-Harper commented Dec 8, 2016

cmsbuild commented Dec 8, 2016

cvuosalo commented Dec 8, 2016

cmsbuild commented Dec 8, 2016 •

edited

cmsbuild commented Dec 8, 2016

cmsbuild commented Dec 8, 2016

Sam-Harper commented Dec 8, 2016

cmsbuild commented Dec 8, 2016

cmsbuild commented Dec 13, 2016

cmsbuild commented Dec 13, 2016

cmsbuild commented Dec 13, 2016

Dr15Jones commented Dec 13, 2016

cvuosalo commented Dec 13, 2016

cvuosalo commented Dec 15, 2016

davidlange6 commented Dec 22, 2016

slava77 commented Dec 22, 2016

cvuosalo commented Dec 22, 2016

Sam-Harper commented Dec 22, 2016

Dr15Jones commented Dec 22, 2016

cvuosalo commented Dec 22, 2016

cvuosalo commented Dec 22, 2016

Dr15Jones commented Dec 29, 2016

Sam-Harper commented Jan 2, 2017

davidlange6 commented Jan 5, 2017

davidlange6 commented Jan 5, 2017

Sam-Harper commented Jan 5, 2017

davidlange6 commented Jan 6, 2017

HEEP V7.0 in 90X #16913

HEEP V7.0 in 90X #16913

Conversation

Sam-Harper commented Dec 8, 2016

cmsbuild commented Dec 8, 2016

cvuosalo commented Dec 8, 2016

cmsbuild commented Dec 8, 2016 • edited

cmsbuild commented Dec 8, 2016

cmsbuild commented Dec 8, 2016

Sam-Harper commented Dec 8, 2016

cmsbuild commented Dec 8, 2016

cmsbuild commented Dec 13, 2016

cmsbuild commented Dec 13, 2016

cmsbuild commented Dec 13, 2016

Dr15Jones commented Dec 13, 2016

cvuosalo commented Dec 13, 2016

cvuosalo commented Dec 15, 2016

davidlange6 commented Dec 22, 2016

slava77 commented Dec 22, 2016

cvuosalo commented Dec 22, 2016

Sam-Harper commented Dec 22, 2016

Dr15Jones commented Dec 22, 2016

cvuosalo commented Dec 22, 2016

cvuosalo commented Dec 22, 2016

Dr15Jones commented Dec 29, 2016

Sam-Harper commented Jan 2, 2017

davidlange6 commented Jan 5, 2017

davidlange6 commented Jan 5, 2017

Sam-Harper commented Jan 5, 2017

davidlange6 commented Jan 6, 2017

cmsbuild commented Dec 8, 2016 •

edited