Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make workspaces in HGCDigitizer only depend on signal size #16336

Merged
merged 1 commit into from Oct 27, 2016

Conversation

lgray
Copy link
Contributor

@lgray lgray commented Oct 24, 2016

Change the HGCal digitizers such that only bx-vectors that contain signal are stored.
In the case of an empty detid, a single zeroed out bx-vector is used.

This reduces memory in 0 PU from avg 6.4 GB in 4-thread jobs to 2.5 GB.
This reduces memory in 200PU from peak 9.2GB in 4-thread jobs to 7.2 GB.

@slava77 would you mind running your usual tests on this PR to make sure things are OK, despite it being a SIM-only PR. I've checked to make sure everything looks sane but would like some deeper analysis.

@lgray
Copy link
Contributor Author

lgray commented Oct 24, 2016

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 24, 2016

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/15935/console

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @lgray (Lindsey Gray) for CMSSW_8_1_X.

It involves the following packages:

SimCalorimetry/HGCalSimProducers

@cmsbuild, @civanch, @mdhildreth, @davidlange6 can you please review it and eventually sign? Thanks.
@cseez, @vandreev11, @sethzenz, @pfs, @kpedro88 this is something you requested to watch as well.
@slava77, @smuzaffar you are the release manager for this.

cms-bot commands are listed here #13028

@lgray
Copy link
Contributor Author

lgray commented Oct 24, 2016

assign reco

@lgray
Copy link
Contributor Author

lgray commented Oct 24, 2016

oh well, I guess I am not special enough

@slava77
Copy link
Contributor

slava77 commented Oct 24, 2016

assign reconstruction

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

@civanch
Copy link
Contributor

civanch commented Oct 25, 2016

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_8_1_X IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @slava77, @davidlange6, @smuzaffar

@lgray
Copy link
Contributor Author

lgray commented Oct 25, 2016

@davidlange6 please wait for Slava to sign off too. Thanks!

@slava77
Copy link
Contributor

slava77 commented Oct 25, 2016

On 10/25/16 9:33 AM, Lindsey Gray wrote:

@davidlange6 https://github.com/davidlange6 please wait for Slava to
sign off too. Thanks!

I'm traveling today.
So, my check may come only tomorrow.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#16336 (comment), or
mute the thread
https://github.com/notifications/unsubscribe-auth/AEdcbnLQtO6oskeR_9Z5kKl2fbZMfFHHks5q3i9LgaJpZM4KfTmE.

@lgray
Copy link
Contributor Author

lgray commented Oct 25, 2016

This one isn't incredibly urgent, so that is fine.

On Tue, Oct 25, 2016 at 1:47 PM, Slava Krutelyov notifications@github.com
wrote:

On 10/25/16 9:33 AM, Lindsey Gray wrote:

@davidlange6 https://github.com/davidlange6 please wait for Slava to
sign off too. Thanks!

I'm traveling today.
So, my check may come only tomorrow.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#16336 (comment), or
mute the thread
<https://github.com/notifications/unsubscribe-auth/AEdcbnLQtO6oskeR_
9Z5kKl2fbZMfFHHks5q3i9LgaJpZM4KfTmE>.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#16336 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABBMOe8iiLvu3i2ZRgWrzUsjHL_214EEks5q3k7GgaJpZM4KfTmE
.

@davidlange6
Copy link
Contributor

assign reconstruction

@cmsbuild
Copy link
Contributor

New categories assigned: reconstruction

@cvuosalo,@slava77 you have been requested to review this Pull request/Issue and eventually sign? Thanks

@slava77
Copy link
Contributor

slava77 commented Oct 26, 2016

This reduces memory in 200PU from peak 9.2GB in 4-thread jobs to 7.2 GB.
my pre15 baseline step3 from 22624 takes about 62 GB.
What did you use to estimate the 200PU mem?

My guess is that the size of the input MinBias matters (my test had 4 input minbias files from relval, with net total of about 30K events).

@lgray
Copy link
Contributor Author

lgray commented Oct 26, 2016

Hi Slava, the quoted decrease is for step2! Where in step3 is using most of the 62GB? How many threads?

I was using the full list of minbias files from pre15 D3Timing.

@slava77
Copy link
Contributor

slava77 commented Oct 26, 2016

On 10/26/16 6:39 AM, Lindsey Gray wrote:

Hi Slava, the quoted decrease is for step2! Where in step3 is using most
of the 64GB? How many threads?

16 threads

step2 was up to 24GB, which is somewhat consistent with your "before"
numbers.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#16336 (comment), or
mute the thread
https://github.com/notifications/unsubscribe-auth/AEdcboiwehELf6X8QuHDa3Yl-dXtyg-Eks5q3zwOgaJpZM4KfTmE.

@slava77
Copy link
Contributor

slava77 commented Oct 26, 2016

step2 PU200 in 22624.0 memory based on MemoryCheck (post-event memory), using 16-thread job (event number reference is weak, given that it's detected by "Begin processing" and not actual event end):

  • baseline 810pre15: Max VSIZ 55141 on evt 85 ; max RSS 23375.5 on evt 58
  • this PR: Max VSIZ 47484.1 on evt 90 ; max RSS 24330.4 on evt 1

So, whatever was done here to minimize memory does not show up in this metric.
I can imagine that if you used per-module check of MemoryCheck (process.SimpleMemoryCheck.oncePerEventMode=False, the default value) maybe at that level the peak is what's reported in the PR description.

@slava77
Copy link
Contributor

slava77 commented Oct 27, 2016

+1

for #16336 8728bd1

  • here to check effects on reco quantities

looking at the same 22624 with PU200 100 events, I do not see any significant changes, beyond random variations.

... there is some decrease in the number of hits in BH, FH and EE, in the range of 1-2%, primarily at low energy, but it also matches generalTracks yield change. This is apparently random. The same signal events were used (from the logs), but minbias mix sequence was apparently different (speculation, I didn't check it, but what else).

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_8_1_X IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @slava77, @davidlange6, @smuzaffar

@slava77
Copy link
Contributor

slava77 commented Oct 27, 2016

about step3 memory, looking with single core:

  • mix: starts event with ~3.4GB, finishes processing event with 6.1 GB
  • .. some "minor" increases continue
  • FEVTDEBUGHLToutput: starts at 6.6 GB and ends at 7.3GB

So, mix replay appears to be the most costly. Writing FEVTDEBUGHLT is the next in line. The latter has cost impact both on run time memory and output on disk.

@davidlange6
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 5f43103 into cms-sw:CMSSW_8_1_X Oct 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants