Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GlobalCache to DeepFlavourJetTagsProducer to improve startup perf… #22886

Merged

Conversation

wddgit
Copy link
Contributor

@wddgit wddgit commented Apr 6, 2018

…ormance

igprof profiles showed this module was taking significant
time at startup, mostly as it reads the configuration file
used to initialize its neural network (lots of memory
churn also, around 150 MB per stream). Without the
GlobalCache, the startup time is proportional to the number
of streams. This change moves the LightweightNeuralNetwork
into the global cache so there is only one and the initialization
time is no longer proportional to the number of streams.
LightweightNeuralNetwork is supposed to be thread safe
to use in this manner.

Some other minor cleanup also.

The intent is that after merging this PR the module will give
identical results as before. I tested simply by putting in print
statements for the output of the neural network and also all
the parameter values I moved around. Then I compared the
output before and after in a runTheMatrix test that runs it (25.0).
All was identical. I also ran under igprof and verified the initialization
time is now constant and not proportional to the number of streams.

…ormance

igprof profiles showed this module was taking significant
time at startup, most as it reads the configuration file
used to initialize its neural network (lots of memory
churn also). Without the GlobalCache the startup time
is proportional to the number of streams. This change
moves the LightweightNeuralNetwork into the global cache
so there is only one and the initialization time is
no longer proportional to the number of streams.
LightweightNeuralNetwork is supposed to be thread safe
to use in this manner.
@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2018

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2018

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2018

A new Pull Request was created by @wddgit (W. David Dagenhart) for master.

It involves the following packages:

RecoBTag/Combined

@perrotta, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@imarches, @acaudron, @JyothsnaKomaragiri, @mverzett, @ferencek, @pvmulder this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@wddgit
Copy link
Contributor Author

wddgit commented Apr 6, 2018

please test

FYI @Dr15Jones

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2018

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/27361/console Started: 2018/04/06 22:29

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2018

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2018

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2018

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-22886/27361/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 29
  • DQMHistoTests: Total histograms compared: 2504254
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2504077
  • DQMHistoTests: Total skipped: 176
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 1.04000000002 KiB( 22 files compared)
  • Checked 119 log files, 9 edm output root files, 29 DQM output files

@slava77
Copy link
Contributor

slava77 commented Apr 7, 2018

+1

for #22886 292f7ac

  • mostly a technical update. This is nStreams better than what we had, but it is still a factor of 2.5 worse than where it could be with the NN provided via EventSetup.
    • there are 5 instances running in full reco+miniAOD jobs (4 of these have the same NN now)
  • jenkins tests pass and comparisons with the baseline show no differences

Based on somewhat old memory performance tests, I confirm the memory churn size and note that the total RSS savings are not very large, expected to go down from 1.9 MB *nStreams to just 1.9 MB.
https://slava77sk.web.cern.ch/slava77sk/reco/cgi-bin/igprof-navigator/CMSSW_10_0_X_2017-12-17-2300-orig.step3.136.831.100.IgProf.1.MEM_LIVE/2572
https://slava77sk.web.cern.ch/slava77sk/reco/cgi-bin/igprof-navigator/CMSSW_10_0_X_2017-12-17-2300-orig.step3.136.831.100.IgProf.99.MEM_TOTAL/1528

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 7, 2018

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@fabiocos
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit f38bc64 into cms-sw:master Apr 10, 2018
@wddgit wddgit deleted the addGlobalCacheDeepFlavorJetTagsProducer branch July 31, 2018 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants