Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elide newCombinedSeeds in electron seeding #25199

Conversation

slava77
Copy link
Contributor

@slava77 slava77 commented Nov 11, 2018

instead of making a trivial merge-by-copy done in newCombinedSeeds, pass the inputs directly to the ElectronSeedProducer.
This PR avoids extra work spent in copying and reduces the memory utilization.

  • ElectronSeedProducer and its utility methods are updated to be able to take a vector of seed collections as input
  • newCombinedSeeds remains in the sequence definitions as a placeholder (it is not running in unscheduled mode because it is not consumed). I did not completely remove newCombinedSeeds yet pending some better understanding if the concept of combining seeds is useful in general (perhaps it is): it will be very easy to introduce a non-trivial seed combiner (e.g. with hit filter or other manipulation).

Inspired by observations of large memory use in HIHardProbes processing
https://slava77sk.web.cern.ch/slava77sk/reco/cgi-bin/igprof-navigator/CMSSW_10_3_1_patch1-orig.cmsRunGlibC.PromptReco_Run326383_HIHardProbes.IgProf.1.MEM_LIVE/20
This is on event 326383:1:51371 (at the end of the first event in the job).

Tested in CMSSW_10_3_1_patch1

  • Using the profile for one event as a reference of a bad case of memory use, as much as 5GB reduction in the peak can be expected. Given stochastic nature of getting a peak in multi-threaded execution, the reduction is probably smaller.
  • comparison of peaks in 8-thread execution shows 2.1GB reduction in the peak after 2h of running and 3.0 GiB after 21h (the job is still running as of submission time [numbers to be updated]).
  • short matrix tests pass and comparisons show no differences

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @slava77 (Slava Krutelyov) for master.

It involves the following packages:

RecoEgamma/EgammaElectronAlgos
RecoEgamma/EgammaElectronProducers

@perrotta, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@jainshilpi, @Sam-Harper, @varuns23, @rovere, @lgray this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor Author

slava77 commented Nov 11, 2018

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 11, 2018

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/31592/console Started: 2018/11/11 22:57

@cmsbuild
Copy link
Contributor

-1

Tested at: 2e345f4

You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-25199/31592/summary.html

I found follow errors while testing this PR

Failed tests: RelVals

  • RelVals:

The relvals timed out after 2 hours.
When I ran the RelVals I found an error in the following worklfows:
140.53 step2

runTheMatrix-results/140.53_RunHI2011+RunHI2011+RECOHID11+HARVESTDHI/step2_RunHI2011+RunHI2011+RECOHID11+HARVESTDHI.log

150.0 step1
runTheMatrix-results/150.0_HydjetQ_B12_5020GeV_2018+HydjetQ_B12_5020GeV_2018+DIGIHI2018+RECOHI2018+HARVESTHI2018/step1_HydjetQ_B12_5020GeV_2018+HydjetQ_B12_5020GeV_2018+DIGIHI2018+RECOHI2018+HARVESTHI2018.log

136.8311 step2
runTheMatrix-results/136.8311_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017/step2_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017.log

136.731 step4
runTheMatrix-results/136.731_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2/step4_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2.log

@cmsbuild
Copy link
Contributor

Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped)

@slava77
Copy link
Contributor Author

slava77 commented Nov 12, 2018

@cmsbuild please test

try again

A std::exception was thrown.
Can not get data (Additional Information: [frontier.c:1124]: No more proxies.

looks transient

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 12, 2018

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/31595/console Started: 2018/11/12 07:09

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-25199/31595/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 32
  • DQMHistoTests: Total histograms compared: 3013827
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3013629
  • DQMHistoTests: Total skipped: 197
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 31 files compared)
  • Checked 134 log files, 14 edm output root files, 32 DQM output files

@slava77
Copy link
Contributor Author

slava77 commented Nov 12, 2018

+1

for #25199 2e345f4

  • jenkins tests pass and comparisons show no differences as expected

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@fabiocos
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 524e287 into cms-sw:master Nov 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants