Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduced radom number calls in the runSimple Method of HGCDigitizerBase for Potential Speedup. #27980

Merged
merged 7 commits into from Sep 24, 2019

Conversation

adas1994
Copy link
Contributor

@adas1994 adas1994 commented Sep 11, 2019

PR description:

The runSimple member function of HGCDigitizerBase.cc contains SIMD-type instructions which runs on millions of channels, causing the whole digitization step to move slowly. I identified several conditional branches(if statements) and replaced them with boolean variables in order to limit the number of branching. This helps compiler to autovectorize the code to gain potential speedup.

PR validation:

I used workflow number 21234.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2026D44_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D44+RecoFullGlobal_2026D44+HARVESTFullGlobal_2026D44

if this PR is a backport please specify the original PR:

Before submitting your pull requests, make sure you followed this checklist:

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@adas1994 adas1994 changed the title Reduced radomnum calls Reduced radom number calls in the runSimple Method of HGCDigitizerBase for Potential Speedup. Sep 11, 2019
@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-27980/11861

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @adas1994 for master.

It involves the following packages:

SimCalorimetry/HGCalSimProducers

@cmsbuild, @civanch, @kpedro88, @mdhildreth can you please review it and eventually sign? Thanks.
@vandreev11, @sethzenz, @makortel, @kpedro88, @lgray, @cseez, @apsallid, @pfs, @deguio this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@adas1994
Copy link
Contributor Author

please test

1 similar comment
@kpedro88
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 11, 2019

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/2486/console Started: 2019/09/11 19:58

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-6919f9/2486/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 9762 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2957224
  • DQMHistoTests: Total failures: 7907
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2948976
  • DQMHistoTests: Total skipped: 341
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 145 log files, 15 edm output root files, 34 DQM output files

@@ -22,7 +22,7 @@
#include "Geometry/HcalTowerAlgo/interface/HcalGeometry.h"

#include "SimCalorimetry/HGCalSimAlgos/interface/HGCalSiNoiseMap.h"

#include "TRandom.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adas1994 why aren't you using the framework random number generator service?

void HGCDigitizerBase<DFr>::GenerateGaussianNoise(const double NoiseMean, const double NoiseStd) {
unsigned int seed = 123456;
seed = seed + SeedOffset_;
TRandom trandom(seed);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adas1994 why not using the framework service based on the defined engine, i.e. move here CLHEP::RandGaussQ::shoot(engine, 0.0, noiseWidth) ? Furthermore TRandom looks the generator of poorest quality in the ROOT family according to the documentation The generator provided in TRandom itself is a LCG (Linear Congruential Generator), the BSD rand generator, that it should not be used because its period is only 2**31, i.e. approximatly 2 billion events, that can be generated in just few seconds.

Copy link
Contributor Author

@adas1994 adas1994 Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

( @fabiocos )We were trying to see, if we can get to make HGCal simulate similar noise using random numbers from a much smaller pool of Gaussian random numbers and not generating a whole bunch of them for every channel for every event.

Consider this, at the end of the day, it is all just a bunch of Normal/Gaussian random numbers. And we are not generating billions of them so that they could cause problem due to its limited period.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adas1994 that has nothing to do with which pRNG should be used. Similar comments had been made on previous versions of this PR. Please use the CMSSW random number service exclusively.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kpedro88 @fabiocos The problem is that what needs to happen is that the noise collection needs to be generated in the constructor, before there are any streams to refer to. Here, the noise itself is generated with a fixed seed, so it is completely reproducible. The random number service is used later to choose noise values from this fixed array, so that is thread safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdhildreth I imagine that what you need here is to have a predefined noise collection before the first event noise is produced, which does not necessarily mean within the constructor. I understand the rationale behind this strategy, but I also see risks:

  • is this collection big enough to avoid reusing too frequently the same noise and create artificial correlations?

  • although strictly speaking your noise is reproducible reusing the same configuration, the change of random numbers among jobs in production are managed through the framework service, nobody will manually randomize the starting seed offset in your configuration, so you will end up in each single job with exactly the same noise, which I doubt is what we want. The configurable offsets will be practically useless, and you might probably structure things so as the noise array is static since every instance will end up using the same.

Provided this is really the way you want to go I can imagine a strategy to initialise the noise array within runSimple at the first function call, to be done with care in a multi-thread safe way. That would be a clean way to rely on the random number service at least, although reusing the same library.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fabiocos The idea is to make the library big enough so that it doesn't matter. It's just generating low-level single-channel noise, over millions of channels, using a gaussian distribution. We should make it big enough so it can't be seen in physics validation plots, of course. In terms of the random seed variation: it doesn't matter if this is randomized if the collection is big enough. Each event gets an appropriate random entry into the library in a thread-safe way. This should be sufficient.

We can look at runSimple to see if this is a better way. Thanks.

TRandom trandom(seed);
for (size_t i = 0; i < NoiseArrayLength_; i++) {
for (size_t j = 0; j < samplesize_; j++) {
GaussianNoiseArray_[i][j] = trandom.Gaus(NoiseMean, NoiseStd);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in principle there is also the fireArray method that can be used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have stored the random numbers generated by old method and hashed by our new method right before they are being used in runShaperToT and plotted a random sample from them to visualize any possible bias. I found these :
HGCDigisEE
HGCDigisHEfront
HFNoseDigis

.

@adas1994
Copy link
Contributor Author

adas1994 commented Sep 13, 2019 via email

@adas1994
Copy link
Contributor Author

This is a commit with a python configurable switch to decide which random noise generation method to use; old or modified method.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-27980/11937

  • This PR adds an extra 24KB to repository

@cmsbuild
Copy link
Contributor

Pull request #27980 was updated. @cmsbuild, @civanch, @kpedro88, @mdhildreth can you please check and sign again.

@fabiocos
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 19, 2019

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/2578/console Started: 2019/09/19 15:40

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-6919f9/2578/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 9545 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2957336
  • DQMHistoTests: Total failures: 9354
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2947641
  • DQMHistoTests: Total skipped: 341
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 145 log files, 15 edm output root files, 34 DQM output files

@adas1994
Copy link
Contributor Author

adas1994 commented Sep 23, 2019 via email

@civanch
Copy link
Contributor

civanch commented Sep 23, 2019

+1

@kpedro88
Copy link
Contributor

+upgrade

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@fabiocos
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit c7962f7 into cms-sw:master Sep 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants