Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

option to use deterministic event-based seed in SmearedJetProducer (for MT) #20240

Merged
merged 4 commits into from Sep 6, 2017

Conversation

kpedro88
Copy link
Contributor

@kpedro88 kpedro88 commented Aug 22, 2017

While upgrade my ntuple production code to use multithreading, I noticed the output jet collections differed for MC samples. I found the cause to be the use of random numbers in SmearedPATJetProducer. The solution is to use a deterministic seed based on the event number (thanks to @Dr15Jones for the suggestion). I followed the approach from https://github.com/cms-sw/cmssw/blob/CMSSW_9_3_0_pre4/RecoJets/JetProducers/plugins/VirtualJetProducer.cc#L272 (in a simplified manner). This resolved the regression I observed in my ntuple code when running with multiple threads.

This PR will be backported to 80X. Questions for jet people (e.g. @rappoccio, @blinkseb, @zdemirag):

  1. Currently I have the useDeterministicSeed option disabled by default in the fillDescriptions(), so users would have to activate it manually. Should I enable it by default instead? This would change the default output. Edit: it was changed to be enabled by default.
  2. Are other backports desired? (e.g. 92X)

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @kpedro88 (Kevin Pedro) for master.

It involves the following packages:

PhysicsTools/PatUtils

@perrotta, @cmsbuild, @monttj, @slava77 can you please review it and eventually sign? Thanks.
@TaiSakuma, @gouskos, @imarches, @ahinzmann, @acaudron, @mmarionncern, @rappoccio, @jdolen, @nhanvtran, @gpetruc, @gkasieczka, @schoef, @ferencek, @mverzett, @mariadalfonso, @pvmulder, @seemasharmafnal, @JyothsnaKomaragiri this is something you requested to watch as well.
@davidlange6, @slava77 you are the release manager for this.

cms-bot commands are listed here

@kpedro88
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 22, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/22435/console Started: 2017/08/23 01:12

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/PR-20240/257

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-20240/22435/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 26
  • DQMHistoTests: Total histograms compared: 2653934
  • DQMHistoTests: Total failures: 242
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2653503
  • DQMHistoTests: Total skipped: 189
  • DQMHistoTests: Total Missing objects: 0
  • Checked 107 log files, 14 edm output root files, 26 DQM output files

@slava77
Copy link
Contributor

slava77 commented Aug 24, 2017

@kpedro88
isn't this also needed in 92X?

@kpedro88
Copy link
Contributor Author

@slava77 wasn't sure, that's why I asked the jet experts in the PR description - I can just make a backport now if you think it's worthwhile

@kpedro88
Copy link
Contributor Author

Also, it was pointed out to me that this change solves a similar problem with reproducibility in CRAB if a job is split differently each time it's run, so maybe it should be enabled by default. I would still like to hear from jet experts on this (@rappoccio, @blinkseb, @zdemirag, ...).

@kpedro88
Copy link
Contributor Author

kpedro88 commented Sep 1, 2017

@slava77 is the latest commit okay? Let me know and then I'll update the backports

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 1, 2017

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 1, 2017

Pull request #20240 was updated. @perrotta, @cmsbuild, @monttj, @slava77 can you please check and sign again.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 1, 2017

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/PR-20240/450

@slava77
Copy link
Contributor

slava77 commented Sep 1, 2017

@slava77 is the latest commit okay? Let me know and then I'll update the backports

yes, it covers the concerns mentioned so far.

@slava77
Copy link
Contributor

slava77 commented Sep 1, 2017

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 1, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/22656/console Started: 2017/09/01 19:17

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 1, 2017

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 1, 2017

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 1, 2017

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-20240/22656/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 26
  • DQMHistoTests: Total histograms compared: 2656222
  • DQMHistoTests: Total failures: 220
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2655813
  • DQMHistoTests: Total skipped: 189
  • DQMHistoTests: Total Missing objects: 0
  • Checked 107 log files, 14 edm output root files, 26 DQM output files

@@ -138,6 +138,7 @@
variation = cms.int32(0), # If not specified, default to 0

seed = cms.uint32(37428479), # If not specified, default to 37428479
useDeterministicSeed = cms.bool(True),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not match the PR description

Currently I have the useDeterministicSeed option disabled by default

I suppose, the description should be updated.

@slava77
Copy link
Contributor

slava77 commented Sep 5, 2017

+1

for #20240 e4d33c5

  • deterministic random numbers are implemented and enabled by default
  • jenkins tests pass and comparisons with baseline show some differences in MC workflows in METUnc_JetRes{Up,Down} plots as expected

@kpedro88
Copy link
Contributor Author

kpedro88 commented Sep 6, 2017

@slava77 updated PR description

@davidlange6
Copy link
Contributor

merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants