Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce MINIGEN step and output to replace GenOnly workflows #19853

Merged
merged 20 commits into from Sep 12, 2017

Conversation

perrozzi
Copy link
Contributor

created a new eventcontent
--eventcontent MINIAODGEN
and a new step
--step PATGEN
to allow MINIAODSIM from GenOnly datasets.
Useful for GEN validation and creation of alternative samples in central production with small event size.

@cmsbuild cmsbuild added this to the CMSSW_9_3_X milestone Jul 21, 2017
@perrozzi perrozzi changed the title allow running MINIAOSIM from GenOnly from cmsDriver allow running MINIAOSIM from GenOnly using cmsDriver Jul 21, 2017
@cmsbuild
Copy link
Contributor

A new Pull Request was created by @perrozzi for master.

It involves the following packages:

Configuration/Applications
Configuration/EventContent
PhysicsTools/PatAlgos

@perrotta, @monttj, @cmsbuild, @franzoni, @slava77, @davidlange6 can you please review it and eventually sign? Thanks.
@ghellwig, @mmarionncern, @gouskos, @rappoccio, @imarches, @ahinzmann, @acaudron, @gpetruc, @TaiSakuma, @Martin-Grunewald, @jdolen, @nhanvtran, @JyothsnaKomaragiri, @gkasieczka, @schoef, @ferencek, @mverzett, @mariadalfonso, @pvmulder, @seemasharmafnal this is something you requested to watch as well.
@davidlange6 you are the release manager for this.

cms-bot commands are listed here

@perrozzi perrozzi changed the title allow running MINIAOSIM from GenOnly using cmsDriver allow running MINIAODSIM from GenOnly using cmsDriver Jul 21, 2017
@perrozzi
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 21, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/21658/console Started: 2017/07/21 11:57

@cmsbuild
Copy link
Contributor

Comparison job queued.

@slava77
Copy link
Contributor

slava77 commented Jul 21, 2017

@perrozzi
please add a test configuration for this new cmsDriver step.
A matrix workflow seems more appropriate, but it can also be added as a unit test in PhysicsTools/PatAlgos/test/runtests.sh

@perrozzi
Copy link
Contributor Author

@slava77 I am not sure where to find a cmsDriver example for master/93X.
I have tested it in 80X using this (then I ported and removed the cfg from the commit)

# with command line options: step1 --fileout file:out.root --filein /store/mc/RunIISummer15GS/DYToMuMu_M_50_TuneAZ_8TeV_pythia8/GEN-SIM/GenOnly_MCRUN2_71_V1-v3/100000/006A257F-D140-E711-93E2-0025901C1A92.root --mc --eventcontent MINIAODGEN --runUnscheduled --datatier MINIAODSIM --conditions 80X_mcRun2_asymptotic_2016_TrancheIV_v6 --step PATGEN --nThreads 1 --era Run2_2016 --python_filename test_MINIAODGEN_from_GEN.py --no_exec -n 5

do you have suggestions?

@slava77
Copy link
Contributor

slava77 commented Jul 21, 2017

@fabozzi
please take a look at the new steps in cmsDriver and advise @perrozzi on how to add a matrix workflow for it.
Thank you.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-19853/21658/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 340 differences found in the comparisons
  • DQMHistoTests: Total files compared: 23
  • DQMHistoTests: Total histograms compared: 2347494
  • DQMHistoTests: Total failures: 30599
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2316729
  • DQMHistoTests: Total skipped: 166
  • DQMHistoTests: Total Missing objects: 0
  • Checked 93 log files, 14 edm output root files, 23 DQM output files

@davidlange6
Copy link
Contributor

davidlange6 commented Jul 21, 2017 via email

@slava77
Copy link
Contributor

slava77 commented Jul 21, 2017

Reco comparison results: 340 differences found in the comparisons

if not semi-blandly to ignore the diffs, need to rerun this after another IB.
The PR was ahead of the IB by a bit at the time of testing

@perrozzi
Copy link
Contributor Author

@davidlange6 sure, I can try to summarize here then we can discuss at the ORP meeting or else.

In GEN we are often using GenOnly workflows (i.e. GEN step without SIM) to provide high statistics alternative samples to compare unfolded data and mc or validate new versions of the MC, for instance.
In most of the cases, we use Rivet for this purpose, which relies on the HepMC product.
However, GEN(-SIM) have very short (i.e. 0) lifetime on disk [*] complicating things especially for quick validation turn around.
Thanks to the developments achieved in the last months, Rivet can now be reliably used starting from MINIAOD [**]

Since we foresee very soon one of these such quick validation turn around to validate the latest and greatest GEN updates before to launch the 2017 v2 MC campaign, to count on central production GENOnly+MINIAODSIM workflows (saving only reduced MINIAODSIM output) will be very beneficial, I think.

The validation can be carried on using the current 2016 MC campaign (71X+80X) so this feature is quite more urgent in 80X than master/93X, somehow.

Let me know if you want to discuss more or would like to have further input.
Thanks
Luca

[*] https://hypernews.cern.ch/HyperNews/CMS/get/comp-ops/3623/1/1/1.html

[**] The HepMC collection can be produced on the fly starting from the GenParticle collection, but in MINIAOD it is splitted and slimmed into PackedGenParticles and PrunedGenParticles to reduce disk space.
A "merger" has been written to combine back the two collections restoring missing links, being able to produce a complete HepMC. The Rivet results obtained with this HepMC collection have been compared to those obtained with the original collection and are successfully validated.

@slava77
Copy link
Contributor

slava77 commented Jul 24, 2017

@perrozzi
can all of this be simply added to GEN step?

the output data tier better be named MINIGEN or something like that, because "AOD" in the name seems inappropriate.

@perrotta
Copy link
Contributor

perrotta commented Sep 4, 2017

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 4, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/22676/console Started: 2017/09/04 06:56

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 4, 2017

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 4, 2017

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 4, 2017

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-19853/22676/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 26
  • DQMHistoTests: Total histograms compared: 2656522
  • DQMHistoTests: Total failures: 207
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2656126
  • DQMHistoTests: Total skipped: 189
  • DQMHistoTests: Total Missing objects: 0
  • Checked 107 log files, 14 edm output root files, 26 DQM output files

@perrozzi
Copy link
Contributor Author

perrozzi commented Sep 4, 2017

Hello, may I kindly ask to review this on time for tomorrow's ORP? I would really like to close this...

@slava77
Copy link
Contributor

slava77 commented Sep 5, 2017

+1

for #19853 3781f11

@smuzaffar smuzaffar modified the milestones: CMSSW_9_4_X, CMSSW_9_3_X Sep 6, 2017
@perrozzi perrozzi changed the title allow running MINIAODSIM from GenOnly using cmsDriver introduce MINIGEN step and output to replace GenOnly workflows Sep 6, 2017
@perrozzi
Copy link
Contributor Author

perrozzi commented Sep 6, 2017

@perrotta, @monttj, can you please review and sign this PR, unless there are other suggestions?

@perrozzi
Copy link
Contributor Author

Dear @perrotta, @monttj, I would kindly renew my invitation to please review and sign this PR, unless there are other suggestions.

@perrotta
Copy link
Contributor

@perozzi : his PR only mosses the "analysis" signature; I can only sign for reco, and @slava77 already did six days ago
@davidlange6 : this PR is fully signed but it only misses the analysis signature since six days already; since GEN apparerntly urges having it merged in the release, if @monttj cannot sign maybe you can bypass and at least start evaluting the PR

@davidlange6
Copy link
Contributor

merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet