Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the MadGraph bug for multithreading #30444

Merged
merged 2 commits into from Jun 29, 2020
Merged

Conversation

colizz
Copy link
Contributor

@colizz colizz commented Jun 29, 2020

PR description:

Fix two scripts related to MadGraph multithread implementation.

  • A MadGraph bug was recently observed in the multithread mode that influences the calculated cross-section for the multi-jet process: https://bugs.launchpad.net/mg5amcnlo/+bug/1884085
    To fix this, We apply the corresponding patch to the MadGraph code when producing events from earlier gridpacks.
  • Simplify the runcmsgrid patch to make it compatible with the current and future runcmsgrid script.

PR validation:

The fix is on shell scripts, hence does not influence the CMSSW framework.
We tested the scripts on W+01234j and DY+01234j multi-jet processes to validate that the MG bug is solved.

(A quick test for this PR on SMP-MultiValidationUL17wmLHEGEN-00001)

cmsrel CMSSW_11_2_X_2020-06-28-0000
cd CMSSW_11_2_X_2020-06-28-0000/src/
cmsenv
git cms-merge-topic colizz:dev-mt-112X
curl -s --insecure https://cms-pdmv.cern.ch/mcm/public/restapi/requests/get_fragment/SMP-MultiValidationUL17wmLHEGEN-00001 --retry 2 --create-dirs -o Configuration/GenProduction/python/SMP-MultiValidationUL17wmLHEGEN-00001-fragment.py 
scram b -j 8
cd ../..
cmsDriver.py Configuration/GenProduction/python/SMP-MultiValidationUL17wmLHEGEN-00001-fragment.py --fileout file:SMP-MultiValidationUL17wmLHEGEN-00001.root --mc --eventcontent RAWSIM,LHE --datatier GEN,LHE --conditions 106X_mc2017_realistic_v6 --beamspot Realistic25ns13TeVEarly2017Collision --step LHE,GEN --geometry DB:Extended --era Run2_2017 --python_filename SMP-MultiValidationUL17wmLHEGEN-00001_1_cfg.py --no_exec --customise Configuration/DataProcessing/Utils.addMonitoring -n 400 --nThreads 4

cmsRun -e -j SMP-MultiValidationUL17wmLHEGEN-00001_rt.xml SMP-MultiValidationUL17wmLHEGEN-00001_1_cfg.py

A bug was recently observed in the madgraph multithread mode that influence the calculated xsec: https://bugs.launchpad.net/mg5amcnlo/+bug/1884085
To fix this, We add additional patches to the madgraph code when producing events from earlier gridpacks.
@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-30444/16565

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @colizz (Congqiao Li) for master.

It involves the following packages:

GeneratorInterface/LHEInterface

@SiewYan, @mkirsano, @cmsbuild, @GurpreetSinghChahal, @agrohsje, @alberto-sanchez, @qliphy can you please review it and eventually sign? Thanks.
@alberto-sanchez, @agrohsje, @mkirsano this is something you requested to watch as well.
@silviodonato, @dpiparo you are the release manager for this.

cms-bot commands are listed here

@qliphy
Copy link
Contributor

qliphy commented Jun 29, 2020

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 29, 2020

The tests are being triggered in jenkins.

@qliphy
Copy link
Contributor

qliphy commented Jun 29, 2020

thanks @colizz
Do you think the patch should also be included in https://github.com/cms-sw/genproductions ?

@colizz
Copy link
Contributor Author

colizz commented Jun 29, 2020

thanks @colizz
Do you think the patch should also be included in https://github.com/cms-sw/genproductions ?

Thanks @qliphy.
I would suggest doing so when we officially move on to MG>=2.7.3.
The reason is that: for the current MG version in use (which is <=2.6.5), this patch must be used together with MG bug fix patches as implemented in run_generic_tarball_cvmfs_madgraphLO_multithread.sh, the wrapper of runcmsgrid.sh. Those fixes are gridpack version-dependent and a bit complicated to handle.
Therefore it is safer to keep all those stuff in CMSSW, and let CMSSW trigger the multithread feature.

With MG>=2.7.3, we can include this patch in genproductions (or directly apply this patch to runcmsgrid in genproductions), since all bugs are fixed.

@cmsbuild
Copy link
Contributor

+1
Tested at: 6dab9fb
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-3b45fa/7471/summary.html
CMSSW: CMSSW_11_2_X_2020-06-28-0000
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-3b45fa/7471/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 36
  • DQMHistoTests: Total histograms compared: 2778915
  • DQMHistoTests: Total failures: 4
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2778861
  • DQMHistoTests: Total skipped: 50
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 35 files compared)
  • Checked 152 log files, 16 edm output root files, 36 DQM output files

@qliphy
Copy link
Contributor

qliphy commented Jun 29, 2020

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

please test workflow 551.0

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 29, 2020

The tests are being triggered in jenkins.
Test Parameters:

@cmsbuild
Copy link
Contributor

+1
Tested at: 6dab9fb
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-3b45fa/7487/summary.html
CMSSW: CMSSW_11_2_X_2020-06-29-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit d63e62e into cms-sw:master Jun 29, 2020
@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-3b45fa/7487/summary.html

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-3b45fa/551.0_TTbar012Jets_NLO_Mad_13TeV_py8+TTbar012Jets_5f_NLO_FXFX_Madgraph_LHE_13TeV+Hadronizer_TuneCP5_13TeV_aMCatNLO_FXFX_5f_max2j_max1p_LHE_pythia8+HARVESTGEN2

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 36
  • DQMHistoTests: Total histograms compared: 2778915
  • DQMHistoTests: Total failures: 5
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 2778859
  • DQMHistoTests: Total skipped: 50
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 35 files compared)
  • DQMHistoSizes: changed ( 10224.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 152 log files, 16 edm output root files, 36 DQM output files

cmsbuild added a commit that referenced this pull request Jun 30, 2020
Fix the MadGraph bug for multithreading (backport #30444)
@colizz colizz deleted the dev-mt-112X branch June 30, 2020 10:15
@colizz colizz restored the dev-mt-112X branch June 30, 2020 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants