ConcurrentHadronizerFilter #28913

Dr15Jones · 2020-02-10T19:59:10Z

PR description:

Created the ConcurrentHadronizerFilter. This templated class is similar to HadronizerFilter except it is thread-safe and can run the Hadronizer concurrently for different events.

The only hadronizer that has been instantiated is Pythia8 using a dummy decayer class, ConcurrentExternalDecayDriver.

PR validation:

The code was tested using a production workflow snippet where the Pythia8Hadronizer was replaced with the ConcurrentPythia8Hadronizer. In the snippet, no external decayer was begin specified.

This global module replicates the Hadronizer for each stream in order to run them concurrently. This only works for thread-friendly hadronizers, decayers and filters.

This is meant to give access to thread-friendly decayers. At the moment non such exist so if used the object will throw an exception.

This makes use of the ConcurrentHadronizerFilter.

Use identical clones of the random engine in order to setup the hadronizer and decayer on each LuminosityBlock boundary.

cmsbuild · 2020-02-10T19:59:37Z

The code-checks are being triggered in jenkins.

cmsbuild · 2020-02-10T20:04:31Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28913/13703

This PR adds an extra 28KB to repository

cmsbuild · 2020-02-10T20:04:56Z

A new Pull Request was created by @Dr15Jones (Chris Jones) for master.

It involves the following packages:

GeneratorInterface/Core
GeneratorInterface/ExternalDecays
GeneratorInterface/Pythia8Interface

@SiewYan, @efeyazgan, @mkirsano, @cmsbuild, @agrohsje, @alberto-sanchez, @qliphy can you please review it and eventually sign? Thanks.
@alberto-sanchez, @agrohsje, @mkirsano this is something you requested to watch as well.
@davidlange6, @silviodonato, @fabiocos you are the release manager for this.

cms-bot commands are listed here

Dr15Jones · 2020-02-10T20:05:47Z

please test

cmsbuild · 2020-02-10T20:06:07Z

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/4584/console Started: 2020/02/10 21:06

cmsbuild · 2020-02-10T20:09:26Z

-1

Tested at: d49909b

CMSSW: CMSSW_11_1_X_2020-02-10-1100
SCRAM_ARCH: slc7_amd64_gcc820
You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-98755c/4584/summary.html

I found follow errors while testing this PR

Failed tests: ClangBuild

Clang:

I found compilation error while trying to compile with clang. Command used:

USER_CUDA_FLAGS='--expt-relaxed-constexpr' USER_CXXFLAGS='-Wno-register -fsyntax-only' scram build -k -j 32 COMPILER='llvm compile'

                 ^
/cvmfs/cms-ib.cern.ch/nweek-02615/slc7_amd64_gcc820/external/pythia8/243/include/Pythia8/SpaceShower.h:146:18: note: hidden overloaded virtual function 'Pythia8::SpaceShower::getSplittingProb' declared here: different number of parameters (5 vs 7)
  virtual double getSplittingProb( const Event& , int , int , int , string )
                 ^
In file included from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_1_X_2020-02-10-1100/src/GeneratorInterface/Pythia8Interface/plugins/Pythia8Hadronizer.cc:61:
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_1_X_2020-02-10-1100/src/GeneratorInterface/Core/interface/ConcurrentHadronizerFilter.h:139:20: error: 'callWhenNewProductsRegistered' following the 'template' keyword does not refer to a template
    this->template callWhenNewProductsRegistered([ptrThis](BranchDescription const& iBD) {
                   ^
/cvmfs/cms-ib.cern.ch/nweek-02615/slc7_amd64_gcc820/external/gcc/8.2.0-pafccj/lib/gcc/x86_64-unknown-linux-gnu/8.3.1/../../../../include/c++/8.3.1/bits/unique_ptr.h:831:34: note: in instantiation of member function 'edm::ConcurrentHadronizerFilter::ConcurrentHadronizerFilter' requested here
    { return unique_ptr<_Tp>(new _Tp(std::forward<_Args>(__args)...)); }
                                 ^

cmsbuild · 2020-02-10T20:09:28Z

Comparison not run due to Build errors/Fireworks only changes/No short matrix requested (RelVals and Igprof tests were also skipped)

Dr15Jones · 2020-02-10T20:10:30Z

The test used the following

gen_cff.py

import FWCore.ParameterSet.Config as cms

externalLHEProducer = cms.EDProducer("ExternalLHEProducer",
    args = cms.vstring('/cvmfs/cms.cern.ch/phys_generator/gridpacks/2017/13TeV/powheg/V2/TT_hvq/patched/TT_hdamp_NNPDF31_NNLO_inclusive_patched_reducedPDFWeights.tgz'),
    nEvents = cms.untracked.uint32(5000),
    numberOfParameters = cms.uint32(1),
    outputFile = cms.string('cmsgrid_final.lhe'),
    scriptName = cms.FileInPath('GeneratorInterface/LHEInterface/data/run_generic_tarball_cvmfs.sh')
)

#Link to datacards:
#https://github.com/cms-sw/genproductions/blob/master/bin/Powheg/production/2017/13TeV/TT_hvq/TT_hdamp_NNPDF31_NNLO_inclusive.input

import FWCore.ParameterSet.Config as cms
from Configuration.Generator.Pythia8CommonSettings_cfi import *
from Configuration.Generator.MCTunes2017.PythiaCP5Settings_cfi import *
from Configuration.Generator.Pythia8PowhegEmissionVetoSettings_cfi import *
from Configuration.Generator.PSweightsPythia.PythiaPSweightsSettings_cfi import *

#generator = cms.EDFilter("Pythia8HadronizerFilter",
generator = cms.EDFilter("Pythia8ConcurrentHadronizerFilter",
maxEventsToPrint = cms.untracked.int32(1),
pythiaPylistVerbosity = cms.untracked.int32(1),
filterEfficiency = cms.untracked.double(1.0),
pythiaHepMCVerbosity = cms.untracked.bool(False),
comEnergy = cms.double(13000.),
PythiaParameters = cms.PSet(
pythia8CommonSettingsBlock,
pythia8CP5SettingsBlock,
pythia8PowhegEmissionVetoSettingsBlock,
pythia8PSweightsSettingsBlock,
processParameters = cms.vstring(
        'POWHEG:nFinal = 2', ## Number of final state particles
        ## (BEFORE THE DECAYS) in the LHE
        ## other than emitted extra parton
        'TimeShower:mMaxGamma = 1.0',#cutting off lepton-pair production
        ##in the electromagnetic shower
        ##to not overlap with ttZ/gamma* samples
        '6:m0 = 172.5',    # top mass'
),
parameterSets = cms.vstring('pythia8CommonSettings',
'pythia8CP5Settings',
'pythia8PowhegEmissionVetoSettings',
'pythia8PSweightsSettings',
'processParameters'
)
)
)

genParticlesForFilter = cms.EDProducer("GenParticleProducer",
    abortOnUnknownPDGCode = cms.untracked.bool(False),
    saveBarCodes = cms.untracked.bool(True),
    src = cms.InputTag("generator", "unsmeared")
)

genParticlesForjetsForFilter = cms.EDProducer("InputGenJetsParticleSelector",
    excludeFromResonancePids = cms.vuint32(12, 13, 14, 16),
    excludeResonances = cms.bool(False),
    ignoreParticleIDs = cms.vuint32(1000022, 1000012, 1000014, 1000016, 2000012, 
        2000014, 2000016, 1000039, 5100039, 4000012, 
        4000014, 4000016, 9900012, 9900014, 9900016, 
        39),
    partonicFinalState = cms.bool(False),
    src = cms.InputTag("genParticlesForFilter"),
    tausAsJets = cms.bool(False)
)

ak8GenJetsForFilter = cms.EDProducer("FastjetJetProducer",
    Active_Area_Repeats = cms.int32(5),
    GhostArea = cms.double(0.01),
    Ghost_EtaMax = cms.double(6.0),
    Rho_EtaMax = cms.double(4.5),
    doAreaFastjet = cms.bool(False),
    doPUOffsetCorr = cms.bool(False),
    doPVCorrection = cms.bool(False),
    doRhoFastjet = cms.bool(False),
    inputEMin = cms.double(0.0),
    inputEtMin = cms.double(0.0),
    jetAlgorithm = cms.string('AntiKt'),
    jetPtMin = cms.double(3.0),
    jetType = cms.string('GenJet'),
    maxBadEcalCells = cms.uint32(9999999),
    maxBadHcalCells = cms.uint32(9999999),
    maxProblematicEcalCells = cms.uint32(9999999),
    maxProblematicHcalCells = cms.uint32(9999999),
    maxRecoveredEcalCells = cms.uint32(9999999),
    maxRecoveredHcalCells = cms.uint32(9999999),
    minSeed = cms.uint32(14327),
    nSigmaPU = cms.double(1.0),
    rParam = cms.double(0.8),
    radiusPU = cms.double(0.5),
    src = cms.InputTag("genParticlesForjetsForFilter"),
    srcPVs = cms.InputTag(""),
    useDeterministicSeed = cms.bool(True)
)

genHTFilter = cms.EDFilter("GenHTFilter",
    genHTcut = cms.double(649.0),
    jetEtaCut = cms.double(1000.0),
    jetPtCut = cms.double(650.0),
    src = cms.InputTag("ak8GenJetsForFilter")
)

LHEJetFilter = cms.EDFilter("LHEJetFilter",
    jetPtMin = cms.double(350.0),
    jetR = cms.double(0.8),
    src = cms.InputTag("externalLHEProducer")
)

ProductionFilterSequence = cms.Sequence(LHEJetFilter*generator*genParticlesForFilter*genParticlesForjetsForFilter*ak8GenJetsForFilter*genHTFilter)

test_cfg.py

import FWCore.ParameterSet.Config as cms

process = cms.Process("GEN")

process.load("gen_cff")
process.load('IOMC.RandomEngine.IOMC_cff')
process.load('SimGeneral.HepPDTESSource.pythiapdt_cfi')

process.source = cms.Source("EmptySource")

process.maxEvents.input = 5000

process.options.wantSummary = True
process.options.numberOfThreads = 4

process.p = cms.Path(process.ProductionFilterSequence, cms.Task(process.externalLHEProducer))

Dr15Jones · 2020-02-10T20:27:59Z

It would be possible to allow our present Decayers to work with this class. I can see two ways to do it

we use the code from FWCore/SharedMemory to run the decayers in a separate process. The down side is we would have to serialize/deserialize all the data needed and created by the decayers. It is quite likely that the overhead of serialization would only be beneficial under very high (say 100+) thread count.
We could use the ExternalWork facility to execute the decayers in a separate TBB task where serialization of access to the decayer is handled by the appropriate SerialTaskQueue. (This is how the one:: module HadronizerFilter works internally.)

Of course the absolute best would be to have access to thread-friendly decayers.

Dr15Jones · 2020-02-10T20:28:22Z

@makortel FYI

cmsbuild · 2020-02-10T21:43:58Z

+1
Tested at: 31cfb66
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-98755c/4585/summary.html
CMSSW: CMSSW_11_1_X_2020-02-10-1100
SCRAM_ARCH: slc7_amd64_gcc820

cmsbuild · 2020-02-10T21:44:01Z

Comparison job queued.

cmsbuild · 2020-02-10T23:41:29Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-98755c/4585/summary.html

Comparison Summary:

No significant changes to the logs found
Reco comparison results: 0 differences found in the comparisons
DQMHistoTests: Total files compared: 34
DQMHistoTests: Total histograms compared: 2694005
DQMHistoTests: Total failures: 1
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 2693658
DQMHistoTests: Total skipped: 346
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
Checked 147 log files, 16 edm output root files, 34 DQM output files

silviodonato · 2020-02-18T14:52:30Z

We need generators' review @alberto-sanchez @agrohsje @efeyazgan @mkirsano @qliphy @SiewYan

agrohsje · 2020-02-18T20:15:58Z

+1
@Dr15Jones : Do you plan to work on the decays as mentioned above? (Following option 2?)
Would you mind an extended talk in GEN starting with a more general introduction and then discussing your modifications for parallelizing.

cmsbuild · 2020-02-18T20:16:25Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @silviodonato, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

Dr15Jones · 2020-02-18T20:48:06Z

Do you plan to work on the decays as mentioned above? (Following option 2?)

I am willing to if that is seen as useful. The question I have is which configurations of Pythia8HadronizerFilter are actually used for production?

Would you mind an extended talk in GEN starting with a more general introduction and then discussing your modifications for parallelizing.

That would be fine as well.

silviodonato · 2020-02-19T08:37:40Z

+1

jordan-martins · 2020-03-12T23:23:16Z

Hi @Dr15Jones , BPH has some low filter efficiency requests that would be great if any gain in time/evt would be achieved. I share one cfg of interest. Currently, this request is producing ~35 events/lumisection in a 8hr condor job. The cfg uses Pythia8 as initialize but then evtGen (not Multi-Process capable) takes over to decay some particular particles. Other thing to notice is that we do not use the Pythia8HadronizerFilter but rather the Pythia8GeneratorFilter.

Would you think that we could have some gain in here as well!?

Many Thanks in advance,
Jordan.
@alberto-sanchez @qliphy @agrohsje

[1]
/afs/cern.ch/work/j/jordanm/public/any/CMSSW_10_2_20/src/BPH-RunIIFall18GS-00219_1_cfg.py

Dr15Jones · 2020-03-13T16:16:52Z

The problem is EvtGen is not thread safe (as you pointed out) and therefore not ameanable to the code I orginally wrote. To really handle this case would be to write the code I had originally thought would be necessary which was using FWCore/SharedMemory to run all the generator code in a different process. Doing such would probably be about 2-4 weeks of work by me.

Dr15Jones · 2020-04-11T16:47:56Z

@jordan-martins wrote

BPH has some low filter efficiency requests that would be great if any gain in time/evt would be achieved. I share one cfg of interest. Currently, this request is producing ~35 events/lumisection in a 8hr condor job. The cfg uses Pythia8 as initialize but then evtGen (not Multi-Process capable) takes over to decay some particular particles. Other thing to notice is that we do not use the Pythia8HadronizerFilter but rather the Pythia8GeneratorFilter.

So I tried out the code in pull request #29445 using the configuration to which you pointed. My first observation is that configuration did not seem all that slow, it could do 2.98472 ev/s using 4 threads (although its CPU efficiency was very low). Using #29445 and 4 threads I got a 2.3x speed up (6.93303 ev/s) and very good CPU efficiency (based on results of watching top). I was stuck at 4 threads since the VM I am using only has 4 threads. This could should scale well under 8 threads as well.

Dr15Jones and others added 6 commits February 10, 2020 17:21

Removed unused decayRandomEngine global variable

1d645e1

Added ConcurrentHadronizerFilter

8ef0ca7

This global module replicates the Hadronizer for each stream in order to run them concurrently. This only works for thread-friendly hadronizers, decayers and filters.

Added ConcurrentExternalDecayDriver

6a940c1

This is meant to give access to thread-friendly decayers. At the moment non such exist so if used the object will throw an exception.

Added declaration of Pythia8ConcurrentHadronizerFilter

2971162

This makes use of the ConcurrentHadronizerFilter.

code format

c408913

Use equivalent random engine for all stream begin lumis

d49909b

Use identical clones of the random engine in order to setup the hadronizer and decayer on each LuminosityBlock boundary.

cmsbuild added this to the CMSSW_11_1_X milestone Feb 10, 2020

cmsbuild added code-checks-pending comparison-pending generators-pending orp-pending pending-signatures tests-pending labels Feb 10, 2020

cmsbuild added code-checks-approved and removed code-checks-pending labels Feb 10, 2020

cmsbuild added tests-started and removed tests-pending labels Feb 10, 2020

cmsbuild added comparison-notrun tests-rejected and removed comparison-pending tests-started labels Feb 10, 2020

clang syntax fix

31cfb66

cmsbuild added tests-started and removed tests-pending labels Feb 10, 2020

cmsbuild added tests-approved and removed tests-started labels Feb 10, 2020

cmsbuild added comparison-available and removed comparison-pending labels Feb 10, 2020

cmsbuild added fully-signed generators-approved and removed generators-pending pending-signatures labels Feb 18, 2020

cmsbuild added orp-approved and removed orp-pending labels Feb 19, 2020

cmsbuild merged commit cd9d8d7 into cms-sw:master Feb 19, 2020

Dr15Jones deleted the concurrentPythiaHadronizer branch February 24, 2020 20:12

makortel mentioned this pull request Apr 17, 2020

Access edm::event from GeneratorInterface #28636

Closed

qliphy mentioned this pull request May 29, 2020

multi-thread support for MadGraph LO process (106X) #30030

Merged

colizz mentioned this pull request May 29, 2020

ConcurrentHadronizerFilter (106X) #30053

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ConcurrentHadronizerFilter #28913

ConcurrentHadronizerFilter #28913

Dr15Jones commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

Dr15Jones commented Feb 10, 2020

cmsbuild commented Feb 10, 2020 •

edited

cmsbuild commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

Dr15Jones commented Feb 10, 2020

Dr15Jones commented Feb 10, 2020

Dr15Jones commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

silviodonato commented Feb 18, 2020

agrohsje commented Feb 18, 2020

cmsbuild commented Feb 18, 2020

Dr15Jones commented Feb 18, 2020

silviodonato commented Feb 19, 2020

jordan-martins commented Mar 12, 2020 •

edited

Dr15Jones commented Mar 13, 2020

Dr15Jones commented Apr 11, 2020

ConcurrentHadronizerFilter #28913

ConcurrentHadronizerFilter #28913

Conversation

Dr15Jones commented Feb 10, 2020

PR description:

PR validation:

cmsbuild commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

Dr15Jones commented Feb 10, 2020

cmsbuild commented Feb 10, 2020 • edited

cmsbuild commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

Dr15Jones commented Feb 10, 2020

Dr15Jones commented Feb 10, 2020

Dr15Jones commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

cmsbuild commented Feb 10, 2020

silviodonato commented Feb 18, 2020

agrohsje commented Feb 18, 2020

cmsbuild commented Feb 18, 2020

Dr15Jones commented Feb 18, 2020

silviodonato commented Feb 19, 2020

jordan-martins commented Mar 12, 2020 • edited

Dr15Jones commented Mar 13, 2020

Dr15Jones commented Apr 11, 2020

cmsbuild commented Feb 10, 2020 •

edited

jordan-martins commented Mar 12, 2020 •

edited