Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate convertToUnscheduled() customize function and --runUnscheduled argument of cmsDriver #29630

Merged
merged 13 commits into from
Jun 1, 2020

Conversation

makortel
Copy link
Contributor

@makortel makortel commented May 4, 2020

PR description:

This PR proposes to deprecate the convertToUnscheduled() and cleanUnscheduled() customize functions that were first introduced in 2012 as a "temporary aid" to help in moving to "unscheduled execution", and the --runUnscheduled command line argument to cmsDriver.py. In both cases the function/argument are left there, but they only print a deprecation message (to minimize breaking

The current behavior is to take all EDProducers and ignored EDFilters in Paths (either directly or via Sequences) and move those modules into Tasks associated to the same Paths. This transformation was useful to expose the module-level intra-event concurrency to the framework in the maximal way.

Since then, the framework has evolved, and with #29024 all modules (except EDFilters) in Paths are now run concurrently (still according to the filtering logic). Therefore the convertToUnscheduled() is no longer strictly needed to maximize the intra-event concurrency between modules. In addition, #29553 makes the distinction between EDProducers in Tasks and in Paths more important:

  • currently EDProducers in Tasks whose Event products are not consumed are not run for Events (unless they use Accumulator extension), they are, however, run for lumi, run, and job transitions
  • with the PR EDProducers in Tasks whose any products are not consumed are deleted before beginJob transition

Therefore I'm starting to feel uneasy to move EDProducers from Paths to Tasks behind people's back (ok, in principle the use of convertToUnscheduled() can be argued to be a "deliberate choice"), and am proposing to remove that functionality.

It may be that some more modules get run now in the RECO step (e.g. DQM, VALIDATION, ALCA are still mostly using Sequences, whereas e.g. RAW2DIGI and RECO use Tasks).

Moving producers from Paths/Sequences to Tasks is recommended to be continued (a bit less overhead, and unnecessary producers do not get run).

I had to modify VALIDATION sequences in two places to avoid circular dependencies (modules in Paths/Sequences still need to be declared in the order of their data dependencies). I decided to just move modules to Tasks to make it explicit that their execution order does not matter. It may be that further testing (e.g. full matrix) reveals more such cases.

Resolves #20363. Resolves #20203.

PR validation:

Limited matrix and (relevant) unit tests run.

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29630/15008

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

A new Pull Request was created by @makortel (Matti Kortelainen) for master.

It involves the following packages:

Configuration/Applications
Configuration/DataProcessing
Configuration/PyReleaseValidation
Configuration/Skimming
Configuration/StandardSequences
DQM/RPCMonitorDigi
DQMOffline/Configuration
DQMServices/Demo
FWCore/ParameterSet
FastSimulation/Validation
HLTrigger/Configuration
RecoMuon/MuonIdentification
RecoPixelVertexing/PixelLowPtUtilities
SLHCUpgradeSimulations/Geometry
TauAnalysis/MCEmbeddingTools
Validation/Configuration
Validation/Geometry
Validation/RecoEgamma
Validation/RecoMuon
Validation/RecoParticleFlow
Validation/RecoTrack
Validation/TrackerConfiguration

@andrius-k, @lveldere, @chayanit, @wajidalikhan, @sbein, @schneiml, @ianna, @kpedro88, @Martin-Grunewald, @fioriNTU, @santocch, @perrotta, @civanch, @makortel, @cmsbuild, @fwyzard, @davidlange6, @smuzaffar, @Dr15Jones, @cvuosalo, @mdhildreth, @jfernan2, @slava77, @fabiocos, @ssekmen, @kmaeshima, @pgunnell, @silviodonato, @franzoni can you please review it and eventually sign? Thanks.
@felicepantaleo, @abbiendi, @echapon, @wddgit, @rishabhCMS, @threus, @slomeo, @acimmino, @vargasa, @jhgoh, @dgulhan, @apsallid, @HuguesBrun, @cericeci, @trocino, @rociovilar, @barvic, @GiacomoSguazzoni, @rovere, @VinInn, @bellan, @wmtford, @ebrondol, @mtosi, @fabiocos, @rbartek, @Martin-Grunewald, @Fedespring, @calderona, @hatakeyamak, @lecriste, @matt-komm, @folguera this is something you requested to watch as well.
@silviodonato, @dpiparo you are the release manager for this.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

makortel commented May 4, 2020

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/5981/console Started: 2020/05/04 20:15

# there are customize functions given by users (in our unit
# tests) that need to be run before the "unscheduled customise
# functions"
self.pythonCfgCode += self.addCustomise(1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried first to remove the customisation_file_unsch altogether, but then found out that (at least) this test
https://github.com/cms-sw/cmssw/blob/master/TauAnalysis/MCEmbeddingTools/test/runtests.sh
depending on customize functions being called before the customize function added with PAT step.

@@ -1645,8 +1645,6 @@ def prepare_PAT(self, sequence = "miniAOD"):
self.prepare_PATFILTER(self)
self.loadDefaultOrSpecifiedCFF(sequence,self.PATDefaultCFF)
self.labelsToAssociate.append('patTask')
if not self._options.runUnscheduled:
raise Exception("MiniAOD production can only run in unscheduled mode, please run cmsDriver with --runUnscheduled")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be curious to learn more details on why MiniAOD requires "unscheduled mode" (to understand if not moving all EDProducers to Tasks could cause any problems).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PAT was moved to unscheduled-only support a long time ago (was it 2013 or so), hence this check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PAT was moved to unscheduled-only support a long time ago (was it 2013 or so), hence this check.

Something along that was my recollection as well. Do I understand correctly that it means that as long as PAT is run unscheduled, all is fine (i.e. PAT does not require anything else in the job to be run unscheduled)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PAT does not require anything else in the job to be run unscheduled)?

yes

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

-1

Tested at: 3f47162

CMSSW: CMSSW_11_1_X_2020-05-04-1100
SCRAM_ARCH: slc7_amd64_gcc820
You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-1c0b98/5981/summary.html

I found follow errors while testing this PR

Failed tests: UnitTests

  • Unit Tests:

I found errors in the following unit tests:

---> test TestDQMOnlineClient-dt4ml_dqm_sourceclient had ERRORS

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2020

Comparison job queued.

@silviodonato
Copy link
Contributor

+operations

@silviodonato
Copy link
Contributor

Do you have any comments?
pdmv: @chayanit @pgunnell @wajidalikhan
hlt: @Martin-Grunewald @fwyzard

@Martin-Grunewald
Copy link
Contributor

+1

@chayanit
Copy link

chayanit commented Jun 1, 2020

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 1, 2020

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit afe34f6 into cms-sw:master Jun 1, 2020
@makortel makortel deleted the deprecateConvertToUnscheduled branch June 1, 2020 21:21
@Dr15Jones
Copy link
Contributor

This broke the latest IB. It appears that we now have a case of a module in a Path depending on a module in the EndPath. From the log

L1Reco_step after l1extraParticles [path L1Reco_step], l1extraParticles consumes caloStage1LegacyFormatDigis, caloStage1LegacyFormatDigis after caloStage1Digis [path dqmoffline_step]

In theory that could work, except we still have the assumption that EndPaths run after Paths in the dependency checker

dqmProvInfo after @EndPathStart [path dqmoffline_step], @EndPathStart consumes TriggerResults

@makortel
Copy link
Contributor Author

makortel commented Jun 2, 2020

Being followed-up in an issue #30074

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment