New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow specifying split level on a per branch basis #19408
Allow specifying split level on a per branch basis #19408
Conversation
The new parameter overrideBranchesSplitLevel can be used to specify branch by branch what split level to use. This only works for data products made in this job, not products read from the input.
A new Pull Request was created by @Dr15Jones (Chris Jones) for master. It involves the following packages: IOPool/Output @cmsbuild, @smuzaffar, @Dr15Jones, @davidlange6 can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
+1 |
The tests are being triggered in jenkins. |
This pull request is fully signed and it will be integrated in one of the next master IBs after it passes the integration tests. This pull request requires discussion in the ORP meeting before it's merged. @davidlange6, @smuzaffar |
This feature was requested from #19205 |
Even though this feature is relatively minor, I am dubious that the loss in development time for other projects plus the extra maintenance burden incurred by this code change is worth the very small benefit this achieves for #19205. I hope I'm wrong. |
+1 The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
+1 |
for the records, thanks to this PR we could selectively [1] decide which branch to split based on how they compress best (full split vs no split). The saving is quite large (and I expect it to be even larger on the typical production samples that are merged from smaller chunks).... so unless someone has objections we could deploy this[1] in production [1] process.MINIAODSIMoutput.overrideBranchesSplitLevel = cms.untracked.VPSet( [ cms.untracked.PSet(branch = cms.untracked.string("patMuons_slimmedMuons__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patElectrons_slimmedElectrons__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patTaus_slimmedTaus__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patPhotons_slimmedPhotons__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patTaus_slimmedTausBoosted__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patCompositeCandidates_oniaPhotonCandidates_conversions_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("recoSuperClusters_reducedEgamma_reducedSuperClusters_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("recoConversions_reducedEgamma_reducedConversions_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patPackedCandidates_lostTracks__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patMETs_slimmedMETs__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patMETs_slimmedMETsPuppi__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patMETs_slimmedMETsNoHF__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("recoVertexCompositePtrCandidates_slimmedKshortVertices__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patJets_slimmedJetsAK8__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("HcalNoiseSummary_hcalnoise__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patJets_slimmedJetsAK8PFPuppiSoftDropPacked_SubJets_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patIsolatedTracks_isolatedTracks__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("GenEventInfoProduct_generator__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1tEGammaBXVector_caloStage2Digis_EGamma_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1tEtSumBXVector_caloStage2Digis_EtSum_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("recoGenJets_slimmedGenJetsAK8__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("recoVertexCompositePtrCandidates_slimmedLambdaVertices__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("GlobalAlgBlkBXVector_gtStage2Digis__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1tMuonBXVector_gmtStage2Digis_Muon_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patPackedCandidates_lostTracks_eleTracks_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("recoConversions_reducedEgamma_reducedSingleLegConversions_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("recoGsfElectronCores_reducedEgamma_reducedGedGsfElectronCores_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("recoPhotonCores_reducedEgamma_reducedGedPhotonCores_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("recoCSCHaloData_CSCHaloData__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("recoBeamHaloSummary_BeamHaloSummary__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("GlobalExtBlkBXVector_gtStage2Digis__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("recoBeamSpot_offlineBeamSpot__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1extraL1EtMissParticles_l1extraParticles_MET_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1extraL1EtMissParticles_l1extraParticles_MHT_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1extraL1HFRingss_l1extraParticles__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1extraL1EmParticles_l1extraParticles_NonIsolated_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1extraL1JetParticles_l1extraParticles_IsoTau_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1extraL1JetParticles_l1extraParticles_Forward_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1extraL1JetParticles_l1extraParticles_Central_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1extraL1EmParticles_l1extraParticles_Isolated_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1extraL1MuonParticles_l1extraParticles__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("l1extraL1JetParticles_l1extraParticles_Tau_*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("LumiScalerss_scalersRawToDigi__*"),splitLevel=cms.untracked.int32(0)), cms.untracked.PSet(branch = cms.untracked.string("patPhotons_slimmedOOTPhotons__*"),splitLevel=cms.untracked.int32(0)), ] ) |
On 7/3/17 9:15 AM, arizzi wrote:
process.MINIAODSIMoutput.overrideBranchesSplitLevel = cms.untracked.VPSet( [
the name of the output module is not always known.
The change should be made in some other way.
the currently used variety I can think of is
MINIAODSIMoutput
MINIAODoutput
write_MINIAOD
the settings should probably come with a hook in the ConfigBuilder
and the overrides probably set in the EventContent
|
@slava77 yes this was just a top level replace to test. In fact I think the functions in miniAOD_tools.py are not called anymore to customize the output modules (unless I did some mistakes, I'll cross check tomorrow) ... if that's the case we should remove them (and in that case how can we be sure the right settings are used?) |
@davidlange6 @slava77 is there a reason why ConfigBuilder is a beast of >2k lines so off-loading some configuration steps would be a better option... isn't it? |
no reason that I see against this - both existing implementations were done by the same person around the same time..
… On Jul 10, 2017, at 9:19 AM, arizzi ***@***.***> wrote:
@davidlange6 @slava77 is there a reason why
https://github.com/cms-sw/cmssw/blob/master/Configuration/Applications/python/ConfigBuilder.py#L644-L647
couldn't simply call
https://github.com/cms-sw/cmssw/blob/master/PhysicsTools/PatAlgos/python/slimming/miniAOD_tools.py#L368
ConfigBuilder is a beast of >2k lines so off-loading some configuration steps would be a better option... isn't it?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
On 7/10/17 9:19 AM, arizzi wrote:
@davidlange6 <https://github.com/davidlange6> @slava77
<https://github.com/slava77> is there a reason why
https://github.com/cms-sw/cmssw/blob/master/Configuration/Applications/python/ConfigBuilder.py#L644-L647
If I'm not mistaken, this part is called before convertToUnscheduled()
couldn't simply call
https://github.com/cms-sw/cmssw/blob/master/PhysicsTools/PatAlgos/python/slimming/miniAOD_tools.py#L368
this has to be called after convertToUnscheduled
…
ConfigBuilder is a beast of >2k lines so off-loading some configuration
steps would be a better option... isn't it?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19408 (comment)>, or
mute the thread
<https://github.com/notifications/unsubscribe-auth/AEdcbubHN1Bpzzkve2nEWL10XsT4OMSXks5sMiT7gaJpZM4OCi4V>.
|
why? I just want to set one more outputmodule option, if L644-L647 are the
lines where outmodule options are set, this is where I should replace with
my function... how does convertToUncheduled matter?
On Mon, Jul 10, 2017 at 3:36 PM, Slava Krutelyov <notifications@github.com>
wrote:
… On 7/10/17 9:19 AM, arizzi wrote:
> @davidlange6 <https://github.com/davidlange6> @slava77
> <https://github.com/slava77> is there a reason why
> https://github.com/cms-sw/cmssw/blob/master/Configuration/Applications/
python/ConfigBuilder.py#L644-L647
If I'm not mistaken, this part is called before convertToUnscheduled()
> couldn't simply call
> https://github.com/cms-sw/cmssw/blob/master/
PhysicsTools/PatAlgos/python/slimming/miniAOD_tools.py#L368
this has to be called after convertToUnscheduled
>
> ConfigBuilder is a beast of >2k lines so off-loading some configuration
> steps would be a better option... isn't it?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#19408 (comment)>, or
> mute the thread
> <https://github.com/notifications/unsubscribe-auth/
AEdcbubHN1Bpzzkve2nEWL10XsT4OMSXks5sMiT7gaJpZM4OCi4V>.
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#19408 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEyilqjCYIVUwfs_ZYBOcTv2UU_GMEQeks5sMijOgaJpZM4OCi4V>
.
|
On 7/10/17 9:39 AM, arizzi wrote:
why? I just want to set one more outputmodule option, if L644-L647 are the
lines where outmodule options are set, this is where I should replace with
my function... how does convertToUncheduled matter?
sorry, you are right,
it can be together with the output def.
|
@slava77 with the change to using the Task system in CMSSW_9_2, it is no longer a problem calling before or after |
The new parameter overrideBranchesSplitLevel can be used to specify branch by branch what split level to use. This only works for data products made in this job, not products read from the input.