Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow specifying split level on a per branch basis #19408

Merged
merged 1 commit into from Jun 23, 2017

Conversation

Dr15Jones
Copy link
Contributor

The new parameter overrideBranchesSplitLevel can be used to specify branch by branch what split level to use. This only works for data products made in this job, not products read from the input.

The new parameter overrideBranchesSplitLevel can be used to specify branch by branch what split level to use. This only works for data products made in this job, not products read from the input.
@cmsbuild
Copy link
Contributor

A new Pull Request was created by @Dr15Jones (Chris Jones) for master.

It involves the following packages:

IOPool/Output

@cmsbuild, @smuzaffar, @Dr15Jones, @davidlange6 can you please review it and eventually sign? Thanks.
@wddgit this is something you requested to watch as well.
@davidlange6 you are the release manager for this.

cms-bot commands are listed here

@Dr15Jones
Copy link
Contributor Author

please test

@Dr15Jones
Copy link
Contributor Author

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 22, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/20853/console Started: 2017/06/22 18:37

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs after it passes the integration tests. This pull request requires discussion in the ORP meeting before it's merged. @davidlange6, @smuzaffar

@Dr15Jones
Copy link
Contributor Author

This feature was requested from #19205

@Dr15Jones
Copy link
Contributor Author

Even though this feature is relatively minor, I am dubious that the loss in development time for other projects plus the extra maintenance burden incurred by this code change is worth the very small benefit this achieves for #19205. I hope I'm wrong.

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-19408/20853/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 22
  • DQMHistoTests: Total histograms compared: 1802848
  • DQMHistoTests: Total failures: 15030
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 1787652
  • DQMHistoTests: Total skipped: 166
  • DQMHistoTests: Total Missing objects: 0
  • Checked 90 log files, 14 edm output root files, 22 DQM output files

@davidlange6
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 0ec3143 into cms-sw:master Jun 23, 2017
@Dr15Jones Dr15Jones deleted the perBranchSettingOfSplitLevel branch June 27, 2017 13:43
@arizzi
Copy link
Contributor

arizzi commented Jul 3, 2017

@gpetruc

for the records, thanks to this PR we could selectively [1] decide which branch to split based on how they compress best (full split vs no split).
A little optimization lead to the following
http://arizzi.web.cern.ch/arizzi/miniaod/default-split.html 47.1 Kb/ev
http://arizzi.web.cern.ch/arizzi/miniaod/split-nosplit.html 44.5 Kb/ev

The saving is quite large (and I expect it to be even larger on the typical production samples that are merged from smaller chunks).... so unless someone has objections we could deploy this[1] in production

[1]

process.MINIAODSIMoutput.overrideBranchesSplitLevel = cms.untracked.VPSet( [
cms.untracked.PSet(branch = cms.untracked.string("patMuons_slimmedMuons__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patElectrons_slimmedElectrons__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patTaus_slimmedTaus__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patPhotons_slimmedPhotons__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patTaus_slimmedTausBoosted__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patCompositeCandidates_oniaPhotonCandidates_conversions_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("recoSuperClusters_reducedEgamma_reducedSuperClusters_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("recoConversions_reducedEgamma_reducedConversions_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patPackedCandidates_lostTracks__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patMETs_slimmedMETs__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patMETs_slimmedMETsPuppi__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patMETs_slimmedMETsNoHF__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("recoVertexCompositePtrCandidates_slimmedKshortVertices__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patJets_slimmedJetsAK8__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("HcalNoiseSummary_hcalnoise__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patJets_slimmedJetsAK8PFPuppiSoftDropPacked_SubJets_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patIsolatedTracks_isolatedTracks__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("GenEventInfoProduct_generator__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1tEGammaBXVector_caloStage2Digis_EGamma_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1tEtSumBXVector_caloStage2Digis_EtSum_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("recoGenJets_slimmedGenJetsAK8__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("recoVertexCompositePtrCandidates_slimmedLambdaVertices__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("GlobalAlgBlkBXVector_gtStage2Digis__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1tMuonBXVector_gmtStage2Digis_Muon_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patPackedCandidates_lostTracks_eleTracks_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("recoConversions_reducedEgamma_reducedSingleLegConversions_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("recoGsfElectronCores_reducedEgamma_reducedGedGsfElectronCores_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("recoPhotonCores_reducedEgamma_reducedGedPhotonCores_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("recoCSCHaloData_CSCHaloData__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("recoBeamHaloSummary_BeamHaloSummary__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("GlobalExtBlkBXVector_gtStage2Digis__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("recoBeamSpot_offlineBeamSpot__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1extraL1EtMissParticles_l1extraParticles_MET_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1extraL1EtMissParticles_l1extraParticles_MHT_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1extraL1HFRingss_l1extraParticles__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1extraL1EmParticles_l1extraParticles_NonIsolated_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1extraL1JetParticles_l1extraParticles_IsoTau_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1extraL1JetParticles_l1extraParticles_Forward_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1extraL1JetParticles_l1extraParticles_Central_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1extraL1EmParticles_l1extraParticles_Isolated_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1extraL1MuonParticles_l1extraParticles__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("l1extraL1JetParticles_l1extraParticles_Tau_*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("LumiScalerss_scalersRawToDigi__*"),splitLevel=cms.untracked.int32(0)),
cms.untracked.PSet(branch = cms.untracked.string("patPhotons_slimmedOOTPhotons__*"),splitLevel=cms.untracked.int32(0)),

        ]
    )

@slava77
Copy link
Contributor

slava77 commented Jul 3, 2017 via email

@arizzi
Copy link
Contributor

arizzi commented Jul 3, 2017

@slava77 yes this was just a top level replace to test.

In fact I think the functions in miniAOD_tools.py are not called anymore to customize the output modules (unless I did some mistakes, I'll cross check tomorrow) ... if that's the case we should remove them (and in that case how can we be sure the right settings are used?)

@arizzi
Copy link
Contributor

arizzi commented Jul 10, 2017

@davidlange6 @slava77 is there a reason why
https://github.com/cms-sw/cmssw/blob/master/Configuration/Applications/python/ConfigBuilder.py#L644-L647
couldn't simply call
https://github.com/cms-sw/cmssw/blob/master/PhysicsTools/PatAlgos/python/slimming/miniAOD_tools.py#L368

ConfigBuilder is a beast of >2k lines so off-loading some configuration steps would be a better option... isn't it?

@davidlange6
Copy link
Contributor

davidlange6 commented Jul 10, 2017 via email

@slava77
Copy link
Contributor

slava77 commented Jul 10, 2017 via email

@arizzi
Copy link
Contributor

arizzi commented Jul 10, 2017 via email

@slava77
Copy link
Contributor

slava77 commented Jul 10, 2017 via email

@Dr15Jones
Copy link
Contributor Author

@slava77 with the change to using the Task system in CMSSW_9_2, it is no longer a problem calling before or after convertToUnscheduled. You can even call convertToUnscheduled multiple times now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants