Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forward compression settings if fast cloning is disabled on first file of a merge #36287

Merged
merged 1 commit into from Dec 8, 2021

Conversation

dan131riley
Copy link

PR description:

This PR addresses some issues found wrt memory usage at the tier-0 merge jobs:

https://hypernews.cern.ch/HyperNews/CMS/get/tier0-Ops/2304.html

It was observed that in merge jobs where fast cloning was disabled, the output file was being written with the default (zlib) compression and sub-optimal basket sizes. In this PR, if this is a merge job and fast cloning is disabled for this first file, then compression and basket sizes are forwarded to the output file. Also, the basket size for the EventAuxiliary branch is bumped in order to improve the data layout for sites with large block sizes.

This PR will actually make the merge job memory consumption worse, but will improve the compression and layout of the output files.

PR validation:

Tests are ok, changes are purely technical in the setting of various ROOT IO parameters.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-36287/26989

  • This PR adds an extra 40KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @dan131riley (Dan Riley) for master.

It involves the following packages:

  • Configuration/DataProcessing (operations)
  • IOPool/Output (core)

@perrotta, @smuzaffar, @Dr15Jones, @makortel, @cmsbuild, @qliphy, @fabiocos, @davidlange6 can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @missirol, @fabiocos, @wddgit, @makortel this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@makortel
Copy link
Contributor

@cmsbuild, please test

@@ -74,6 +74,8 @@ def mergeProcess(*inputFiles, **options):
process.add_(Service("InitRootHandlers", EnableIMT = CfgTypes.untracked.bool(False)))
else:
outMod = OutputModule("PoolOutputModule")
outMod.mergeJob = CfgTypes.untracked.bool(True)
outMod.eventAuxiliaryBasketSize = CfgTypes.untracked.int32(2*1024*1024)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the basket size for the EventAuxiliary branch is bumped in order to improve the data layout for sites with large block sizes.

Should we think of increasing this by default too? Or are the benefits more or less specific to outputs of merge jobs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is mostly specific to the merge jobs, we generally don't have single jobs producing 20GB files, and we don't fast-clone the EventAuxiliary branch, so it should be sufficient to do this just for merges.

desc.addUntracked("mergeJob", false)
->setComment(
"If set to true and fast copying is disabled, copy input file compression and basket sizes to the output "
"file.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking if more descriptive name would be clearer (e.g. along monstrous useCompressionBasketSizeDefaultsFromInput)?

Or do you see the usage more of "now we know the job is a merge job, and can do various optimizations based on that"? (the code hints a bit towards that)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I think there's more that could be done to optimize merge jobs where fast-cloning isn't possible, so I'd rather make this a hint to the output module, not a specific directive.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-251054/20840/summary.html
COMMIT: 8879595
CMSSW: CMSSW_12_2_X_2021-11-29-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/36287/20840/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 42
  • DQMHistoTests: Total histograms compared: 3247873
  • DQMHistoTests: Total failures: 5
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3247845
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 41 files compared)
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 177 log files, 37 edm output root files, 42 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor

makortel commented Dec 6, 2021

+1

@qliphy
Copy link
Contributor

qliphy commented Dec 8, 2021

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 8, 2021

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants