Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use lzma/l4 for raw data #28109

Merged
merged 1 commit into from
Oct 9, 2019
Merged

Use lzma/l4 for raw data #28109

merged 1 commit into from
Oct 9, 2019

Conversation

davidlange6
Copy link
Contributor

PR description:

Proposal to use LZMA level 4 to compress the RAW data tier. Tests on a double muon raw file from run 324970 (containing the highest lumi part of the run) shows 15% reduction in file size at the cost of a 0.25 seconds/event extra time writing (repack noticeably slower) and 0.05 seconds/event extra reading (eg, 5x more read back overhead)

It will be interesting to test also standard when it is part of root (6.20 or 6.22 it seems), but using LZMA appears to be a much better usage of storage vs write/read resources given our usual usage of RAW data.

Searches of GitHub suggest we have only rediscovered this change as a way to gain in RAW data size with minimal cost oil CPU, so perhaps there is a good reason not to do it.

@davidlange6
Copy link
Contributor Author

davidlange6 commented Oct 2, 2019 via email

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 2, 2019

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 2, 2019

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28109/12116

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 2, 2019

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/2762/console Started: 2019/10/02 20:32

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 2, 2019

A new Pull Request was created by @davidlange6 (David Lange) for master.

It involves the following packages:

Configuration/EventContent

@cmsbuild, @franzoni, @fabiocos, @kpedro88, @davidlange6 can you please review it and eventually sign? Thanks.
@Martin-Grunewald this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 2, 2019

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 2, 2019

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 2, 2019

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8f9ffa/2762/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2956833
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2956491
  • DQMHistoTests: Total skipped: 341
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 147 log files, 16 edm output root files, 34 DQM output files

@fabiocos
Copy link
Contributor

fabiocos commented Oct 9, 2019

+operations

the information about the compression algorithm used for the data tier is untracked, but effectively stored in the TFile through https://cmssdt.cern.ch/lxr/source/IOPool/Output/src/RootOutputFile.cc#0120
which activates
https://root.cern.ch/doc/v608/src_2TFile_8cxx_source.html#l02136
with the status word defined in the description record
https://root.cern.ch/doc/v608/src_2TFile_8cxx_source.html#l00063
So I understand that the update looks transparent to the input system.

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 9, 2019

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@fabiocos
Copy link
Contributor

fabiocos commented Oct 9, 2019

+1

@cmsbuild cmsbuild merged commit 12f2f04 into cms-sw:master Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants