decrease event size by storing PFCandidates with half precision float… #29513

jsalfeld · 2020-04-19T11:48:02Z

… point format

PR description:

As mentioned in a TSG on Jan. 15th:
https://indico.cern.ch/event/878772/contributions/3703563/attachments/1969842/3276766/ScoutingGroup.pdf

The PFScouting event size is reduced by 40% when PFCandidate 4-vectors are stored in Hal precision floating point format. I used the libminifloat class in CMSSW and convert back to float32 (the type returned by float32to16 is uint16_t) and let the compression reduce the information being stored.

PR validation:

I was running with and without the change and compared the output, and it looks the same. The per event size of the corresponding collection is indeed reduced after checking via edmEventSize.

File outputScoutingPF.root Events 10
Branch Name | Average Uncompressed Size (Bytes/Event) | Average Compressed Size (Bytes/Event)
float16: ScoutingParticles_hltScoutingPFPacker__HLT2018. 24530.5 7804.4
float32: ScoutingParticles_hltScoutingPFPacker__HLT2018. 24530.5 12444.5

if this PR is a backport please specify the original PR and why you need to backport that PR:

Before submitting your pull requests, make sure you followed this checklist:

verify that the PR is really intended for the chosen branch
verify that changes follow CMS Naming, Coding, And Style Rules
verify that the PR passes the basic test procedure suggested in the CMSSW PR instructions

… point format

cmsbuild · 2020-04-19T11:48:27Z

The code-checks are being triggered in jenkins.

cmsbuild · 2020-04-19T11:56:54Z

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29513/14767

This PR adds an extra 16KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

code-format:
https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29513/14767/code-format.patch
e.g. curl https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29513/14767/code-format.patch | patch -p1
You can also run scram build code-format to apply code format directly

cmsbuild · 2020-04-19T12:08:23Z

The code-checks are being triggered in jenkins.

cmsbuild · 2020-04-19T12:14:53Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29513/14768

This PR adds an extra 16KB to repository

cmsbuild · 2020-04-19T12:16:20Z

A new Pull Request was created by @jsalfeld (Jakob Salfeld-Nebgen) for master.

It involves the following packages:

HLTrigger/JetMET

@cmsbuild, @Martin-Grunewald, @fwyzard can you please review it and eventually sign? Thanks.
@Martin-Grunewald this is something you requested to watch as well.
@silviodonato, @dpiparo you are the release manager for this.

cms-bot commands are listed here

Martin-Grunewald · 2020-04-19T13:22:31Z

please test

cmsbuild · 2020-04-19T13:22:53Z

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/5772/console Started: 2020/04/19 15:40

cmsbuild · 2020-04-19T14:50:01Z

+1
Tested at: be726b0
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-788168/5772/summary.html
CMSSW: CMSSW_11_1_X_2020-04-19-0000
SCRAM_ARCH: slc7_amd64_gcc820

cmsbuild · 2020-04-19T14:50:04Z

Comparison job queued.

fwyzard · 2020-04-19T15:55:20Z

HLTrigger/JetMET/plugins/HLTScoutingPFProducer.cc

+          outPFCandidates->emplace_back(MiniFloatConverter::float16to32(MiniFloatConverter::float32to16(cand.pt())),
+                                        MiniFloatConverter::float16to32(MiniFloatConverter::float32to16(cand.eta())),
+                                        MiniFloatConverter::float16to32(MiniFloatConverter::float32to16(cand.phi())),
+                                        MiniFloatConverter::float16to32(MiniFloatConverter::float32to16(cand.mass())),


is there no way to just truncate the precision, without converting to 16 bits and back to 32 bits ?

of course there is: it is called bfloat16 (instead of halffloat)
https://en.wikipedia.org/wiki/Bfloat16_floating-point_format

BUT it really has low precision

bfloat16 is what it is mostly used for AI.

On Intel machines (AVX2 and more) one could use intrinsics to perform the conversion to/from halffloat (worth only if contiguous as in a SOA)

Mhm, no, that doesn't seem like a good candidate here.

I just meant to ask if there is way to do MiniFloatConverter::float16to32(MiniFloatConverter::float32to16(x)) in a single call.

under the assumption that the first call did not over/underflow it is trivial yes
(well, one must properly round, not truncate. still pretty trivial).

but at this point I do not understand what is going on, I was thinking data were stored in hf16 (aka int16), not in truncated float32
(ok I missed "let the compression reduce the information being stored.")

but at this point I do not understand what is going on, I was thinking data were stored in hf16 (aka int16), not in truncated float32

that was my assumption as well

on the other hand, storing the truncated precision back in a float32 has the advantage of not affecting the data format, and apparently the compression is still good enough to reduce the size by ~37%

cmsbuild · 2020-04-19T16:16:42Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-788168/5772/summary.html

Comparison Summary:

No significant changes to the logs found
Reco comparison results: 2 differences found in the comparisons
DQMHistoTests: Total files compared: 34
DQMHistoTests: Total histograms compared: 2696435
DQMHistoTests: Total failures: 39
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 2696077
DQMHistoTests: Total skipped: 319
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
Checked 147 log files, 16 edm output root files, 34 DQM output files

jsalfeld · 2020-05-01T14:38:30Z

sorry for the delay, I switched to MiniFloatConverter::reduceMantissaToNbitsRounding. And checked that the output matches the one from "MiniFloatConverter::float16to32(MiniFloatConverter::float32to16())" with 10 bits mantissa via Events->Scan(). The size is similar:
File outputScoutingPF.root Events 120
Branch Name | Average Uncompressed Size (Bytes/Event) | Average Compressed Size (Bytes/Event)
MantissaRounding: ScoutingParticles_hltScoutingPFPacker__HLT2018. 25165.4 7747.8
ConversionReConversion: ScoutingParticles_hltScoutingPFPacker__HLT2018. 25165.4 7487.35

I this is good then we can adjust the precision as we want.

cmsbuild · 2020-05-01T14:38:40Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29513/14954

This PR adds an extra 16KB to repository

cmsbuild · 2020-05-01T14:39:07Z

Pull request #29513 was updated. @cmsbuild, @Martin-Grunewald, @fwyzard can you please check and sign again.

silviodonato · 2020-05-05T12:22:00Z

please test

cmsbuild · 2020-05-05T12:22:21Z

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/6023/console Started: 2020/05/05 14:22

cmsbuild · 2020-05-05T13:33:08Z

+1
Tested at: 70f4902
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-788168/6023/summary.html
CMSSW: CMSSW_11_1_X_2020-05-05-1100
SCRAM_ARCH: slc7_amd64_gcc820

cmsbuild · 2020-05-05T13:33:11Z

Comparison job queued.

cmsbuild · 2020-05-05T15:15:05Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-788168/6023/summary.html

Comparison Summary:

No significant changes to the logs found
Reco comparison results: 0 differences found in the comparisons
DQMHistoTests: Total files compared: 34
DQMHistoTests: Total histograms compared: 2696239
DQMHistoTests: Total failures: 1
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 2695919
DQMHistoTests: Total skipped: 319
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
Checked 147 log files, 16 edm output root files, 34 DQM output files

Martin-Grunewald · 2020-05-05T15:33:57Z

+1

cmsbuild · 2020-05-05T15:34:23Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo (and backports should be raised in the release meeting by the corresponding L2)

silviodonato · 2020-05-05T16:10:42Z

+1

decrease event size by storing PFCandidates with half precision float…

435cb20

… point format

cmsbuild added this to the CMSSW_11_1_X milestone Apr 19, 2020

cmsbuild added code-checks-pending comparison-pending hlt-pending orp-pending pending-signatures tests-pending labels Apr 19, 2020

cmsbuild added code-checks-rejected and removed code-checks-pending labels Apr 19, 2020

clang format

be726b0

cmsbuild added code-checks-pending and removed code-checks-rejected labels Apr 19, 2020

cmsbuild added code-checks-approved and removed code-checks-pending labels Apr 19, 2020

cmsbuild added tests-started and removed tests-pending labels Apr 19, 2020

cmsbuild added tests-approved and removed tests-started labels Apr 19, 2020

fwyzard reviewed Apr 19, 2020

View reviewed changes

cmsbuild added code-checks-approved and removed code-checks-pending labels May 1, 2020

cmsbuild added tests-started and removed tests-pending labels May 5, 2020

cmsbuild added tests-approved and removed tests-started labels May 5, 2020

cmsbuild added comparison-available and removed comparison-pending labels May 5, 2020

cmsbuild added fully-signed hlt-approved and removed hlt-pending pending-signatures labels May 5, 2020

cmsbuild added orp-approved and removed orp-pending labels May 5, 2020

cmsbuild merged commit 743e9ac into cms-sw:master May 5, 2020

This was referenced May 5, 2020

Clean up BuildFiles under RecoLocalTracker/ #29705

Merged

Clean up BuildFiles under RecoTracker/ #29716

Merged

Clean up BuildFiles under L1Trigger/ #29685

Merged

Clean up BuildFiles under Geometry/ #29675

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decrease event size by storing PFCandidates with half precision float… #29513

decrease event size by storing PFCandidates with half precision float… #29513

jsalfeld commented Apr 19, 2020

cmsbuild commented Apr 19, 2020

cmsbuild commented Apr 19, 2020

cmsbuild commented Apr 19, 2020

cmsbuild commented Apr 19, 2020

cmsbuild commented Apr 19, 2020

Martin-Grunewald commented Apr 19, 2020

cmsbuild commented Apr 19, 2020 •

edited

Loading

cmsbuild commented Apr 19, 2020

cmsbuild commented Apr 19, 2020

fwyzard Apr 19, 2020

VinInn Apr 19, 2020

fwyzard Apr 19, 2020

VinInn Apr 19, 2020

fwyzard Apr 19, 2020

cmsbuild commented Apr 19, 2020

jsalfeld commented May 1, 2020

cmsbuild commented May 1, 2020

cmsbuild commented May 1, 2020

silviodonato commented May 5, 2020

cmsbuild commented May 5, 2020 •

edited

Loading

cmsbuild commented May 5, 2020

cmsbuild commented May 5, 2020

cmsbuild commented May 5, 2020

Martin-Grunewald commented May 5, 2020

cmsbuild commented May 5, 2020

silviodonato commented May 5, 2020

decrease event size by storing PFCandidates with half precision float… #29513

decrease event size by storing PFCandidates with half precision float… #29513

Conversation

jsalfeld commented Apr 19, 2020

PR description:

PR validation:

if this PR is a backport please specify the original PR and why you need to backport that PR:

cmsbuild commented Apr 19, 2020

cmsbuild commented Apr 19, 2020

cmsbuild commented Apr 19, 2020

cmsbuild commented Apr 19, 2020

cmsbuild commented Apr 19, 2020

Martin-Grunewald commented Apr 19, 2020

cmsbuild commented Apr 19, 2020 • edited Loading

cmsbuild commented Apr 19, 2020

cmsbuild commented Apr 19, 2020

fwyzard Apr 19, 2020

Choose a reason for hiding this comment

VinInn Apr 19, 2020

Choose a reason for hiding this comment

fwyzard Apr 19, 2020

Choose a reason for hiding this comment

VinInn Apr 19, 2020

Choose a reason for hiding this comment

fwyzard Apr 19, 2020

Choose a reason for hiding this comment

cmsbuild commented Apr 19, 2020

jsalfeld commented May 1, 2020

cmsbuild commented May 1, 2020

cmsbuild commented May 1, 2020

silviodonato commented May 5, 2020

cmsbuild commented May 5, 2020 • edited Loading

cmsbuild commented May 5, 2020

cmsbuild commented May 5, 2020

cmsbuild commented May 5, 2020

Martin-Grunewald commented May 5, 2020

cmsbuild commented May 5, 2020

silviodonato commented May 5, 2020

cmsbuild commented Apr 19, 2020 •

edited

Loading

cmsbuild commented May 5, 2020 •

edited

Loading