Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Study additional decrease in SIM AOD size #39449

Open
Dr15Jones opened this issue Sep 19, 2022 · 51 comments
Open

Study additional decrease in SIM AOD size #39449

Dr15Jones opened this issue Sep 19, 2022 · 51 comments

Comments

@Dr15Jones
Copy link
Contributor

Studying a Run 3 pileup based workflow (11834.21) with 1000 events shows that the following branches take the bulk of the space on disk

Branch Relative %
recoTracks_generalTracks__RECO 32.6%
recoPFCandidates_particleFlow__RECO 16.1%
recoGenParticles_genParticles__HLT 4.6%

Applications of different branch structure, object thinning, and lossy compression strategies (described below) to those branches could allow us to decrease the SIM AOD size by > 15% depending on how much loss of information is acceptable.

@cmsbuild
Copy link
Contributor

A new Issue was created by @Dr15Jones Chris Jones.

@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@Dr15Jones
Copy link
Contributor Author

Dr15Jones commented Sep 19, 2022

Change to branch structure

The default for the AOD event content is to preserve the branch splitting from the input file into the output file. This was probably initially done to allow fast cloning of those branches from the input to the output. But with our use of multi-threading, we already can not use fast cloning (as using multiple threads can cause the event order in the output file to differ from the event order in the input file).

By allowing the branches read/written from the input source to be fully split, the size of the 1000 event SIM AOD file decreased by 4.7 %.

@makortel
Copy link
Contributor

assign core,reconstruction,generators

@cmsbuild
Copy link
Contributor

New categories assigned: core,generators,reconstruction

@mkirsano,@menglu21,@mandrenguyen,@Dr15Jones,@smuzaffar,@clacaputo,@alberto-sanchez,@SiewYan,@makortel,@GurpreetSinghChahal,@Saptaparna you have been requested to review this Pull request/Issue and eventually sign? Thanks

@Dr15Jones
Copy link
Contributor Author

Dr15Jones commented Sep 19, 2022

Lossy Compression Overview

ROOT offers several lossy compression options, all based on the Double32_t and Float16_t ROOT defined typedefs.

double to float

Use of the typedef Double32_t signifies to ROOT that although the in memory representation of a floating point value is a double, when it comes time to write the data out the value should be converted to a float and then store the float value. Essentially this is truncating the mantissa and exponential part of the floating point value.

The use of Double32_t is already used extensively in the branches mentioned in the opening description.

fixed precision conversion

If a specially structured comment is placed on the same line as the declaration of a member variable of type Double32_t (or Float16_t) then ROOT will convert the floating point number into a fixed point number and store the result. The comment includes: the minimum value, the maximum value and the number of bits to store.

This form is quite useful for storing values which have a fixed precision, such as position measurements or possibly angular measurements.

relative precision conversion

Similar to fixed precision conversion, adding a special comment after the declaration of the member variable one can specify relative precision conversion. In this case, the full exponential range of a float is used (which is 8 bits), but the number of bits of the mantissa is rounded down to the number of bits requested (with the maximum number of bits storable being 24). One bit for the sign of the value is also added to the storage. A Float16_t always uses this form of lossy compression with the default number of stored bits being 12.

This is useful for storing values where the precision of the value is proportional to the value itself, such as the case for the momentum measurement.

I create a test where I looked at values from the GenParticles momentum stored with relative precision using different bit settings. The largest relative differences measured were

bits of precision relative deviation
9 9.8 E-4
10 4.9 E -4

Extrapolating to 12 bits I would expect a relative deviation of 1.2 E-4.

@Dr15Jones
Copy link
Contributor Author

Dr15Jones commented Sep 19, 2022

reco::Track lossy compression

The largest contributions to the stored std::vector<reco::Track> are the following sub-TBranches

Branch Relative
covariance_[15] 60.6%
vertex_.fCoordinates.fX 3.9%
vertex_.fCoordinates.fY 3.6%
vertex_.fCoordinates.fZ 3.9%
momentum_.fCoordinates.fX 3.9%
momentum_.fCoordinates.fY 3.9%
momentum_.fCoordinates.fZ 3.9%

I tried various relative compression settings for covariance_ and momentum_.

For vertex_ I noted that X and Y were bound between -150 and 150 while Z did not have an obvious bound. Assuming we'd want to keep an absolute resolution of around 1um, I used fixed precision compression on just X and Y where I kept 21 bits. Under suggestion by @slav77, I also tried bounding between -1100 and 1100 with 24 bits.

The code for this change can be found here: Dr15Jones#7

The results for different compression settings can be found below

compression description reduction of AOD size reduction of generalTracks branch only
covariance & Pxyz 9 bits, vertex 21 bits 13.6% 38.4%
covariance & Pxyz 9 bits, vertex 24 bits 13.6% 38.5%
covariance & Pxyz 9 bits 12.1% 34.0%
covariance & Pxyz 10 bits, vertex 24 bits 12.8% 36.0%
covariance & Pxyz 10 bits 11.2% 31.6%
covariance & Pxyz 12 bits, vertex 24 bits 10.9% 30.7%
covariance & Pxyz 12 bits 9% 26.2%

@Dr15Jones
Copy link
Contributor Author

Dr15Jones commented Sep 19, 2022

reco::LeafCandidate lossy compression

The reco::PFCandidate, reco::GenParticle and reco::PFJet all obtain their base information by inheriting from reco::LeafCandidate. Therefore applying lossy compression to that class will decrease storage for all of them.

Here is the relative sizes for the shared components for PFCandidate and GenParticle as they are the largest on file

Branch relative fraction of PFCandidate relative fraction of GenParticle
vertex_.fCoordinates.fX 6.6% 2.1%
vertex_.fCoordinates.fY 6.2% 1.9%
vertex_.fCoordinates.fZ 6.6% 1.8%
p4Polar_.fCoordinates.fPt 10.4% 19.2%
p4Polar_.fCoordinates.fEta 9.7% 19.1%
p4Polar_.fCoordinates.fPhi 9.7% 18.8%
p4Polar_.fCoordinates.fM 1.5% 9.6%

The way ROOT encodes data into fEta makes it hard to apply lossy compression as very large values of eta are truncated and then the value of Pz is added to the the internal storage of fEta. This means for low values of Pz one needs very high absolute precision. Because of that no attempt was made to apply lossy compression to fEta. Different relative lossy precision was applied to fPt while fixed precision lossy was used on fPhi. The code testing this can be found here:
https://github.com/Dr15Jones/cmssw/pull/9/files

Given the difficulties with the polar representation, I created storage only variables storing Px, Py and Pz and then applied lossy compression to them. The code testing this can be found here:
Dr15Jones#8

The relative savings on AOD size is as follows

compression description reduction of AOD size
Pt 9 bits & Phi 16 bits 1.2%
Pt 9 bits & Phi 20 bits 0.8%
Pt 9 bits & Phi 24 bits 0.4%
Pxyz 9 bits 2.5%
Pxyz 10 bits 2.2%
Pxyz 12 bits 1.5%

Applying fixed precision lossy compression to Eta on just the PFCandidate objects yields the savings

compression description reduction of AOD size
Pt 9 bits & Phi&Eta 16 bits 0.7%
Pt 9 bits & Phi&Eta 20 bits 0.2%
Pt 9 bits & Phi&Eta 24 bits -0.2% (it got bigger)

In theory, adding the savings on just PFCandidate to the savings for all Candidates would given an upper limit on applying the all the changes together. Even then, the savings is not as great as just using the Pxyz representation.

@slava77
Copy link
Contributor

slava77 commented Sep 19, 2022

@Dr15Jones please add compression details wrt reco::Track generalTracks branch; the other kind of tracks (electrons or muons) are less frequent and would not affect the total compression as much.
A related question is: can we have this compression per product instead of per class?

I don't think this compression strategy for reco::Track would be appropriate in the default global coordinates (esp p{x,y,z}).
A guide could be a quality of e.g. conversion (track pairs) reconstruction as well as frequency of getting non-positive-definite covariance after this compression.

A more appropriate for the momentum would be to go to q/p (or 1/pt), theta, phi, or even the pt,eta,phi used in the Particle representation; the compression has to preserve the angle better than the absolute value.

@Dr15Jones
Copy link
Contributor Author

Dr15Jones commented Sep 19, 2022

please add compression details wrt reco::Track generalTracks branch;
This is just using standard AOD settings: fully split with LZMA 4.

A related question is: can we have this compression per product instead of per class?

I'm afraid not.

I don't think this compression strategy for reco::Track would be appropriate in the default global coordinates (esp p{x,y,z}).
A guide could be a quality of e.g. conversion (track pairs) reconstruction as well as frequency of getting non-positive-definite covariance after this compression.

I think we really need further work to be pursued by experts, not a meddling amateur like myself :).

A more appropriate for the momentum would be to go to q/p (or 1/pt), theta, phi, or even the pt,eta,phi used in the Particle representation; the compression has to preserve the angle better than the absolute value.

I actually found the opposite when I did the work on PFCandidate and GenParticle. The polar notation is substantially worse when compressed and seemed to me to have worse precision behavior. But again, those judgements would be better made by professionals.

@slava77
Copy link
Contributor

slava77 commented Sep 19, 2022

A more appropriate for the momentum would be to go to q/p (or 1/pt), theta, phi, or even the pt,eta,phi used in the Particle representation; the compression has to preserve the angle better than the absolute value.

I actually found the opposite when I did the work on PFCandidate and GenParticle. The polar notation is substantially worse when compressed and seemed to me to have worse precision behavior. But again, those judgements would be better made by professionals.

Did you really see that with the PFCandidate?
I could understand the problem for GenParticle saving pz=0 tracks using 5-7-digit overflow for eta, but the PFCands should run out by eta of 5.

Please remind me if this functionality is configurable by product (instead of by class)

@mmusich
Copy link
Contributor

mmusich commented Sep 20, 2022

I tried various relative compression settings for covariance_ and momentum_.

compression of the covariance matrix elements leading to non-positively defined matrices is already now a limiting factor for analysis using miniAOD. If the plan is to do the same on AOD it needs to be done with care and carefully cross-validated.

@davidlange6
Copy link
Contributor

davidlange6 commented Sep 20, 2022 via email

@mmusich
Copy link
Contributor

mmusich commented Sep 20, 2022

Is there a summary of the effect in miniaod somewhere?

https://indico.cern.ch/event/1155820/#7-track-covariance-matrices-in

(Or eg, what is the rate seen there?)

60% of the tracks used for BPH analysis have this problem.

@slava77
Copy link
Contributor

slava77 commented Sep 20, 2022

60% of the tracks used for BPH analysis have this problem.

simple rounding will likely have a smaller impact than what's done in miniAOD with the cov parameterization

@mmusich
Copy link
Contributor

mmusich commented Sep 20, 2022

simple rounding will likely have a smaller impact than what's done in miniAOD with the cov parameterization

not objecting to that, but one would have to see the impact and have it agreed by the relevant parties.

@Dr15Jones
Copy link
Contributor Author

@mmusich

not objecting to that, but one would have to see the impact and have it agreed by the relevant parties.

A major reason I made this issue is to start to get experts to begin looking at these options.

@Dr15Jones
Copy link
Contributor Author

@slava77

Please remind me if this functionality is configurable by product (instead of by class)

It is only by class. (Looks like my previous reply was accidentally included in the quote I copied).

But it is possible for us to have different classes that work like the LeafCandidate (I actually did that as part of my testing) so that different inheriting classes could have different storage options.

@Dr15Jones
Copy link
Contributor Author

Did you really see that with the PFCandidate?
I could understand the problem for GenParticle saving pz=0 tracks using 5-7-digit overflow for eta, but the PFCands should run out by eta of 5.

It is a good point as I only studied GenParticle and have not looked at the details of PFCandidate. Matti pointed me to the PackedCandidate which is used in MiniAOD storage and there the eta is truncated at +-6.

@Dr15Jones
Copy link
Contributor Author

Dr15Jones commented Sep 20, 2022

As a test of the lose of precision I ran RECO jobs on the same 1000 events using just 1 thread in order to get event ordering identical. I then wrote out the momentum related values from the GenParticles and wrote them to a test structure which stored the values using different compression values/methods. I then measured the deviation of the values stored with compression to the values stored fully. The results are below

Max Deviation Original Pt 9bit, Phi 16 bit Pt 9 bit, Phi 20 bit Pt 9 bit, Phi 24 bit Pxyz 9 bit Pxyz 12 bit
P ratio 0. 9.8E-4 9.8E-4 9.8E-4 9.8E-4 1.2E-4
Pt ratio 0. 9.8E-4 9.8E-4 9.8E-4 9.8E-4 1.2E-4
Px ratio 0. 6.3E-3 1.1E-3 9.8E-4 9.8E-4 1.2E-4
Px diff 0. 1.1E-4 7.3E-5 7.0E-5 - -
Py ratio 0. 6.9E-3 1.1E-3 9.8E-4 9.8E-4 1.2E-4
Py diff 0. 9.7E-6 1.0E-5 1.0E-5 - -
Pz ratio 0. 9.8E-4 9.8E-4 9.8E-4 9.8E-4 1.2E-4
Phi diff 0. 4.8E-5 3.0E-6 1.9E-7 9.3E-4 1.1E-4
Eta ratio 0. 0. 0. 0. 1.8E-3 2.3E-4

I have now gone and applied a Eta fixed precision between -6 and 6 on all particleFlow PFCandidates in the 1000 event file and compared the precision. (This could not safety be done with GeParticles since we do have important particles with eta > 6).

Max Deviation Pt 9bit, Phi&Eta 16 bit Pt 9 bit, Phi&Eta 20 bit Pt 9 bit, Phi&Eta 24 bit Pt 12bit, Phi&Eta 20 bit
P ratio 1.1E-3 9.8E-4 9.8E-4 1.3E-4
Pt ratio 9.8E-4 9.8E-4 9.8E-4 1.2E-4
Px ratio 1.7E-3 9.8E-4 9.8E-4 1.6E-4
Px diff 1.5E-4 2.0E-4 2.0E-4 1.0E-5
Py ratio 1.7E-3 9.8E-4 9.8E-4 1.3E-4
Py diff 2.3E-4 2.2E-4 2.2E-4 2.7E-6
Pz ratio 4.4E-3 9.8E-4 9.8E-4 2.5E-4
Pz diff 6.1E-6 1.0E-4 1.0E-4 1.2E-5
Phi diff 4.8E-5 3.0E-6 1.9E-7 3.0E-6
Eta diff 9.2E-5 5.7E-6 3.6E-7 5.7E-6

@mmusich
Copy link
Contributor

mmusich commented Sep 20, 2022

A major reason I made this issue is to start to get experts to begin looking at these options.

what are the target gains in % expected from this effort?

@slava77
Copy link
Contributor

slava77 commented Sep 20, 2022

About the number of bit for px,py,pz, if I take a naive linear propagation of the relative uncertainty, for the W mass we'll need 1e-6 precision (0.1 MeV); so, it's 19 or 20 bits.

Perhaps making some toy phase-space would show the more appropriate connection.
@bendavid is my consideration appropriate or are there some cancellations (or, worse, enhancements) in the relationship between the track momentum precision and the fitted mass?

@Dr15Jones
Copy link
Contributor Author

what are the target gains in % expected from this effort?

Such a question is "above my pay grade" :). I'm just showing that we have the possibility for substantial gains by exploring this area. @dpiparo ?

@Dr15Jones
Copy link
Contributor Author

Dr15Jones commented Sep 20, 2022

@slava77

About the number of bit for px,py,pz, if I take a naive linear propagation of the relative uncertainty, for the W mass we'll need 1e-6 precision (0.1 MeV); so, it's 19 or 20 bits.

From what I can tell, this is what the MiniAOD is using

void pat::PackedCandidate::pack(bool unpackAfterwards) {
float unpackedPt = std::min<float>(p4_.load()->Pt(), MiniFloatConverter::max());
packedPt_ = MiniFloatConverter::float32to16(unpackedPt);
packedEta_ = int16_t(std::round(p4_.load()->Eta() / 6.0f * std::numeric_limits<int16_t>::max()));
packedPhi_ = int16_t(std::round(p4_.load()->Phi() / 3.2f * std::numeric_limits<int16_t>::max()));
packedM_ = MiniFloatConverter::float32to16(p4_.load()->M());
if (unpackAfterwards) {
delete p4_.exchange(nullptr);
delete p4c_.exchange(nullptr);
unpack(); // force the values to match with the packed ones
}

where MiniFloatConverter::float32to16 is

inline static uint16_t float32to16(float x) { return float32to16round(x); }

which is

inline static uint16_t float32to16round(float x) {
uint32_t i32 = edm::bit_cast<uint32_t>(x);
uint8_t shift = shifttable[(i32 >> 23) & 0x1ff];
if (shift == 13) {
uint16_t base2 = (i32 & 0x007fffff) >> 12;
uint16_t base = base2 >> 1;
if (((base2 & 1) != 0) && (base < 1023))
base++;
return basetable[(i32 >> 23) & 0x1ff] + base;
} else {
return basetable[(i32 >> 23) & 0x1ff] + ((i32 & 0x007fffff) >> shifttable[(i32 >> 23) & 0x1ff]);
}
}

which (I think) stores 11 bits of the mantissa and 4 bits of the exponent and 1 bit for the sign.

@slava77
Copy link
Contributor

slava77 commented Sep 20, 2022

From what I can tell, this is what the MiniAOD is using ...
which (I think) stores 11 bits of the mantissa and 4 bits of the exponent and 1 bit for the sign.

On one hand, not everything in miniAOD is saved at this precision; electrons and muons have the same precision as AOD.
On the other hand my example leading to 1E-6 precision need is too simplistic, that's more relevant for a fit over a distribution (so, some sqrt(N) or so contributes); the single candidate resolution requirements are likely much softer. I'll try to come up with a more clear toy/case.

@Dr15Jones
Copy link
Contributor Author

@slava77 how does reconstruction want to proceed?

@slava77
Copy link
Contributor

slava77 commented Sep 26, 2022

@slava77 how does reconstruction want to proceed?

I can only respond for tracking (reco::Tracks part of this issue).
We could start with no-PU validation using samples of the same type as in a recent tracking validation https://its.cern.ch/jira/browse/PDMVRELVALS-158
perhaps covariance & Pxyz 10 bits and covariance & Pxyz 12 bits.

Please clarify what you did for the vertex_: was it a change in nbits or also a bound (150 cm was mentioned)? I'd take 11m as a somewhat safer alternative, in case some muon studies decide to put the ref point on the last possible measurement.

@Dr15Jones
Copy link
Contributor Author

@slava77

Please clarify what you did for the vertex_: was it a change in nbits or also a bound (150 cm was mentioned)? I'd take 11m as a somewhat safer alternative, in case some muon studies decide to put the ref point on the last possible measurement.

It was both
https://github.com/Dr15Jones/cmssw/blob/7992e286bc39a8035eca3a03c6b4bcd8ab60448f/DataFormats/TrackReco/src/TrackPositionStorage.h#L31-L33

Going for 11m and keeping the same position accuracy (which is now about 1um) would mean at least 3 more bits of storage.

@Dr15Jones
Copy link
Contributor Author

@slava77

We could start with no-PU validation using samples of the same type as in a recent tracking validation https://its.cern.ch/jira/browse/PDMVRELVALS-158

If the plots are made during RECO then they would not reflect the change in precision since that only appears for later steps reading the data.

@slava77
Copy link
Contributor

slava77 commented Sep 26, 2022

@slava77

We could start with no-PU validation using samples of the same type as in a recent tracking validation https://its.cern.ch/jira/browse/PDMVRELVALS-158

If the plots are made during RECO then they would not reflect the change in precision since that only appears for later steps reading the data.

good point, we need to split the request to run PAT/miniAOD in a separate step from the RECO step.

@Dr15Jones
Copy link
Contributor Author

Dr15Jones commented Sep 26, 2022

@slava77 I updated the Track lossy section with new values where I used covariance & Pxyz 9 bits and position x & y using bounds -1100 and 1100 with 24 bits of compression. The findings give size identical to the more restricted x & y using bounds of -150 and 150 with 21 bits. I hypothesize that the two are identical except in memory the 24 bits one has more 0 bits than the 21 bit version and the compression algorithm does a great job of removing those extra 0s.

@Dr15Jones
Copy link
Contributor Author

@slava77 how to proceed? Should I make to pull requests with just covariance & Pxyz 10 bits and covariance & Pxyz 12 bits with no position compression? Then you request two different special builds with those changes so they can be validated?

@slava77
Copy link
Contributor

slava77 commented Sep 29, 2022

@slava77 how to proceed? Should I make to pull requests with just covariance & Pxyz 10 bits and covariance & Pxyz 12 bits with no position compression? Then you request two different special builds with those changes so they can be validated?

@cms-sw/orp-l2 please advise
IIUC, we will need a pre-release as a reference and two additional builds (pre-release+one of the two PRs mentioned above) so that the relvals can be requested.
I guess separate patch branches would be needed or [merge-[revert-merge]-revert] steps if you'd prefer to stay on one branch.

@Dr15Jones
Copy link
Contributor Author

@slava77 do you want the validation test to use compression for the position as well?

@perrotta
Copy link
Contributor

When do you need that validation? With the present schedule, a CMSSW_12_6_0_pre3 is expected for Oct 10. Once there is a pre-release, we can branch off a couple of releases specific for this test.
If you need it earlier on, we could cut pre3 next Tuesday, make another couple of pre-releases two weeks apart, and keel the same target date with on additional pre-release with respect to what planned in https://twiki.cern.ch/twiki/bin/view/CMS/CMSSW_12_6_0
Are normal RelVal enough for your studies? (I apologize if it was already written somewhere here above, but I did not read the whole thread yet)
In any case, please let un know your desires/needs.

@slava77
Copy link
Contributor

slava77 commented Sep 29, 2022

When do you need that validation? With the present schedule, a CMSSW_12_6_0_pre3 is expected for Oct 10.

I think that pre3 is fine,
IIUC, we'll not put this into any useful production until 13_X
but @Dr15Jones please correct me if I'm wrong.

@Dr15Jones
Copy link
Contributor Author

I think that pre3 is fine, IIUC, we'll not put this into any useful production until 13_X

I don't have any insight into the time scale for which this would be deployed, other than my personal bias of 'sooner would be better' in order to help our recent resource constraints.

@slava77
Copy link
Contributor

slava77 commented Sep 29, 2022

@slava77 do you want the validation test to use compression for the position as well?

please add some details in the table #39449 (comment)

to accompany covariance & Pxyz 10 bits and covariance & Pxyz 12 bits variants.

From #39449 (comment) I'm not sure there is much gain, if 24 bits (default float) gives the same size as 21 bits. But perhaps I misunderstood the test reported there.

@Dr15Jones
Copy link
Contributor Author

Dr15Jones commented Sep 29, 2022

From #39449 (comment) I'm not sure there is much gain, if 24 bits (default float) gives the same size as 21 bits. But perhaps I misunderstood the test reported there.

When comparing just using covariance & Pxyz 9 bits to doing covariance & Pxyz 9 bits PLUS x&y of 24 bits the total AOD file size decreases by an additional 1.5%. In the just covariance & Pxyz 9 bits case, the x,y&z are already being converted to float when stored as they are Double32_t values. A float is 32 bits in size, 24 bits mantissa with 8 bits of exponent. As the exponent is based 2, values have information split between the mantissa and exponent. To illustrate lets look at a few representations. Lets assume a fixed conversion with a min of 0, max of 2^23 and number bits 24.

decimal value int representation float representation comment
8.0 b[20 0s]1000 b[24 0s]00000011
9.0 b[20 0s]1001 b001[21 0s]00000011 2^3*(1+1/8)
255.0 b[16 0s]11111111 b[17 0s]111111100000111 2^7*(1+1/2+1/4+...1/128)

Given the conversion of the x&y to fixed point 24 bit values (which a min of -1100cm and max of 1100cm) still reduces the size of the AOD, implies that the fixed 24 bit conversion is easier to compress than the 32 bit float.

@slava77
Copy link
Contributor

slava77 commented Sep 29, 2022

From #39449 (comment) I'm not sure there is much gain, if 24 bits (default float) gives the same size as 21 bits. But perhaps I misunderstood the test reported there.

When comparing just using covariance & Pxyz 9 bits to doing covariance & Pxyz 9 bits PLUS x&y of 24 bits the total AOD file size decreases by an additional 1.5%. In the just covariance & Pxyz 9 bits case, the x,y&z are already being converted to float when stored as they are Double32_t values. A float is 32 bits in size, 24 bits mantissa with 8 bits of exponent. As the exponent is based 2, values have information split between the mantissa and exponent. To illustrate lets look at a few representations. Lets assume a fixed conversion with a min of 0, max of 2^23 and number bits 24.
decimal value int representation float representation comment
8.0 b[20 0s]1000 b[24 0s]00000011
9.0 b[20 0s]1001 b001[21 0s]00000011 2^3*(1+1/8)
255.0 b[16 0s]11111111 b[17 0s]111111100000111 2^7*(1+1/2+1/4+...1/128)

Given the conversion of the x&y to fixed point 24 bit values (which a min of -1100cm and max of 1100cm) still reduces the size of the AOD, implies that the fixed 24 bit conversion is easier to compress than the 32 bit float.

Ugh,
image
I proposes to make PRs for the two last cases in the table.
You followed up with a possible addition of the position compression, and I just wanted to see the explicit expected compression in this table for the proposed variant of position compression.
I wasn't looking for more academic details.

@Dr15Jones
Copy link
Contributor Author

@slava77 it will probably take a few days (given the speed of the jobs) to make the measurements for

  • covariance & Pxyz 10 bits with x&y using 24 bit fixed (min -1100 max 1100) and
  • covariance & Pxyz 12 bits with x&y using 24 bit fixed (min -1100 max 1100)

Given that to first order the different field compressions are independent of each other, I expect the addition of position compression will decrease the AOD size compared with just the covariance & Pxyz alone by an additional 1.5% (since that was the case for the 9 bit compression test). So an overall estimated reduction of AOD of 12.7% and 10.5% (for 10bit and 12 bit cases of covariance and Pxyz).

@Dr15Jones
Copy link
Contributor Author

@slava77 the table has been updated with the additional measurements. The savings are even a bit better than my original estimate.

@slava77
Copy link
Contributor

slava77 commented Sep 30, 2022

@slava77 the table has been updated with the additional measurements. The savings are even a bit better than my original estimate.

Thanks.

Perhaps we can start with the PRs now? Perhaps with just one first, to confirm BW compatibility and what can be seen in the PR tests.

The somewhat more aggressive option covariance & Pxyz 10 bits, vertex 24 bits could be a start.

@Dr15Jones
Copy link
Contributor Author

@slava77

The somewhat more aggressive option covariance & Pxyz 10 bits, vertex 24 bits could be a start.

See #39554

@slava77
Copy link
Contributor

slava77 commented Oct 8, 2022

When do you need that validation? With the present schedule, a CMSSW_12_6_0_pre3 is expected for Oct 10. Once there is a pre-release, we can branch off a couple of releases specific for this test. If you need it earlier on, we could cut pre3 next Tuesday, make another couple of pre-releases two weeks apart, and keel the same target date with on additional pre-release with respect to what planned in https://twiki.cern.ch/twiki/bin/view/CMS/CMSSW_12_6_0 Are normal RelVal enough for your studies? (I apologize if it was already written somewhere here above, but I did not read the whole thread yet) In any case, please let un know your desires/needs.

pre3 is out. I guess we can proceed with the plan .
One variant is available as #39554 (currently made to master) and at a glance it seems ready (the bw compatibility is confirmed and the HLT validation plots roughly confirm the expected precision of the lossy compression change).

Should special branches be made?

@perrotta @rappoccio

@mandrenguyen
Copy link
Contributor

+reconstruction
It seems that, at least for the case of tracking, the available lossy compression options were studied and found to be too approximate to be safe for physics.
Do we consider this case closed?

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 2, 2024

cms-bot internal usage

@makortel
Copy link
Contributor

makortel commented Apr 3, 2024

-core

Do we consider this case closed?

I guess at this point it seems clear the lossy compression won't be going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants