[Q&A] InflateState reset has measurable costs when decompressing millions of objects #89

Byron · 2020-08-04T03:31:02Z

When decompressing the objects in the linux kernel pack with 7.5 million objects/decompressions, I am using a boxed InflateState which is reset for each object to get ready for the next decompression. Please note that this is the common case, as it happens when receiving a pack when cloning or fetching, as an index has to be built from the pack. gitoxide is already faster than git when resolving these objects (i.e. computing the SHA1), but is quite a bit slower during indexing, which I am trying to fix :D.

Since it comes with its own output buffer, this one is reset as well.

This shows up when profiling decompressing/indexing the entire kernel pack:

Commenting out the line that resets the buffer has the desired effect, removing this cost entirely.

The respective field has a TODO associated with it, and I wonder if you think my issue could be fixed with that.

Thanks for sharing your thoughts.

The text was updated successfully, but these errors were encountered:

oyvindln · 2020-08-05T13:29:08Z

Hm, re-using the buffer without zeroing it could in theory pose some security risk if there was a bug somewhere in the decompression code (not that we have found anything like that through fuzzing.) It's probably possible to use a structure or something to statically guarantee not accidentally reading old content though.

A simple temporary solution would be to add an extra reset function that doesn't zero out the buffer for uses like this, with an accompanied warning. In any case the streaming api needs a serious rework.

It is also possible that the compiler isn't optimizing the code to zero the array well and calling bzero more than it needs to, but I haven't looked into what code the line generates.

Byron · 2020-08-06T08:42:51Z

A simple temporary solution would be to add an extra reset function that doesn't zero out the buffer for uses like this, with an accompanied warning. In any case the streaming api needs a serious rework.

I agree, that seems like the easiest course of action, and I am happy to submit a PR for that if you give me the go along with a name for that method :D.

On another note, do you from the top of your head could think of other overhead/cost associated with starting to decompress a stream? I had the feeling that even with that issue removed, gitoxide could only maintain about 25MB/s when decompressing many, many small streams, whereas git keep up at ~50MB/s.

In any case, thanks a lot for your help!

Fixes Frommi#89

Related to Frommi#89

Byron · 2020-08-11T07:33:11Z

I took a closer look at this issue and did my best to assure the slowdown at the end of the pack (where there are many, many small streams) is not caused by something outside of actually decompressing the streams.

To achieve that, I split the producing code that is decompressing small objects into its own thread and avoided making any allocation there. The outcome (an entry) is sent in chunks big enough to minimize synchronization overhead to a consumer thread, which integrates that into a tree. That way, it's easy to see if for instance the consumer is the bottleneck. Fortunately, one core can integrate millions of entries, and I have seen it receive peaks of close to 400k entries per second.

I have also applied the pack in the linked PR to miniz oxide to shave off several seconds.

Below is how this looks like after profiling decompressing the linux kernel pack.

Most time is spent in the producer thread decompressing entries, some time is spent on IO as well, which is expected.

Now I get these times which are quite respectable: 52.1MB/s vs 53.7MB/s for by git itself.

Once we get the PR to a mergable state an a new patch release, I think this issue can be closed as well - it's good enough :).

Byron mentioned this issue Aug 4, 2020

[performance+memory] Beating git in index-pack (as used for clones and fetches) ✅ 🚀 GitoxideLabs/gitoxide#5

Closed

DoumanAsh mentioned this issue Aug 8, 2020

Introduce reset policy to control InflateState::reset #91

Merged

Byron added a commit to Byron/miniz_oxide that referenced this issue Aug 11, 2020

Allow resetting the Inflate state without zeroing the output buffer

bf0c89f

Fixes Frommi#89

Byron added a commit to Byron/miniz_oxide that referenced this issue Aug 11, 2020

Allow resetting the Inflate state without zeroing the output buffer

0c6e803

Related to Frommi#89

Byron mentioned this issue Aug 11, 2020

Allow resetting the Inflate state without zeroing the output buffer #92

Closed

oyvindln closed this as completed in #91 Aug 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Q&A] InflateState reset has measurable costs when decompressing millions of objects #89

[Q&A] InflateState reset has measurable costs when decompressing millions of objects #89

Byron commented Aug 4, 2020 •

edited

Loading

oyvindln commented Aug 5, 2020

Byron commented Aug 6, 2020

Byron commented Aug 11, 2020

[Q&A] InflateState reset has measurable costs when decompressing millions of objects #89

[Q&A] InflateState reset has measurable costs when decompressing millions of objects #89

Comments

Byron commented Aug 4, 2020 • edited Loading

oyvindln commented Aug 5, 2020

Byron commented Aug 6, 2020

Byron commented Aug 11, 2020

Byron commented Aug 4, 2020 •

edited

Loading