New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a faster `deflate` setting #37298

Merged
merged 2 commits into from Oct 22, 2016

Conversation

Projects
None yet
4 participants
@nnethercote
Contributor

nnethercote commented Oct 20, 2016

In #37086 we have considered various ideas for reducing the cost of LLVM bytecode compression. This PR implements the simplest of these: use a faster deflate setting. It's very simple and reduces the compression time by almost half while increasing the size of the resulting rlibs by only about 2%.

I looked at using zstd, which might be able to halve the compression time again. But integrating zstd is beyond my Rust FFI integration abilities at the moment -- it consists of a few dozen C files, has a non-trivial build system, etc. I decided it was worth getting a big chunk of the possible improvement with minimum effort.

The following table shows the before and after percentages of instructions executed during compression while doing debug builds of some of the rustc-benchmarks with a stage1 compiler.

html5ever-2016-08-25      1.4% -> 0.7%
hyper.0.5.0               3.8% -> 2.4%
inflate-0.1.0             1.0% -> 0.5%
piston-image-0.10.3       2.9% -> 1.8%
regex.0.1.30              3.4% -> 2.1%
rust-encoding-0.3.0       4.8% -> 2.9%
syntex-0.42.2             2.9% -> 1.8%
syntex-0.42.2-incr-clean 14.2% -> 8.9%

The omitted ones spend 0% of their time in decompression.

And here are actual timings:

futures-rs-test  4.110s vs  4.102s --> 1.002x faster (variance: 1.017x, 1.004x)
helloworld       0.223s vs  0.226s --> 0.986x faster (variance: 1.012x, 1.022x)
html5ever-2016-  4.218s vs  4.186s --> 1.008x faster (variance: 1.008x, 1.010x)
hyper.0.5.0      4.746s vs  4.661s --> 1.018x faster (variance: 1.002x, 1.016x)
inflate-0.1.0    4.194s vs  4.143s --> 1.012x faster (variance: 1.007x, 1.006x)
issue-32062-equ  0.317s vs  0.316s --> 1.001x faster (variance: 1.013x, 1.005x)
issue-32278-big  1.811s vs  1.825s --> 0.992x faster (variance: 1.014x, 1.006x)
jld-day15-parse  1.412s vs  1.412s --> 1.001x faster (variance: 1.019x, 1.008x)
piston-image-0. 11.058s vs 10.977s --> 1.007x faster (variance: 1.008x, 1.039x)
reddit-stress    2.331s vs  2.342s --> 0.995x faster (variance: 1.019x, 1.006x)
regex.0.1.30     2.294s vs  2.276s --> 1.008x faster (variance: 1.007x, 1.007x)
rust-encoding-0  1.963s vs  1.924s --> 1.020x faster (variance: 1.009x, 1.006x)
syntex-0.42.2   29.667s vs 29.391s --> 1.009x faster (variance: 1.002x, 1.023x)
syntex-0.42.2-i 15.257s vs 14.148s --> 1.078x faster (variance: 1.018x, 1.008x)

r? @alexcrichton

Remove `{in,de}flate_bytes_zlib`.
These functions are unused.
Use fast decompression in `deflate_bytes`.
This commit changes the parameters of `deflate` to do faster,
lower-quality compression. For the compression of LLVM bytecode -- which
is the main use of `deflate_bytes` -- it makes compression almost twice
as fast while the size of the compressed files is only ~2% worse.
@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton
Member

alexcrichton commented Oct 20, 2016

@bors: r+

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Oct 20, 2016

Contributor

📌 Commit 94771a1 has been approved by alexcrichton

Contributor

bors commented Oct 20, 2016

📌 Commit 94771a1 has been approved by alexcrichton

Manishearth added a commit to Manishearth/rust that referenced this pull request Oct 22, 2016

Rollup merge of #37298 - nnethercote:faster-deflate, r=alexcrichton
Use a faster `deflate` setting

In #37086 we have considered various ideas for reducing the cost of LLVM bytecode compression. This PR implements the simplest of these: use a faster `deflate` setting. It's very simple and reduces the compression time by almost half while increasing the size of the resulting rlibs by only about 2%.

I looked at using zstd, which might be able to halve the compression time again. But integrating zstd is beyond my Rust FFI integration abilities at the moment -- it consists of a few dozen C files, has a non-trivial build system, etc. I decided it was worth getting a big chunk of the possible improvement with minimum effort.

The following table shows the before and after percentages of instructions executed during compression while doing debug builds of some of the rustc-benchmarks with a stage1 compiler.
```
html5ever-2016-08-25      1.4% -> 0.7%
hyper.0.5.0               3.8% -> 2.4%
inflate-0.1.0             1.0% -> 0.5%
piston-image-0.10.3       2.9% -> 1.8%
regex.0.1.30              3.4% -> 2.1%
rust-encoding-0.3.0       4.8% -> 2.9%
syntex-0.42.2             2.9% -> 1.8%
syntex-0.42.2-incr-clean 14.2% -> 8.9%
```
The omitted ones spend 0% of their time in decompression.

And here are actual timings:
```
futures-rs-test  4.110s vs  4.102s --> 1.002x faster (variance: 1.017x, 1.004x)
helloworld       0.223s vs  0.226s --> 0.986x faster (variance: 1.012x, 1.022x)
html5ever-2016-  4.218s vs  4.186s --> 1.008x faster (variance: 1.008x, 1.010x)
hyper.0.5.0      4.746s vs  4.661s --> 1.018x faster (variance: 1.002x, 1.016x)
inflate-0.1.0    4.194s vs  4.143s --> 1.012x faster (variance: 1.007x, 1.006x)
issue-32062-equ  0.317s vs  0.316s --> 1.001x faster (variance: 1.013x, 1.005x)
issue-32278-big  1.811s vs  1.825s --> 0.992x faster (variance: 1.014x, 1.006x)
jld-day15-parse  1.412s vs  1.412s --> 1.001x faster (variance: 1.019x, 1.008x)
piston-image-0. 11.058s vs 10.977s --> 1.007x faster (variance: 1.008x, 1.039x)
reddit-stress    2.331s vs  2.342s --> 0.995x faster (variance: 1.019x, 1.006x)
regex.0.1.30     2.294s vs  2.276s --> 1.008x faster (variance: 1.007x, 1.007x)
rust-encoding-0  1.963s vs  1.924s --> 1.020x faster (variance: 1.009x, 1.006x)
syntex-0.42.2   29.667s vs 29.391s --> 1.009x faster (variance: 1.002x, 1.023x)
syntex-0.42.2-i 15.257s vs 14.148s --> 1.078x faster (variance: 1.018x, 1.008x)
```

r? @alexcrichton

Manishearth added a commit to Manishearth/rust that referenced this pull request Oct 22, 2016

Rollup merge of #37298 - nnethercote:faster-deflate, r=alexcrichton
Use a faster `deflate` setting

In #37086 we have considered various ideas for reducing the cost of LLVM bytecode compression. This PR implements the simplest of these: use a faster `deflate` setting. It's very simple and reduces the compression time by almost half while increasing the size of the resulting rlibs by only about 2%.

I looked at using zstd, which might be able to halve the compression time again. But integrating zstd is beyond my Rust FFI integration abilities at the moment -- it consists of a few dozen C files, has a non-trivial build system, etc. I decided it was worth getting a big chunk of the possible improvement with minimum effort.

The following table shows the before and after percentages of instructions executed during compression while doing debug builds of some of the rustc-benchmarks with a stage1 compiler.
```
html5ever-2016-08-25      1.4% -> 0.7%
hyper.0.5.0               3.8% -> 2.4%
inflate-0.1.0             1.0% -> 0.5%
piston-image-0.10.3       2.9% -> 1.8%
regex.0.1.30              3.4% -> 2.1%
rust-encoding-0.3.0       4.8% -> 2.9%
syntex-0.42.2             2.9% -> 1.8%
syntex-0.42.2-incr-clean 14.2% -> 8.9%
```
The omitted ones spend 0% of their time in decompression.

And here are actual timings:
```
futures-rs-test  4.110s vs  4.102s --> 1.002x faster (variance: 1.017x, 1.004x)
helloworld       0.223s vs  0.226s --> 0.986x faster (variance: 1.012x, 1.022x)
html5ever-2016-  4.218s vs  4.186s --> 1.008x faster (variance: 1.008x, 1.010x)
hyper.0.5.0      4.746s vs  4.661s --> 1.018x faster (variance: 1.002x, 1.016x)
inflate-0.1.0    4.194s vs  4.143s --> 1.012x faster (variance: 1.007x, 1.006x)
issue-32062-equ  0.317s vs  0.316s --> 1.001x faster (variance: 1.013x, 1.005x)
issue-32278-big  1.811s vs  1.825s --> 0.992x faster (variance: 1.014x, 1.006x)
jld-day15-parse  1.412s vs  1.412s --> 1.001x faster (variance: 1.019x, 1.008x)
piston-image-0. 11.058s vs 10.977s --> 1.007x faster (variance: 1.008x, 1.039x)
reddit-stress    2.331s vs  2.342s --> 0.995x faster (variance: 1.019x, 1.006x)
regex.0.1.30     2.294s vs  2.276s --> 1.008x faster (variance: 1.007x, 1.007x)
rust-encoding-0  1.963s vs  1.924s --> 1.020x faster (variance: 1.009x, 1.006x)
syntex-0.42.2   29.667s vs 29.391s --> 1.009x faster (variance: 1.002x, 1.023x)
syntex-0.42.2-i 15.257s vs 14.148s --> 1.078x faster (variance: 1.018x, 1.008x)
```

r? @alexcrichton

bors added a commit that referenced this pull request Oct 22, 2016

Auto merge of #37341 - Manishearth:rollup, r=Manishearth
Rollup of 8 pull requests

- Successful merges: #37294, #37298, #37301, #37310, #37318, #37321, #37326, #37327
- Failed merges:
@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Oct 22, 2016

Contributor

⌛️ Testing commit 94771a1 with merge b5f6d7e...

Contributor

bors commented Oct 22, 2016

⌛️ Testing commit 94771a1 with merge b5f6d7e...

bors added a commit that referenced this pull request Oct 22, 2016

Auto merge of #37298 - nnethercote:faster-deflate, r=alexcrichton
Use a faster `deflate` setting

In #37086 we have considered various ideas for reducing the cost of LLVM bytecode compression. This PR implements the simplest of these: use a faster `deflate` setting. It's very simple and reduces the compression time by almost half while increasing the size of the resulting rlibs by only about 2%.

I looked at using zstd, which might be able to halve the compression time again. But integrating zstd is beyond my Rust FFI integration abilities at the moment -- it consists of a few dozen C files, has a non-trivial build system, etc. I decided it was worth getting a big chunk of the possible improvement with minimum effort.

The following table shows the before and after percentages of instructions executed during compression while doing debug builds of some of the rustc-benchmarks with a stage1 compiler.
```
html5ever-2016-08-25      1.4% -> 0.7%
hyper.0.5.0               3.8% -> 2.4%
inflate-0.1.0             1.0% -> 0.5%
piston-image-0.10.3       2.9% -> 1.8%
regex.0.1.30              3.4% -> 2.1%
rust-encoding-0.3.0       4.8% -> 2.9%
syntex-0.42.2             2.9% -> 1.8%
syntex-0.42.2-incr-clean 14.2% -> 8.9%
```
The omitted ones spend 0% of their time in decompression.

And here are actual timings:
```
futures-rs-test  4.110s vs  4.102s --> 1.002x faster (variance: 1.017x, 1.004x)
helloworld       0.223s vs  0.226s --> 0.986x faster (variance: 1.012x, 1.022x)
html5ever-2016-  4.218s vs  4.186s --> 1.008x faster (variance: 1.008x, 1.010x)
hyper.0.5.0      4.746s vs  4.661s --> 1.018x faster (variance: 1.002x, 1.016x)
inflate-0.1.0    4.194s vs  4.143s --> 1.012x faster (variance: 1.007x, 1.006x)
issue-32062-equ  0.317s vs  0.316s --> 1.001x faster (variance: 1.013x, 1.005x)
issue-32278-big  1.811s vs  1.825s --> 0.992x faster (variance: 1.014x, 1.006x)
jld-day15-parse  1.412s vs  1.412s --> 1.001x faster (variance: 1.019x, 1.008x)
piston-image-0. 11.058s vs 10.977s --> 1.007x faster (variance: 1.008x, 1.039x)
reddit-stress    2.331s vs  2.342s --> 0.995x faster (variance: 1.019x, 1.006x)
regex.0.1.30     2.294s vs  2.276s --> 1.008x faster (variance: 1.007x, 1.007x)
rust-encoding-0  1.963s vs  1.924s --> 1.020x faster (variance: 1.009x, 1.006x)
syntex-0.42.2   29.667s vs 29.391s --> 1.009x faster (variance: 1.002x, 1.023x)
syntex-0.42.2-i 15.257s vs 14.148s --> 1.078x faster (variance: 1.018x, 1.008x)
```

r? @alexcrichton

@bors bors merged commit 94771a1 into rust-lang:master Oct 22, 2016

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details

@nnethercote nnethercote deleted the nnethercote:faster-deflate branch Oct 23, 2016

@jsgf

This comment has been minimized.

Show comment
Hide comment
@jsgf

jsgf Nov 30, 2016

Contributor

BTW, there's a zstd binding crate on crates.io.

Contributor

jsgf commented Nov 30, 2016

BTW, there's a zstd binding crate on crates.io.

bors added a commit that referenced this pull request Jul 11, 2017

Auto merge of #43147 - oyvindln:deflate_fix, r=alexcrichton
Use similar compression settings as before updating to use flate2

Fixes #42879

(My first PR to rust-lang yay)

This changes the compression settings back to how they were before the change to use the flate2 crate rather than the in-tree flate library. The specific changes are to use the `Fast` compression level (which should be equivialent to what was used before), and use a raw deflate stream rather than wrapping the stream in a zlib wrapper. The [zlib](https://tools.ietf.org/html/rfc1950) wrapper adds an extra 2 bytes of header data, and 4 bytes for a checksum at the end. The change to use a faster compression level did give some compile speedups in the past (see #37298). Having to calculate a checksum also added a small overhead, which didn't exist before the change to flate2.

r? @alexcrichton
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment