Skip to content

build: introduce zopfli compression #10629

@PeterDaveHello

Description

@PeterDaveHello
  • Version: any
  • Platform: any unix os with .gz tarball release
  • Subsystem: build

Dear all node maintainer,

Not sure if you heard of zopfli or not, I spend a while to test its performance and impact on node release tarball, wanna share the result with you, wonder if you'll like to accept a PR to use it?

zopfli is a fully gz compatible compression algorithm introduced by Google in 2013, with additional 5% size saving but may be 80 times slower when compression (but won't be performance issue when decompression), used to compress the static resource like release tarball, since we can compress it once but use it billion times, so that we can save significant bandwidth and disk space. It's already been included in Ubuntu, Debian, Fedora Linux distros, and also FreeBSD, for the detail, there are some references:

I just take node-v7.4.0-linux-x64.tar.gz release tarball as an example, compare time + size impact on my computer with both origin gzip -9 and different iterations of zopfli compression
( gzip -9, zopfli --i1, zopfli --i9, zopfli --i19, zopfli --i50)

$ time gzip -9 node-v7.4.0-linux-x64.tar

real    0m6.452s
user    0m6.442s
sys     0m0.008s

size: 15537444
$ time zopfli --i1 node-v7.4.0-linux-x64.tar

real    1m49.536s
user    1m49.452s
sys     0m0.052s

node-v7.4.0-linux-x64.tar.gz 15537444 -> 14975152 (96.38%), 16 times slower
$ time zopfli --i9 node-v7.4.0-linux-x64.tar

real    3m44.684s
user    3m44.546s
sys     0m0.068s

node-v7.4.0-linux-x64.tar.gz 15537444 -> 14937034 (96.13%), 34 times slower
$ time zopfli --i19 node-v7.4.0-linux-x64.tar

real    6m11.833s
user    6m11.656s
sys     0m0.068s

node-v7.4.0-linux-x64.tar.gz 15537444 -> 14935402 (96.13%), 56 times slower
$ time zopfli --i50 node-v7.4.0-linux-x64.tar

real    13m49.044s
user    13m48.278s
sys     0m0.064s

node-v7.4.0-linux-x64.tar.gz 15537444 -> 14933583 (96.11%), 168 times slower

So the time will at least increase 16 times, about 1 min 43 secs on my computer (E3-1220 V2 @ 3.10GHz CPU with DDR3 1333 8GB Ram), the size will be reduced to 96.38%, if we give it more time and iterations, the size could be smaller, but the effect won't growth that significantly, that's the tradeoff.

I'm not sure how many download times per release will have, since we don't release new version everyday (except nightly build, but we can disable zopfli on nightly build, just use origin gzip on it), give each release few more minutes, save about 4% size on the gz release tarball, save both bandwidth and disk space on the nodejs side, user side, may worth it.

Just FYI, I also tested zopfli compression result with all the gz tarballs but different compress iterations as below:

zopfli --i1 size changes:

node-v7.4.0-headers.tar.gz          483170 ->   460281 (95.26%)
node-v7.4.0-darwin-x64.tar.gz     13416624 -> 12893965 (96.10%)
node-v7.4.0-linux-arm64.tar.gz    14711615 -> 14155294 (96.21%)
node-v7.4.0-linux-armv6l.tar.gz   14687810 -> 14147345 (96.32%)
node-v7.4.0-linux-armv7l.tar.gz   14666298 -> 14122855 (96.29%)
node-v7.4.0-linux-ppc64.tar.gz    15675943 -> 14887267 (94.96%)
node-v7.4.0-linux-ppc64le.tar.gz  15335840 -> 14750604 (96.18%)
node-v7.4.0-linux-s390x.tar.gz    15952274 -> 15138093 (94.89%)
node-v7.4.0-linux-x64.tar.gz      15537444 -> 14975152 (96.38%)
node-v7.4.0-linux-x86.tar.gz      14999886 -> 14469933 (96.46%)
node-v7.4.0-sunos-x86.tar.gz      15309353 -> 14751377 (96.35%)
node-v7.4.0.tar.gz                27904025 -> 26594185 (95.30%)

zopfli --i9 size changes:

node-v7.4.0-headers.tar.gz          483170 ->   458036 (94.79%)
node-v7.4.0-darwin-x64.tar.gz     13416624 -> 12848587 (95.76%)
node-v7.4.0-linux-arm64.tar.gz    14711615 -> 14111936 (95.92%)
node-v7.4.0-linux-armv6l.tar.gz   14687810 -> 14102389 (96.01%)
node-v7.4.0-linux-armv7l.tar.gz   14666298 -> 14080710 (96.00%)
node-v7.4.0-linux-ppc64.tar.gz    15675943 -> 14826696 (94.58%)
node-v7.4.0-linux-ppc64le.tar.gz  15335840 -> 14710986 (95.92%)
node-v7.4.0-linux-s390x.tar.gz    15952274 -> 15089968 (94.59%)
node-v7.4.0-linux-x64.tar.gz      15537444 -> 14937034 (96.13%)
node-v7.4.0-linux-x86.tar.gz      14999886 -> 14435642 (96.23%)
node-v7.4.0-sunos-x86.tar.gz      15309353 -> 14712741 (96.10%)
node-v7.4.0.tar.gz                27904025 -> 26472385 (94.86%)

zopfli --i50 size changes:

node-v7.4.0-headers.tar.gz          483170 ->   457820 (94.75%)
node-v7.4.0-darwin-x64.tar.gz     13416624 -> 12843218 (95.72%)
node-v7.4.0-linux-armv7l.tar.gz   14666298 -> 14075986 (95.97%)
node-v7.4.0-linux-armv6l.tar.gz   14687810 -> 14096693 (95.97%)
node-v7.4.0-linux-x86.tar.gz      14999886 -> 14431547 (96.21%)
node-v7.4.0-linux-arm64.tar.gz    14711615 -> 14109192 (95.90%)
node-v7.4.0-linux-x64.tar.gz      15537444 -> 14933583 (96.11%)
node-v7.4.0-linux-s390x.tar.gz    15952274 -> 15086037 (94.56%)
node-v7.4.0-linux-ppc64.tar.gz    15675943 -> 14821324 (94.54%)
node-v7.4.0-linux-ppc64le.tar.gz  15335840 -> 14706944 (95.89%)
node-v7.4.0-sunos-x86.tar.gz      15309353 -> 14703688 (96.04%)
node-v7.4.0.tar.gz                27904025 -> 26461851 (94.83%)

What do you guys think?

Thanks for your time :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    buildIssues and PRs related to build files or the CI.feature requestIssues that request new features to be added to Node.js.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions