- Version: any
- Platform: any unix os with .gz tarball release
- Subsystem: build
Dear all node maintainer,
Not sure if you heard of zopfli or not, I spend a while to test its performance and impact on node release tarball, wanna share the result with you, wonder if you'll like to accept a PR to use it?
zopfli is a fully gz compatible compression algorithm introduced by Google in 2013, with additional 5% size saving but may be 80 times slower when compression (but won't be performance issue when decompression), used to compress the static resource like release tarball, since we can compress it once but use it billion times, so that we can save significant bandwidth and disk space. It's already been included in Ubuntu, Debian, Fedora Linux distros, and also FreeBSD, for the detail, there are some references:
I just take node-v7.4.0-linux-x64.tar.gz release tarball as an example, compare time + size impact on my computer with both origin gzip -9 and different iterations of zopfli compression
( gzip -9, zopfli --i1, zopfli --i9, zopfli --i19, zopfli --i50)
$ time gzip -9 node-v7.4.0-linux-x64.tar
real 0m6.452s
user 0m6.442s
sys 0m0.008s
size: 15537444
$ time zopfli --i1 node-v7.4.0-linux-x64.tar
real 1m49.536s
user 1m49.452s
sys 0m0.052s
node-v7.4.0-linux-x64.tar.gz 15537444 -> 14975152 (96.38%), 16 times slower
$ time zopfli --i9 node-v7.4.0-linux-x64.tar
real 3m44.684s
user 3m44.546s
sys 0m0.068s
node-v7.4.0-linux-x64.tar.gz 15537444 -> 14937034 (96.13%), 34 times slower
$ time zopfli --i19 node-v7.4.0-linux-x64.tar
real 6m11.833s
user 6m11.656s
sys 0m0.068s
node-v7.4.0-linux-x64.tar.gz 15537444 -> 14935402 (96.13%), 56 times slower
$ time zopfli --i50 node-v7.4.0-linux-x64.tar
real 13m49.044s
user 13m48.278s
sys 0m0.064s
node-v7.4.0-linux-x64.tar.gz 15537444 -> 14933583 (96.11%), 168 times slower
So the time will at least increase 16 times, about 1 min 43 secs on my computer (E3-1220 V2 @ 3.10GHz CPU with DDR3 1333 8GB Ram), the size will be reduced to 96.38%, if we give it more time and iterations, the size could be smaller, but the effect won't growth that significantly, that's the tradeoff.
I'm not sure how many download times per release will have, since we don't release new version everyday (except nightly build, but we can disable zopfli on nightly build, just use origin gzip on it), give each release few more minutes, save about 4% size on the gz release tarball, save both bandwidth and disk space on the nodejs side, user side, may worth it.
Just FYI, I also tested zopfli compression result with all the gz tarballs but different compress iterations as below:
zopfli --i1 size changes:
node-v7.4.0-headers.tar.gz 483170 -> 460281 (95.26%)
node-v7.4.0-darwin-x64.tar.gz 13416624 -> 12893965 (96.10%)
node-v7.4.0-linux-arm64.tar.gz 14711615 -> 14155294 (96.21%)
node-v7.4.0-linux-armv6l.tar.gz 14687810 -> 14147345 (96.32%)
node-v7.4.0-linux-armv7l.tar.gz 14666298 -> 14122855 (96.29%)
node-v7.4.0-linux-ppc64.tar.gz 15675943 -> 14887267 (94.96%)
node-v7.4.0-linux-ppc64le.tar.gz 15335840 -> 14750604 (96.18%)
node-v7.4.0-linux-s390x.tar.gz 15952274 -> 15138093 (94.89%)
node-v7.4.0-linux-x64.tar.gz 15537444 -> 14975152 (96.38%)
node-v7.4.0-linux-x86.tar.gz 14999886 -> 14469933 (96.46%)
node-v7.4.0-sunos-x86.tar.gz 15309353 -> 14751377 (96.35%)
node-v7.4.0.tar.gz 27904025 -> 26594185 (95.30%)
zopfli --i9 size changes:
node-v7.4.0-headers.tar.gz 483170 -> 458036 (94.79%)
node-v7.4.0-darwin-x64.tar.gz 13416624 -> 12848587 (95.76%)
node-v7.4.0-linux-arm64.tar.gz 14711615 -> 14111936 (95.92%)
node-v7.4.0-linux-armv6l.tar.gz 14687810 -> 14102389 (96.01%)
node-v7.4.0-linux-armv7l.tar.gz 14666298 -> 14080710 (96.00%)
node-v7.4.0-linux-ppc64.tar.gz 15675943 -> 14826696 (94.58%)
node-v7.4.0-linux-ppc64le.tar.gz 15335840 -> 14710986 (95.92%)
node-v7.4.0-linux-s390x.tar.gz 15952274 -> 15089968 (94.59%)
node-v7.4.0-linux-x64.tar.gz 15537444 -> 14937034 (96.13%)
node-v7.4.0-linux-x86.tar.gz 14999886 -> 14435642 (96.23%)
node-v7.4.0-sunos-x86.tar.gz 15309353 -> 14712741 (96.10%)
node-v7.4.0.tar.gz 27904025 -> 26472385 (94.86%)
zopfli --i50 size changes:
node-v7.4.0-headers.tar.gz 483170 -> 457820 (94.75%)
node-v7.4.0-darwin-x64.tar.gz 13416624 -> 12843218 (95.72%)
node-v7.4.0-linux-armv7l.tar.gz 14666298 -> 14075986 (95.97%)
node-v7.4.0-linux-armv6l.tar.gz 14687810 -> 14096693 (95.97%)
node-v7.4.0-linux-x86.tar.gz 14999886 -> 14431547 (96.21%)
node-v7.4.0-linux-arm64.tar.gz 14711615 -> 14109192 (95.90%)
node-v7.4.0-linux-x64.tar.gz 15537444 -> 14933583 (96.11%)
node-v7.4.0-linux-s390x.tar.gz 15952274 -> 15086037 (94.56%)
node-v7.4.0-linux-ppc64.tar.gz 15675943 -> 14821324 (94.54%)
node-v7.4.0-linux-ppc64le.tar.gz 15335840 -> 14706944 (95.89%)
node-v7.4.0-sunos-x86.tar.gz 15309353 -> 14703688 (96.04%)
node-v7.4.0.tar.gz 27904025 -> 26461851 (94.83%)
What do you guys think?
Thanks for your time :)
Dear all node maintainer,
Not sure if you heard of zopfli or not, I spend a while to test its performance and impact on node release tarball, wanna share the result with you, wonder if you'll like to accept a PR to use it?
zopfli is a fully gz compatible compression algorithm introduced by Google in 2013, with additional 5% size saving but may be 80 times slower when compression (but won't be performance issue when decompression), used to compress the static resource like release tarball, since we can compress it once but use it billion times, so that we can save significant bandwidth and disk space. It's already been included in Ubuntu, Debian, Fedora Linux distros, and also FreeBSD, for the detail, there are some references:
I just take
node-v7.4.0-linux-x64.tar.gzrelease tarball as an example, compare time + size impact on my computer with both origingzip -9and different iterations of zopfli compression(
gzip -9,zopfli --i1,zopfli --i9,zopfli --i19,zopfli --i50)So the time will at least increase 16 times, about 1 min 43 secs on my computer (E3-1220 V2 @ 3.10GHz CPU with DDR3 1333 8GB Ram), the size will be reduced to 96.38%, if we give it more time and iterations, the size could be smaller, but the effect won't growth that significantly, that's the tradeoff.
I'm not sure how many download times per release will have, since we don't release new version everyday (except nightly build, but we can disable zopfli on nightly build, just use origin gzip on it), give each release few more minutes, save about 4% size on the gz release tarball, save both bandwidth and disk space on the nodejs side, user side, may worth it.
Just FYI, I also tested zopfli compression result with all the gz tarballs but different compress iterations as below:
zopfli --i1 size changes:
zopfli --i9 size changes:
zopfli --i50 size changes:
What do you guys think?
Thanks for your time :)