Reducing output size

Jakub Arnold edited this page Jun 23, 2014 · 7 revisions

This page covers the options you have for reducing the output JS size from Fay. Before reading this please read the Generated code page which explains what code Fay outputs. We're going to work on a simple file to reduce it to the smallest we can.

The file:

import Prelude

main = print (fib 10) where
  fib :: Int -> Int
  fib 0 = 0
  fib 1 = 1
  fib n = fib (n - 1) + fib (n - 2)

The file is very simple, and optimizing it in reality probably wouldn't be worth our time, but you can apply the same techniques in this article to much larger code with very good results.

So our first compilation command is:

$ fay demo.hs

And we're running:

$ node demo.js
55

What's our current size?

$ cat demo.js | gzip > demo.js.gz
$ cat demo.js | gzip -9 > demo.js.gz.9
$ ls -alh demo.js*
-rw-rw-r-- 1 chris chris  62K Feb  2 13:44 demo.js
-rw-rw-r-- 1 chris chris 8.6K Feb  2 13:48 demo.js.gz
-rw-rw-r-- 1 chris chris 8.4K Feb  2 13:48 demo.js.gz.9

We care about output size for a number of reasons, gzip may or may not be applicatible at the time, code size itself might be important (for small footprint devices), or for pasting a small script, etc. But we're going to just try to whittle this down to nothing (hopefully).

No -p

Obviously, don't use the -p flag if you want small output.

Google Closure compiler

The Google Closure compiler is your first line of attack. Fay is specifically aimed at being compressible by such tools.

$ closure --compilation_level=ADVANCED_OPTIMIZATIONS demo.js > demo.js.closure
$ cat demo.js.closure | gzip > demo.js.closure.gz
$ cat demo.js.closure | gzip -9 > demo.js.closure.gz.9
$ ls demo.js* -alh
-rw-rw-r-- 1 chris chris  62K Feb  2 13:44 demo.js
-rw-rw-r-- 1 chris chris 3.8K Feb  2 13:50 demo.js.closure
-rw-rw-r-- 1 chris chris 1.3K Feb  2 13:50 demo.js.closure.gz
-rw-rw-r-- 1 chris chris 1.3K Feb  2 13:50 demo.js.closure.gz.9
-rw-rw-r-- 1 chris chris 8.6K Feb  2 13:48 demo.js.gz
-rw-rw-r-- 1 chris chris 8.4K Feb  2 13:48 demo.js.gz.9

So straight away demo.js.closure has reduced the 62K of demo.js to a mere 3.8K. That's because demo.js contains Prelude and a bunch of code in runtime, none of which is actually used. So Closure strips them out and inlines quite a lot of stuff, generally it produces smaller—and in some cases, faster—code.

The gzip'd version demo.js.closure.gz has been further reduced to 1.3K, and gzip'ing at -9 (highest level of compression) produces no results.

Optimization

Optimization can (thankfully) produce smaller code after Closure:

$ fay demo.hs -O

The output is 25K larger than it was without optimizations:

$ ls demo.js -alh
-rw-rw-r-- 1 chris chris 87K Feb  2 13:54 demo.js

But Closure is now able to reduce the file to 3.7K:

$ ls demo.js.closure -alh
-rw-rw-r-- 1 chris chris 3.7K Feb  2 13:55 demo.js.closure

Optimization gains like this shouldn't be too high, I don't believe, but it's worth taking into consideration.

The gzipped version is still 1.3K, so we saved nothing on gzip.

Don't enable user-generated type serialization

The print function is defined like this:

print :: Automatic a -> Fay ()
print = ffi "(function(x) { if (console && console.log) console.log(x) })(%1)"

The automatic serializer code in the runtime is large. But if we're not using polymorphic values with our FFI functions, we can avoid this all together:

showInt :: Int -> String
showInt = ffi "%1+''"

Before we make that change in the file, let's check the size of the file when we import FFI (which brings with it some data types):

import FFI
import Prelude
main = print (fib 10) where
  fib :: Int -> Int
  fib 0 = 0
  fib 1 = 1
  fib n = fib (n - 1) + fib (n - 2)

And we see that it increases our base size to 4.2K.

$ fay demo.hs -O
$ closure --compilation_level=ADVANCED_OPTIMIZATIONS demo.js > demo.js.closure
$ ls demo.js* -alh
-rw-rw-r-- 1 chris chris  89K Feb  2 14:07 demo.js
-rw-rw-r-- 1 chris chris 4.2K Feb  2 14:08 demo.js.closure

This is because user-defined types generate a "dispatcher" used to serialize them generically with the Automatic type. If we're not using Automatic anywhere, we can disable the dispatcher generation:

$ fay demo.hs -O --no-dispatcher

And then we see that this significantly reduces the output size:

$ closure --compilation_level=ADVANCED_OPTIMIZATIONS demo.js > demo.js.closure
$ ls demo.js* -alh
-rw-rw-r-- 1 chris chris  84K Feb  2 14:10 demo.js
-rw-rw-r-- 1 chris chris 2.6K Feb  2 14:10 demo.js.closure

But we're stil using the Automatic feature because of print:

$ node demo.js
demo.js:225
    jsObj = Fay$$fayToJsUserDefined(type,fayObj);

So let's replace print with our showInt:

import FFI
import Prelude
main = putStrLn (showInt (fib 10)) where
  fib :: Int -> Int
  fib 0 = 0
  fib 1 = 1
  fib n = fib (n - 1) + fib (n - 2)
showInt :: Int -> String
showInt = ffi "%1+''"

And we're okay:

$ node demo.js
55

We're also now down to 2.7K in the closure output, and 1020 bytes in the gzip output:

$ cat demo.js.closure | gzip > demo.js.closure.gz
$ cat demo.js.closure | gzip -9 > demo.js.closure.gz.9
$ ls demo.js* -alh
-rw-rw-r-- 1 chris chris  84K Feb  2 14:13 demo.js
-rw-rw-r-- 1 chris chris 2.7K Feb  2 14:13 demo.js.closure
-rw-rw-r-- 1 chris chris 1020 Feb  2 14:14 demo.js.closure.gz
-rw-rw-r-- 1 chris chris 1018 Feb  2 14:14 demo.js.closure.gz.9

Only export used things

If we only export main, we get another substantial saving, because showInt isn't exported:

module Main (main) where
import FFI
import Prelude
main = putStrLn (showInt (fib 10)) where
  fib :: Int -> Int
  fib 0 = 0
  fib 1 = 1
  fib n = fib (n - 1) + fib (n - 2)
showInt :: Int -> String
showInt = ffi "%1+''"

However, in our tiny case, we gain nothing. But in other, more normal-sized code-bases, this saves you a lot. This is the more serious kind of output size benefit you will get in real apps.

Don't export the built-ins

If we're not using the built-ins, let's not export them. This allows Closure to inline even further:

$ fay demo.hs -O --no-dispatcher --no-builtins
$ closure --compilation_level=ADVANCED_OPTIMIZATIONS demo.js > demo.js.closure
$ cat demo.js.closure | gzip -9 > demo.js.closure.gz.9
$ cat demo.js.closure | gzip > demo.js.closure.gz
$ ls demo.js* -alh
-rw-rw-r-- 1 chris chris  85K Feb  2 14:20 demo.js
-rw-rw-r-- 1 chris chris  807 Feb  2 14:24 demo.js.closure
-rw-rw-r-- 1 chris chris  374 Feb  2 14:20 demo.js.closure.gz
-rw-rw-r-- 1 chris chris  373 Feb  2 14:20 demo.js.closure.gz.9

Of course, we're really down to super micro optimization at this point. But if you want to embed a little JS script somewhere or that you want people to be able to copy/paste, these kind of things can count.

Bring locals to the top-level

Bringing local definitions to the top-level can help Closure to inline away code.

module Main (main) where
import FFI
import Prelude
main = putStrLn (showInt (fib 10))
fib :: Int -> Int
fib 0 = 0
fib 1 = 1
fib n = fib (n - 1) + fib (n - 2)
showInt :: Int -> String
showInt = ffi "%1+''"

Yielding:

-rw-rw-r-- 1 chris chris  798 Feb  2 14:20 demo.js.closure

Again, this can be considered a micro-optimization.

Final output:

The whole program is here, you can cat | node and paste it or in your browser console:

var i=new function(){function f(a){return new b(function(){var c;if(0===d(a))c=0
;else if(1===d(a))c=1;else{var j=f(g(a,1)),e=f(g(a,2));c=new b(function(){
return d(j)+d(e)})}return c})}function d(a,c){for(;a instanceof b;)a=a.f(c);return a}
function b(a){this.a=!1;this.value=a}function k(){this.value=void 0}function h(a,
c){this.d=a;this.e=c}function g(a,c){return new b(function(){return d(a)-d(c)})}
b.prototype.f=function(a){return a?this.value():this.a?this.value:(this.value=
this.value(),this.a=!0,this.value)};this.b=new b(function(){var a=f(10),c=new b(
function(){for(var c=d(a)+"",e=null,b=c.length-1;0<=b;b--)e=new h(c[b],e);return e
});return new b(function(){for(var a=c,b="",a=d(a);a instanceof h;)b+=a.d,a=d(a.
e);console.log(b);return new k})});this.c=d};i.c(i.b);

Dissection

We can dissect it further if we want. This is addition and subtraction. Note the d is thunk forcing. With strictness analysis this wouldn't be necessary.

c=new b(function(){return d(j)+d(e)})}return c})}
function g(a,c){return new b(function(){return d(a)-d(c)})}

This is the thunk forcer loop:

function d(a,c){for(;a instanceof b;)a=a.f(c);return a}

And this is the actual thunk forcer:

b.prototype.f=function(a){return a?this.value():this.a?this.value:(this.value=
this.value(),this.a=!0,this.value)};

This makes a thunk:

function b(a){this.a=!1;this.value=a}

The rest is code for serialization to/from JS strings (for print = putStrLn .show), and i.c(i.b); is “force the main thunk”. Finally, the first line is:

function f(a){return new b(function(){var c;if(0===d(a))c=0
;else if(1===d(a))c=1;else{var j=f(g(a,1)),e=f(g(a,2))

which is the fib function. As you can see, it uses the lazy addition and lazy subtraction functions and it itself is lazy. If Fay did some strictness analysis to realise that none of the number operations in here are non-strict, then the code wouuuld look like a native JavaScript implementation, using native - and + operations, Closure would determine that the (+) and - are no longer needed, and this code would be shorter still.