Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uncurrying #3713

Closed
chrisdone opened this issue Oct 30, 2020 · 14 comments
Closed

Uncurrying #3713

chrisdone opened this issue Oct 30, 2020 · 14 comments
Labels
feat Feature Request

Comments

@chrisdone
Copy link

One thing that functional languages like PureScript do is generate curried functions. This appears to be a blocker for Closure. For example,

(function(){

  var Show_string = { show: function(s){ return s; } };

  function print(d,v){
    console.log(d.show(v));
  }

  function greet(d,v){
    return print(d,v + "!");
  }

  return greet(Show_string, "Mary");
})();

Closure outputs,

console.log("Mary!");

Excellent result. It did pretty straight-forward transformations here. Including removing the overhead of generic functions, which are achieved via dictionary (argument) passing in Haskell-like languages. However, in languages like Haskell/PureScript, functions are all single-argument chains of functions, like this:

(function(){

  var Show_string = { show: function(s){ return s; } };

  function print(d) {
    return function(v){
      console.log(d.show(v));
    }
  }

  function greet(d){
    return function(v){
      return print(d)(v + "!");
    }
  }

  return greet(Show_string)("Mary");
})();

Unfortunately, Closure outputs the following:

(function() {
  function c(a) {
    return function(b) {
      console.log(a.show(b));
    };
  }
  return function(a) {
    return function(b) {
      return c(a)(b + "!");
    };
  }({show:function(a) {
    return a;
  }})("Mary");
})();

This means that PureScript's output is both large and slow -- this example is small, but imagine large codebases with nested levels of this.

So it seems that "uncurrying" is not a process that Closure currently does. I'm not sure whether this was decided against, simply because it's not normal in JS to have functions like this, or whether it's an oversight, or whether there's a good reason that makes reducing these in the general case difficult.

Could someone fill me in?

If it's simply not something considered, how difficult would it be to add? Is it something I could implement myself in Closure? I'm discussing with the PureScript community and considering whether to add it to the compiler output, but on the other hand,
this kind of thing feels like Closure's bread and butter, so I'm attempting to do it the Right Way before going off and patching something further up the chain.

@mprobst
Copy link
Contributor

mprobst commented Nov 2, 2020

So it seems that "uncurrying" is not a process that Closure currently does. I'm not sure whether this was decided against, simply because it's not normal in JS to have functions like this, or whether it's an oversight, or whether there's a good reason that makes reducing these in the general case difficult.

I suspect this simply never came up as an important thing to look at, indeed because normal JavaScript doesn't produce code like this.

Can you share what the optimization algorithm you propose would look like? How would you identify the function expressions, and when to inline their contents?

@chrisdone
Copy link
Author

My thinking so far is that whenever there is an uninterrupted sequence of successive function expressions, that is a match. This is a curried function of arity 2.

function f(x){
  return function(y){
    return ...;
    }
}

If the call site is f(1) then nothing special needs to happen. But if the function is fully saturated, i.e. f(1)(2), then we want to produce f(1,2) and go back to the definition and rewrite it to function f(x,y){ ... }. Obviously, this can only happen if all calls to the function are saturated to arity N.

Also consider that when uncurrying a function g e.g. of arity 5, that is called mostly fully saturated g(1,2,3,4,5), apart from sometimes with one final arg partially applied, we could generate a definition currying the last argument, e.g. h = function(arg5){ return g(1,2,3,4,arg5) }, then we call h(1,2,3,4).

I've implemented this in a Haskell-like-to-JS compiler in the past (Fay) and found that increased code size pre-Closure, but decreased code size post-Closure pass. It's also a bit faster and less liable to stack overflow. In particular, type-class generic functions implicitly by the compiler are provided an argument for each class instance (this avoids doing vlookups). Naturally, the larger the call chain, the more argument passing we're doing. With curried functions, that's a lot of nested functions and lots of f(d1)(d2)(arg1)(arg1). Here's an example of compiler output.

PureScript like Haskell generates impure code only within a module boundary, so there's a lot of mileage out of such rewrites.

Please let me know if you think this is something that could be practically added to Closure or whether you think it'd be unlikely to make it in.

@mprobst
Copy link
Contributor

mprobst commented Nov 2, 2020

We'll discuss this and get back to you.

@lauraharker
Copy link
Contributor

lauraharker commented Jan 6, 2021

Some notes:

  • Here's a webservice link showing the lack of optimization even with ADVANCED optimizations. [edited - forgot to add the link]
  • The existing function inliner never inlines functions with both inner functions and local variables/parameters:
    if (NodeUtil.has(block, Node::isFunction, alwaysTrue())) {
    functionState.setHasInnerFunctions(true);
    // If there are inner functions, we can inline into global scope
    // if there are no local vars or named functions.
    // TODO(johnlenz): this can be improved by looking at the possible
    // values for locals. If there are simple values, or constants
    // we could still inline.
    if (!assumeMinimumCapture && hasLocalNames(fnNode)) {
    functionState.disallowInlining();
    }
    }
    }
    . Unfortunately that would include all curried functions. @concavelenz - do you recall the TODO in there?

@chrisdone
Copy link
Author

Thanks for the update @lauraharker. 😄 🎉

I added some thoughts below as concerns that PS programmers have, and why I think they'll be covered already if we follow straight-forward rules.


Nullary functions

I know this may be obvious, especially to the Closure Compiler maintainers, but I'll note it anyway: PureScript uses function (){ .. } as a way to delay evaluation. It may be tempting to reduce that down, too, but it'd cause different semantics, e.g.

function print(s){
  return function(){
    console.log(s);
  }
}

then these are strung together via

when(x>5,then(print("hi"),then(print("world"),pure(123)))

and the then will call each action as action() to force evaluation. The when is a function that'll only run the composed action of then(.., then(.., ..)) if the bool matches.

Removing the intermediate function(){..} would change the semantics here and even break performance, as the actions would all run immediately.

Of course, following "only reduce it if fully saturated" would not remove such nullary functions anyway.

Verdict: no problem.

Intermediate results

Others in the PS community have pointed out e.g. func x = let thing = bigCalculation(x) in \y -> go y thing. would compile to something like

function func(x){
  let thing = bigCalculation(x);
  return function(y){
    return go(y)(thing);
  }
}

The motivation being that you might use func in map, e.g. map(func(blah), [1,2,3,...]) - you only want bigCalculation to happen once. Plus, it doesn't satisfy what I wrote above in

uninterrupted sequence of successive

As this sequence is interrupted by let thing ... Only successive return function(){ <repeat> } would be a valid chain. In this example, I personally believe this is an exception to the rule. We do it sometimes, but it's a conscious decision to interrupt the function arguments to basically cache a value.

In other words, it's a concern PS programmers have, but it wouldn't come up anyway based on this thread.

Verdict: no problem.

@nreid260 nreid260 added the feat Feature Request label Jan 13, 2021
@nreid260
Copy link
Contributor

Unfortunately, given the body of code Closure is generally used on and the technical cost of this idea, we don't have any plans to implement it in the near future.

@chrisdone
Copy link
Author

Thanks for the update @nreid260; I appreciate the frankness and quick turn around on this reply. Pleasure interacting with the team.

It was worth asking. I may take a shot at it sometime in Haskell as a basic rename-and-then-uncurry step. But don’t let that stop anyone else trying; my time budget is presently allocated elsewhere!

@krk
Copy link

krk commented Dec 21, 2021

Tried setting assumeClosuresOnlyCaptureReferences = true and it worked:

On commit 2809700, diff:

diff --git a/src/com/google/javascript/jscomp/CompilationLevel.java b/src/com/google/javascript/jscomp/CompilationLevel.java
index 02ec99aee..0211a6cc1 100644
--- a/src/com/google/javascript/jscomp/CompilationLevel.java
+++ b/src/com/google/javascript/jscomp/CompilationLevel.java
@@ -181,7 +181,7 @@ public enum CompilationLevel {
     options.setSmartNameRemoval(true);
     options.setInlineConstantVars(true);
     options.setInlineFunctions(Reach.ALL);
-    options.setAssumeClosuresOnlyCaptureReferences(false);
+    options.setAssumeClosuresOnlyCaptureReferences(true);
     options.setInlineVariables(Reach.ALL);
     options.setComputeFunctionSideEffects(true);
     options.setAssumeStrictThis(true);

Command:

java -jar bazel-bin/compiler_unshaded_deploy.jar -O ADVANCED --js ps.js --js_output_file minps.js

Input:

(function(){

  var Show_string = { show: function(s){ return s; } };

  function print(d) {
    return function(v){
      console.log(d.show(v));
    }
  }

  function greet(d){
    return function(v){
      return print(d)(v + "!");
    }
  }

  return greet(Show_string)("Mary");
})();

Output:

console.log("Mary!");

@chrisdone
Copy link
Author

Thanks a bunch @krk, that's a highly encouraging tip! I'll try it.

@chrisdone
Copy link
Author

😞 I'm not able to reproduce your result @krk with this Dockerfile https://gist.github.com/chrisdone/52124977ff5b6833558544f977423f0c and your patch on commit 2809700cd Run InferConsts as part of typechecking integration tests.

I made a docker image with that Dockerfile and ran

bazelisk build //:compiler_unshaded_deploy.jar

and then

java -jar bazel-bin/compiler_unshaded_deploy.jar -O ADVANCED --js demo.js --js_output_file /dev/stdout

The output is:

(function(){function c(a){return function(b){console.log(a.show(b))}}return function(a){return function(b){return c(a)(b+"!")}}({show:function(a){return a}})("Mary")})();

Screenshot from 2021-12-29 15-34-26

Any idea what's different between your and my setup?

@krk
Copy link

krk commented Dec 31, 2021

@chrisdone, I have created a Dockerfile that demonstrates the patch at https://gist.github.com/krk/01355fedc9df9dd639498e23b6fa0380

# Expected output:
# Vanilla:
# (function(){function c(a){return function(b){console.log(a.show(b))}}return function(a){return function(b){return c(a)(b+"!")}}({show:function(a){return a}})("Mary")})();
# 
# Patched:
# console.log("Mary!");

@chrisdone
Copy link
Author

chrisdone commented Dec 31, 2021

Thanks a bunch @krk! I’ve eyeballed the Dockerfile and can’t see where my attempt diverged from yours, but no matter, I’m sure it was something subtle and trivial.

Happy new year! 🎊

EDIT: I was able to reproduce the result. ✔️

@sd-yip
Copy link

sd-yip commented Feb 2, 2022

Hi @chrisdone, I am also a PureScript user looking for better handling on minification. Glad to see this thread that enables me to reuse the same findings with @krk.

There is an NPM package that I just published for easy usage:
https://www.npmjs.com/package/google-closure-compiler-acocr

@jeslie0
Copy link

jeslie0 commented Jan 11, 2024

I have packaged up @sd-yip's NPM package into a Nix Flake, which can be found here: https://github.com/jeslie0/closure-compiler-acocr. Hopefully, this is useful to anyone that finds this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Feature Request
Projects
None yet
Development

No branches or pull requests

7 participants