Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Possibly bring back closed over variables in loops. #1804

Closed
jashkenas opened this Issue · 27 comments

8 participants

@jashkenas
Owner

Chatting with @swannodette the other day ... he mentioned that ClojureScript managed to smartly close over loop-local variables, to provide let semantics. We used to have this, but abandoned it due to inconsistencies back when we introduced do.

for item in list
  functions.push ->
    call item

... used to close over the per-iteration value of item. If ClojureScript indeed manages to do this without leaky semantics, perhaps we can follow suit.

@zmthy
Collaborator

Wasn't our problem not that we couldn't implement it properly, but that it broke consistency with Javascript? People were just confused about the magic.

@geraldalewis

I think it's a great idea, especially considering it would solve the issue where loop vars are re-declared within nested functions (and only within nested functions):

-> 
  outerVar = "outer"
  ->
     console.log outerVar # "outerVar"

->
  outerVar = "outer"
  ->
    console.log outerVar # "undefined"
    0 for outerVar in [0]
@satyr
Collaborator

Does ClojureScript have break/continue/return equivalent?

@swannodette

No ClojureScript has no form of non-local control beyond exceptions. Is non-local control support relevant here?

@satyr
Collaborator

Is non-local control support relevant here?

Inconsistency wrt those controls was the reason we gave up the feature before.

@swannodette

Interesting, got an example where it becomes an issue? In what way does break / continue / return matter from inside a fn closed over a loop local?

BTW, what is being suggested is not wrapping the whole body of the loop in a fn to preserve let semantics - only doing so around the fns that are already closing over the loop local.

@satyr
Collaborator

See #959.

@swannodette

The implementation of #959 borks break / continue / return. That's not what I'm suggesting.

@jashkenas
Owner

So, here was the initial bug:

for i in [1, 2]
  setTimeout (-> console.log i), 0
  return if false

The return needs to return out of the enclosing function, so we suppressed the closure wrapping. ... I'm thinking we might be able to get around this by making the generated JavaScript more specific, to just wrap the inner function definition, and nothing more:

var i, _fn, _i, _len, _ref;
_ref = [1, 2];
_fn = function(i) {
  return function() {
    return console.log(i);
  };
};
for (_i = 0, _len = _ref.length; _i < _len; _i++) {
  i = _ref[_i];
  setTimeout(_fn(i), 0);
  if (false) return;
}
@swannodette

Yup, the trick is delimit the scope precisely to the fn that uses the local(s)

@satyr
Collaborator

I've suggested it before. Not sure why it got rejected.

@swannodette

Hmm, I think there's a big problem with this enhancement - breaking the mutability of locals. I consider this awful but in a language with near JS semantics you are allowed to do the following:

for i in xs
  setTimeout (-> i++), 100
  i++
  setTimeout (-> somef(i++)), 200

This seems like a show stopper to me.

@michaelficarra
Collaborator

I believe my solution from #959 is still valid and addresses all of the concerns mentioned. It's just not very easy to implement. Not impossibly difficult, but definitely not easy.

@swannodette

@michaelficarra yes that looks like it could work.

@geraldalewis

@michaelficarra that's pretty crazy clever. I understand you made the gnarliest example you could, but I wonder if it's in keeping with the "readable output" philosophy?

I think you could make the case that most JS devs wouldn't naturally transliterate

a = (e++ for e in a)

as:

var a, e;
a = (function() {
  var _i, _len, _results;
  _results = [];
  for (_i = 0, _len = a.length; _i < _len; _i++) {
    e = a[_i];
    _results.push(e++);
  }
  return _results;
})();

That's not incredibly readable, and it's pretty easily distinguishable as computer generated. But the difference might be that you're not seeing the semantics hacking to make the comprehension work. _results is one thing; __return = {} feels like we're edging into defining a new runtime.

@jashkenas
Owner

Yep -- it's a bit too nasty as generated JS output ... but merely closure-wrapping the function definitions would make it so there are no return/continue/break special cases.

That said, @swannodette's point about mutating local variables stands. That would be very hard to do right.

@swannodette

@michaelficarra's solution does support mutable locals since it wraps the body of the loop and gives continue / break / return special handling. However __results and __return are not necessary of course. You can accomplish the same thing with exceptions.

While I wouldn't be surprised that my continue / break / return statements were converted to exceptions inside of for - I think most JS devs would.

@michaelficarra
Collaborator

@geraldalewis: That example's not unnatural at all. How else would you write it? That's exactly how I would hand-code that.

@jashkenas: I don't think it's that unnatural; it doesn't feel like a very far step from the example @geraldalewis gave of our current output. Very few of the showcased features would be used in the average case. See the second example in my proposal. Can you show me a simple example that has an unacceptably complex compilation? Also, see @swannodette's note. Mutability within iterations but not between was one of the goals of the proposal.

@swannodette: exceptions were avoided because try/catch is slow. Very slow. It should never be used in control flow.

@swannodette

@michaelficarra, if JS engines continue to optimize the conventional wisdom about the speed of throw will turn out to be false. The most expensive part of exceptions is capturing the stack. On the JVM this only done when creating exception objects. Thus an throw with a preallocated exception on the JVM is equivalent to a longjmp (goto) - http://blogs.oracle.com/jrose/entry/longjumps_considered_inexpensive.

In anycase the cost of creating functions that close over the environment inside the loop thoroughly demolishes the cost of using exceptions for flow control.

http://jsperf.com/exception-speed

@michaelficarra
Collaborator

@swannodette: Thanks for showing me that. I thought there was an inherent inefficiency in try/catch.

@geraldalewis

@michaelficarra I didn't express myself correctly. My point was that there is normally no bending of JS semantics to create a CoffeeScript comprehension. On the other hand, while your solution was really beautiful, it felt like code I see in the Dart runtime (which I really like, by the way). In other words, I would never write that code unless I was working around JavaScript's nature, which seems like a tenet of CoffeeScript. (And I want to reiterate how much I personally dig that solution).

As for how I would transliterate CoffeeScript's comprehensions (in that example I gave in particular):

a = (e++ for e in a)

would be....

for( var i = 0, l = a.length; i < l; i++ )
  a[i] += 1;

or

a = a.map(function(e){return ++e;});

I hope I'm not on some sub-wavelength, and that's sensible code. Did you mean in the generic sense?

@snan

Sorry for the noise, but I’m a new CoffeeScript user and this bit me too. I fixed the issues locally by wrapping it in an extra -> but I really want it to close over variables per each iteration.
Just one more vote, I guess…

@showell

I'm -1 on this, for the simple reason that it would complicate CS's scoping rules. Right now the rules are pretty simple. The only way to limit the scope of a variable is to put it inside a function, correct? The simplicity does lead to a few gotchas:

  1. Obviously, loop variables tend to introduce naming collisions (hence this issue).
  2. Introducing top level variables can lead to accidental naming collisions.

Despite the gotchas, I find that the overall simplicity leads to fewer bugs, because you never have a false sense of security, and the scoping model is very easy to reason about.

If we were to limit the scope of loop variables, then the conceptual complexity of the scoping model naturally increases (unless I'm overlooking an elegant way to express how the model would work under this proposal).

The upsides of this proposal are pretty self-evident, so I'll summarize the objections:

  1. It would make the scoping model more complex.
  2. It would potentially break existing code.
  3. Implementation might be tricky.
  4. The generated JS code might look ugly and be harder to understand.
@jashkenas
Owner

Yep -- closing this because the initial premise (follow ClojureScript's lead) won't work for for us, with break/return/continue and mutable locals. But, @michaelficarra, if you come up with a pull request that you think is an improvement, I'll be looking forward to it.

@jashkenas jashkenas closed this
@geraldalewis

We may be able to implement this by exploiting the fact that catch variables are block scoped.

  • Control flow statements break/return/continue behave as expected
  • Loop variables behave as expected:
    • after exiting the loop, the loop variable reflects its final value within the loop body
    • loop variables can be modified within the loop body*
  • This method works for both Arrays and Objects**

Here's an example:

  var callbacks = [];
  for (var i = 0; i < 3; i++) {
    try{ throw i }
    catch (i){
      // `catch` variables are block scoped
      callbacks.push(function() {
        return console.log(i);
      });
    }
  }
  // code might rely on the loop variable after the loop exits
  // for instance, when finding the index of an element
  i === 3; // true
  for (var j = 0; j <= 2; j++) callbacks[j](); // 0,1,2

* So that loop variables remain mutable, we just add two assignments:

  var callbacks = [];
  for (var i = 0, j = 0; i < 3; i++) {
    try{ throw i }
    catch (i){
      ++i;
      callbacks.push(function() {
        return console.log('  ' + i);
      });
      // update j with the updated index
      j = i;
    }
    // update the index to reflect changes made within the catch body
    i = j;
  }
  i === 4; // true
  for (var k = 0; k <= 2; k++) 
    typeof callbacks[k] === 'function' && callbacks[k](); // 1,3

** Both keys and values in Object iteration can be captured by nesting try/catch blocks.

  var o = {
    key1: 'val 1',
    key2: 'val 2',
    key3: 'val 3'
  };

  var callbacks = [];
  for (var key in o) {
    // both the key and value can be captured by
    // nesting each in their own try..catch block
    try{ throw key}
    catch(key){
      try{ throw o[key] }
      catch(val){
        callbacks.push(function(){
          console.log('  ' + key + ": " + val);
        });
      }
    }
  }
  key === 'key3'; // true
  for (var j = 0; j <= 2; j++) callbacks[j]();
  /*
    key1: val 1
    key2: val 2
    key3: val 3
  */

I'm not sure how cross-platform this behavior is, and I'm sure there are performance implications, though I'm not sure how severe. Just some food for thought.

Here's a gist to illustrate this approach.

@satyr
Collaborator

We may be able to implement this by exploiting the fact that catch variables are block scoped.

JScript says no.

@geraldalewis

Thanks @satyr , thought there might be some cross-browser differences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.