Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sloppy mode block-scoped self-defining functions don't hoist well #162

Closed
littledan opened this issue Nov 5, 2015 · 79 comments
Closed

Sloppy mode block-scoped self-defining functions don't hoist well #162

littledan opened this issue Nov 5, 2015 · 79 comments
Labels
needs consensus This needs committee consensus before it can be eligible to be merged.

Comments

@littledan
Copy link
Member

There's an idiom called "self-defining functions". It usually looks like this:

var foo = function() {
  foo = ...
}

Some users (e.g., GWT, which has widely deployed compilation output all over the web), write a slightly different version, used to implement Java static {} class initialization blocks that run just once, but with checks all over the place to run it if it hasn't been run yet:

function foo() {
   foo = ...
}

Many module systems wrap all the definition code in a try/catch. Google does this internally, and the JS code in Reddit seems to have had a similar treatment. If we have a sloppy-mode block-scoped self-defining function as follows:

try {
  function foo() {
    alert("hey!")
    foo = () => {}
  }
} catch (e) { }
foo()
foo()

then (I believe) all browsers pre ES2015 (IE10-, WebKit, Firefox and Chrome) would alert "hey" just once. This is the long-standing web-compatible intersection, as far as I can tell.

However, ES2015 semantics would say that there are two bindings--the inner lexical one inside the try block, and the outer function-hoisted one. The line redefining foo will redefine the lexical binding, and leave the hoisted one in place. So "hey!" will be alerted twice. This is IE11 and Edge semantics, as the first implementors of something approximating Annex B 3.3.

I don't think these semantics are web-compatible. They break Google Inbox, though Inbox does not currently try to work on IE. More generally, the breakage is just the result of a combination of two techniques that independently work and are independently widely used--self-defining functions (through a rather odd syntax that GWT has been emitting for years), and surrounding function definitions by try/catch (which many module systems seem to do). It is, at the very least, a refactoring hazard to prohibit them together.

I think it's possible that this breaks the web even though IE11 has been shipping for a couple years because

  • GWT is used in a lot of deploy-once intranet applications which might be only used on particular browsers, and might hit only old versions of IE or non-IE browsers.
  • Lots of static initialization blocks could be run multiple times, and the program can still sorta get by, but that doesn't make its behavior correct.

I think the second binding, the lexical one, isn't necessary to get something working that will allow references to lexically scoped variables from hoisted functions and be within the intersection of all browsers' semantics. An alternative semantics suggested by adamk@ is to have just one var-like binding, initialized to undefined, which is set at the top of the block where the function declaration occurs to that function value. There would be no conditionality semantics, and no multiple bindings for the same variable name, so the result would be easier for both users and implementors to understand.

@allenwb
Copy link
Member

allenwb commented Nov 5, 2015

Wow, I've never seen that idiom before. I wonder if anybody other than GWT uses it.

I'd suggest starting by making the smallest possible accommodation for the specific idiomatic use that has been observed. In this case what we would look for would be:

  1. A try block whose that only contains a FunctionDeclaration
  2. The function body contains a LHS reference to the name of the function.

Because the block only contains the declaration, its only possible reason for existing is to introduce a hoisted function declaration. It isn't even possible to call the function if it isn't hoisted so there is actually no need for a block level lexical binding to be created. You probably can even get away without checking for the second condition.

So, I'd look for this specific pattern and hoist without creating a lexical binding.

However, all the other restrictions on when to apply function declaration compat fixes should apply as we don't want this semantics being used in new code. In particular, you should never do such an implicit hoist that crosses an outer lexical declaration (let, class, etc) of the same name.

@littledan
Copy link
Member Author

I think it's fairly widely used. A version of the idiom is described in https://www.safaribooksonline.com/library/view/javascript-patterns/9781449399115/ch04.html under the heading "Self-defining functions", though that uses the first form of the syntax.

I'm a little uncertain about this restriction. It sounds complicated to implement and counter-intuitive for users. Users have always had a single var-style binding for sloppy-mode function declarations across all browsers, in all contexts. We didn't do a "lexical fixup" for var with these hoisting constraints; if you have a var which is declared under a let, it doesn't get an implicit hoist crossing an outer lexical declaration--it just throws. What if we did the same for sloppy-mode function bindings?

I see a few problems with the extremely narrow fix you described:

  • With eval, we can't determine in advance whether a function body contains an LHS reference to the name of the function. By the time the code runs in eval, it would be too late. This would be the first time ES contained such an inconsistency with respect to eval.
  • The try blocks here contain both FunctionDeclarations and other statements, so the formulation is too restrictive. We would probably want to allow arbitrary try blocks
  • The mental model here for programmers is pretty inconsistent. When they make a little change, all the sudden the semantics of the variables are different, due to this or that edge case. Do we really need this complexity? With this proposal, there are three ways that functions in sloppy-mode functions can be bound: Just with a lexical binding, just with a var-style binding, and with both.
  • The var-style binding in Annex B 3.3 is defined to be set by reading the lexical binding and writing it to the var binding. Sounds like different machinery will be needed for this new type of hoisting. Also, to allow functions to be used in a top-down way within a try statement, setting the variable will have to be hoisted to the top of the block. This amounts to all the complexity of @ajklein's proposal, except implemented just for this one edge case.
  • For implementations, it'll create some amount of overhead in the parser to be constantly testing whether function bodies have LHS references to the same function anywhere inside them lexically. There's not currently spec machinery or implementation machinery for doing this. Delaying creating any binding at all will also be somewhat difficult for implementations like V8, which attempt to do a few things in the first parser pass--we'll have to go back and fix up code that's already been parsed, recursively through all inner scopes.

I hope we can come up with simple, consistent semantics that satisfy all the use cases.

@bterlson
Copy link
Member

bterlson commented Nov 5, 2015

It is notable that IE11 implemented essentially the current Annex B semantics and these are still implemented in Edge as well. This amounts to millions of users and none of these have reported sites broken due to this issue. It would be great to get a sense for if this pattern is actually used broadly enough to justify either adding a lot of complexity with a targeted fix as @allenwb proposes, or b) eliminating block-scoped function bindings in sloppy mode.

@eplawless
Copy link

We use this pattern internally at Netflix. It's not very common, but it appears probably at least once in each of our projects. It's used mostly to lazily allocate some resource the first time a function is called.

function calculate(arg) {
  var resource = createExpensiveResource();
  calculate = function(arg) {
    doSomethingWith(resource, arg);
  };
  return calculate(arg);
}

@allenwb
Copy link
Member

allenwb commented Nov 6, 2015

@eplawless But do you ever use such declarations inside a block (such as a catch block)? There is no problem with this pattern if at occurs at the global script level or the top level of a function.

@eplawless
Copy link

@allenwb I'd have to search for it; offhand I don't know if we do. I wouldn't be surprised to see it appear within an IIFE, though. Additionally, I know of at least one user-facing project whose build process is just to concatenate the source files and wrap the whole thing in a try/catch.

I think the pattern is reasonably idiomatic, and I would be very surprised if its behavior changed in the way described.

@allenwb
Copy link
Member

allenwb commented Nov 6, 2015

@littledan

Users have always had a single var-style binding for sloppy-mode function declarations across all browsers, in all contexts.

That is not correct. Haven't you look at the previous analysis that was done by @bterlson, etc. There are only a limited number of legacy usage patterns for block level function declarations that are actuallyt inter-operable across all browsers. Apparently, you have identified another one.

We didn't do a "lexical fixup" for var with these hoisting constraints; if you have a var which is declared under a let, it doesn't get an implicit hoist crossing an outer lexical declaration--it just throws.

Because let declarations didn't exist on the web until ES6, so there can't be any pre-existing code that does this that we need to preserve.

What if we did the same for sloppy-mode function bindings?
we do, it is never legal to hoist a function declaration over a like-named let declaration

Edge has already demonstrated that the solution given in B.3.3 is implementable. You new case is actually simpler than the one that B.3.3 already deals with as it doesn't require two binding, just the hosted one. Essentially, for such function you would revert to what ever you have already done.

@allenwb
Copy link
Member

allenwb commented Nov 6, 2015

@eplawless appearing in a IIFE wouldn't be a problem, because declarations are never hoisted across function boundaries. It's only { } blocks containing functions that are problematic.

@eplawless
Copy link

@allenwb yep, makes sense. It's the try/catch wrapping that worries me, which I can confirm we do on one of the libraries which we deploy to a majority of our devices.

@bterlson
Copy link
Member

bterlson commented Nov 6, 2015

@eplawless is it possible to find where this pattern is used and give me a URL (I have a Netflix account if needed :)). If you can find it, the link won't work in Edge or IE11 which seems very bad...

@eplawless
Copy link

@bterlson sure, let me go speak to the team whose library I'm thinking of. It's served to game consoles, media boxes, and smart TVs, so I should be able to find a CDN link for you (hopefully unminified).

@bterlson
Copy link
Member

bterlson commented Nov 6, 2015

@eplawless that would be amazing!

@eplawless
Copy link

@bterlson so, updates:

The team who used to concat their build and wrap in a try/catch has switched to Browserify and does not wrap in a try/catch anymore. In that rewrite they removed the instance of this pattern I was aware of.

Another team does have this pattern but each usage I see happens to be within an IIFE. I want to stress that this is accidental and that at any point a refactor could remove it. I have a minified link for the code that contains this pattern; this is what runs on devices: https://secure.netflix.com/us/tvui/release-2015_11_02-49/13_3/sapphire/720p/gibbon/darwin_main.js

Here's an unminified excerpt:

function shouldLogVerboseRequests() {
    var result = false;
    var showVerboseConsolidatedLogs = util.CurrentUrlParams.get('showVerboseConsolidatedLogs') || '',
        arrToLog = showVerboseConsolidatedLogs.split(','),
        result = arrToLog.indexOf('request') > -1 ||
            arrToLog.indexOf('requests') > -1 ||
            showVerboseConsolidatedLogs === 'true';

    shouldLogVerboseRequests = function memoizedShouldLogVerboseRequests() { return result; }
    return result;
}

I can find 5-10 instances of it offhand, though I haven't been in this codebase for about a year. The behavior change wouldn't make this code incorrect, but it would degrade performance on low end devices (of which we support many) given the high volume of calls to this and similar functions.

That's probably all I can add to the conversation. The other engineers I spoke with were alarmed at the change; I think it is a very surprising behavior.

Edit: fixed URL

@allenwb
Copy link
Member

allenwb commented Nov 6, 2015

@eplawless

That's probably all I can add to the conversation. The other engineers I spoke with were alarmed at the change; I think it is a very surprising behavior.

Exactly which change is alarming? The ES6 semantics for the above code at the global level of either a script or function has not changed in any way.

It's only if the code was something like:

if (somePredicate()) {
   function shouldLegVerboseResults() { ... }
 }

would anything be different. The above code is already non-standard and not very interoperable. Have you ever seen that combination?

It would also break for:

try {
   function shouldLegVerboseResults() { ... }
} catch (e) {};

that is the case that GWT apparently is producing. But, I think that pattern is specific enough that we should be able to account for it.

@eplawless
Copy link

@allenwb

My knowledge of the spec is very limited, but I'm surprised that you mention this pattern is non-standard. Was the fact that a function statement creates a single binding in its containing scope specified for ES5, or is this undefined (or against-spec) behavior that just happens to work?

I think our alarm comes from the fact that a seemingly orthogonal detail (whether the statement is within a block) can impact the behavior of this pattern. I'd expect to be able to reason about this by saying one of "function statements only create an outer binding" or "function statements have separate inner and outer bindings", but I'm not sure I understand why this should behave differently inside a block.

@allenwb
Copy link
Member

allenwb commented Nov 6, 2015

@eplawless
Prior to ES6, the ECMAScript specification did not specify the syntactic possibility or semantics of a function declaration occurring within a { } code block. Strictly following the standard, it would be a syntax error.

In practice, browsers did allow function declarations to occur in blocks, but they did not all use the same semantics. There are many ways a function could be defined within a block that historically will not work the same in all browsers.

ES6 defines a standard syntax and semantics for functions in blocks. The semantics is different from what has historically been in any browser (but is totally consistent with the ES6 lexical scoping rules). The ES6 semantics tries to be backwards compatible with most rationale and interoperable legacy uses of block-level function declarations. Annex B.3.3 attempts to also make some less rationale legacy uses that occur interoperably on the web also work.

Specifically the main use case that Annex B.3.3 is trying to address is when a function is defined within a block (should be scoped only to the defining block, according to conventional block scoping rules) but is being reference from outside the block. That's what leads to distinct inner and outer bindings in some situations

@eplawless
Copy link

@allenwb

That makes sense, thanks for taking the time to explain.

@littledan
Copy link
Member Author

I think the fact that Google Inbox and a previous version of Netflix both ran into this issue without sharing any code is a bit worrying, as far as a web compatibility risk. If this was shared behavior between all browsers before IE11, I would not feel comfortable shipping the Annex B 3.3 semantics in Google Chrome. I share @eplawless 's concern about orthogonal refactoring--until ES2015, it was a valid refactoring to put such function declarations in a try/catch block, across browsers, and multiple module systems do so. Even if block-scoped functions were never defined before, and ES2015 attempts to define them in a way that meets its needs, such a definition might not be web-compatible.

Even if it shipped in IE11, and IE11 reached wide distribution, sites which faced issues with IE11 mode could have flipped on a compatibility mode, and it could have never generated a bug report. Edge explicitly sacrifices compatibility with certain websites, and what it does is not necessarily a usable strategy for Chrome, which needs to be updated to existing customers without breaking too much content.

@allenwb

That is not correct. Haven't you look at the previous analysis that was done by @bterlson, etc.

I've read a version of it from the TC39 minutes. Is there another one I should look at? I didn't see any suggestion that multiple bindings were introduced by a single FunctionDeclaration in any particular browser.

We didn't do a "lexical fixup" for var with these hoisting constraints; if you have a var which is declared under a let, it doesn't get an implicit hoist crossing an outer lexical declaration--it just throws.

Because let declarations didn't exist on the web until ES6, so there can't be any pre-existing code that does this that we need to preserve.

Exactly, and we could do the same for FunctionDeclarations--making them always var-style declarations which throw if there's a surrounding, conflicting let declaration that it would hoist over. Just like var declarations, there is pre-existing code that does this that we need to preserve the behavior of (even if it was operating in a minefield of unspecified and inconsistent semantics).

Edge has already demonstrated that the solution given in B.3.3 is implementable. You new case is actually simpler than the one that B.3.3 already deals with as it doesn't require two binding, just the hosted one. Essentially, for such function you would revert to what ever you have already done.

I've implemented the Annex B 3.3 semantics in V8 behind a flag, but what I haven't done is ship it because I'm not convinced I wouldn't break the web. However, I don't see how your semantics would be reverting to anything in particular.

  • First of all, the test has to be specified and implemented somehow. I've pointed out some issues with the test you mentioned. Chiefly, what does it mean exactly to have the function on the LHS, when this may come in an eval? Also I'm worried it'd be slow, and that it'll be complicated within the V8 parser to defer finding out where the function binding is until this test can be done.
  • Second, one there's the test, what are the semantics to fall back to? Is it the old Chrome semantics of choosing the last one overall? Or will FunctionDeclarations still execute some code to overwrite some binding? Or maybe something will happen when entering the block where the FunctionDeclaration occurs? I bet that once we figure out the semantics for the hardest case, the case that this bug calls out, we might be able to use them for everything.

@rossberg
Copy link
Member

rossberg commented Nov 9, 2015

Right, I'd like to reemphasize what Dan said: the problem is not implementability -- it is already implemented in V8. However, immediately after turning it on on dev channel we got the bug report Dan mentioned, which is why we are very concerned about actually shipping this. (https://code.google.com/p/chromium/issues/detail?id=535836 for reference; it's also worth noting that it actually wasn't easy tracking down the incompatibility discussed here as the root cause of the observed failure.)

The fact that Annex B also is a rather complicated solution is a secondary matter, but worth keeping in mind if we now consider going for an even more complicated solution that is likely going to be just as brittle.

@bterlson bterlson added the needs consensus This needs committee consensus before it can be eligible to be merged. label Nov 9, 2015
@bterlson
Copy link
Member

bterlson commented Nov 9, 2015

Even if it shipped in IE11, and IE11 reached wide distribution, sites which faced issues with IE11 mode could have flipped on a compatibility mode, and it could have never generated a bug report.

True, but given the volume of bug reports we do get (things I'm sure affect only a handful of websites) and the massive scale of compatability testing this doesn't seem very likely to me.

Edge explicitly sacrifices compatibility with certain websites, and what it does is not necessarily a usable strategy for Chrome, which needs to be updated to existing customers without breaking too much content.

I can't speak authoritatively for the Edge team, but I can for Chakra, and this is not a good way to state our compatibility goals. We explicitly broke compatibility for sites that depended on Microsoft specific extensions. Compatibility with the internet remains a top priority in addition to interoperability and spec conformance. This specific issue is certainly not an issue that I would break sites over, which is why we did the due diligence to locates sites that were broken back in IE11.

I know you really want to discount the IE11/Edge data (or lack thereof), but I don't think you can so easily. We're talking millions of users and billions of internet-hours here.

IMO changing a standard behavior (one long debated and already agreed to by all parties in this thread) and that is already implemented deserves a higher bar than breaking one app. When we found that one library was broken by adding %TypedArray%.prototype.map and some sites were broken, we didn't remove the method but fixed the sites and it hasn't been an issue since. Why won't that work here?

@ajklein
Copy link
Contributor

ajklein commented Nov 9, 2015

It seems like this is a good agenda topic for next week's meeting, which would bring a broader range of stakeholders into this discussion (I notice that @bterlson just added a "needs consensus" label to this issue, so I'm guessing he feels the same way).

@allenwb
Copy link
Member

allenwb commented Nov 9, 2015

It isn't clear to me what the googlers are suggesting as a viable alternative to Annex B.3.3.

There is no prior standard (or fully interoperable non-standard) semantics to revert to. Leaving the meaning of function declarations in blocks unspecified as in prior editions doesn't help as browsers will have to implement something anyway.

The most likely alternative semantics (in non-strict code treat block level function declarations as var declarations) would create a serious semantic divergence between strict mode and non-strict mode that would be a pedagogical nightmare and undesirable deterrent to modularizing code or otherwise converting to strict mode.

In agreeing to Annex B.3.3 we all understood that it was only trying to be "good enough" to asymptotically cover most interoperable uses of legacy block level function declarations that violated block-scoping principles. In other words, it is not (and can never be) perfect backwards compatibility. If necessary we can add other cases such as the one I sketched for try-blocks.. I generally agree with Brian in that we should start by seeing if we can simply evangelize changing identified offending uses. However, in this case since the culprit is a transpiler we may have second-order compatibility issues that make that impossible. In that case, the way forward (assuming this is really an interoperablity issue and not just a Chrome issue) is probably to extend the asymptote.

@bterlson
Copy link
Member

bterlson commented Nov 9, 2015

@ajklein Yep, added the label but not yet added to the agenda. Feel free to send a PR for that if you want to own it :)

There is no prior standard (or fully interoperable non-standard) semantics to revert to.

I think there is - prior to IE11, there was only a var binding, and so assigning to the function declaration name would assign to this var binding. We could decide that functions are var-scoped in sloppy and let-scoped in strict and this would be a better backwards compatible semantics. But this has other drawbacks as you point out. We could add additional one-off cases like sloppy-functions-in-try are var scoped, but this also has drawbacks.

@littledan
Copy link
Member Author

@allenwb Completely agree that we don't want to go back to the ES5 days of unspecified semantics here. I just hope we can find something that satisfies web compatibility concerns without being extremely complicated and hard to understand for users and implementors.

The most likely alternative semantics (in non-strict code treat block level function declarations as var declarations) would create a serious semantic divergence between strict mode and non-strict mode that would be a pedagogical nightmare and undesirable deterrent to modularizing code or otherwise converting to strict mode.

Strict mode and sloppy mode code already have huge differences, for example the different receiver for methods called on primitives. We already have the difference between hoisting and not hoisting out of blocks in sloppy mode--we need it for web compatibility. Whether that hoisting means that there isn't a lexical binding anymore seems like a much smaller, secondary semantic difference that most users won't notice. I don't think it'll be a deterrent to using strict mode; the main deterrent in this area will be the fact that functions aren't hoisted out of blocks. What do you mean about modularizing?

@bterlson Thanks for merging my pull request to add this to the agenda a few days ago at tc39/agendas#122 .

I don't think this is an issue of breaking one app. It's more an issue of two features of JavaScript which could previously be used independently (self-defining functions, and putting definitions in a try/catch) together. There's ample evidence that each one is used often individually. The pattern tends to be for initialization code. When initialization code runs multiple times, sometimes that doesn't create a user-visible failure, but sometimes it does; this could be another cause of insufficient bug reports. Annex B 3.3 creates a subtle, counter-intuitive failure when these features are combined.

@allenwb
Copy link
Member

allenwb commented Nov 9, 2015

@bterlson
Actually, I believe the most common semantics prior to ES6 was to treat function declarations in blocks as a hoisted, pre-initialized binding. Hence:

let pred = true;  //somebody using let, they must expect block scoping
if (pred) {
   function foo() {console.log("foo 1");
   foo();
} else {
   function foo() {console.log("foo 2");
   foo();
}

amazingly outputs "foo 2" because last occurrence of function foo... provides the initial value.

@allenwb
Copy link
Member

allenwb commented Nov 9, 2015

@littledan

Completely agree that we don't want to go back to the ES5 days of unspecified semantics here. I just hope we can find something that satisfies web compatibility concerns without being extremely complicated and hard to understand for users and implementors.

I understand why you would hope that, but I don't see why you would expect that given that a lot of time (and analysis of code scraped from the web) by many smart people on TC39 went into trying to find such a thing and the B.3.3 solution was the best we could come up with.

I suspect that if we had discovered the try protected function declaration pattern we would have included it as a case handled in B.3.3.

So, since you already have code that implements B.3.3, and you have users with function-in-try code why don't you add the try-block function case to your implementation nd see if it takes a care of the problem?

Working code that proves or disproves a proposition is better than endless discussion or wishful thinking.

(BTW, I still don't understand what exceptions the authors of such code expects could occur. Could use post a snippet of some actual GWT generated code that does this?)

@littledan
Copy link
Member Author

@allenwb I could try it out, if you could offer an implementable, well-defined refinement of "The function body contains a LHS reference to the name of the function." I still have concerns about complexity though, and my current implementation relies, like the spec machinery, on the lexical binding when defining the hoisted binding, so it'll be a nontrivial amount of work to implement your new three-way semantics.

The code is fairly long, but basically, there is one piece of code which generates these self-defining functions with the syntax I included in the first post (to implement static {} blocks in Java), and there's another piece of code that wraps JavaScript in a try/catch as all the code is being concatenated and minified on the way out. Wrapping all definitions in a try/catch sounds a bit weird, since a FunctionDeclaration can't cause a Runtime Error, but apparently lots of programmers do it, even when typing code by hand. It's an idiom with an expiration date, since it won't work with class declarations, but it's still current practice.

In the Google Inbox, that static block sets up a WebWorker where a lot of the work is done. The result here is a spinning cursor and emails never loading. However, other initialization code is less important and will have less wide-ranging effects. It's important to note that only references outside the try/catch are effected--in Inbox, most of the references are inside it, but some are also outside. (I debugged mostly by viewing the minified source. There's a lot of code and nothing neat to extract from it, as far as I can tell.)

@allenwb
Copy link
Member

allenwb commented Nov 10, 2015

On Nov 9, 2015, at 4:14 PM, littledan notifications@github.com wrote:

@allenwb https://github.com/allenwb I could try it out, if you could offer an implementable, well-defined refinement of "The function body contains a LHS reference to the name of the function." I still have concerns about complexity though, and my current implementation relies, like the spec machinery, on the lexical binding when defining the hoisted binding, so it'll be a nontrivial amount of work to implement your new three-way semantics.

how about checking if the LHS of an assignment operator AST node is the Identifier that is included in the head of the function declaration?

Or, try leaving that check out. If the block only contains a single item that is a FunctionDeclaration, only create the local binding.

Or look for a try-catch AST node whose try body is is a single item that is a FunctionDeclaration. Since FunctionDeclarations never throw dynamically you can replace the try-catch node with the FunctionDeclaration node.

...

The code is fairly long, but basically, there is one piece of code which generates these self-defining functions with the syntax I included in the first post (to implement static {} blocks in Java), and there's another piece of code that wraps JavaScript in a try/catch as all the code is being concatenated and minified on the way out. Wrapping all definitions in a try/catch sounds a bit weird, since a FunctionDeclaration can't cause a Runtime Error, but apparently lots of programmers do it, even when typing code by hand. It's an idiom with an expiration date, since it won't work with class declarations, but it's still current practice.

Like I mention above, if it is a a try-catch that wraps just a single function declaration you should be able to just eliminate the try-catch. That should handle it for hand-written code.

I’m most interested in understanding exactly what GWT was generating because a lot of deployed GWT generated code would be hard to get change.

In the Google Inbox, that static block sets up a WebWorker where a lot of the work is done. The result here is a spinning cursor and emails never loading. However, other initialization code is less important and will have less wide-ranging effects. It's important to note that only references outside the try/catch are effected--in Inbox, most of the references are inside it, but some are also outside. (I debugged mostly by viewing the minified source. There's a lot of code and nothing neat to extract from it, as far as I can tell.)

So is Inbox implemented using GWT? Since it is a single app and one that Google controls I’d think you should be able to evangualize getting this fixed

Allen

@littledan
Copy link
Member Author

how about checking if the LHS of an assignment operator AST node is the Identifier that is included in the head of the function declaration?

This would be the first time that ECMAScript makes eval "not work" so drastically--an assignment within an eval to the function's binding would reach only the lexical binding, if there isn't a literal AST node not in an eval which uses the identifier on the LHS. To be concrete:

try {
  function foo() {
    eval("foo = undefined");
  }
} catch (e) {}
foo();
foo()

wouldn't work properly--it wouldn't throw on the second run because there was no literal AST node which met that definition. Not that this is likely to be a big web compat issue, but it would be strange to break eval here after the heroic lengths we took to make it work elsewhere, for example, in ES2016, it works to eval super construct calls within an arrow function within a constructor.

Or, try leaving that check out. If the block only contains a single item that is a FunctionDeclaration, only create the local binding.

That wouldn't hit the typical case, which is a long piece of JavaScript including multiple FunctionDeclarations as well as a bunch of other statements. This is what Google Inbox is hitting.

Or look for a try-catch AST node whose try body is is a single item that is a FunctionDeclaration. Since FunctionDeclarations never throw dynamically you can replace the try-catch node with the FunctionDeclaration node.

Again, a single item doesn't cut it.

Maybe we could say, in general, FunctionDeclarations which are inside of try blocks are hoisted out in general.

I’m most interested in understanding exactly what GWT was generating because a lot of deployed GWT generated code would be hard to get change.

GWT generates these self-defining functions. Further module-wrapping code on the outside takes a large amount of GWT code (but not necessarily the whole program), including multiple FunctionDeclarations and other statements which do have side effects, and puts it in a try block. The try block technique is used both by module systems and hand-written code.

So is Inbox implemented using GWT? Since it is a single app and one that Google controls I’d think you should be able to evangualize getting this fixed

Yes, Inbox uses GWT. If it were as simple as fixing Inbox, then I'd definitely go for that solution (indeed, that's how the conversation started between me and the Inbox team). There's an easy way to fix it for newly generated code--just declare your function like this:

var foo = function() {
  foo = ...
}

We have some draft implementation of that change. However, we can't go back and ask everyone on the web to rebuild their GWT, unfortunately; we can only ask Google to do that, so it seems futile.

@eplawless potentially faces issue without using either of GWT or Google's internal module infrastructure, so I don't think this is about any sort of Google-specific weirdness.

@concavelenz
Copy link

If the changes don't reveal additional issues that need to be fixed:
roughly a week to be end-user visible.

On Mon, Dec 14, 2015 at 1:49 PM, Jason Orendorff notifications@github.com
wrote:

We've backed out Annex B.3 support for now:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d2bec6ed7b30

@concavelenz https://github.com/concavelenz Any idea how long it will
take for this to be fixed? A guess would be fine. I'm not asking you to
commit to anything - we aren't blocking anything on it.


Reply to this email directly or view it on GitHub
#162 (comment).

@jorendorff
Copy link
Contributor

Thank you! That's good news.

@MatrixFrog
Copy link

Inbox should be compatible with the new semantics now.

@syg
Copy link
Contributor

syg commented Dec 18, 2015

I have relanded Annex B support in Firefox Nightly in light of Tyler's comment.

   shu

On Dec 18, 2015, at 12:02, Tyler Breisacher notifications@github.com wrote:

Inbox should be compatible with the new semantics now.


Reply to this email directly or view it on GitHub.

@gskachkov
Copy link

Sorry possible it is wrong thread for the question.
But how following script should work according to Annex B.3.3.3 Changes to EvalDeclarationInstantiation ::

function foo() {
    {
        let f = 20;
        eval(" { function f() { }; } ");
        console.log(f);// I expect - 20 
    }
    console.log(f);// I expect - function f() {};
}
foo();

I've checked in Chrome/Chrome Canary/FireFox/FireFox Nightly/Edge and there different result.
In Chrome Canary&Firefox Nightly:

1 - 20
2 - Link to window object

In Chrome&FireFox:

Syntax Error redeclaration of f variable

In Edge:

1 - 20
2 - function () {}

@syg
Copy link
Contributor

syg commented Oct 6, 2016

'f' in the 2nd console.log(f) should be undefined.

Annex B.3.3.3 for EvalDeclarationInstantiation says in 1.d.2.ii: "If replacing the FunctionDeclaration f with a VariableStatement that has F as a BindingIdentifier would not produce any Early Errors for body [...]"

You can try to run this program where function f() { } is replaced with var f:

function foo() {
    {
        let f = 20;
        eval(" { var f; } ");
        console.log(f);
    }
    console.log(f);
}
foo();

This throws a redeclaration error, so Annex B.3.3.3 does not synthesize a var binding for the function.

@gskachkov
Copy link

@syg Thanks for your answer!
Seems I need to read spec more attentively

@saambarati
Copy link

@syg Why would the VariableEnvironment be the scope that holds "let f"? I had assumed that that meant the "var" scope.

@syg
Copy link
Contributor

syg commented Oct 7, 2016

@saambarati The VariableEnvironment wouldn't hold the "let f". What are you looking at?

@allenwb
Copy link
Member

allenwb commented Oct 7, 2016

The problem here is that the var f would be hoisted over the block level let f and that is always illegal

14.1.2

It is a Syntax Error if any element of the LexicallyDeclaredNames of FunctionStatementList also occurs in the VarDeclaredNames of FunctionStatementList.

18.2.1.3 5.d.ii.2.d

NOTE: A direct eval will not hoist var declaration over a like-named lexical declaration.

@saambarati
Copy link

@allenwb The LexicallyDeclaredNames of FunctionStatementList should be the empty list. Am I completely misunderstanding something here?

I still agree that we shouldn't bind to a var variable, since there is a loop in Annex B.3.3 that walks the entire scope chain checking if such a variable with such a name exists. However, I'm not sure how 14.1.2 is relevant here. Can you elaborate?

@saambarati
Copy link

@syg I guess I'm fundamentally confused about why your example program throws a redeclaration error. Can you elaborate further? Thanks.

@bakkot
Copy link
Contributor

bakkot commented Oct 7, 2016

@saambarati, B.3.3 contains no scope-walking loops. Rather, it says If replacing the FunctionDeclaration f with a VariableStatement that has F as a BindingIdentifier would not produce any Early Errors [...].

In @gskachkov's program, the Early Error which would be produced is the error in @syg's program, which comes from EvalDeclarationInstantiation 5.d.ii.a.i, which amounts to saying that it is a SyntaxError for an eval to declare a var that conflicts with an outer let.

@syg
Copy link
Contributor

syg commented Oct 7, 2016

@bakkot Actually, I now think @saambarati is right. My explanation was wrong, in that it's not actually an Early Error. Replacing the function with a var would cause an error in EvalDeclarationInstantiation, but that's not an Early Error in the body of the eval.

I worked through B.3.3.3 again, and I guess it comes down to the loop in step 1.d.ii.

Substep 4 sets bindingExists to true, since there's a let f. Substep 7 doesn't apply then at all because bindingExists is true, so no var gets synthesized or assigned to.

The story in my head was morally right but my actual understanding was wrong. I wonder if we can clarify the spec mechanism better if we lumped in errors raised by the various DeclarationInstantiation metaprocedures into the counterfactual semantics...

@bakkot
Copy link
Contributor

bakkot commented Oct 7, 2016

Ah, whoops, I made the same error. Sorry, @saambarati. (And B.3.3 does of course contain a scope-walking loop, just not what I was thinking of.)

@syg's example does throw a redeclaration error in EvalDeclarationInstantiation 5.d.ii.a.i and the spec text in B.3.3.1.d.ii does have the effect of preventing hoisting in the cases where that error would be thrown.

I think perhaps a note to that effect in B.3.3 would be the simplest clarification?

@allenwb
Copy link
Member

allenwb commented Oct 8, 2016

However, I'm not sure how 14.1.2 is relevant here. Can you elaborate?

The early error I quoted and similar early errors in 13.2.3 and 15.1.1 and a couple other places collective establish the basic var hoisting rule that it is an error to declare a var binding of a name in a context where it would be hoisted over or into a scope that has a lexical binding (let, const, etc.) for that same name.

Normally such hoisting errors are early errors, but eval can dynamically inject declarations into such contexts so the corresponding error validation has to take place dynamically as part of eval. That's the purpose of step 5 of 18.2.1.3 and the reason that the errors throw in that step are SyntaxError as if they were being reported as an early error. Another way to look at this, is that step 5 is dynamically doing the early error validation that would have been done if the eval code statically occurred in the same context.

@saambarati
Copy link

@allenwb Interesting. I didn't know this. So this is a syntax error?

function foo() {
     {
          let f;
           { var f; }
      }
}

If so, is the section you pointed to what determines this?

@allenwb
Copy link
Member

allenwb commented Oct 9, 2016

@saambarati yes and yes
plus an understanding of the LexicallyDeclaratedNames and VarDeclaredNames abstract operations is essential

@saambarati
Copy link

@allenwb Thanks for explaining. I see there is some subtlety to how the recursive definitions of VarDeclaredNames, and the like, are defined. I'm going to read over it more closely when I'm at my computer and not on my phone.

@syg So is the consensus that @gskachkov example program should execute without throwing an exception but it should not synthesize a var?

@syg
Copy link
Contributor

syg commented Oct 9, 2016

@saambarati Yes, the example program should execute without throwing an exception but it should not synthesize, nor assign to, a var.

@gskachkov
Copy link

gskachkov commented Oct 27, 2016

One more question, how should behave following code according to the spec:

function foo() {
    function boo() { debug(a); }
    eval("{ boo()/*1*/; eval('delete a;'); boo();/*2*/ function a() { }  boo(); /*3*/;} ");
}
foo();

As I can understand spec it should return following result depend on where we put boo function:

1 --> undefined
2 --> ReferenceError: Can't find variable: a
3 --> ReferenceError: Can't find variable: a

@allenwb
Copy link
Member

allenwb commented Oct 27, 2016

On Oct 26, 2016, at 9:01 PM, Alexander Skachkov notifications@github.com wrote:

One more question, how should behave following code according to the spec:
function foo() {
function boo() { debug(a); }
eval("{ boo()/* 1 /; eval('delete a;'); boo()/ 2 /;; function a() { } boo(); / 3 */;} ");
}
foo();
As I can understand spec it should return following result depend on where we put boo function:
1 --> undefined
2 --> ReferenceError: Can't find variable: a
3 --> ReferenceError: Can't find variable: a

A bit of analysis.

  1. Placing delete a; in an eval shouldn’t change the semantics of delete so the inner eval can be eliminated. Is there a reason you had the inner eval?

  2. Block level bindings are never deletable and are created upon entry to the block.. See https://tc39.github.io/ecma262/#sec-blockdeclarationinstantiation https://tc39.github.io/ecma262/#sec-blockdeclarationinstantiation . The entire body of the outer eval is a block so the block level binding for a is not deletable. So, delete a does nothing other than return false. Removing it entirely should change nothing.

  3. The result should be exactly as if the block was inlined (without the delete, rather than evaled:

function foo() {
function boo() { debug(a); }
{ boo()/* 1 /; boo()/ 2 /;; function a() { } boo(); / 3 */;}
}
foo();

  1. But, is this actually legacy (pre-ES2015) interoperable code? Did closure capture of eval created binding work across interoperable among browsers prior to ES2015. If nor, then we should be trying to make this work.

Allen

@gskachkov
Copy link

  1. Placing delete a; in an eval shouldn’t change the semantics of delete so the inner eval can be eliminated. Is there a reason you had the inner eval?

Yeah You are right, it does not matter if delete a in inner eval, however

Block level bindings are never deletable and are created upon entry to the block.. See

It little bit confused me, because I thought it delectable according to the B.3.3.3.d.ii.7.a.ii.ii.i

i. Perform ! varEnvRec.CreateMutableBinding(F, true).

If D is true, record that the newly created binding may be deleted by a subsequent DeleteBinding call.

and according to the B.3.3.3.d.ii.7.b.v&B.3.3.3.d.ii.7.b.vi we can not set binding so when we trying to access a inside of the boo function we need to throw exception.

So I decided that it should work as if there would be var instead of function:

function foo() {
    function boo() { console.log(a); }
    eval("{ boo()/* 1 */; delete a;  boo()/* 2 */; var a = 10; boo(); /* 3 */;} ");
 }
 foo();

returns:

1 --> undefined
2 --> ReferenceError: Can't find variable: a
3 --> 10 // It takes value not from foo's function scope but from global scope where a is created and assigned by expression a = 10, in case of function we do not create variable in glob scope and do not bind there function

@allenwb
Copy link
Member

allenwb commented Oct 27, 2016

On Oct 27, 2016, at 10:37 AM, Alexander Skachkov notifications@github.com wrote:

  1. Placing delete a; in an eval shouldn’t change the semantics of delete so the inner eval can be eliminated. Is there a reason you had the inner eval?

Yeah You are right, it does not matter if delete a in inner eval, however

Block level bindings are never deletable and are created upon entry to the block.. See

It little bit confused me, because I thought it delectable according to the B.3.3.3.d.ii.7.a.ii.ii.i

i. Perform ! varEnvRec.CreateMutableBinding(F, true).

This is creating the function-level var-binding that is provided for backwards combat. But the function declaration in the block also create a block-level lexical binding, when block declaration instantiation is performed. That binding is created using false as the second argument. So, the block level bindings not deletable.

If D is true, record that the newly created binding may be deleted by a subsequent DeleteBinding call.

The delete a operation resolves a to the undeletable block-level binding. So it does nothing. It does not touch the function-level compatibility var binding created in 7.a.ii.ii.i

and according to the B.3.3.3.d.ii.7.b.v&B.3.3.3.d.ii.7.b.vi we can not set binding so when we trying to access a inside of the boo function we need to throw exception.

these work, because both bindings are still there.

So I decided that it should work as if there would be var instead of function:

var instead of function is used to determine with the compatibility semantics should be applied. But that does not mean that the function declaration is given var semantics. Within its enclosing block, the function declaration is bound according to normal BlockDeclarationInstantiation rules.

@saambarati
Copy link

Allen, I have a similar program that I'm curious about what the expected behavior is. It looks like the code that binds the function as the execution step of the declaration expects an outer binding to exist. Why would a binding exist in this program:

function delete_a() { delete a; }
function get_a() { return a; }
eval(`
    get_a(); // undefined;
    {
           get_a(); // undefined;
           deleta_a();
           function a() { }; // what does this do?
           get_a(); // what does this do?
    }
`)

The reason I think this the spec might be wrong is:
before any of the body of the eval executes, we create a binding for a that is deletable. As part of the execution of the eval before we execute the function declaration and bind to the outer binding of a, we delete the outer binding. However, the spec seems to assume the outer binding will always be there. Why is the binding guaranteed to still exist?

I could be misunderstanding what's happening in the spec, so an analysis of why I might be wrong would be helpful.

@gskachkov
Copy link

@saambarati Good question.
I think last get_a() should return function a() {}
There is how I understand spec
because during binding we invoke SetMutableBinding

B.3.3.3.d.ii.7.b.vi Perform ? genvRec.SetMutableBinding(F, fobj, false).

That is described in 8.1.1.1.5 and in step 8.1.1.1.5.2 we check if binding exist and recreate, if it was removed, in step 8.1.1.1.5.2.b

@allenwb
Copy link
Member

allenwb commented Nov 13, 2016

Well, this is another case that doesn't fall into the set of legacy interoperable FiB scenarios that Annex B.3.3 is trying to support. So, for cases like this there isn't any legacy semantics we are trying to preserve. Instead, we should be trying to minimize observable B.3.3 deviations from the normal (ie, without B.3.3) ES2015+ semantics.

In this case, I think minimizing deviation means that in B.3.3.3 step 2.d.ii.1.b.iii.vi should be guarded by genvRec.HasBinding(F). That will prevent the global binding from being recreated after the delete.

The guard probably isn't need for the non-eval cases because in those cases the extra bindng isn't deletable.

@gskachkov
Copy link

gskachkov commented Feb 7, 2017

Sorry if I missed, but just to double check if I correctly understand how should behave hoisting of function declaration in case of using 'with' keyword. I'm expecting that it will behave in the same way as variable declaration in eval. Could you please correct me if I'm wrong?

function foo(scope) {
  with(scope) {
    eval('{ function y() {} }');
    console.log(y);
  }
  console.log(y);
  console.log(scope.y);
}

//1
foo({});
// function y() {}
// function y() {}
// undefined
foo({y:10});
// function y() {}
// undefined
// function y() {}
foo(10);
// function y() {}
// function y() {}
// undefined
var v = 10;
v.y = 20;
foo(v);
// function y() {}
// function y() {}
// undefined

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs consensus This needs committee consensus before it can be eligible to be merged.
Projects
None yet
Development

No branches or pull requests