Skip to content
This repository has been archived by the owner on Sep 8, 2021. It is now read-only.

Different parsing errors between Babel and V8: await as identifier #43

Closed
JLHwung opened this issue Apr 1, 2021 · 31 comments · Fixed by #46
Closed

Different parsing errors between Babel and V8: await as identifier #43

JLHwung opened this issue Apr 1, 2021 · 31 comments · Fixed by #46

Comments

@JLHwung
Copy link
Contributor

JLHwung commented Apr 1, 2021

Related versions: Babel 7.13.9 V8 9.1.216

await as BindingIdentifier of FunctionDeclaration

C = class { static { function await() {} } };

Babel throws, V8 passes.

According to #27 (comment), a function boundary includes parameters and body. So the binding identifier await should throw here.

await as IdentifierReference in ConciseBody of ArrowFunctionExpression

C = class { static { () => await } };

Babel throws, V8 passes.

C = class { static { () => arguments } }

Babel throws, V8 throws.

In the spec we have same treatment for IdentifierReference when its SV is arguments or await, so they should either both throw or both pass. I am not clear whether function boundary includes an arrow function. But since this already throw

C = class { static p = () => arguments }

I am good to throw for C = class { static { () => arguments } }.

I don't have strong opinion for C = class { static { () => await } };, I tend to forbid it even though we allow C = class { static p = () => await }.

/cc @syg since you implemented static block features.

@ljharb
Copy link
Member

ljharb commented Apr 1, 2021

This perhaps should be a v8 bug?

@ljharb ljharb closed this as completed Apr 1, 2021
@ljharb ljharb reopened this Apr 1, 2021
@JLHwung
Copy link
Contributor Author

JLHwung commented Apr 1, 2021

I posted here because

I am not sure if C = class { static { () => await } }; should throw, specifically whether function boundary applies to arrow function, too. If C = class { static { () => await } }; should pass, then so as for C = class { static { () => arguments } };.

But since C = class { static p = () => arguments } already throws, allowing C = class { static { () => arguments } }; will be inconsistent.

If we throw C = class { static { () => arguments } }; but allow C = class { static { () => await } };, the spec needs to be updated to reflect different treatments here.

If we decide both should throw, we should update the definition of function boundaries to rule out arrow functions.

@ljharb
Copy link
Member

ljharb commented Apr 1, 2021

an arrow function does not bind arguments, and so should not affect it. It does, however, matter if it’s an async arrow or not, for await - so I’d not expect the two to be consistent, necessarily.

@JLHwung
Copy link
Contributor Author

JLHwung commented Apr 1, 2021

Ah there is an SE on 5.1:

It is a Syntax Error if ContainsArguments of ClassStaticBlockStatementList is true.

I think current behaviour is good. C = class { static { () => await } }; should be allowed.

On #27 (comment), can you add an editor note that function boundary does not include the binding identifier of function declaration? That will clearly mark C = class { static { function await() {} } }; as SE.

@syg
Copy link
Contributor

syg commented Apr 1, 2021

Thanks for the report!

C = class { static { function await() {} } };

I think this is a V8 bug, I'll double check.

C = class { static { () => await } };

This is a Babel bug.

C = class { static p = () => arguments }

This is a spec bug in the spec draft of the class static blocks proposal. It should align with what the class fields is doing, which is using this ContainsArguments abstract operations that can look through arrow functions. Instead, the class static blocks proposal is using the "doesn't cross function boundaries" prose language, which isn't precise enough.

@JLHwung
Copy link
Contributor Author

JLHwung commented Apr 1, 2021

@syg Yeah I agree that C = class { static p = () => arguments } should throw, the spec defines that on 5.1

It is a Syntax Error if ContainsArguments of ClassStaticBlockStatementList is true.

I agree that C = class { static { () => await } }; should not throw. It is a Babel bug.

I have reported another two cases in #27 (comment) where both Babel and V8 throw but it seems to be allowed by the spec.

@syg
Copy link
Contributor

syg commented Apr 1, 2021

C = class { static { function await() {} } };

Actually I take this back. From a the perspective of the Contains Static Semantics, the function binding identifier does cross the function boundary, so this actually should be allowed. @rbuckton what was your intention here?

@ljharb
Copy link
Member

ljharb commented Apr 1, 2021

Given that (async function f() { return function await() {}; }()) works fine, i'd expect (class { static { function await() {} } }) to work as well even if static blocks became async.

@bakkot
Copy link

bakkot commented Apr 1, 2021

(async function f() { return function await() {}; }()) is different from (async function f() { function await() {}; }()) in that in the former case there is no await declared in the scope of the async function, so it's fine, whereas in the latter there would be, so it's a syntax error.

Since C = class { static { function await() {} } }; does declare await in the scope of the static block, as in the second of my two cases, I assume it should be disallowed.

@JLHwung
Copy link
Contributor Author

JLHwung commented Apr 1, 2021

C = class { static p = function await() {} } is allowed, so it will be inconsistent if we disallow C = class { static { function await() {} } }.

The current V8 implementation seems to follow Contains Static Semantics, although the await error was specified differently.

@ljharb
Copy link
Member

ljharb commented Apr 1, 2021

@bakkot aha, good point, (async function f() { function await() {}; }()) is a syntax error, thanks.

@JLHwung the inconsistency already exists then, since (async function f() { function await() {}; }()) throws but (async function f() { (function await() {}); }()) does not.

@bakkot bakkot closed this as completed Apr 1, 2021
@bakkot bakkot reopened this Apr 1, 2021
@bakkot
Copy link

bakkot commented Apr 1, 2021

(sorry, misclick)

@JLHwung
Copy link
Contributor Author

JLHwung commented Apr 1, 2021

@ljharb Oh I was meant to say C = class { static { this.p = function await() {} } } which is, if I wear developers hat, equivalent to C = class { static p = function await() {} }.

From a parser viewpoint, throwing for C = class { static { this.p = function await() {} } } is same as throwing for the one without this.p so I lazily drop this.p = .

@bakkot
Copy link

bakkot commented Apr 1, 2021

From a parser viewpoint, throwing for C = class { static { this.p = function await() {} } } is same as throwing for the one without this.p

That's not true. (async function(){ this.p = function await() {} }) is legal; (async function(){ function await() {} }) is an error. The BindingIdentifier works very differently for functions-as-declarations versus functions-as-expressions, and is subject to different rules.

@rbuckton
Copy link
Collaborator

rbuckton commented Apr 1, 2021

If await were a reserved word like yield, this would be so much easier. Also, the spec today does not clearly define what "function boundary" means.

The goal is to reserve await in all of its meanings inside of a static {} block, allowing us to defer until later whether static {} blocks can participate in top-level-await or in the body of an async function, or whether special syntax like async static {} might be necessary.

The goal is for await to be illegal as a BindingIdentifier or IdentifierReference anywhere it would be illegal inside of an async function or where it could have possibly been treated as an AwaitExpression:

let await;
class C {
  static {
    let await; // illegal, cannot declare a new binding for await
    let { await } = {}; // illegal, cannot declare a new binding for await
    let { await: other } = {}; // legal
    await; // illegal
    await(1); // illegal
    function await() {}; // illegal
    class await {}; // illegal
    ({ await }); // illegal short-hand property reference
    ({ [await]: 1 }); // illegal
    class D {
      await = 1; // legal
      x = await; // legal (initializers have an implicit function boundary)
      [await] = 1; // illegal (computed property names are evaluated outside of a class body
    };
    (function await() {}); // legal, 'await' in function expression name not bound inside of static block
    (class await {}); // legal, 'await' in class expression name not bound inside of static block
    (function () { return await; }); // legal, 'await' is inside of a new function boundary
    (() => await); // legal, 'await' is inside of a new function boundary
    await: // illegal, 'await' cannot be used as a label
      break await; // illegal, 'await' cannot be used as a label
  }
}

@rbuckton
Copy link
Collaborator

rbuckton commented Apr 1, 2021

I mentioned in #27 that another way we could specify this is to parse the body of static {} in +Await and then error if the block Contains AwaitExpression (or for-await-of).

@JLHwung
Copy link
Contributor Author

JLHwung commented Apr 1, 2021

@rbuckton Thanks for a detailed example. In the cases listed above, both Babel and V8 throws

(async function() { (class await {}); })

The other cases match current implementations if static { } is replaced by async function() { }.

Unlike FunctionExpression, the ClassExpression inherits the [Await] parameter, so I think (class await() {}); should be illegal under static block, since it is illegal under async function body.

@rbuckton
Copy link
Collaborator

rbuckton commented Apr 2, 2021

@rbuckton Thanks for a detailed example. In the cases listed above, both Babel and V8 throws

(async function() { (class await {}); })

Interesting. Its likely because of the oddities of how the class name's binding works:

C = { x: 1 }
(class C { [C.x]() {} }); // error: Cannot access 'C' before initialization

Even though computed property names essentially evaluate in the outer scope, we introduce an environment record in between that scope and class element evaluation in which C is declared but not initialized. So if (class await {}) were allowed in an async function, there'd be this odd interaction of async function() { (class await { [await(1)]() {} }) }, where await(1) is actually awaiting the value 1 even though there might be a lexical binding for the name await.

@Kingwl
Copy link
Member

Kingwl commented Apr 2, 2021

Oh my goodness. 🤦

@hax
Copy link
Member

hax commented Apr 2, 2021

We really should simplify the things --- just make the await as reserved word in classes and modules. Allowing await as normal identifier in some places doesn't help developers at all and finally people need pay another efforts to add linter rules to forbid those holes 😅

@bakkot
Copy link

bakkot commented Apr 2, 2021

It is reserved in modules, actually! Alas, not all code is in modules. (I'm not sure why it wasn't reserved in classes; probably to make it easier to refactor from functions?)

Allowing await as normal identifier in some places doesn't help developers at all

Well, it helps anyone who shipped code that used await as an identifier during the two decades when it didn't mean anything in the language. Or, more to the point, any random user of a website running such code.

@rbuckton
Copy link
Collaborator

rbuckton commented Apr 2, 2021

It is reserved in modules, actually!

Only at the top level, IIRC. You can still do function f() { let await = 0; } in a module, because the function body is not parsed in +Await.

@bakkot
Copy link

bakkot commented Apr 2, 2021

No, it's reserved everywhere: there is an Early Error rule for identifiers named "await" when the goal symbol of the syntactic grammar is |Module|.

(This doesn't run into the tricky edge cases you've been facing here because the reservation is comprehensive, meaning you don't have to worry about function boundaries and arrow parameters so on.)

@rbuckton
Copy link
Collaborator

rbuckton commented Apr 2, 2021

I was just looking at that section and completely missed that rule. 😓

@bakkot
Copy link

bakkot commented Apr 2, 2021

Yeah, there's a lot of special cases in the grammar. Not sure I would've been able to find it if I hadn't already known where to look.

@jugglinmike
Copy link

One of the examples above surprised me a bit:

class C {
  static {
    class D {
      x = await; // legal (initializers have an implicit function boundary)
    }
  }
}

I don't object to that specific design decision, but the phrasing of the relevant early error doesn't really convey it:

IdentifierReference : Identifier

  • It is a Syntax Error if the code matched by this production is nested, directly or indirectly (but not crossing function or static initialization block boundaries), within a ClassStaticBlock and the StringValue of Identifier is "arguments" or "await".

The term "function boundary" is not formally defined, so some readers may not recognize that the boundary of the "implicit function" qualifies.

@jugglinmike
Copy link

Although @rbuckton's earlier comment makes the intention clear, there are some more ambiguities in the term "function boundary" which may confuse readers of the current proposed text.

Are function "names" inside the function boundary? The intent is for this to depend on how the bindings interact with the relevant environment records. However, the early error isn't defined in terms of environment records. The interpretation that function "names" are inside the function boundary isn't as clear cut. There is at least one aspect of the specification where the notional boundary does not match the environment records--the interpretation of "strict mode code." Readers may interpret "function boundary" to follow the behavior of strict mode instead of the behavior of environment records.

Are class "names" inside a function boundary? The intent is for this to match the answer for function "names." Readers may not agree that a class definition constitutes a function boundary, though. For one, the constructor method (itself a boundary for the produced function) raises questions about which function boundary the class describes. This is also supported by some of the language in the proposal: "but not crossing function or static initialization block boundaries" would be redundant if class definitions were function boundaries.

Also, it's notoriously complicated to say on which side of the boundary default parameter initializers fall.

It seems to me that there's enough ambiguity here to warrant a formal definition of the term "function boundary." Would you folks agree?

@rbuckton
Copy link
Collaborator

rbuckton commented Jun 5, 2021

I'm looking into the switch to using +Await and forbidding await operations rather than the current function boundary-based checks. It's easy enough to add an early error for AwaitExpression, i.e.:

It is a Syntax Errror if |ClassStaticBlockStatementList| Contains |AwaitExpression| is *true*.

But its a bit harder to check for for await, since it doesn't have its own production. @ljharb, @bakkot any suggestions?

@rbuckton
Copy link
Collaborator

rbuckton commented Jun 5, 2021

Would I add it as a special case to https://tc39.es/ecma262/#sec-static-semantics-contains? Something like this:

ForInOfStatement : `for` `await` `(` LeftHandSideExpression `of` AssignmentExpression `)` Statement
  1. If symbol is AwaitExpression, return true.
  2. If LeftHandSideExpression Contains symbol, return true.
  3. If AssignmentExpression Contains symbol, return true.
  4. Return Statement Contains symbol.
ForInOfStatement : `for` `await` `(` `var` ForBinding `of` AssignmentExpression `)` Statement
  1. If symbol is AwaitExpression, return true.
  2. If ForBinding Contains symbol, return true.
  3. If AssignmentExpression Contains symbol, return true.
  4. Return Statement Contains symbol.
ForInOfStatement : `for` `await` `(` ForDeclaration `of` AssignmentExpression `)` Statement
  1. If symbol is AwaitExpression, return true.
  2. If ForDeclaration Contains symbol, return true.
  3. If AssignmentExpression Contains symbol, return true.
  4. Return Statement Contains symbol.

We do something similar in AsyncArrowFunction : `async` AsyncArrowBindingIdentifier `=>` AsyncConciseBody for NewTarget, SuperProperty, SuperCall, super, and this, but returning false (instead of true like we would here).

@bakkot
Copy link

bakkot commented Jun 5, 2021

The cleanest solution would be to split ForInOfStatement into ForInStatement, ForOfStatement, and ForAwaitOfStatement, so that you can check for ForAwaitOfStatement just as you currently can check for AwaitExpression.

This involves touching a few dozen places in the spec, but they're all straightforward. If you'd like I can do this as a PR to ecma262 and you can add an editor's note to the section in this proposal containing the reference to ForAwaitOfStatement saying that this production is defined by said PR.

@rbuckton
Copy link
Collaborator

rbuckton commented Jun 6, 2021

I ended up going with a slightly different approach and added a ContainsAwait that is similar to ContainsArguments. I will have a PR up shortly.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants