Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: introduce lookaheadInLineCharCode #15510

Merged
merged 2 commits into from Mar 24, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 1 addition & 2 deletions packages/babel-parser/src/parser/expression.ts
Expand Up @@ -1287,8 +1287,7 @@ export default abstract class ExpressionParser extends LValParser {
if (tokenIsIdentifier(type)) {
if (
this.isContextual(tt._module) &&
this.lookaheadCharCode() === charCodes.leftCurlyBrace &&
!this.hasFollowingLineBreak()
this.lookaheadInLineCharCode() === charCodes.leftCurlyBrace
) {
return this.parseModuleExpression();
}
Expand Down
17 changes: 8 additions & 9 deletions packages/babel-parser/src/parser/statement.ts
Expand Up @@ -329,10 +329,11 @@ export default abstract class StatementParser extends ExpressionParser {

/**
* Assuming we have seen a contextual `using` and declaration is allowed, check if it
* starts a variable declaration so that it should be interpreted as a keyword.
* starts a variable declaration in the same line so that it should be interpreted as
* a keyword.
*/
hasFollowingBindingIdentifier(): boolean {
const next = this.nextTokenStart();
hasInLineFollowingBindingIdentifier(): boolean {
const next = this.nextTokenInLineStart();
const nextCh = this.codePointAtPos(next);
return this.chStartsBindingIdentifier(nextCh, next);
}
Expand Down Expand Up @@ -485,9 +486,8 @@ export default abstract class StatementParser extends ExpressionParser {
case tt._using:
// using [no LineTerminator here][lookahead != `await`] BindingList[+Using]
if (
this.hasFollowingLineBreak() ||
this.state.containsEsc ||
!this.hasFollowingBindingIdentifier()
!this.hasInLineFollowingBindingIdentifier()
) {
break;
}
Expand Down Expand Up @@ -914,12 +914,11 @@ export default abstract class StatementParser extends ExpressionParser {

const startsWithLet = this.isContextual(tt._let);
const startsWithUsing =
this.isContextual(tt._using) && !this.hasFollowingLineBreak();
this.isContextual(tt._using) &&
this.hasInLineFollowingBindingIdentifier();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this check again to after startsWithUsing in line 921 below? We don't need to run it when isLetOrUsing is true because of startsWithLet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the line break check here is not required.

for (const
  x of []);

is valid, and so should

// Currently throwing "Missing semicolon. (1:10)"
for (using
  x of []);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal spec requires it; @rbuckton is there a reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you are right. I overlooked the spec.

Line break is probably safe here. for (using requires a trailing semicolon to start a ForStatement, so there won't be ASI issues here from using not being a keyword.

As for this PR I will keep the current behaviour.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal spec requires it; @rbuckton is there a reason?

It's definitely required in a regular for, because for reuses LexicalDeclaration, which requires it. I'd have to look into whether removing the restriction from for-of would cause problems, but either way that would be a normative change requiring consensus. I'm leaning towards leaving it as-is, however, to keep the line termination rules for using and await using consistent in all places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep the line termination rules for using and await using consistent in all places.

I agree. If there are any strong push back, we can always loose the restriction instead of adding one, which would be breaking.

const isLetOrUsing =
(startsWithLet && this.hasFollowingBindingAtom()) ||
(startsWithUsing &&
this.hasFollowingBindingIdentifier() &&
this.startsUsingForOf());
(startsWithUsing && this.startsUsingForOf());
if (this.match(tt._var) || this.match(tt._const) || isLetOrUsing) {
const initNode = this.startNode<N.VariableDeclaration>();
const kind = this.state.value;
Expand Down
29 changes: 29 additions & 0 deletions packages/babel-parser/src/tokenizer/index.ts
Expand Up @@ -29,6 +29,7 @@ import {
isNewLine,
isWhitespace,
skipWhiteSpace,
skipWhiteSpaceInLine,
} from "../util/whitespace";
import State from "./state";
import type { LookaheadState, DeferredStrictError } from "./state";
Expand Down Expand Up @@ -195,6 +196,34 @@ export default abstract class Tokenizer extends CommentsParser {
return this.input.charCodeAt(this.nextTokenStart());
}

/**
* Similar to nextToken, but it will stop at line break when it is seen before the next token
*
* @returns {number} position of the next token start or line break, whichever is seen first.
* @memberof Tokenizer
*/
nextTokenInLineStart(): number {
return this.nextTokenInLineStartSince(this.state.pos);
}

nextTokenInLineStartSince(pos: number): number {
skipWhiteSpaceInLine.lastIndex = pos;
return skipWhiteSpaceInLine.test(this.input)
? skipWhiteSpaceInLine.lastIndex
: pos;
}
Comment on lines +199 to +214
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nextTokenInLineStart(pos = this.state.pos): number {

Would this be better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default parameter adds a new branching. I can inline nextTokenInLineStartSince since we are not using it anyway. (An optimizing compiler will inline it for sure).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, nextTokenInLineStartSince will be used in parsing await using foo = bar(), when we do a second lookahead from using.


/**
* Similar to lookaheadCharCode, but it will return the char code of line break if it is
* seen before the next token
*
* @returns {number} char code of the next token start or line break, whichever is seen first.
* @memberof Tokenizer
*/
lookaheadInLineCharCode(): number {
return this.input.charCodeAt(this.nextTokenInLineStart());
}

codePointAtPos(pos: number): number {
// The implementation is based on
// https://source.chromium.org/chromium/chromium/src/+/master:v8/src/builtins/builtins-string-gen.cc;l=1455;drc=221e331b49dfefadbc6fa40b0c68e6f97606d0b3;bpv=0;bpt=1
Expand Down
2 changes: 1 addition & 1 deletion packages/babel-parser/src/util/whitespace.ts
Expand Up @@ -22,7 +22,7 @@ export function isNewLine(code: number): boolean {
export const skipWhiteSpace = /(?:\s|\/\/.*|\/\*[^]*?\*\/)*/g;

export const skipWhiteSpaceInLine =
/(?:[^\S\n\r\u2028\u2029]|\/\/.*|\/\*.*?\*\/)*/y;
/(?:[^\S\n\r\u2028\u2029]|\/\/.*|\/\*.*?\*\/)*/g;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change has no effect since we are not using .exec(), here I am just aligning to the flag of skipWhiteSpace.


// Skip whitespace and single-line comments, including /* no newline here */.
// After this RegExp matches, its lastIndex points to a line terminator, or
Expand Down