start position discrepancy with esprima #136

msridhar · 2014-10-02T22:26:08Z

Consider the following code:

((((function () {
}()))));

According to esprima, the start column of both the ExpressionStatement and CallExpression node in the AST is 4. With acorn, the start column is 0. I don't know if there is a correct answer here, but 4 seems more right to me. Thanks for the great parser!

EDIT: see my next comment below for a more accurate characterization of Esprima's current behavior for this example.

The text was updated successfully, but these errors were encountered:

RReverser · 2014-10-03T08:00:11Z

AFAIK this deviation was always intentional (and I counted on that when adapting ES6 tests). There is no "correct answer" as parenthesized expression doesn't have it's representation in AST. However, it's IMO even more correct to have parenthesis counted as part of expression node which they belong to and not part of parent block.

In any case, positions have sense mostly for source maps, and both cases you'll have correct will be handled correctly by any debugger.

marijnh · 2014-10-03T08:05:28Z

The statement definitely starts at position zero in this case. An argument could be made for the CallExpression starting at 4, but Acorn's current behavior is a response to issue #14, which interestingly enough points at Esprima for a precedent of the behavior you are now claiming Esprima does not show.

Or am I missing something?

msridhar · 2014-10-03T16:25:50Z

Ok, so I re-tested with the latest code from esprima and acorn master. For my example above, both esprima and acorn actually agree that the start column of the ExpressionStatement is 0. esprima says the CallExpression starts at column 4, while acorn says column 0.

For the example from #14, both esprima and acorn agree that the BinaryExpression that is the test of the conditional starts at column 4. But for the Literal 1 that is parenthesized, acorn says it starts at column 4, but esprima says 5.

IMHO, for both these cases, the esprima logic for handling parentheses is more sensible. To give more context, my client is an IDE tool where I'd to be able to highlight the relevant expressions. Particularly for my original example, when highlighting the CallExpression, I'd rather not highlight the parentheses. However, I'm not sure if esprima's handling of parentheses would break @RReverser's use case.

RReverser · 2014-10-03T16:34:04Z

It doesn't break anything for me if ask the tests will be fixed correspondingly. However I'm still not sure if in (1)+2 it's correct to make parenthesis part of BinaryExpression and not Literal that they semantically belong to.

msridhar · 2014-10-03T17:22:11Z

It's not really clear to me where the location of parentheses should be accounted for in an "abstract" syntax tree, where presumably the function of the parentheses has been expressed in the tree's structure. Having some kind of ParenthesizedExpression would make things more explicit, but then the tree is less abstract. And that's not an option here anyway. FWIW, I can't think of a particular use case where a client interested in an expression e in the AST really needs the parentheses enclosing e to be part of its location. If someone knows of such a case, that would be interesting to hear about.

getify · 2014-10-03T18:23:53Z

I've suggested a heuristic before when discussing where to unambiguously associate (in the tree) various concrete syntax elements (whitespace, comments, unnecessary parens, etc), and I think it is worth considering:

These concrete syntax elements (that is, items which are not strictly required for program execution) should be associated as low (leaf-wise) in the tree as possible, and when applicable, left-associative (to the left, to the left...). That seems like the best and easiest location for code generation, and is probably a wash for any of the analytical use-cases.

So, to your situation, I'd say the ( ) around the 1 should associate to the 1 element, not to the statement. But, in 2 * (3 + 4), of course the ( ) has to associate to the 3 + 4 expression, not to the 3 and 4 elements.

msridhar · 2014-10-03T21:31:01Z

@getify the issue here is that the parentheses are not explicitly associated with the 1 element; it's just that its location range includes the parentheses. If I instead write this program:

if ((    1    ) === 1) {
}

Now, acorn says the 1 literal has a start position of {"line":1,"column":4} and an end position of {"line":1,"column":15}. But, if I understand correctly, there is no other information in the AST to tell a code generator how many nested parentheses pairs to use to fill that space. Given the AST structure we have, to me it makes more sense to not include these characters in the the location range of the enclosed expression.

getify · 2014-10-03T22:04:55Z

I was talking about extending the tree structure with the additional concrete syntax elements metadata. I'm aware the standard tree doesn't keep this info.

Even if you didn't actually extend the tree to hold the data, I still think the heuristic works in terms of where to assume the data (and thus position counts) should be associated to.

…rentheses Issue #136

Issue #136

marijnh · 2014-10-08T10:39:32Z

I've pushed two patches that implement the following:

Parenthesized expressions now have a range that includes only the expression itself
Composite expressions and statements cover all of the text inside them, including leading and trailing parentheses that wrap inner expressions.

I think this should cover everybody's cases. It is still not consistent with Esprima (which will leave parentheses out of the statement nodes), but I think what I implemented is preferable. You can, for example, replace the range given for a statement with another statement in the program text, and the result will continue to make sense.

msridhar · 2014-10-08T15:56:03Z

Actually, for my test cases, the behavior of the current master branch of acorn and esprima is now exactly the same (my original bug report was based on an older version of esprima). Thanks for the fix!

marijnh added a commit that referenced this issue Oct 8, 2014

Adjust start positions of parenthesized expressions to not include pa…

3603f7d

…rentheses Issue #136

marijnh added a commit that referenced this issue Oct 8, 2014

Make locations in loose parser consistent with those in regular parser

bdee9e8

Issue #136

marijnh mentioned this issue Oct 8, 2014

Issue with range information and parenthesis #14

Closed

marijnh closed this as completed Oct 8, 2014

manucorporat mentioned this issue Jun 9, 2019

Treeshaking of arguments leads to SyntaxError rollup/rollup#2909

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

start position discrepancy with esprima #136

start position discrepancy with esprima #136

msridhar commented Oct 2, 2014

RReverser commented Oct 3, 2014

marijnh commented Oct 3, 2014

msridhar commented Oct 3, 2014

RReverser commented Oct 3, 2014

msridhar commented Oct 3, 2014

getify commented Oct 3, 2014

msridhar commented Oct 3, 2014

getify commented Oct 3, 2014

marijnh commented Oct 8, 2014

msridhar commented Oct 8, 2014

start position discrepancy with esprima #136

start position discrepancy with esprima #136

Comments

msridhar commented Oct 2, 2014

RReverser commented Oct 3, 2014

marijnh commented Oct 3, 2014

msridhar commented Oct 3, 2014

RReverser commented Oct 3, 2014

msridhar commented Oct 3, 2014

getify commented Oct 3, 2014

msridhar commented Oct 3, 2014

getify commented Oct 3, 2014

marijnh commented Oct 8, 2014

msridhar commented Oct 8, 2014