New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to tokenize a regular expression #1873

Closed
vitaly-t opened this Issue Nov 1, 2017 · 14 comments

Comments

Projects
None yet
3 participants
@vitaly-t

vitaly-t commented Nov 1, 2017

Fails to process the following code:

const input = 'if(true){}/\\d/.test(\'\');'

or

const input = "if(true){}/\\d/.test('');";

This is a perfectly valid JavaScript after compression that Esprima cannot handle - throws an error:

Error: Line 1: Unexpected token ILLEGAL
    at ErrorHandler.constructError (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:5004:22)
    at ErrorHandler.createError (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:5020:27)
    at ErrorHandler.throwError (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:5027:21)
    at Scanner.throwUnexpectedToken (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:5155:35)
    at Scanner.getComplexIdentifier (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:5488:23)
    at Scanner.scanIdentifier (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:5556:67)
    at Scanner.lex (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:6223:26)
    at Tokenizer.getNextToken (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:6666:84)
    at Object.tokenize (D:\NodeJS\tests\node_modules\esprima\dist\esprima.js:154:36)
@vitaly-t

This comment has been minimized.

Show comment
Hide comment
@vitaly-t

vitaly-t Nov 18, 2017

@ariya Any chance this issue will be looked at? It was logged 18 days ago, and no response since.

vitaly-t commented Nov 18, 2017

@ariya Any chance this issue will be looked at? It was logged 18 days ago, and no response since.

@ariya

This comment has been minimized.

Show comment
Hide comment
@ariya

ariya Nov 20, 2017

Contributor

@vitaly-t Please be patient, every contributor seems to be extremely busy these days.

Contributor

ariya commented Nov 20, 2017

@vitaly-t Please be patient, every contributor seems to be extremely busy these days.

@michaelficarra

This comment has been minimized.

Show comment
Hide comment
@michaelficarra

michaelficarra Nov 20, 2017

Contributor

@vitaly-t What version of esprima are you using? I cannot reproduce a failure using the online demo.

Contributor

michaelficarra commented Nov 20, 2017

@vitaly-t What version of esprima are you using? I cannot reproduce a failure using the online demo.

@vitaly-t

This comment has been minimized.

Show comment
Hide comment
@vitaly-t

vitaly-t Nov 20, 2017

@michaelficarra There is something not right with that online test. I cannot get it to fail, but locally it fails always, with any 3.x or 4.0.0 version of Esprima.

You must define the input correctly, like this:

const input = 'if(true){}/\\d/.test(\'\');';

then it will fail when you run it locally through Esprima.

vitaly-t commented Nov 20, 2017

@michaelficarra There is something not right with that online test. I cannot get it to fail, but locally it fails always, with any 3.x or 4.0.0 version of Esprima.

You must define the input correctly, like this:

const input = 'if(true){}/\\d/.test(\'\');';

then it will fail when you run it locally through Esprima.

@vitaly-t

This comment has been minimized.

Show comment
Hide comment
@vitaly-t

vitaly-t Nov 20, 2017

Actually no, I can, and here it is.

vitaly-t commented Nov 20, 2017

Actually no, I can, and here it is.

@michaelficarra

This comment has been minimized.

Show comment
Hide comment
@michaelficarra

michaelficarra Nov 20, 2017

Contributor

@vitaly-t That's not a JavaScript program. Check your quotes.

Contributor

michaelficarra commented Nov 20, 2017

@vitaly-t That's not a JavaScript program. Check your quotes.

@vitaly-t

This comment has been minimized.

Show comment
Hide comment
@vitaly-t

vitaly-t Nov 20, 2017

@michaelficarra it perfectly is JavaScript. You can change it to this, if you like:

const input = "if(true){}/\\d/.test('');";

Not sure what you mean by program, it is a piece of compressed JavaScript, as I described it in the beginning.

I can see that the online test fails to reproduce the issue, but you can reproduce it locally 😉

vitaly-t commented Nov 20, 2017

@michaelficarra it perfectly is JavaScript. You can change it to this, if you like:

const input = "if(true){}/\\d/.test('');";

Not sure what you mean by program, it is a piece of compressed JavaScript, as I described it in the beginning.

I can see that the online test fails to reproduce the issue, but you can reproduce it locally 😉

@michaelficarra

This comment has been minimized.

Show comment
Hide comment
@michaelficarra

michaelficarra Nov 20, 2017

Contributor

@vitaly-t Sorry, but you're just wrong.

$ npm install esprima
+ esprima@4.0.0
added 1 package in 1.318s
$ echo "if(true){}/\\d/.test('');" > test.js
$ node_modules/.bin/esvalidate test.js
$ echo $?
0

I cannot reproduce this either on the website or locally using a freshly-installed esprima 4.0.0. Which do you think is more likely: esprima, a widely used parser, is failing to parse a very simple program, or you are having trouble giving it the correct input?

Contributor

michaelficarra commented Nov 20, 2017

@vitaly-t Sorry, but you're just wrong.

$ npm install esprima
+ esprima@4.0.0
added 1 package in 1.318s
$ echo "if(true){}/\\d/.test('');" > test.js
$ node_modules/.bin/esvalidate test.js
$ echo $?
0

I cannot reproduce this either on the website or locally using a freshly-installed esprima 4.0.0. Which do you think is more likely: esprima, a widely used parser, is failing to parse a very simple program, or you are having trouble giving it the correct input?

@ariya

This comment has been minimized.

Show comment
Hide comment
@ariya

ariya Nov 20, 2017

Contributor

Another way to play with this is with RunKit, e.g. https://runkit.com/5a127690868980001150b6df/5a127690bb5c880012193356.

Contributor

ariya commented Nov 20, 2017

Another way to play with this is with RunKit, e.g. https://runkit.com/5a127690868980001150b6df/5a127690bb5c880012193356.

@vitaly-t

This comment has been minimized.

Show comment
Hide comment
@vitaly-t

vitaly-t Nov 20, 2017

In a simplified form, this is the code that fails:

const esprima = require('esprima');
const input = "if(true){}/\\d/.test('');";
esprima.tokenize(input);

I'm using method tokenize with more parameters, as shown below, but they don't matter, as removing them, and passing in just the input fails the same.

esprima.tokenize(input, {loc: true, range: true}, node => {
    if (node.type === 'RegularExpression') {
        // processing...
    }
});

vitaly-t commented Nov 20, 2017

In a simplified form, this is the code that fails:

const esprima = require('esprima');
const input = "if(true){}/\\d/.test('');";
esprima.tokenize(input);

I'm using method tokenize with more parameters, as shown below, but they don't matter, as removing them, and passing in just the input fails the same.

esprima.tokenize(input, {loc: true, range: true}, node => {
    if (node.type === 'RegularExpression') {
        // processing...
    }
});
@ariya

This comment has been minimized.

Show comment
Hide comment
@ariya

ariya Nov 20, 2017

Contributor

@vitaly-t Thanks for the clarification! So this is basically a problem for the pure tokenizer (not the parser).

Contributor

ariya commented Nov 20, 2017

@vitaly-t Thanks for the clarification! So this is basically a problem for the pure tokenizer (not the parser).

@vitaly-t

This comment has been minimized.

Show comment
Hide comment
@vitaly-t

vitaly-t Nov 20, 2017

@ariya it would appear, yes, sorry for the confusion earlier. The tokenizer fails with the given example.

vitaly-t commented Nov 20, 2017

@ariya it would appear, yes, sorry for the confusion earlier. The tokenizer fails with the given example.

@ariya ariya changed the title from Fails on a compressed piece of code to Fail to tokenize a regular expression Nov 21, 2017

@ariya ariya added the defect label Nov 23, 2017

ariya added a commit to ariya/esprima that referenced this issue Nov 23, 2017

Make the pure tokenizer a bit aggressive in recognizing regex literals.
At the same time, ensure that it can fall back if the regex literal
proved to be invalid.

Fix jquery#1772
Fix jquery#1873

@ariya ariya closed this in df749a7 Nov 26, 2017

@vitaly-t

This comment has been minimized.

Show comment
Hide comment
@vitaly-t

vitaly-t Aug 13, 2018

@ariya Just tested against version 4.0.1, and the issue is still there. So whatever that commit was - does not fix the issue. Please re-open it.

i'm getting the very same error:

Error: Line 1: Unexpected token ILLEGAL
    at ErrorHandler.constructError (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5012:22)
    at ErrorHandler.createError (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5028:27)
    at ErrorHandler.throwError (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5035:21)
    at Scanner.throwUnexpectedToken (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5164:35)
    at Scanner.getComplexIdentifier (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5497:23)
    at Scanner.scanIdentifier (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5565:67)
    at Scanner.lex (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:6232:26)
    at Tokenizer.getNextToken (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:6675:84)
    at Object.tokenize (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:154:36)
    at Object.parseRegEx (D:\NodeJS\decomment\lib\utils.js:37:13)

vitaly-t commented Aug 13, 2018

@ariya Just tested against version 4.0.1, and the issue is still there. So whatever that commit was - does not fix the issue. Please re-open it.

i'm getting the very same error:

Error: Line 1: Unexpected token ILLEGAL
    at ErrorHandler.constructError (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5012:22)
    at ErrorHandler.createError (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5028:27)
    at ErrorHandler.throwError (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5035:21)
    at Scanner.throwUnexpectedToken (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5164:35)
    at Scanner.getComplexIdentifier (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5497:23)
    at Scanner.scanIdentifier (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:5565:67)
    at Scanner.lex (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:6232:26)
    at Tokenizer.getNextToken (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:6675:84)
    at Object.tokenize (D:\NodeJS\decomment\node_modules\esprima\dist\esprima.js:154:36)
    at Object.parseRegEx (D:\NodeJS\decomment\lib\utils.js:37:13)

@ariya

This comment has been minimized.

Show comment
Hide comment
@ariya

ariya Aug 25, 2018

Contributor

@vitaly-t If you want to try it out, can you try the master branch?
The fix was not included in 4.0.1, it is too risky for a patch release.

Contributor

ariya commented Aug 25, 2018

@vitaly-t If you want to try it out, can you try the master branch?
The fix was not included in 4.0.1, it is too risky for a patch release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment