esprima.tokenizer cannot tokenize a single '/' #1772

Closed
eddieantonio opened this Issue Feb 27, 2017 · 1 comment

Comments

Projects
None yet
2 participants
@eddieantonio

eddieantonio commented Feb 27, 2017

Esprima: v3.1.3
Node: v7.5.0

When I try an input that is simply /, I get Invalid regular expression: missing /.

Here's Esprima working as expected (the input is not syntactically correct, but still produces tokens):

> esprima.tokenize('a /')
[ { type: 'Identifier', value: 'a' },
  { type: 'Punctuator', value: '/' },
  errors: [] ]

Simply remove the leading identifier and 💥:

> esprima.tokenize('/')
Error: Line 1: Invalid regular expression: missing /
    at ErrorHandler.constructError (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:3396:22)
    at ErrorHandler.createError (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:3414:27)
    at ErrorHandler.throwError (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:3422:21)
    at Scanner.throwUnexpectedToken (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:3505:28)
    at Scanner.scanRegExpBody (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:4510:19)
    at Scanner.scanRegExp (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:4566:26)
    at Tokenizer.getNextToken (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:6363:72)
    at Object.tokenize (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:136:36)
    at repl:1:9
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)

Turning on the tolerant: true just dumps this as an error:

> esprima.tokenize('/', {tolerant: true})
[ errors: [ { Error: Line 1: Invalid regular expression: missing /
        at ErrorHandler.constructError (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:3396:22)
        at ErrorHandler.createError (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:3414:27)
        at ErrorHandler.throwError (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:3422:21)
        at Scanner.throwUnexpectedToken (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:3505:28)
        at Scanner.scanRegExpBody (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:4510:19)
        at Scanner.scanRegExp (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:4566:26)
        at Tokenizer.getNextToken (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:6363:72)
        at Object.tokenize (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:136:36)
        at repl:1:9
        at ContextifyScript.Script.runInThisContext (vm.js:23:33)
      index: 1,
      lineNumber: 1,
      description: 'Invalid regular expression: missing /' } ] ]

This raises the question: is this per design? Is an input consisting of a single solidus a correct token stream of one token (the division operator); or is it really the beginning of a malformed regular expression? (for my purposes, the former is far more convenient).

Possibly related to #1516 and #1493.

EDIT: It fails for /= as well:

> esprima.tokenize('/=')
Error: Line 1: Invalid regular expression: missing /
    at ErrorHandler.constructError (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:3396:22)
    at ErrorHandler.createError (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:3414:27)
    at ErrorHandler.throwError (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:3422:21)
    at Scanner.throwUnexpectedToken (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:3505:28)
    at Scanner.scanRegExpBody (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:4510:19)
    at Scanner.scanRegExp (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:4566:26)
    at Tokenizer.getNextToken (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:6363:72)
    at Object.tokenize (/Users/eddieantonio/Projects/training-grammar-guru/tokenize-js/node_modules/esprima/dist/esprima.js:136:36)
    at repl:1:9
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)
@ariya

This comment has been minimized.

Show comment
Hide comment
@ariya

ariya Feb 28, 2017

Contributor

Thanks for the detailed report @eddieantonio! This is definitely a defect and we ought to fix it.

Contributor

ariya commented Feb 28, 2017

Thanks for the detailed report @eddieantonio! This is definitely a defect and we ought to fix it.

@ariya ariya added the defect label Feb 28, 2017

@ghost ghost referenced this issue Feb 28, 2017

Closed

no errors in tolerant mode #1766

ariya added a commit to ariya/esprima that referenced this issue Nov 23, 2017

Make the pure tokenizer a bit aggressive in recognizing regex literals.
At the same time, ensure that it can fall back if the regex literal
proved to be invalid.

Fix #1772
Fix #1873

@ariya ariya closed this in df749a7 Nov 26, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment