Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: "\8", "\9" - Unexpected token ILLEGAL #1226

Closed
zlobz opened this issue Jul 8, 2015 · 13 comments
Closed

Regression: "\8", "\9" - Unexpected token ILLEGAL #1226

zlobz opened this issue Jul 8, 2015 · 13 comments

Comments

@zlobz
Copy link

zlobz commented Jul 8, 2015

2.4.1
See #1106
bf5c615

// Error: Line 1: Unexpected token ILLEGAL
var esprima = require('esprima');
esprima.parse('"\\8"');

Some valid strings that esprima currently does not parse:

"\8"
"\9"
@zlobz zlobz changed the title Regression: "\812" - Unexpected token ILLEGAL Regression: "Hello\812World" - Unexpected token ILLEGAL Jul 8, 2015
@zlobz zlobz changed the title Regression: "Hello\812World" - Unexpected token ILLEGAL Regression: "\\8" - Unexpected token ILLEGAL Jul 8, 2015
@zlobz zlobz changed the title Regression: "\\8" - Unexpected token ILLEGAL Regression: "\\8", "\\9" - Unexpected token ILLEGAL Jul 8, 2015
zlobz pushed a commit to zlobz/esprima that referenced this issue Jul 8, 2015
@zlobz zlobz changed the title Regression: "\\8", "\\9" - Unexpected token ILLEGAL Regression: "\8", "\9" - Unexpected token ILLEGAL Jul 8, 2015
@michaelficarra
Copy link
Contributor

Those are not valid. If you feel otherwise, show me the grammar production that allows them.

@zlobz
Copy link
Author

zlobz commented Jul 8, 2015

No, it is valid. "\8\9" works in the wild.

> "\8\9"
'89'

https://es5.github.io/#x16
An implementation may extend program syntax and regular expression pattern or flag syntax.

@dmethvin
Copy link
Member

dmethvin commented Jul 8, 2015

Seems like the \ DecimalDigit production would apply? Things are complicated a bit because in non-strict mode there's also LegacyOctalEscapeSequence but none of those productions apply since 8 is not an octal digit. In strict a valid octal sequence generates a syntax error but an invalid one like \8 silently converts to 8.

@zlobz
Copy link
Author

zlobz commented Jul 8, 2015

Standard ECMA-262 June 1997 http://tecfa.unige.ch/guides/js/e262-pdf.pdf
7.7.4 String Literals

EscapeSequence ::
CharacterEscapeSequence
OctalEscapeSequence
HexEscapeSequence
UnicodeEscapeSequence

CharacterEscapeSequence ::
\ SingleEscapeCharacter
\ NonEscapeCharacter

NonEscapeCharacter::
SourceCharacter but not EscapeCharacter or LineTerminator

EscapeCharacter ::
SingleEscapeCharacter
OctalDigit
x
u

SingleEscapeCharacter :: one of
'"\ b f n r t

@mathiasbynens
Copy link
Contributor

@dmethvin
Copy link
Member

dmethvin commented Jul 8, 2015

This is what i saw in ES2015 RC1:

SingleStringCharacter ::
\ EscapeSequence

EscapeSequence ::
CharacterEscapeSequence

CharacterEscapeSequence ::
NonEscapeCharacter

NonEscapeCharacter ::
SourceCharacter but not one of EscapeCharacter or LineTerminator

EscapeCharacter ::
SingleEscapeCharacter
DecimalDigit
x
u

@ariya
Copy link
Contributor

ariya commented Jul 8, 2015

@dmethvin Good analysis! That means \8 does not fulfill any production since 8 is a DecimalDigit and NonEscapeCharacter can't be it.

@zlobz
Copy link
Author

zlobz commented Jul 8, 2015

@dmethvin

but an invalid one like \8 silently converts to 8.

because it is valid in old spec - ECMA-262 1997. since
8 is SourceCharacter
8 is not EscapeCharacter because it is not OctalDigit

NonEscapeCharacter ::
SourceCharacter but not one of EscapeCharacter or LineTerminator

EscapeCharacter ::
OctalDigit

@dmethvin
Copy link
Member

dmethvin commented Jul 8, 2015

Yeah, I meant \8 was invalid octal.

@ariya
Copy link
Contributor

ariya commented Jul 8, 2015

@xzo That's very old. ES 5.1 already specified EscapeCharacter to be exactly like above (see 7.8.4)

I don't think Esprima needs to support ECMAScript < 5 at all.

@zlobz
Copy link
Author

zlobz commented Jul 8, 2015

Why parser doesn't need to support it, If modern ECMAScript engines support it? How to parse obfuscated code, for example?

@ariya
Copy link
Contributor

ariya commented Jul 9, 2015

Do we have some information as to how many scripts out there still utilizing such an escape sequence?

In all cases, even if we want to support, at best this is an error that should be tolerated, aka when tolerant is set to true in the parsing options.

@mathiasbynens
Copy link
Contributor

This is on the table for ES7: https://bugs.ecmascript.org/show_bug.cgi?id=3477

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants