-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix octal escapes starting with 0 #1
Conversation
Actually, I’m just noticing that things are a little more tricky; Consider strings like |
8b43d6f
to
e7195ef
Compare
I’ve updated with a more comprehensive solution that is, unfortunately, also kind of a full rewrite of the |
3 similar comments
Wow, thanks for the PR! Also, I guess I should enable email notifications :D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the solution. Thanks a lot, seriously, this fixes the problems I didn't even have a clue about!
Left a couple of notes regarding the code. Once we fix them, I think it's ready to merge and be published.
@@ -0,0 +1 @@ | |||
{"/home/sqrt/src/unescape-js/dist/index.js":{"path":"/home/sqrt/src/unescape-js/dist/index.js","s":{"1":1,"2":1,"3":1,"4":1,"5":1,"6":5,"7":1,"8":2,"9":1,"10":18,"11":18,"12":2,"13":16,"14":3,"15":2,"16":1,"17":13,"18":2,"19":11,"20":1},"b":{"1":[2,16],"2":[3,13],"3":[2,1],"4":[2,11]},"f":{"1":5,"2":2,"3":18,"4":18},"fnMap":{"1":{"name":"fromHex","line":24,"loc":{"start":{"line":24,"column":14},"end":{"line":24,"column":36}}},"2":{"name":"fromOct","line":27,"loc":{"start":{"line":27,"column":14},"end":{"line":27,"column":36}}},"3":{"name":"(anonymous_3)","line":31,"loc":{"start":{"line":31,"column":18},"end":{"line":31,"column":36}}},"4":{"name":"(anonymous_4)","line":32,"loc":{"start":{"line":32,"column":41},"end":{"line":32,"column":95}}}},"statementMap":{"1":{"start":{"line":3,"column":0},"end":{"line":5,"column":3}},"2":{"start":{"line":7,"column":0},"end":{"line":7,"column":32}},"3":{"start":{"line":9,"column":0},"end":{"line":9,"column":105}},"4":{"start":{"line":11,"column":0},"end":{"line":22,"column":2}},"5":{"start":{"line":24,"column":0},"end":{"line":26,"column":2}},"6":{"start":{"line":25,"column":4},"end":{"line":25,"column":51}},"7":{"start":{"line":27,"column":0},"end":{"line":29,"column":2}},"8":{"start":{"line":28,"column":4},"end":{"line":28,"column":50}},"9":{"start":{"line":31,"column":0},"end":{"line":43,"column":2}},"10":{"start":{"line":32,"column":4},"end":{"line":42,"column":7}},"11":{"start":{"line":33,"column":8},"end":{"line":41,"column":9}},"12":{"start":{"line":34,"column":12},"end":{"line":34,"column":37}},"13":{"start":{"line":35,"column":15},"end":{"line":41,"column":9}},"14":{"start":{"line":36,"column":12},"end":{"line":36,"column":86}},"15":{"start":{"line":36,"column":34},"end":{"line":36,"column":57}},"16":{"start":{"line":36,"column":62},"end":{"line":36,"column":86}},"17":{"start":{"line":37,"column":15},"end":{"line":41,"column":9}},"18":{"start":{"line":38,"column":12},"end":{"line":38,"column":34}},"19":{"start":{"line":40,"column":12},"end":{"line":40,"column":47}},"20":{"start":{"line":45,"column":0},"end":{"line":45,"column":36}}},"branchMap":{"1":{"line":33,"type":"if","locations":[{"start":{"line":33,"column":8},"end":{"line":33,"column":8}},{"start":{"line":33,"column":8},"end":{"line":33,"column":8}}]},"2":{"line":35,"type":"if","locations":[{"start":{"line":35,"column":15},"end":{"line":35,"column":15}},{"start":{"line":35,"column":15},"end":{"line":35,"column":15}}]},"3":{"line":36,"type":"if","locations":[{"start":{"line":36,"column":12},"end":{"line":36,"column":12}},{"start":{"line":36,"column":12},"end":{"line":36,"column":12}}]},"4":{"line":37,"type":"if","locations":[{"start":{"line":37,"column":15},"end":{"line":37,"column":15}},{"start":{"line":37,"column":15},"end":{"line":37,"column":15}}]}}}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added .nyc_output
to .gitignore
. Could you please merge or rebase upon the master
branch so that this file gets removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, done!
string = unescapeVariableHexSequence(string); | ||
return string; | ||
return string.replace(JSEscapeRegex, (_, match, longHex, varHex, shortHex, octal) => { | ||
if (match[0] === 'x') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I get from short debugging, every time this function is invoked, among longHex
, varHex
, shortHex
and octal
, only a single parameter will have a value will be defined. Is this true? Can we write just something like
if (longHex !== undefined) {
// ...
} else if (varHex !== undefined) {
// ...
}
then? This will make the code easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems to work, yeah. :)
[/\\"/g, '\"'], | ||
[/\\\\/g, '\\'] | ||
]; | ||
const JSEscapeRegex = /\\(u(\{([0-9A-Fa-f]+)\}|[0-9A-Fa-f]{4})|x([0-9A-Fa-f]{2})|(\d{3})|['"tbrnfv0/\\])/g; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: let's rename this to jsEscapeRegex
? Only classes start from the upper-case letter usually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, whatever you prefer. :)
Octal escape sequences starting with 0 should be expanded before the usual escape sequences to avoid ambiguity with the `\0` escape. Also, accidental double unescaping, e.g. `\\x41 → \x41 → A` needs to be avoided. Fixing this requires using a single `.replace()` call per input string.
Simplify: duplicate the `u` part between the first and the second alternative. This reduces the regex depth.
`\/` is not a valid JS escape sequence.
1 similar comment
1 similar comment
* Swap `varHex` and `longHex` because it’s how they actually correspond to the regex * Reorder `if`s to have the same order as the parameters have * For uniformity, add the `specialCharacter` param as there’s now a corresponding alternative in the regex, and alias `match` with `__` as it’s not necessary anymore
Cool, thanks for the updates! I also altered the code slightly (added docs and refactored a bit). Let’s wait for the tests to finish, and I’m merging it. |
Cool, thanks! (Btw, the reason I included |
Okay, I should’ve asked you, yeah :D Thanks for the explanation. At the moment, I dedicate this module to parsing only string escape sequences. There’re two points that keep me from adding
Anyway, |
Well,
Yeah no, just playing around with stuff. This isn’t a problem for me or anything. :) |
Good point! I think we can handle this as a generic case: i.e. when an escape sequence is invalid, just return the character it escapes (i.e. |
Maybe, but not today… everything’s still a bit christmas-y right now ;) |
Sure. Happy holidays then :–) |
Thanks – happy holidays to you too! |
Octal escape sequences starting with 0 should be expanded before the usual escape sequences to avoid ambiguity with the
\0
escape.This patch aligns the behaviour with how JS parses the sequences.