Handle all none unicode character class escapes natively #439

tjenkinson · 2023-10-15T16:12:29Z

Before this change we would not resolve character class escapes (with the exception of \d) to their possible values.

This meant that a pattern like /\w+\W+/ would be considered vulnerable even though /[a-zA-Z0-9_]+[^a-zA-Z0-9_]+/ would not, because it didn't know that there was no overlap in possible values between \w and \W. Note there was some logic that would cancel out exact inversions if it was written like /\w+[^\w]+/.

All none unicode character class escapes are now resolved to their possible values.

Unicode escapes are not expanded, and I'm not sure how to safely support those given the contents of different unicode properties can change over time when new unicode versions are released. Also there's currently no api to ask the browser what version of unicode it's on or what the contents of a unicode property are.

This now means /^[\w+-]+(?:\.[\w+-]+)*@[\da-zA-Z]+(?:[.-][\da-zA-Z]+)*\.[a-zA-Z]{2,}$/u used in some places for validating email addresses is now marked as safe, where it previously wasn't.

tjenkinson force-pushed the handle-more-character-classes branch 2 times, most recently from 4b82d85 to ece1ede Compare October 15, 2023 16:16

Handle all none unicode character class escapes natively

2d08b12

tjenkinson force-pushed the handle-more-character-classes branch from ece1ede to 2d08b12 Compare October 19, 2023 21:12

tjenkinson merged commit 2a5cb51 into main Oct 21, 2023
6 checks passed

tjenkinson deleted the handle-more-character-classes branch October 21, 2023 18:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle all none unicode character class escapes natively #439

Handle all none unicode character class escapes natively #439

tjenkinson commented Oct 15, 2023 •

edited

Handle all none unicode character class escapes natively #439

Handle all none unicode character class escapes natively #439

Conversation

tjenkinson commented Oct 15, 2023 • edited

tjenkinson commented Oct 15, 2023 •

edited