Skip to content

Commit 48cb152

Browse files
committed
LibRegex: Explicitly check if a character falls into a table-based range
Previously, for a regex such as /[a-sy-z]/i, we would incorrectly think the character "u" fell into the range "a-s" because neither of the conditions "u > s && U > s" or "u < a && U < a" would be true, resulting in the lookup falling back to assuming the character is in the range. Instead, first explicitly check if the character falls into the range, rather than checking if it falls outside the range. If the explicit checks fail, then we know the character is outside the range.
1 parent 27f5a18 commit 48cb152

File tree

2 files changed

+11
-5
lines changed

2 files changed

+11
-5
lines changed

Tests/LibRegex/Regex.cpp

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -690,7 +690,10 @@ TEST_CASE(ECMA262_match)
690690
{ "a|$"sv, "x"sv, true, (ECMAScriptFlags)regex::AllFlags::Global }, // #11940, Global (not the 'g' flag) regexps should attempt to match the zero-length end of the string too.
691691
{ "foo\nbar"sv, "foo\nbar"sv, true }, // #12126, ECMA262 regexp should match literal newlines without the 's' flag.
692692
{ "foo[^]bar"sv, "foo\nbar"sv, true }, // #12126, ECMA262 regexp should match newline with [^].
693-
{ "^[_A-Z]+$"sv, "_aA"sv, true, ECMAScriptFlags::Insensitive } // Insensitive lookup table: characters in a range do not necessarily lie in the same range after being converted to lowercase.
693+
{ "^[_A-Z]+$"sv, "_aA"sv, true, ECMAScriptFlags::Insensitive }, // Insensitive lookup table: characters in a range do not necessarily lie in the same range after being converted to lowercase.
694+
{ "^[a-sy-z]$"sv, "b"sv, true, ECMAScriptFlags::Insensitive },
695+
{ "^[a-sy-z]$"sv, "y"sv, true, ECMAScriptFlags::Insensitive },
696+
{ "^[a-sy-z]$"sv, "u"sv, false, ECMAScriptFlags::Insensitive },
694697
};
695698
// clang-format on
696699

Userland/Libraries/LibRegex/RegexByteCode.cpp

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -557,11 +557,14 @@ ALWAYS_INLINE ExecutionResult OpCode_Compare::execute(MatchInput const& input, M
557557
upper_case_needle = to_ascii_uppercase(needle);
558558
lower_case_needle = to_ascii_lowercase(needle);
559559
}
560-
if (lower_case_needle > range.to && upper_case_needle > range.to)
560+
561+
if (lower_case_needle >= range.from && lower_case_needle <= range.to)
562+
return 0;
563+
if (upper_case_needle >= range.from && upper_case_needle <= range.to)
564+
return 0;
565+
if (lower_case_needle > range.to || upper_case_needle > range.to)
561566
return 1;
562-
if (lower_case_needle < range.from && upper_case_needle < range.from)
563-
return -1;
564-
return 0;
567+
return -1;
565568
});
566569

567570
if (matching_range) {

0 commit comments

Comments
 (0)