-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex matching errors when using \W character class and /i option #4
Comments
I tested several patterns with Perl 5.14 and 5.16:
★: Be careful with these results. |
Other test patterns:
|
Perl 5.14 has some inconsistency, but they are fixed with Perl 5.16. |
test script for onigmo: |
More test patterns with Perl 5.16:
|
\p{ASCII}, [[:ascii:]], \p{Word}, \w, [[:word:]] and their negated patterns inside a character class must be handled specially. They should not match across ASCII/non-ASCII boundary. Exclude them from ASCII/non-ASCII case folding.
All POSIX brackets should not match across ASCII boundary when ASCII flag is on.
/(?ia)[[:lower:]][[:upper:]]/ =~ "Ab" failed.
\p{ASCII}, \w, all POSIX brackets and their negated patterns inside a character class must be handled specially. They should not match across ASCII/non-ASCII boundary. Exclude them from ASCII/non-ASCII case folding. * \p{ASCII} and [[:ascii:]] should not match across ASCII boundary. They don't depend on ASCII flag. * \w and all POSIX brackets should not match across ASCII boundary when ASCII flag is on.
Fix character class with ignore case. (Issue #4) Conflicts: regparse.c
Details of this bug in Japanese. 現象
影響するバージョン
原因ignore case フラグ 対策文字クラスをパースして、文字プロパティーやPOSIXクラスを個々の文字に展開する際、ASCII の範囲を超えて case fold を行ってよいかを判定し、行ってよい文字だけを集めた文字クラス( |
* Onigmo (Oniguruma-mod) 5.15.0 for bregonig.dll を使用。 https://github.com/k-takata/Onigmo/tree/Onigmo-5.15.0_for_bregonig - Unicode 7.0 に対応 - Oniguruma 5.9.5 をマージ - 大量のグループを使うと落ちる問題を修正 k-takata/Onigmo#24 - /\x{1ffc}/i =~ "\x1ff3" がマッチしない問題を修正 - UTF-16/32 で /[a-c#]+\W/ =~ "def#" がマッチしない問題を修正 - /(?i)\u0149\u0149/ =~ "\u0149\u0149" がマッチしない問題を修正 k-takata/Onigmo#40 - 文字クラスの中で /w を使い、/i オプションを指定したときの問題を修正 k-takata/Onigmo#4 - 文字プロパティが /i オプションを無視する問題を修正 k-takata/Onigmo#41 - "ab" =~ /(?!^a).*b/ がマッチしない問題を修正 k-takata/Onigmo#44
see: http://bugs.ruby-lang.org/issues/4044
The text was updated successfully, but these errors were encountered: