Skip to content

Commit

Permalink
- Fixed bug #52971 (PCRE-Meta-Characters not working with utf-8)
Browse files Browse the repository at this point in the history
#   In  PCRE,  by  default, \d, \D, \s, \S, \w, and \W recognize only ASCII
#       characters, even in UTF-8 mode. However, this can be changed by setting
#       the PCRE_UCP option.
  • Loading branch information
felipensp committed Oct 3, 2010
1 parent 4b0927b commit 090a9b3
Show file tree
Hide file tree
Showing 3 changed files with 52 additions and 1 deletion.
1 change: 1 addition & 0 deletions NEWS
Expand Up @@ -22,6 +22,7 @@
- Fixed possible crash in mssql_fetch_batch(). (Kalle)
- Fixed inconsistent backlog default value (-1) in FPM on many systems. (fat)

- Fixed bug #52971 (PCRE-Meta-Characters not working with utf-8). (Felipe)
- Fixed bug #52947 (segfault when ssl stream option capture_peer_cert_chain
used). (Felipe)
- Fixed bug #52944 (Invalid write on second and subsequent reads with an
Expand Down
9 changes: 8 additions & 1 deletion ext/pcre/php_pcre.c
Expand Up @@ -350,7 +350,14 @@ PHPAPI pcre_cache_entry* pcre_get_compiled_regex_cache(char *regex, int regex_le
case 'S': do_study = 1; break;
case 'U': coptions |= PCRE_UNGREEDY; break;
case 'X': coptions |= PCRE_EXTRA; break;
case 'u': coptions |= PCRE_UTF8; break;
case 'u': coptions |= PCRE_UTF8;
/* In PCRE, by default, \d, \D, \s, \S, \w, and \W recognize only ASCII
characters, even in UTF-8 mode. However, this can be changed by setting
the PCRE_UCP option. */
#ifdef PCRE_UCP
coptions |= PCRE_UCP;
#endif
break;

/* Custom preg options */
case 'e': poptions |= PREG_REPLACE_EVAL; break;
Expand Down
43 changes: 43 additions & 0 deletions ext/pcre/tests/bug52971.phpt
@@ -0,0 +1,43 @@
--TEST--
Bug #52971 (PCRE-Meta-Characters not working with utf-8)
--SKIPIF--
<?php if ((double)PCRE_VERSION < 8.1) die('skip PCRE_VERSION >= 8.1 is required!'); ?>
--FILE--
<?php

$message = 'Der ist ein Süßwasserpool Süsswasserpool ... verschiedene Wassersportmöglichkeiten bei ...';

$pattern = '/\bwasser/iu';
preg_match_all($pattern, $message, $match, PREG_OFFSET_CAPTURE);
var_dump($match);

$pattern = '/[^\w]wasser/iu';
preg_match_all($pattern, $message, $match, PREG_OFFSET_CAPTURE);
var_dump($match);

?>
--EXPECTF--
array(1) {
[0]=>
array(1) {
[0]=>
array(2) {
[0]=>
string(6) "Wasser"
[1]=>
int(61)
}
}
}
array(1) {
[0]=>
array(1) {
[0]=>
array(2) {
[0]=>
string(7) " Wasser"
[1]=>
int(60)
}
}
}

0 comments on commit 090a9b3

Please sign in to comment.