Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions ext/pcre/php_pcre.c
Original file line number Diff line number Diff line change
Expand Up @@ -725,6 +725,7 @@ PHPAPI pcre_cache_entry* pcre_get_compiled_regex_cache_ex(zend_string *regex, in
/* Perl compatible options */
case 'i': coptions |= PCRE2_CASELESS; break;
case 'm': coptions |= PCRE2_MULTILINE; break;
case 'n': coptions |= PCRE2_NO_AUTO_CAPTURE; break;
case 's': coptions |= PCRE2_DOTALL; break;
case 'x': coptions |= PCRE2_EXTENDED; break;

Expand Down
25 changes: 25 additions & 0 deletions ext/pcre/tests/preg_match_non_capture.phpt
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
--TEST--
testing /n modifier in preg_match()
--FILE--
<?php

preg_match('/.(.)./n', 'abc', $m);
var_dump($m);

preg_match('/.(?P<test>.)./n', 'abc', $m);
var_dump($m);

?>
--EXPECT--
array(1) {
[0]=>
string(3) "abc"
}
array(3) {
[0]=>
string(3) "abc"
["test"]=>
string(1) "b"
[1]=>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is producing a numbered capture "passing" when that's what the n modifier is supposed to suppress? From the second call, I'd expect this output:

array(3) {
  [0]=>
  string(3) "abc"
  ["test"]=>
  string(1) "b"
}

Copy link
Contributor Author

@felipensp felipensp Oct 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Roy, this is same behavior according to Perl.

$ echo "abc" | perl -ne 'my @a = $_ =~ /.(?P<a>.)/n;print $1;'
b

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, but that's a pity since Perl provides a separate hash for named groups, but PHP gets the numbered and named groups lumped in the same array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, but that's a pity since Perl provides a separate hash for named groups, but PHP gets the numbered and named groups lumped in the same array.

Not a problem there. PCRE behaves in this way, PHP is not creating this numbered entry. See PCRE2' documentation:

  PCRE2_NO_AUTO_CAPTURE

       If this option is set, it disables the use of numbered capturing paren-
       theses  in the pattern. Any opening parenthesis that is not followed by
       ? behaves as if it were followed by ?: but named parentheses can  still
       be used for capturing (and they acquire numbers in the usual way). This
       is the same as Perl's /n option.  Note that, when this option  is  set,
       references  to  capture  groups (backreferences or recursion/subroutine
       calls) may only refer to named groups, though the reference can  be  by
       name or by number.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understood that PCRE returns matches with both numbered and named references, but surely it has always been PHP's choice to put both of those reference sets into the one array and not provide a way to conditionally omit/separate the numbered set, has it not? This is unlike Perl which puts the named set into its own hash:

echo abc | perl -ne '/(.)(?<x>.)(.)/n; $s = keys %+; print "$s\n"'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PCRE docs seems pretty clear cut on what the intention for this flag is ("and they acquire numbers in the usual way" and "though the reference can be by name or by number"). I don't see reason to deviate from that.

Maybe there is room here for another modifier or option to only return named captures, but this modifier clearly isn't it.

string(1) "b"
}