Skip to content

Narrow regex subject to decimal-int-string when every alternation branch is a decimal integer#5814

Merged
staabm merged 3 commits into
phpstan:2.2.xfrom
phpstan-bot:create-pull-request/patch-jjjt2e1
Jun 7, 2026
Merged

Narrow regex subject to decimal-int-string when every alternation branch is a decimal integer#5814
staabm merged 3 commits into
phpstan:2.2.xfrom
phpstan-bot:create-pull-request/patch-jjjt2e1

Conversation

@phpstan-bot
Copy link
Copy Markdown
Collaborator

Summary

preg_match() can narrow its subject parameter to decimal-int-string when the
pattern only matches decimal integers. For the common integer pattern
/^(?:0|-?[1-9][0-9]*)$/ PHPStan only narrowed the subject to non-empty-string.
This change makes alternations whose every branch is a decimal integer narrow the
subject (and the whole-match group) to decimal-int-string.

Changes

  • src/Type/Regex/RegexGroupParser.php:
    • Replaced the decimalInteger($walkResult->isDecimalInteger()->and($decimalInteger))
      combination in the alternation branch of walkGroupAst() with a call to the new
      concatDecimalInteger() helper.
    • Added concatDecimalInteger(TrinaryLogic $left, TrinaryLogic $right): a
      non-decimal part forces no, a decimal part forces yes, otherwise maybe.
  • tests/PHPStan/Analyser/nsrt/bug-14784.php: regression test plus analogous cases.

Root cause

Decimal-integer-ness of a string is the conjunction of its parts, but the running
state starts as maybe (nothing seen yet). The per-token logic upgrades maybe
to yes when it sees a digit. The alternation branch instead combined its
all-branches-decimal result into the running state with TrinaryLogic::and(), and
maybe->and(yes) is maybe — so an all-decimal alternation never committed to
yes. The whole match (subject base type) is computed from this walkGroupAst()
result, so the narrowing was lost for any pattern whose top-level shape is an
alternation. The new helper mirrors the token logic: a decimal part forces yes
regardless of the still-undetermined maybe prefix, while a non-decimal part still
forces no. The capturing-group element type is produced by a separate
root-alternation path (getRootAlternation()) and was already precise
('0'|(decimal-int-string&non-falsy-string)), so it was left untouched.

Test

tests/PHPStan/Analyser/nsrt/bug-14784.php reproduces the reported pattern and
covers analogous cases:

  • the reported /^(?:0|-?[1-9][0-9]*)$/ (non-capturing) and its capturing variant,
  • alternation of two decimal literals /^(?:0|123)$/,
  • alternation without a leading sign /^(?:0|[1-9][0-9]*)$/,
  • an alternation followed by required digits /^[0-9]+(?:0|5)$/,
  • a negative case where a non-decimal branch (/^(?:0|abc)$/) keeps non-empty-string,
  • the capturing-group element type.

Additionally verified via ad-hoc analysis that nested alternations
(/^(?:0|(?:1|2))$/), non-anchored patterns, empty-or-decimal alternations, and
hex alternations all infer the correct types. The full NodeScopeResolverTest
suite, the preg_match* shape tests, the regex type tests, and make phpstan
self-analysis all pass.

Fixes phpstan/phpstan#14784

phpstan-bot and others added 2 commits June 6, 2026 13:53
…ranch is a decimal integer

- `RegexGroupParser::walkGroupAst()` combined the running decimal-integer
  state with an alternation result using `TrinaryLogic::and()`. With the
  initial "maybe" state this swallowed an all-decimal alternation
  (`maybe->and(yes) === maybe`), so patterns like `/^(?:0|-?[1-9][0-9]*)$/`
  narrowed the `preg_match()` subject only to `non-empty-string`.
- Introduce `concatDecimalInteger()` which mirrors the per-token logic: a
  non-decimal part forces `no`, a decimal part forces `yes`, otherwise
  `maybe`. This fixes the subject base type and the whole-match group used
  by `preg_match`/`preg_match_all`.
- Works for the capturing-group variant, alternations without a leading
  sign, alternations of decimal literals, and nested alternations.
@staabm staabm requested a review from VincentLanglet June 6, 2026 14:17
@staabm staabm merged commit 444acab into phpstan:2.2.x Jun 7, 2026
655 of 671 checks passed
@staabm staabm deleted the create-pull-request/patch-jjjt2e1 branch June 7, 2026 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

narrow to decimal-int-string by regex

2 participants