Narrow regex subject to decimal-int-string when every alternation branch is a decimal integer#5814
Merged
staabm merged 3 commits intoJun 7, 2026
Conversation
…ranch is a decimal integer - `RegexGroupParser::walkGroupAst()` combined the running decimal-integer state with an alternation result using `TrinaryLogic::and()`. With the initial "maybe" state this swallowed an all-decimal alternation (`maybe->and(yes) === maybe`), so patterns like `/^(?:0|-?[1-9][0-9]*)$/` narrowed the `preg_match()` subject only to `non-empty-string`. - Introduce `concatDecimalInteger()` which mirrors the per-token logic: a non-decimal part forces `no`, a decimal part forces `yes`, otherwise `maybe`. This fixes the subject base type and the whole-match group used by `preg_match`/`preg_match_all`. - Works for the capturing-group variant, alternations without a leading sign, alternations of decimal literals, and nested alternations.
staabm
approved these changes
Jun 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
preg_match()can narrow its subject parameter todecimal-int-stringwhen thepattern only matches decimal integers. For the common integer pattern
/^(?:0|-?[1-9][0-9]*)$/PHPStan only narrowed the subject tonon-empty-string.This change makes alternations whose every branch is a decimal integer narrow the
subject (and the whole-match group) to
decimal-int-string.Changes
src/Type/Regex/RegexGroupParser.php:decimalInteger($walkResult->isDecimalInteger()->and($decimalInteger))combination in the alternation branch of
walkGroupAst()with a call to the newconcatDecimalInteger()helper.concatDecimalInteger(TrinaryLogic $left, TrinaryLogic $right): anon-decimal part forces
no, a decimal part forcesyes, otherwisemaybe.tests/PHPStan/Analyser/nsrt/bug-14784.php: regression test plus analogous cases.Root cause
Decimal-integer-ness of a string is the conjunction of its parts, but the running
state starts as
maybe(nothing seen yet). The per-token logic upgradesmaybeto
yeswhen it sees a digit. The alternation branch instead combined itsall-branches-decimal result into the running state with
TrinaryLogic::and(), andmaybe->and(yes)ismaybe— so an all-decimal alternation never committed toyes. The whole match (subject base type) is computed from thiswalkGroupAst()result, so the narrowing was lost for any pattern whose top-level shape is an
alternation. The new helper mirrors the token logic: a decimal part forces
yesregardless of the still-undetermined
maybeprefix, while a non-decimal part stillforces
no. The capturing-group element type is produced by a separateroot-alternation path (
getRootAlternation()) and was already precise(
'0'|(decimal-int-string&non-falsy-string)), so it was left untouched.Test
tests/PHPStan/Analyser/nsrt/bug-14784.phpreproduces the reported pattern andcovers analogous cases:
/^(?:0|-?[1-9][0-9]*)$/(non-capturing) and its capturing variant,/^(?:0|123)$/,/^(?:0|[1-9][0-9]*)$/,/^[0-9]+(?:0|5)$/,/^(?:0|abc)$/) keepsnon-empty-string,Additionally verified via ad-hoc analysis that nested alternations
(
/^(?:0|(?:1|2))$/), non-anchored patterns, empty-or-decimal alternations, andhex alternations all infer the correct types. The full
NodeScopeResolverTestsuite, the
preg_match*shape tests, the regex type tests, andmake phpstanself-analysis all pass.
Fixes phpstan/phpstan#14784