Skip to content

Guard *scanf() return type extension by counter#5641

Open
hakre wants to merge 2 commits into
phpstan:2.1.xfrom
hakre:patch-1
Open

Guard *scanf() return type extension by counter#5641
hakre wants to merge 2 commits into
phpstan:2.1.xfrom
hakre:patch-1

Conversation

@hakre
Copy link
Copy Markdown
Contributor

@hakre hakre commented May 11, 2026

Use the recently fixed PrintfHelper::getScanfPlaceholdersCount() to detect invalid format strings in the return type extension. If the format is uncountable, the call is guaranteed to fail – return NullType immediately.

This is the same approach that eliminated the count regression. It already fixes all false‑positive type inferences for malformed formats.

For example, invalid format (mixing positional %n$ with sequential %) returns null.

@staabm
Copy link
Copy Markdown
Contributor

staabm commented May 11, 2026

can you come up with a test, this change fixes?

@hakre
Copy link
Copy Markdown
Contributor Author

hakre commented May 11, 2026

Absolutely — I already have a fixture for the exact example from the commit message (mixing positional %n$ with sequential %). I’ll push it now as a fixup.

@hakre
Copy link
Copy Markdown
Contributor Author

hakre commented May 11, 2026

Wow make build didn't notice the missing string parameter, refreshed.

@hakre hakre force-pushed the patch-1 branch 2 times, most recently from 3b1a8e8 to c386aab Compare May 11, 2026 13:26
@hakre
Copy link
Copy Markdown
Contributor Author

hakre commented May 11, 2026

Tests are green – the phpbench and Infection reds look pre‑existing / unrelated to this change. Let me know if you’d like me to dig into those or if this is good to go. 😊

@hakre hakre force-pushed the patch-1 branch 2 times, most recently from e4293ad to e38254a Compare May 11, 2026 13:55
@hakre
Copy link
Copy Markdown
Contributor Author

hakre commented May 11, 2026

Created a try snippet so it's better to compare with before and now aligned with the fixture that had another blooper: https://phpstan.org/r/087c4c5f-9a00-47cc-8d36-f21fb1b27a31

Comment thread tests/PHPStan/Analyser/nsrt/sscanf.php Outdated
}

function sscanfInvalidFormatMixingPositionalWithSequential(string $s) {
assertType('null', sscanf($s, '%1$s %s'));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be NEVER or Error on PHP8+ since it will throw an error ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But given the fact the return type extension already use TypeCombinator::addNull somewhere else, I'm unsure...

Copy link
Copy Markdown
Contributor Author

@hakre hakre May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Yes, on PHP 8+ it should be never (the call throws, 3v4l.org). I kept null for now as a safe lower bound that doesn’t require a PhpVersion dependency in the extension. If you prefer the stricter version, I can inject PhpVersion and return NeverType for ≥8.0.

/E: found the pattern

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But given the fact the return type extension already use TypeCombinator::addNull somewhere else, I'm unsure...

... me either but I wanted to give it a try and put a fixup on top with php version id test data files (.php scripts) that mimic the same branching as for explode(). The fixup keeps all options open if you prefer something else – let me know.

@staabm
Copy link
Copy Markdown
Contributor

staabm commented May 11, 2026

Initially I thought using the new count helper, we could get rid of a regex path.

Since this is not the case, I think the PR as is does not provide much value.

We would already report a error for the invalid format.
It doesn't bring much value having a more precise return type IMO

hakre added 2 commits May 11, 2026 20:28
Use the recently fixed `PrintfHelper::getScanfPlaceholdersCount()` to
detect invalid format strings in the return type extension. If the
format is uncountable, the call is guaranteed to fail – return
`NullType` immediately.

This is the same approach that eliminated the count regression. It
already fixes all false‑positive type inferences for malformed
formats.

For example, invalid format (mixing positional %n$ with sequential %)
returns `null`.

Gegenprobe: the counter doesn’t guess – it asks PHP itself.
Return `NeverType` on PHP 8+ (where an invalid format throws
`ValueError`) and `NullType` on PHP < 8.0. This makes the gatekeeper
precise about the call never returning on modern PHP.

The test strategy for the version‑dependent types will be settled
in review – the CI now whispers `*NEVER*` where `null` stood before.
@hakre
Copy link
Copy Markdown
Contributor Author

hakre commented May 11, 2026

Thanks for the honest feedback! I’d like to clarify what this small PR actually fixes, because I think there might be a subtle but important distinction.

The existing rule already reports an error on an invalid format, yes. But the return type extension runs independently—and previously it would still infer array|null for that same invalid format. That false type can then leak into subsequent analysis (e.g., a later $result[0] or a foreach), producing confusing downstream errors that are only indirectly related to the real problem.

This PR adds a minimal gatekeeper that short‑circuits the type inference for any format the counter finds invalid. It returns NullType (or NeverType on PHP 8+), so no false array shape ever enters the type system. That’s exactly the same probe‑based approach that fixed the count regression you already merged.

So the value isn’t “a more precise return type” in isolation—it’s eliminating a class of false‑positive type information that otherwise pollutes later analysis steps. And it does so with a tiny, proven, self‑contained check that doesn’t touch the regex at all.

I’m happy to iterate or, if you’d rather, close this PR and save the gatekeeper for a later round. Just wanted to make sure the motivation is clear. 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants