Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array shape mangled when union with empty array is created? #10834

Closed
RobertMe opened this issue Apr 4, 2024 · 7 comments · Fixed by phpstan/phpstan-src#3003
Closed

Array shape mangled when union with empty array is created? #10834

RobertMe opened this issue Apr 4, 2024 · 7 comments · Fixed by phpstan/phpstan-src#3003

Comments

@RobertMe
Copy link

RobertMe commented Apr 4, 2024

Bug report

I have a method which returns a very big array shape (array{....}). But when I use this shape in a ternary / null coalesce operation with a "fallback" to [] the type get's manged. In this case the shape is lost and it's converted into a generic array where the key type some of the keys but as a union, and after a certain amount (seems to be 64 and above) the keys are just "lost" (not added to the union), and the same happens to the value types, these are combined into a union as well, and new / unique types after the first 64 seem to be lost.

So see the linked example. Line 12 correctly dumps the array shape (array{1: string, ..., 64: float}|null), while when combining it with a ?? [] (on line 13) it's converted into array<1|2|3|...|63, bool|int|string> (notice that float is only used as type for key 64, and is not part of the converted generic either). And the same happens when using a normal ternary (truty ? shape() : []).

Also not that dropping a single array key isn't enough (while it would limit the size to 63, which is the number of items in the array key union), and at least 2 keys need to be dropped.

While the code is (very) badly designed I would still like to report this and see if this is something which maybe can be fixed (but it might as well be an artificial limit as these big shapes might hurt performance?)

Code snippet that reproduces the problem

https://phpstan.org/r/8b1640c3-2c07-4c7a-85fc-c143048ef03d

Expected output

Line 13 should log:

Dumped type: array{}|array{1: string, 2: int, 3: bool, 4: string, 5: int, 6: bool, 7: string, 8: int, 9: bool, 10: string, 11: int, 12: bool, 13: string, 14: int, 15: bool, 16: string, 17: int, 18: bool, 19: string, 20: int, 21: bool, 22: string, 23: int, 24: bool, 25: string, 26: int, 27: bool, 28: string, 29: int, 30: bool, 31: string, 32: int, 33: bool, 34: string, 35: int, 36: bool, 37: string, 38: int, 39: bool, 40: string, 41: int, 42: bool, 43: string, 44: int, 45: bool, 46: string, 47: int, 48: bool, 49: string, 50: int, 51: bool, 52: string, 53: int, 54: bool, 55: string, 56: int, 57: bool, 58: string, 59: int, 60: bool, 61: string, 62: int, 63: bool, 64: float}

(which is the same as line 12, except |null now being array{})

And line 15 should be array{}|array{1: ...}|null

Did PHPStan help you today? Did it make you happy in any way?

PHPStan is helping me a lot with refactoring and adding type hints to a very old code base. Generate baseline, refactor, run again, and if there are no (new) errors I'm pretty sure I didn't break anything 😄.

@RobertMe
Copy link
Author

RobertMe commented Apr 4, 2024

Hmm, right. The title is even more correct than I tested. If I change the [] to false (for example) the shape is correctly retained. So it does seem to be specific to arrays (using some array generic, like [1.0] does add the (literal) value (1.0) to the value type union, and using a shape like ['foo' => 1.0] does also add the key(s) to the keys union).

@ondrejmirtes
Copy link
Member

The array gets simplified if it's too big, for performance reasons. So in these situations the precision of being an array shape might get lost.

@RobertMe
Copy link
Author

RobertMe commented Apr 4, 2024

That's what I expected already.

Thinking out loud, might it be an idea to add special handling for an empty array (so just [])? As in that case there is nothing to merge really. It's either the actual array (shape) or an array{}/array<>. This as the working union of an array shape with an empty array is also just array{...}|array{}.

@ondrejmirtes
Copy link
Member

Yeah, of course it can be improved. The logic happens in TypeCombinator::processArrayTypes and optimizeConstantArrays so it gets lost somewhere in there.

@RobertMe
Copy link
Author

RobertMe commented Apr 4, 2024

Yeah, of course it can be improved. The logic happens in TypeCombinator::processArrayTypes and optimizeConstantArrays so it gets lost somewhere in there.

👍🏻 , already found those based on searching in ArrayType / ConstantArrayType for something *merge*() (which is ConstantArrayType::mergeWith) and searching for usages. So will see if I can come up with some optimization for that in case of ->isIterableAtLeastOnce()->no() (which already is somewhere in that code, otherwise I didn't know about isIterableAtLeastOnce() :P)

And to add to the "it's 'just' mangled". This obviously leads to errors later on where undefined array keys are being reported on this mangled type. While, just tested, it works fine when for example using false or null as fallback value (so apparently the "unreadable" legacy code path still leads to some conclusion where the variable definitely is the array shape and not the empty array).

@ondrejmirtes
Copy link
Member

Fixed phpstan/phpstan-src#3003

Copy link

github-actions bot commented May 8, 2024

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants