New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance: array_reduce is drastically slower (~1000x) than implementing the same logic with foreach #8283
Comments
I don't think big O notation applies here. FWIW, your examples are not comparable, because the one with foreach doesn't use function calls or array reassignment. If you create the same example, the performance is identical. See these two examples:
So, I am not sure what exactly you consider to be a bug. In terms of performance |
Hi @Firehed! The problem is that you're causing array separation (arrays are copy on write, unless their reference count is 1) when doing $result = [];
foreach ($data as $item) {
$dummy = $result;
@$result[$item]++;
} That code is pretty much equivalent in performance as the |
Noted @kamil-tekiela - I was about to update the issue regarding the function call and saw your reply indicating the same. "Bug" may not be the best description here, but it was the closest issue template for me to work from. As far as end-user semantics, I don't think Ideally, there could be some sort of performance optimizations that occur within the array_reduce source (perhaps finding a way to inline the function and avoiding the calling overhead for every item?). But I suppose if that's not possible, a warning in the documentation noting that there can be serious performance implications to using array_reduce on larger (1000+ item) arrays and to inline the logic for best results would do. |
I don't think that is necessary. Also, as @iluuu1994 pointed out the issue is present even without functions. The function overhead is rather small in this example. The most time is spent on recreating the array every time the function is called and it's not an issue of |
Yes, let's disregard the function call thing as that turned out to be incorrect as @iluuu1994 pointed out. I think the end result on performance expectations that users would have (and I did) is the same. Reducing a large array into a smaller one (re-indexing, etc) is a very common use-case for |
It might be possible to do some ref-count hackery but I'm not sure if that's frowned upon. I'll test tomorrow. But most likely there's nothing we can do here. |
Given that the culprit was identified as the trigger for array copy on write operation, a common solution in this kind of cases would be to make sure we deal with a mutable resource reference instead of a plain array. You will have good performance if you change the $result = array_reduce($data, function ($carry, $item) {
@$carry[$item]++;
return $carry;
}, new ArrayObject())
->getArrayCopy(); |
I took a quick look and the |
@iluuu1994 i had a similar problem so thanks for explaining what causes this problem :) since you can't change ZEND_COPY then maybe you can use proposition of @drealecs as an internal optimization for the case where $initial is empty array [] ? |
It slows down initialization and file upload ba around factor 300. See: php/php-src#8283
Description
The following code:
Resulted in this output:
But I expected this output instead:
Possibly impactful opcache ini settings:
Note that
opcache_get_status()
returnsfalse
when running this script via CLI, which is where I encountered this.General observation: this doesn't appear to be something that gets incrementally slower the further along the array the algorithm is. By adding an
echo '.'
in either section, it's apparent that even the very first item (with an empty $carry) is drastically slower inarray_reduce
than in the body offoreach
.I know they're not 100% comparable, but both should have O(n) complexity so I would not expect three orders of magnitude performance difference to produce the same result.
PHP Version
8.1.3
Operating System
macOS 12.3
The text was updated successfully, but these errors were encountered: