Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: array_reduce is drastically slower (~1000x) than implementing the same logic with foreach #8283

Closed
Firehed opened this issue Mar 30, 2022 · 9 comments

Comments

@Firehed
Copy link

Firehed commented Mar 30, 2022

Description

The following code:

<?php

declare(strict_types=1);

// generate a large sample of pseudo-random data which may have duplicates
$data = [];
for ($i = 0; $i < 40000; $i++) {
    $data[] = crc32((string)$i);
}


echo 'Foreach';

$start = hrtime(true);
$result = [];
foreach ($data as $item) {
    @$result[$item]++;
}
$durationNs = hrtime(true) - $start;
echo sprintf('%f s', $durationNs / 1_000_000_000);
echo "\n\n";

unset($start, $result, $item, $durationNs);
echo 'Array reduce';

$start = hrtime(true);
$result = array_reduce($data, function ($carry, $item) {
    @$carry[$item]++;
    return $carry;
}, []);
$durationNs = hrtime(true) - $start;
echo sprintf('%f s', $durationNs / 1_000_000_000);

Resulted in this output:

Foreach 0.007446s

Array reduce 8.685649s

But I expected this output instead:

(comparable runtimes)

Possibly impactful opcache ini settings:

 $ php -i | rg opcache
Additional .ini files parsed => /usr/local/etc/php/8.1/conf.d/ext-opcache.ini
    with Zend OPcache v8.1.3, Copyright (c), by Zend Technologies
Zend OPcache
opcache.blacklist_filename => no value => no value
opcache.consistency_checks => 0 => 0
opcache.dups_fix => Off => Off
opcache.enable => On => On
opcache.enable_cli => Off => Off
opcache.enable_file_override => Off => Off
opcache.error_log => no value => no value
opcache.file_cache => no value => no value
opcache.file_cache_consistency_checks => On => On
opcache.file_cache_only => Off => Off
opcache.file_update_protection => 2 => 2
opcache.force_restart_timeout => 180 => 180
opcache.huge_code_pages => Off => Off
opcache.interned_strings_buffer => 8 => 8
opcache.jit => tracing => tracing
opcache.jit_bisect_limit => 0 => 0
opcache.jit_blacklist_root_trace => 16 => 16
opcache.jit_blacklist_side_trace => 8 => 8
opcache.jit_buffer_size => 0 => 0
opcache.jit_debug => 0 => 0
opcache.jit_hot_func => 127 => 127
opcache.jit_hot_loop => 64 => 64
opcache.jit_hot_return => 8 => 8
opcache.jit_hot_side_exit => 8 => 8
opcache.jit_max_exit_counters => 8192 => 8192
opcache.jit_max_loop_unrolls => 8 => 8
opcache.jit_max_polymorphic_calls => 2 => 2
opcache.jit_max_recursive_calls => 2 => 2
opcache.jit_max_recursive_returns => 2 => 2
opcache.jit_max_root_traces => 1024 => 1024
opcache.jit_max_side_traces => 128 => 128
opcache.jit_prof_threshold => 0.005 => 0.005
opcache.lockfile_path => /tmp => /tmp
opcache.log_verbosity_level => 1 => 1
opcache.max_accelerated_files => 10000 => 10000
opcache.max_file_size => 0 => 0
opcache.max_wasted_percentage => 5 => 5
opcache.memory_consumption => 128 => 128
opcache.opt_debug_level => 0 => 0
opcache.optimization_level => 0x7FFEBFFF => 0x7FFEBFFF
opcache.preferred_memory_model => no value => no value
opcache.preload => no value => no value
opcache.preload_user => no value => no value
opcache.protect_memory => Off => Off
opcache.record_warnings => Off => Off
opcache.restrict_api => no value => no value
opcache.revalidate_freq => 2 => 2
opcache.revalidate_path => Off => Off
opcache.save_comments => On => On
opcache.use_cwd => On => On
opcache.validate_permission => Off => Off
opcache.validate_root => Off => Off
opcache.validate_timestamps => On => On

Note that opcache_get_status() returns false when running this script via CLI, which is where I encountered this.

General observation: this doesn't appear to be something that gets incrementally slower the further along the array the algorithm is. By adding an echo '.' in either section, it's apparent that even the very first item (with an empty $carry) is drastically slower in array_reduce than in the body of foreach.

I know they're not 100% comparable, but both should have O(n) complexity so I would not expect three orders of magnitude performance difference to produce the same result.

PHP Version

8.1.3

Operating System

macOS 12.3

@kamil-tekiela
Copy link
Member

I don't think big O notation applies here.

FWIW, your examples are not comparable, because the one with foreach doesn't use function calls or array reassignment. If you create the same example, the performance is identical.

See these two examples:

<?php

declare(strict_types=1);

// generate a large sample of pseudo-random data which may have duplicates
$data = [];
for ($i = 0; $i < 40000; $i++) {
    $data[] = crc32((string) $i);
}

echo 'Foreach';

function reduce($carry, $item)
{
    @$carry[$item]++;
    return $carry;
}

$start = hrtime(true);
$result = [];
foreach ($data as $item) {
    $result = reduce($result, $item);
}
$durationNs = hrtime(true) - $start;
echo sprintf('%f s', $durationNs / 1_000_000_000);
echo "\n\n";

unset($start, $result, $item, $durationNs);
echo 'Array reduce';

$start = hrtime(true);
$result = array_reduce($data, function ($carry, $item) {
    @$carry[$item]++;
    return $carry;
}, []);
$durationNs = hrtime(true) - $start;
echo sprintf('%f s', $durationNs / 1_000_000_000);

array_reduce isn't the culprit. It's the function calls and constant reassignment of arrays; the latter being the biggest offender.

So, I am not sure what exactly you consider to be a bug. In terms of performance array_reduce is almost identical to naive foreach loop if implemented exactly the same way. If you compare apples to oranges, then obviously, you will notice a difference.

@iluuu1994
Copy link
Member

Hi @Firehed!

The problem is that you're causing array separation (arrays are copy on write, unless their reference count is 1) when doing @$carry[$item]++;. The array is being copied over and over again for every iteration of array_reduce. The same would happen for code like this:

$result = [];
foreach ($data as $item) {
    $dummy = $result;
    @$result[$item]++;
}

That code is pretty much equivalent in performance as the array_reduce example. So I'm closing this as not a bug.

@Firehed
Copy link
Author

Firehed commented Mar 30, 2022

Noted @kamil-tekiela - I was about to update the issue regarding the function call and saw your reply indicating the same.

"Bug" may not be the best description here, but it was the closest issue template for me to work from. As far as end-user semantics, I don't think function calls CoW array behavior being involved change expectations around performance here, although it clearly explains the source of them.

Ideally, there could be some sort of performance optimizations that occur within the array_reduce source (perhaps finding a way to inline the function and avoiding the calling overhead for every item?). But I suppose if that's not possible, a warning in the documentation noting that there can be serious performance implications to using array_reduce on larger (1000+ item) arrays and to inline the logic for best results would do.

@kamil-tekiela
Copy link
Member

I suppose if that's not possible, a warning in the documentation noting that there can be serious performance implications to using array_reduce on larger (1000+ item) arrays and to inline the logic for best results would do.

I don't think that is necessary. Also, as @iluuu1994 pointed out the issue is present even without functions. The function overhead is rather small in this example. The most time is spent on recreating the array every time the function is called and it's not an issue of array_reduce. Doing the same in a loop will take just as long.

@Firehed
Copy link
Author

Firehed commented Mar 30, 2022

Yes, let's disregard the function call thing as that turned out to be incorrect as @iluuu1994 pointed out. I think the end result on performance expectations that users would have (and I did) is the same. Reducing a large array into a smaller one (re-indexing, etc) is a very common use-case for array_reduce, and having an array for $carry causing a 1000x performance loss compared to foreach is extremely surprising.

@iluuu1994
Copy link
Member

It might be possible to do some ref-count hackery but I'm not sure if that's frowned upon. I'll test tomorrow. But most likely there's nothing we can do here.

@iluuu1994 iluuu1994 reopened this Mar 30, 2022
@iluuu1994 iluuu1994 self-assigned this Mar 30, 2022
@drealecs
Copy link

Given that the culprit was identified as the trigger for array copy on write operation, a common solution in this kind of cases would be to make sure we deal with a mutable resource reference instead of a plain array.

You will have good performance if you change the $initial parameter from [] to new ArrayObject():

$result = array_reduce($data, function ($carry, $item) {
    @$carry[$item]++;
    return $carry;
}, new ArrayObject())
    ->getArrayCopy();

@iluuu1994
Copy link
Member

I took a quick look and the ZEND_COPY which includes refcounting is in a central place, so it can't really be changed.

@erjotek
Copy link

erjotek commented Jan 24, 2023

@iluuu1994 i had a similar problem so thanks for explaining what causes this problem :)

since you can't change ZEND_COPY then maybe you can use proposition of @drealecs as an internal optimization for the case where $initial is empty array [] ?

paflov added a commit to iqb-berlin/testcenter that referenced this issue Sep 11, 2023
It slows down initialization and file upload ba around factor 300.

See:
php/php-src#8283
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants