Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of reimplemented functions? #167

Closed
ghost opened this issue Mar 25, 2021 · 3 comments
Closed

Performance of reimplemented functions? #167

ghost opened this issue Mar 25, 2021 · 3 comments

Comments

@ghost
Copy link

ghost commented Mar 25, 2021

I understand that a completely reimplemented function in userland code will be slower than the compiled C counterpart, yet I thought that for many functions PSL simply implemented a wrapper for native functions, thus having little overhead.

I got a little surprised when I was replacing some instances of \array_unique of suddenly getting a measurable drop in performance (those are rather large arrays, but that's what I get).

Here I can replicate it with this code:

$arr = [];
$iterations   = 10;
$array_size   = 1000;
$random_range = 5000;

for ($i = 0; $i < $array_size; $i++) {
    $arr[] = Psl\SecureRandom\int(0, $random_range);
}

$total = 0;
$ta0 = microtime(true);
for ($e = 0; $e < $iterations; $e++) {
    $total += \count(\array_unique($arr));
}
$ta1 = microtime(true);

echo "Native\n";
echo $total / $iterations, "\n";
echo 'Time: ' . round($ta1 - $ta0, 4), "\n\n";

$total = 0;
$ta0   = microtime(true);
for ($e = 0; $e < $iterations; $e++) {
    $total += \Psl\Iter\count(\Psl\Dict\unique($arr));
}
$ta1 = microtime(true);

echo "Psl\n";
echo $total / $iterations, "\n";
echo 'Time: ' . round($ta1 - $ta0, 4);

The result in this case is 0.0009 seconds for native implementations, against 0.0841 seconds for using Psl. True, for this case 0,08 seconds is still something I can live with, but it's still a very marked cost increase. For my particular use case can even become problematic.

Is this what's expected, or I'm using the library incorrectly? Mind, this is not in no way a complaint, I love the library even if this is a tradeoff to be aware of, but just wanted to be sure this was not me "holding it wrong".

@azjezz
Copy link
Owner

azjezz commented Mar 25, 2021

this is expected, if you are using PHP >= 7.4, i recommend including vendor/azjezz/psl/src/preload.php in your preload file, or using it as a preload file if you don't have anything else to preload, which should give better performance as functions will be cached after the first run ( assuming you are using php-fpm or fastcgi ), if it's a CLI script, i cannot recommand anything that would improve performance here.

@azjezz
Copy link
Owner

azjezz commented Mar 25, 2021

Hm, after https://phpsandbox.io/n/wispy-leaf-75nn-pnldg, i noticed that the unqiue function is causing the performance issue here, not count ( which has almost the same performance ).

I think there's a way to fix this.

in https://github.com/azjezz/psl/blob/1.6.x/src/Psl/Dict/unique.php#L17, we should add

if (is_array($iterable)) {
  return array_unique($iterable);
}

can you please send a PR and verify that performance has been improved in this case specifically?

P.S: any pull request to improve performance is welcome, even if it adds complexity.

@ghost
Copy link
Author

ghost commented Mar 25, 2021

Just tried the fix and now the performance is perfectly comparable.

PR incoming.

image

Thanks!

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant