Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stripos with large haystack has bad performance #7847

Closed
Lerchensporn opened this issue Dec 28, 2021 · 5 comments
Closed

stripos with large haystack has bad performance #7847

Lerchensporn opened this issue Dec 28, 2021 · 5 comments

Comments

@Lerchensporn
Copy link

Description

stripos is slow because both input strings are always lowercased completely. This also entails unnecessary memory consumption.

preg_match with case-insensitive flag is a much faster alternative.

Source code:

haystack_dup = zend_string_tolower(haystack);

Test case:

<?php

$haystack = str_repeat('A', 1e+7) . 'BBB';

$repetitions = 0;
$start = microtime(true);
do {
	strpos($haystack, 'bbb');
	$repetitions += 1;
} while (microtime(true) - $start < 5);
$rate = round($repetitions / (microtime(true) - $start));
$rate = str_pad($rate, 8, ' ', STR_PAD_LEFT);
print("strpos:     $rate / second\n");

$repetitions = 0;
$start = microtime(true);
do {
	stripos($haystack, 'bbb');
	$repetitions += 1;
} while (microtime(true) - $start < 5);
$rate = round($repetitions / (microtime(true) - $start));
$rate = str_pad($rate, 8, ' ', STR_PAD_LEFT);
print("stripos:    $rate / second\n");

$repetitions = 0;
$start = microtime(true);
do {
	preg_match('/bbb/i', $haystack, $match, PREG_OFFSET_CAPTURE);
	$repetitions += 1;
} while (microtime(true) - $start < 5);
$rate = round($repetitions / (microtime(true) - $start));
$rate = str_pad($rate, 8, ' ', STR_PAD_LEFT);
print("preg_match: $rate / second\n");

Example output:

strpos:         1192 / second
stripos:         176 / second
preg_match:     1118 / second

PHP Version

PHP 8.0.14

Operating System

No response

@iluuu1994
Copy link
Member

Not sure if this qualifies as a bug but there's definitely room for improvement.

@cmb69
Copy link
Member

cmb69 commented Dec 28, 2021

I'd rather treat this as feature request; we probably don't want to replace the current implementation of stripos() in stable releases (given that it works this way since "forever"), and the recent changes to https://wiki.php.net/rfc/strtolower-ascii may have already improved the performance, and at least should make it easier to avoid using zend_string_tolower(). Furthermore, I somewhat doubt that the given test script is representative for real world applications, where case-insensitive searches for long ASCII strings appear to be rather rare (and especially the case where we only have to match the first character of needle to recognize that we do not have to compare the rest of needle, and can procede to the next character of haystack).

That said, consider the following drastically simplified but naive patch for master:

 ext/standard/string.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/ext/standard/string.c b/ext/standard/string.c
index 7c2dce4595..c350ebdd0a 100644
--- a/ext/standard/string.c
+++ b/ext/standard/string.c
@@ -1911,19 +1911,20 @@ PHP_FUNCTION(stripos)
 		RETURN_FALSE;
 	}
 
-	haystack_dup = zend_string_tolower(haystack);
-	needle_dup = zend_string_tolower(needle);
-	found = (char*)php_memnstr(ZSTR_VAL(haystack_dup) + offset,
-			ZSTR_VAL(needle_dup), ZSTR_LEN(needle_dup), ZSTR_VAL(haystack_dup) + ZSTR_LEN(haystack));
+	char *pn = ZSTR_VAL(needle);
+	char *eh = ZSTR_VAL(haystack) + ZSTR_LEN(haystack) - ZSTR_LEN(needle);
+	for (char *ph = ZSTR_VAL(haystack); ph <= eh; ++ph) {
+		if (*ph == *pn || *ph & ~0x40 == *pn & ~0x40) {
+			found = ph;
+			break;
+		}
+	}
 
 	if (found) {
-		RETVAL_LONG(found - ZSTR_VAL(haystack_dup));
+		RETVAL_LONG(found - ZSTR_VAL(haystack));
 	} else {
 		RETVAL_FALSE;
 	}
-
-	zend_string_release_ex(haystack_dup, 0);
-	zend_string_release_ex(needle_dup, 0);
 }
 /* }}} */

The given test script outputs here:

strpos:         1032 / second
stripos:         157 / second
preg_match:     1472 / second

This is even slightly slower than the unpatched variant, so the zend_string_tolower() and php_memnstr() approach doesn't appear that bad, especially given that preg_match() even rivals the performance of strpos().

iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Dec 29, 2021
Closes phpGH-7847

Previously stripos would lowercase both the haystack and needle to reuse
strpos. The approach in this PR is similar to strpos. memchr is highly
optimized so we're using it to search for the first character of the
needle in the haystack. If we find it we compare the remaining
characters of the needle manually.

The new implementation seems to perform about half as well as strpos (as
two memchr calls are necessary).
iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Dec 29, 2021
Closes phpGH-7847

Previously stripos would lowercase both the haystack and needle to reuse
strpos. The approach in this PR is similar to strpos. memchr is highly
optimized so we're using it to search for the first character of the
needle in the haystack. If we find it we compare the remaining
characters of the needle manually.

The new implementation seems to perform about half as well as strpos (as
two memchr calls are necessary).
iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Dec 30, 2021
Closes phpGH-7847

Previously stripos/stristr would lowercase both the haystack and the
needle to reuse strpos. The approach in this PR is similar to strpos.
memchr is highly optimized so we're using it to search for the first
character of the needle in the haystack. If we find it we compare the
remaining characters of the needle manually.

The new implementation seems to perform about half as well as strpos (as
two memchr calls are necessary to find the next candidate).
@iluuu1994
Copy link
Member

function test($haystack, $needle) {
    $repetitions = 0;
    $start = microtime(true);
    do {
        stripos($haystack, $needle);
        $repetitions += 1;
    } while (microtime(true) - $start < 5);

    $rate = round($repetitions / (microtime(true) - $start));
    $rate = str_pad($rate, 8, ' ', STR_PAD_LEFT);

    echo "$rate / second\n";
}

// Worst case
test(str_repeat('A', 1e+7) . 'BB', 'bb');
test(str_repeat('aA', 1e+6) . 'aB', 'ab');
test(str_repeat('A', 1e+1) . 'BB', 'bb');
test(str_repeat('aA', 1e+1) . 'aB', 'ab');
// Best case
test(str_repeat('A', 1e+7), 'aa');
test(str_repeat('aA', 1e+6), 'aa');
test(str_repeat('A', 1e+1), 'aa');
test(str_repeat('aA', 1e+1), 'aa');
// Real life cases
test(file_get_contents(__DIR__ . '/php_net.html'), '<article'); // Content of https://www.php.net/
test('content-type: application/json', 'content-type:');
test('int', 'int');
Old New Difference
108 2227 +1962%
62 90 +45%
2332685 3636611 +56%
1625586 2639407 +62%
111 4559 +4007%
766 3675103 +479678%
2347434 3669210 +56%
2291298 3654664 +60%
3862 8003 +107%
2776782 3623213 +30%
3171980 3870687 +22%

The new algorithm seems to be faster in all cases. Even in the worst case (compared to the old implementation) it is faster, likely because there is no allocation happening. The larger the haystack the more dramatic the difference. There is some variance when executing the benchmark but the difference in the benchmark is way above the margin of error.

@kamil-tekiela
Copy link
Member

Is there any impact on the memory usage?

@iluuu1994
Copy link
Member

@kamil-tekiela It avoids copying of both the haystack and needle so two allocations less essentially. But I haven't measured it. It will likely be insignificant unless the string is very very large as the memory is freed before the function returns.

@iluuu1994 iluuu1994 changed the title stripos has very bad performance stripos with large haystack has bad performance Jan 3, 2022
iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Jan 3, 2022
Closes phpGH-7847
Closes phpGH-7852

Previously stripos/stristr would lowercase both the haystack and the
needle to reuse strpos. The approach in this PR is similar to strpos.
memchr is highly optimized so we're using it to search for the first
character of the needle in the haystack. If we find it we compare the
remaining characters of the needle manually.

The new implementation seems to perform about half as well as strpos (as
two memchr calls are necessary to find the next candidate).
iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Jan 26, 2022
Closes phpGH-7847
Closes phpGH-7852

Previously stripos/stristr would lowercase both the haystack and the
needle to reuse strpos. The approach in this PR is similar to strpos.
memchr is highly optimized so we're using it to search for the first
character of the needle in the haystack. If we find it we compare the
remaining characters of the needle manually.

The new implementation seems to perform about half as well as strpos (as
two memchr calls are necessary to find the next candidate).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants