Skip to content

nmoinvaz/strcasecmp

Repository files navigation

strcasecmp

Benchmark for strcasecmp to determine whether or not there is a better implementation.

Most implementations will convert each character to tolower without checking to see if the characters are first the same. If the characters are the same then no lowercase conversion is needed.

This code uses Google Benchmark to perform tests against several 256 character strings of varying similarity.

Popular implementations

glibc

int
__strcasecmp (const char *s1, const char *s2 LOCALE_PARAM)
{
    const unsigned char *p1 = (const unsigned char *) s1;
    const unsigned char *p2 = (const unsigned char *) s2;
    int result;

    if (p1 == p2)
        return 0;

    while ((result = TOLOWER (*p1) - TOLOWER (*p2++)) == 0)
        if (*p1++ == '\0')
        break;

    return result;
}

https://github.com/bminor/glibc/blob/b92a49359f33a461db080a33940d73f47c756126/string/strcasecmp.c

The check for p1 == p2 can add an additional benefit for user-mode applications that are compiled with string pooling enabled.

Apple

int
strcasecmp(s1, s2)
	const char *s1, *s2;
{
    const u_char
            *us1 = (const u_char *)s1,
            *us2 = (const u_char *)s2;

    while (tolower(*us1) == tolower(*us2++))
        if (*us1++ == '\0')
            return (0);
    return (tolower(*us1) - tolower(*--us2));
}

https://opensource.apple.com/source/Libc/Libc-583/string/FreeBSD/strcasecmp.c.auto.html

OpenSSL

int OPENSSL_strncasecmp(const char *s1, const char *s2, size_t n)
{
    int t;
    size_t i;

    for (i = 0; i < n; i++)
        if ((t = ossl_tolower(*s1) - ossl_tolower(*s2++)) != 0)
            return t;
        else if (*s1++ == '\0')
            return 0;
    return 0;
}

https://github.com/openssl/openssl/blob/79c8dcf3985a7b75eac8e53eb8652728af6c5d3d/crypto/o_str.c

cURL

int Curl_strcasecompare(const char *first, const char *second)
{
    while(*first && *second) {
        if(Curl_raw_toupper(*first) != Curl_raw_toupper(*second))
            /* get out of the loop as soon as they don't match */
            return 0;
        first++;
        second++;
    }
    /* If we're here either the strings are the same or the length is different.
       We can just test if the "current" character is non-zero for one and zero
       for the other. Note that the characters may not be exactly the same even
       if they match, we only want to compare zero-ness. */
    return !*first == !*second;
}

https://github.com/curl/curl/blob/master/lib/strcase.c

Linux Kernel

int strcasecmp(const char *s1, const char *s2)
{
    int c1, c2;

    do {
        c1 = tolower(*s1++);
        c2 = tolower(*s2++);
    } while (c1 == c2 && c1 != 0);
    return c1 - c2;
}

https://github.com/torvalds/linux/blob/master/lib/string.c

Interestingly enough the kernel's implementations for strcasecmp and strncasecmp will return different negative values for string1 = "", string2 = "A" because the quick return in strncasecmp when !c1 || !c2. In this case strcasecmp = -'a' and strncasecmp = '-A'. But perhaps it is left up to the implementation what negative value to return.

Our implementation

Partially derived from Linux kernel implementations for strcasecmp and strncasecmp.

int strcasecmp_new(const char *s1, const char *s2)
{
    unsigned char c1, c2;

    do {
        c1 = *s1++;
        c2 = *s2++;
        if (c1 != c2) {
            c1 = tolower(c1);
            c2 = tolower(c2);
            if (c1 != c2)
                return (int)c1 - (int)c2;
        }
    } while (c1 != 0);
    return 0;
}

Build

CMake instructions:

cmake -S . -B build -D CMAKE_BUILD_TYPE=Release
cmake --build build --config Release

Running

The application allows most Google Benchmark arguments including --benchmark_repetitions=# which can specify the number of repetitions to perform.

Results

2022-10-25T08:48:13-07:00
Running 20 repetitions
Run on (12 X 3202.02 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x6)
  L1 Instruction 32 KiB (x6)
  L2 Unified 256 KiB (x6)
  L3 Unified 12288 KiB (x1)

two strings 100% same case = ~48% improvement

strcasecmp_bench/100% same:0_mean              300 ns          164 ns           20
strcasecmp_bench/100% same:0_median            291 ns          229 ns           20
strcasecmp_bench/100% same:0_stddev           23.5 ns          120 ns           20
strcasecmp_bench/100% same:0_cv               7.84 %         73.54 %            20

strcasecmp_new_bench/100% same:0_mean          158 ns         89.8 ns           20
strcasecmp_new_bench/100% same:0_median        159 ns          102 ns           20
strcasecmp_new_bench/100% same:0_stddev       2.66 ns         53.5 ns           20
strcasecmp_new_bench/100% same:0_cv           1.69 %         59.58 %            20

two strings 83% same case = ~32% improvement

strcasecmp_bench/83% same:1_mean               304 ns          173 ns           20
strcasecmp_bench/83% same:1_median             304 ns          220 ns           20
strcasecmp_bench/83% same:1_stddev            6.58 ns          123 ns           20
strcasecmp_bench/83% same:1_cv                2.16 %         70.95 %            20

strcasecmp_new_bench/83% same:1_mean           204 ns          121 ns           20
strcasecmp_new_bench/83% same:1_median         203 ns          120 ns           20
strcasecmp_new_bench/83% same:1_stddev        3.48 ns         73.3 ns           20
strcasecmp_new_bench/83% same:1_cv            1.71 %         60.41 %            20

two strings 75% same case = ~36% improvement

strcasecmp_bench/75% same:2_mean               321 ns          183 ns           20
strcasecmp_bench/75% same:2_median             322 ns          237 ns           20
strcasecmp_bench/75% same:2_stddev            5.37 ns          129 ns           20
strcasecmp_bench/75% same:2_cv                1.67 %         70.76 %            20

strcasecmp_new_bench/75% same:2_mean           205 ns          127 ns           20
strcasecmp_new_bench/75% same:2_median         205 ns          173 ns           20
strcasecmp_new_bench/75% same:2_stddev        4.52 ns         81.0 ns           20
strcasecmp_new_bench/75% same:2_cv            2.21 %         63.68 %            20

two strings 66% same case = ~32% improvement

strcasecmp_bench/66% same:3_mean               351 ns          197 ns           20
strcasecmp_bench/66% same:3_median             348 ns          199 ns           20
strcasecmp_bench/66% same:3_stddev            22.8 ns         37.4 ns           20
strcasecmp_bench/66% same:3_cv                6.50 %         18.92 %            20

strcasecmp_new_bench/66% same:3_mean           236 ns          138 ns           20
strcasecmp_new_bench/66% same:3_median         232 ns          141 ns           20
strcasecmp_new_bench/66% same:3_stddev        10.9 ns         57.4 ns           20
strcasecmp_new_bench/66% same:3_cv            4.60 %         41.62 %            20

two strings 50% same case = ~32% improvement

strcasecmp_bench/50% same:4_mean               384 ns          217 ns           20
strcasecmp_bench/50% same:4_median             371 ns          220 ns           20
strcasecmp_bench/50% same:4_stddev            27.9 ns         21.2 ns           20
strcasecmp_bench/50% same:4_cv                7.26 %          9.77 %            20

strcasecmp_new_bench/50% same:4_mean           259 ns          147 ns           20
strcasecmp_new_bench/50% same:4_median         259 ns          175 ns           20
strcasecmp_new_bench/50% same:4_stddev        4.08 ns          102 ns           20
strcasecmp_new_bench/50% same:4_cv            1.57 %         69.35 %            20

two strings 33% same case = ~13% improvement

strcasecmp_bench/33% same:5_mean               303 ns          169 ns           20
strcasecmp_bench/33% same:5_median             299 ns          159 ns           20
strcasecmp_bench/33% same:5_stddev            15.7 ns         54.2 ns           20
strcasecmp_bench/33% same:5_cv                5.17 %         32.01 %            20

strcasecmp_new_bench/33% same:5_mean           263 ns          156 ns           20
strcasecmp_new_bench/33% same:5_median         264 ns          166 ns           20
strcasecmp_new_bench/33% same:5_stddev        14.7 ns         64.0 ns           20
strcasecmp_new_bench/33% same:5_cv            5.57 %         41.05 %            20

two strings 25% same case = ~5% improvement

strcasecmp_bench/25% same:6_mean               287 ns          163 ns           20
strcasecmp_bench/25% same:6_median             287 ns          161 ns           20
strcasecmp_bench/25% same:6_stddev            2.99 ns         59.9 ns           20
strcasecmp_bench/25% same:6_cv                1.04 %         36.73 %            20

strcasecmp_new_bench/25% same:6_mean           269 ns          159 ns           20
strcasecmp_new_bench/25% same:6_median         268 ns          158 ns           20
strcasecmp_new_bench/25% same:6_stddev        3.46 ns         8.69 ns           20
strcasecmp_new_bench/25% same:6_cv            1.29 %          5.47 %            20

two strings 15% same case = ~2% decrease

strcasecmp_bench/15% same:7_mean               292 ns          168 ns           20
strcasecmp_bench/15% same:7_median             291 ns          169 ns           20
strcasecmp_bench/15% same:7_stddev            2.76 ns         14.9 ns           20
strcasecmp_bench/15% same:7_cv                0.95 %          8.83 %            20

strcasecmp_new_bench/15% same:7_mean           299 ns          161 ns           20
strcasecmp_new_bench/15% same:7_median         300 ns          139 ns           20
strcasecmp_new_bench/15% same:7_stddev        6.81 ns          122 ns           20
strcasecmp_new_bench/15% same:7_cv            2.28 %         75.59 %            20

two strings 0% same case = ~11% decrease

strcasecmp_bench/0% same:8_mean                297 ns          155 ns           20
strcasecmp_bench/0% same:8_median              293 ns          193 ns           20
strcasecmp_bench/0% same:8_stddev             14.1 ns          118 ns           20
strcasecmp_bench/0% same:8_cv                 4.75 %         76.46 %            20

strcasecmp_new_bench/0% same:8_mean            332 ns          209 ns           20
strcasecmp_new_bench/0% same:8_median          331 ns          242 ns           20
strcasecmp_new_bench/0% same:8_stddev         4.96 ns          111 ns           20
strcasecmp_new_bench/0% same:8_cv             1.50 %         52.89 %            20

For strings that are over 25% the same case our implementation provides up to 48% performance benefits over other implementations. However if the two strings are less than 25% there can be a similar up to 11% performance decrease.

We surmise that perhaps CPU caching does not play a significant role in string case comparison, but only branches and perdictions. tolower has its own branch to determine if the character is uppercase or lowercase which results in a lookup to a hash map.

If we can reduce the total number of branches, by skipping an extra branch when characters match, we can speed up the function in corrolation to the similarity of the two strings.

Hash Maps

There is another way that involves reducing the number of branches and that is by using a different character hash map.

As stated previously, some implementations of tolower first check if the character is uppercase and if so convert it to lowercase.

This is because these tolower implementations rely on the _ctype hash map which is a 256 character map that can be used to lookup what each character's type is - digit, upper, lower, etc.

However, it is possible to use hash maps that are dedicated to the uppercase/lowercase conversion where the results of tolower(c) and toupper(c) are precalculated for each of the 256 characters.

Resources

Releases

No releases published

Packages

No packages published