Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Improve performance of string.IndexOfAny for 2 & 3 char searches #13219

Merged
merged 4 commits into from
Aug 8, 2017
Merged

Improve performance of string.IndexOfAny for 2 & 3 char searches #13219

merged 4 commits into from
Aug 8, 2017

Conversation

bbowyersmyth
Copy link

Improves the performance of string.IndexOfAny, special casing common 2 and 3 character searches.

2 char search
Beats a minimal hard coded loop after 3rd string character.
Small string (22chars) - 67% faster than existing
Large string (366 chars) - 56% faster than existing

3 char search
Beats a minimal hard coded loop after 8th string character.
Small string (22chars) - 49% faster than existing
Large string (366 chars) - 28% faster than existing

InitializeProbabilisticMap
Very common to search for ASCII symbols. The high map always writes the same value to the same location. Only doing that once gives a 13% improvement (isolated, 10 char array).

Issues
https://github.com/dotnet/corefx/issues/22771
aspnet/Mvc#5362
NuGet/NuGet.Client#1590 (comment)
https://github.com/dotnet/coreclr/issues/7017

cc @davkean @benaadams

@bbowyersmyth
Copy link
Author

Intention was to also improve ProbablyContains but there is an existing 80% difference between x86 and x64 (isolated test) that made it difficult. That also limited what could be done to InitializeProbabilisticMap.
Test code here if anyone wants to tackle that but taking 2 and 3 char searches out of this path helps.

@davkean
Copy link
Member

davkean commented Aug 4, 2017

tag @stephentoub @jkotas


while (count > 0)
{
if (*pCh == value1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This depends on JIT CSEing *pCh for best performance. The JIT will probably do it, but it is best to write hand-optimized code like this one to be as close as possible to the native code.

Why not:

 char c = *pCh; 
 if (c == value1 || c == value2)
     goto ReturnIndex;

like in the 3 argument overload?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the same time between the two variations so I just went with the simpler one. 3 char was better with a variable. Will change.

if (anyOf == null)
{
throw new ArgumentNullException(nameof(anyOf));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not be common enough to warrant it (and thus may hurt the more common cases of 2/3), but I wonder if it's worthwhile adding here:

if (anyOf.Length == 1)
{
    return IndexOf(anyOf[0], startIndex, count);
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub The only place I could think of that would be using a 1 char array was the Unix path handling code. That looked to be taken care of with the Path internal changes so I didn't bother with it.
But true to performance testing form the 2 char call is actually 10% better if it has another if before it. Not so if I change that to a switch though which is interesting. Will run a few more tests on the other paths.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope sorry it was the 3 char test that was 10% better. 2 char is 4% worse. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope sorry it was the 3 char test that was 10% better. 2 char is 4% worse. Thoughts?

These kind of odd variations are typically caused by things like code alignment or branch prediction that will vary from build to build. One has to run it under Intel profiler to tell what is going on. I typically just look at the JITed code whether it looks fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's worthwhile adding here

I think it is worthwhile to add it, and also Length == 0. Otherwise, the API has unexpected perf cliff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😢 but don't optimize for them early? e.g.

if (anyOf.Length == 2)
{
    // Very common optimization for directory separators (/, \), quotes (", '), brackets, etc
    return IndexOfAny(anyOf[0], anyOf[1], startIndex, count);
}
else if (anyOf.Length == 3)
{
    return IndexOfAny(anyOf[0], anyOf[1], anyOf[2], startIndex, count);
}
else if (anyOf.Length > 3)
{
    return IndexOfCharArray(anyOf, startIndex, count);
}
else if (anyOf.Length == 1)
{
    return IndexOf(anyOf[0], startIndex, count);
}
else // anyOf.Length == 0
{
    return -1;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a good compromise. Done.

{
char c = *pCh;

if (c == value1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: if (c == value1 || c == value2) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also below

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@jkotas
Copy link
Member

jkotas commented Aug 8, 2017

@dotnet-bot test Tizen armel Cross Release Build

@jkotas
Copy link
Member

jkotas commented Aug 8, 2017

@dotnet-bot test Windows_NT x64 corefx_baseline

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jkotas jkotas merged commit 9a405f2 into dotnet:master Aug 8, 2017
@bbowyersmyth bbowyersmyth deleted the String_IndexOfAny branch August 26, 2017 02:43
@karelz karelz modified the milestone: 2.1.0 Aug 28, 2017
Copy link

@dnickless dnickless left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sory for being late. ;)

if (c == value1 || c == value2)
goto ReturnIndex1;

pCh += 2;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason why we only unroll 2x here? The other methods in this file, IndexOf and LastIndexOf both use 4x. My tests runs also show that switching to 4x cuts the time to 60-70% (actually, 8x is even faster on my machine but let's not go over the top...). Also, I would see value in keeping the two new explicit implementations for 2/3 elements char arrays as close as possible to the existing one for IndexOf() just with 2/3 if checks instead of just 1?!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the usages of IndexOfAny looked like they would be on short strings, path segments and the like. IndexOf use is much more varied and needs to have great performance on all string lengths.
It's a bit of a trade off as the more you unroll, the more time you spend outside that loop in slower code going a char at a time which hurts small string performance.
It's quite possible that with further analysis this function would be better off with 4x unroll but I locked in the win when it achieved its first two objectives. Which was to avoid the relative expensive probability map creation/comparison and to alleviate the need for the hard coded loops that have been popping up.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to look at that test code once you've got too much time: https://gist.github.com/dnickless/eb4807d698b7032b625a8d04794e90d8

if (c == value1 || c == value2 || c == value3)
goto ReturnIndex;

pCh++;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would 4x unrolling here not make sense, too? I didn't test this case but I would expect it to have a similar effect like with all other methods in this file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 char is a lot less common. This was really about giving a lot better performance with minimal code.

@danmoseley
Copy link
Member

Marked port consider per associated request via MSBuild profiles.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
9 participants