Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@stephentoub
Copy link
Member

@stephentoub stephentoub commented Apr 25, 2016

Today the String.Concat(params string[]) implementation makes a defensive copy of the input string[] in order to avoid issues where another thread mutates the array concurrently with the concatenation. Such a situation is possible but exceedingly rare, so we can redo the implementation to make it cheaper when there isn't concurrent mutation and more expensive when there is, rather than always having it be more expensive.

This commit changes string.Concat to optimistically assume there won't be such concurrent mutation. It avoids allocating the defensive string[] array copy, and instead just proceeds to allocate a string of the right length and copy the input strings into it. If along the way it discovers that something has changed such that the string lengths no longer add up to exactly what was precomputed, then it falls back to the original implementation.

Example microbenchmark:

using System;
using System.Diagnostics;

public class Program
{
    public static void Main()
    {
        string result;
        string[] inputs = new[] { "abcd", "efgh", "ijkl", "mnop", "qrst", "uvwx", "yz" };
        var sw = new Stopwatch();
        while (true)
        {
            int gen0 = GC.CollectionCount(0);
            sw.Restart();
            for (int i = 0; i < 20000000; i++) result = string.Concat(inputs);
            sw.Stop();
            Console.WriteLine($"{GC.CollectionCount(0) - gen0}: {sw.Elapsed.TotalSeconds}");
        }
    }
}

Before the change:

> corerun test.exe
1525: 2.0733829
1526: 2.0599474
1526: 2.0717786
1526: 2.0318797
^C

After the change:

> corerun test.exe
763: 1.4700695
762: 1.446919
763: 1.4581139
763: 1.4568889
^C

cc: @jkotas, @ellismg

Today the String.Concat(params string[]) implementation makes a defensive copy of the input string[] in order to avoid issues where another thread mutates the array concurrently with the concatenation.  Such a situation is possible but exceedingly rare, so we can redo the implementation to make it cheaper when there isn't concurrent mutation and more expensive when there is, rather than always having it be more expensive.

This commit changes string.Concat to optimistically assume there won't be such concurrent mutation.  It avoids allocating the defensive string[] array copy, and instead just proceeds to allocate a string of the right length and copy the input strings into it.  If along the way it discovers that something has changed such that the string lengths no longer add up to exactly what was precomputed, then it falls back to the original implementation.

Example microbenchmark:
```C#
using System;
using System.Diagnostics;

public class Program
{
    public static void Main()
    {
        string result;
        string[] inputs = new[] { "abcd", "efgh", "ijkl", "mnop", "qrst", "uvwx", "yz" };
        var sw = new Stopwatch();
        while (true)
        {
            int gen0 = GC.CollectionCount(0);
            sw.Restart();
            for (int i = 0; i < 20000000; i++) result = string.Concat(inputs);
            sw.Stop();
            Console.WriteLine($"{GC.CollectionCount(0) - gen0}: {sw.Elapsed.TotalSeconds}");
        }
    }
}
```
Before the change:
```
> corerun test.exe
1525: 2.0733829
1526: 2.0599474
1526: 2.0717786
1526: 2.0318797
^C
```
After the change:
```
763: 1.4700695
762: 1.446919
763: 1.4581139
763: 1.4568889
^C
```
// fall back should be extremely rare.
return copiedLength == totalLength ?
result :
ConcatWithDefensiveCopy(values);
Copy link
Member

@jkotas jkotas Apr 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not need to duplicate the implementation for defensive copy - it should be fine sufficient to just call the method recursively as Concat((string[])values.Clone()).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean Concat:)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will change.

@jkotas
Copy link
Member

jkotas commented Apr 25, 2016

Do we have tests in corefx to exercise the rare path?

@stephentoub
Copy link
Member Author

stephentoub commented Apr 25, 2016

Do we have tests in corefx to exercise the rare path?

Not currently checked in, as the special case didn't previously exist. But I have a little mini-stress outerloop test locally that typically ends up hitting both of the special paths (too long in the middle, incorrect length at the end), and I'll put up a PR for it tomorrow once this change is checked in and consumable in the packages. I don't want to put it up yet as it also includes a few more tests that don't currently pass, e.g. checking for the 0-length optimization that exists in the other string.Concat overloads, doesn't currently exist in the string.Concat(string[]) overload, but now with this change does exist in that one.

EDIT: I put up the PR at dotnet/corefx#8052, but it can't be merged until corefx consumes this change.

@stephentoub
Copy link
Member Author

Updated.


return ConcatArray(internalValues, totalLength);

// If it's too long, fail, or if it's empty, return an empty string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConcatArray is called from a single place now - can fold it into its caller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@jkotas
Copy link
Member

jkotas commented Apr 25, 2016

LGTM otherwise

Use a recursive call to Concat rather than using a separate ConcatWithDefensiveCopy method.
@stephentoub stephentoub force-pushed the remove_stringconcat_defensivecopy branch from 424a7c6 to 0b088d2 Compare April 25, 2016 15:59
@hughbe
Copy link

hughbe commented Apr 25, 2016

Nice. Do we have a similar situation with other params string[] methods such as string.Join or string.Format? If so, am I correct in suggesting that these may be optimized too?

@stephentoub
Copy link
Member Author

Do we have a similar situation with other params string[] methods such as string.Join or string.Format ?

Not as-is, in that they use StringBuilderCache and Append directly to the received StringBuilder. The StringBuilderCache approach is a good one when it's expected that the returned string will fit within the limits of the StringBuilder cache, and it's a good approach for string.Format where there's a lot more complexity involved.

But... it might be interesting to look at string.Join specifically to see whether changing to use an approach like the one now employed by Concat would help with performance. At the moment I'm not seeing significant perf downsides to doing so (other than the super rare case of concurrent manipulation), but there may be something I'm missing, and I expect the implementation would end up being slightly more complex, though not terribly. For the cases where the string ends up being longer than what fits in a StringBuilder from the cache, I'd expect this approach would be more efficient.

@ellismg
Copy link

ellismg commented Apr 25, 2016

LGTM as well. Thanks, Steve.

@stephentoub
Copy link
Member Author

Test Windows_NT x64 Debug Build and Test please
Test Windows_NT x64 Release Priority 1 Build and Test please

@stephentoub
Copy link
Member Author

Not sure why the Windows_NT x86 Debug Legacy Backend Build and Test leg is still showing as yellow; clicking on the Details link shows that it completed successfully.

@stephentoub stephentoub merged commit de32aed into dotnet:master Apr 26, 2016
@stephentoub stephentoub deleted the remove_stringconcat_defensivecopy branch April 26, 2016 01:09
jkotas pushed a commit to dotnet/corert that referenced this pull request May 1, 2016
Avoid defensive copy in String.Concat(string[])
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Avoid defensive copy in String.Concat(string[])

Today the String.Concat(params string[]) implementation makes a defensive copy of the input string[] in order to avoid issues where another thread mutates the array concurrently with the concatenation.  Such a situation is possible but exceedingly rare, so we can redo the implementation to make it cheaper when there isn't concurrent mutation and more expensive when there is, rather than always having it be more expensive.

This commit changes string.Concat to optimistically assume there won't be such concurrent mutation.  It avoids allocating the defensive string[] array copy, and instead just proceeds to allocate a string of the right length and copy the input strings into it.  If along the way it discovers that something has changed such that the string lengths no longer add up to exactly what was precomputed, then it falls back to the original implementation.

Example microbenchmark:
```C#
using System;
using System.Diagnostics;

public class Program
{
    public static void Main()
    {
        string result;
        string[] inputs = new[] { "abcd", "efgh", "ijkl", "mnop", "qrst", "uvwx", "yz" };
        var sw = new Stopwatch();
        while (true)
        {
            int gen0 = GC.CollectionCount(0);
            sw.Restart();
            for (int i = 0; i < 20000000; i++) result = string.Concat(inputs);
            sw.Stop();
            Console.WriteLine($"{GC.CollectionCount(0) - gen0}: {sw.Elapsed.TotalSeconds}");
        }
    }
}
```
Before the change:
```
> corerun test.exe
1525: 2.0733829
1526: 2.0599474
1526: 2.0717786
1526: 2.0318797
^C
```
After the change:
```
> corerun test.exe
763: 1.4700695
762: 1.446919
763: 1.4581139
763: 1.4568889
^C
```

Commit migrated from dotnet/coreclr@de32aed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants