Skip to content

Improve speed of Randomizer.GetString#4512

Merged
stevenaw merged 6 commits intonunit:masterfrom
CollinAlpert:getrandomstring_speed_up
Jun 16, 2024
Merged

Improve speed of Randomizer.GetString#4512
stevenaw merged 6 commits intonunit:masterfrom
CollinAlpert:getrandomstring_speed_up

Conversation

@CollinAlpert
Copy link
Contributor

@CollinAlpert CollinAlpert commented Oct 17, 2023

Fixes #4733

This PR improves speed and reduces allocations of the Randomizer.GetString method. Here is a benchmark:

Screenshot 2023-10-17 at 13 16 46

Benchmark code:

[MemoryDiagnoser]
public class Benchmark
{
	private const string DefaultStringChars = "abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNOPQRSTUVWXYZ0123456789_";

	[Params(5, 10, 100, 1000)]
	public int OutputLength;

	[Benchmark(Baseline = true)]
	public string GetRandomStringOld()
	{
		var data = new char[OutputLength];

		for (int i = 0; i < data.Length; i++)
			data[i] = DefaultStringChars[Random.Shared.Next(0, DefaultStringChars.Length)];

		return new string(data);
	}

	[Benchmark]
	public string GetRandomStringNew()
	{
		return string.Create(OutputLength, DefaultStringChars, static (span, allowedChars) =>
		{
			for (var i = 0; i < span.Length; i++)
			{
				span[i] = allowedChars[Random.Shared.Next(0, allowedChars.Length)];
			}
		});
	}
}

#if NET6_0_OR_GREATER
return string.Create(outputLength, allowedChars, (span, chars) =>
{
for (var i = 0; i < span.Length; i++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use int iso var for i
Also of you use the same names for the parameters as the other version makes this look similar.

I wonder if you can make a local function for the loop as that is exactly the same between both versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed the parameters and used int.

As for making a local function, that would add an extra allocation for the closure and we wouldn't get as much of a performance benefit.

Copy link
Member

@OsirisTerje OsirisTerje Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you make it local and mark it with the MethodImpAttribute(MethodImplOptions.AggressiveInlining) ? Does that affect performance?

And, you measure this in release, right?

Sorry for being a bit outdated: The inline keyword should be enough, and then the code can be beautiful and still performant :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might help with the allocation (I'd need to check if the JIT would actually be able to inline the function), but that still wouldn't solve the problem of finding a common type between Span<char> and char[]. Any magic performed to put both of those types in a common function will definitely result in worse performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A char[] can be assigned to a Span<char> also on .NET Framework. So both could use Span<char>.
I don't know if that changes any optimizations done by the JITter.

Could you put the below in your benchmark please, prevents me from having to write one.

public string GetString(int outputLength, string allowedChars)
{
#if NET6_0_OR_GREATER
    return string.Create(outputLength, allowedChars, Fill);
#else
    char[] data = new char[outputLength];
    Fill(data, allowedChars);
    return new string(data);
#endif

    void Fill(Span<char> data, string allowed)
    {
        for (int i = 0; i < data.Length; i++)
            data[i] = allowed[Next(0, allowed.Length)];
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However I have just run a benchmark with the updated code and it seems the method is not inlined and the performance is a bit worse than the NetFx code:

image

Please note that I do not have net472 installed and am running the net472 code on net8.0, so the string.Create on >= net6.0 version will still be faster than the stackalloc version on net472. It will just not be as performant as it could be.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you are comparing here? .NET Core vs NET Framework or old vs new?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no way of providing a realistic benchmark, since I can't install net472. In these benchmarks I am comparing the PR code for net472 and >= net6.0, but running both benchmarks on net8.0. Unfortunately that's the best I can do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My benchmark results show that for .NET48 nor NET6.0 the new code is not improving things or in case of .NET48 even making things slower, but for NET8.0 the improvement is a lot.

What I haven't tested is to see if a .NET6.0 build of NUNIT when run under NET8.0 would give the same improvement as we are not intending to make an NET8.0 binary Nuget package.

image

Copy link
Contributor Author

@CollinAlpert CollinAlpert Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does your benchmark do for GetStringNew in net48, since it can't use the string.Create API? Does it use stackalloc?

I managed to run some benchmarks on a Windows machine and also noticed that stackalloc was slower than char[] on net48, so I will probably revert that change.

@OsirisTerje
Copy link
Member

This seems to have gone stale.

Does this help in any way now, or should it be closed?

@stevenaw
Copy link
Member

@CollinAlpert Is this something you'd still like to pursue?

On the trade-off of net8 vs net4.8, my own take is that I think a big improvement on the former is worth a small slowdown in the other if it means the code stays simple. At this point I think most NET4.8 users know they're not getting the same performance as on newer runtimes.

# Conflicts:
#	src/NUnitFramework/framework/nunit.framework.csproj
@CollinAlpert
Copy link
Contributor Author

@stevenaw Yes, sure! Just let me know what you need me to do to consider this mergable.

Copy link
Member

@stevenaw stevenaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CollinAlpert ! I left one minor suggestion just to minimize the effects of large or negative values being passed in.

@manfred-brands I know you were already heavily involved in this review. I'm not sure if you had any other thoughts on your end.

#if NET6_0_OR_GREATER
return string.Create(outputLength, allowedChars, FillSpan);
#else
Span<char> data = stackalloc char[outputLength];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing I would suggest changing is this line. If a particularly large value of outputLength is passed then it will allocate a very large amount on the stack and potentially cause a stack overflow and take down the process. I'd suggest we conditionally use stackalloc only if the value of outputLength is below a low value like 256 and otherwise use an array like the original code does.

Perhaps something like (untested):

const int MaxStackAllocSize = 256;
Span<char> data = (uint)outputLength <= MaxStackAllocSize ? stackalloc char[outputLength] : new char[outputLength];

The cast to uint is there to catch and avoid stackalloc if a negative value were to be passed in. new char[-1] will still fail as well, but it would be an OverflowException rather than a StackOverflowException which will only fail the test rather than the process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, thanks! What do you think about throwing an ArgumentOutOfRangeException when a negative value gets passed in?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a great idea and would be a great usability improvement over the existing code. Thanks!

I don't think the change in thrown exception type here would be enough to break anyone so it should be fairly safe. If anything, the more targeted exception could help alert to issues in calling code

Copy link
Member

@stevenaw stevenaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CollinAlpert ! Changes look great, thanks for your contribution.

@stevenaw stevenaw merged commit d19a0c4 into nunit:master Jun 16, 2024
@CollinAlpert CollinAlpert deleted the getrandomstring_speed_up branch June 16, 2024 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve speed of Randomizer.GetString

4 participants