Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid most of the (small) regressions for seeded and derived Random instances #57530

Merged
merged 4 commits into from
Aug 17, 2021

Conversation

stephentoub
Copy link
Member

Fixes #57272
cc: @GrabYourPitchforks, @tannergooding

When we introduced the new Random algorithm, we did so by factoring the old algorithm out into an implementation strategy class that is instantiated for all use other than new Random(). This ends up penalizing other uses (providing a seed and/or deriving from Random) by adding more virtual dispatch than is strictly necessary, in particular for new Random(seed). This PR negates most but not all of that (expected and small) regression by splitting the compat implementation in two, one class for new Random(seed) and one for new DerivedRandom()/new DerivedRandom(seed); the former no longer needs to make virtual calls back out to the parent type. The former is also one that a consumer can't really do anything to improve, whereas in the derived case, the derivation may override to provide a more optimal implementation.

Additionally, we introduced NextInt64 in this release, which means we can still change the algorithm it uses / what derived methods it calls. This changes the implementation to make 3 calls instead of 8.

I also experimented but ultimately decided against the proposal in the cited issue to lazily-initialize the array for a derived type the first time it uses the state. The lazy initialization was measurable, and most derived types I found do end up using one of the base methods (even though obviously it could be used purely as an abstraction). We can reconsider in the future.

Type Method Job Toolchain Mean Error StdDev Ratio
RandomDerived Next Job-MYJVJF \main\corerun.exe 8.256 ns 0.1118 ns 0.0873 ns 1.00
RandomDerived Next Job-SVARSY \pr\corerun.exe 8.216 ns 0.0484 ns 0.0452 ns 1.00
RandomSeed Next Job-MYJVJF \main\corerun.exe 9.153 ns 0.0145 ns 0.0129 ns 1.00
RandomSeed Next Job-SVARSY \pr\corerun.exe 8.282 ns 0.0595 ns 0.0527 ns 0.90
RandomDerived NextMax Job-MYJVJF \main\corerun.exe 13.140 ns 0.0733 ns 0.0685 ns 1.00
RandomDerived NextMax Job-SVARSY \pr\corerun.exe 12.978 ns 0.0863 ns 0.0765 ns 0.99
RandomSeed NextMax Job-MYJVJF \main\corerun.exe 12.890 ns 0.0623 ns 0.0583 ns 1.00
RandomSeed NextMax Job-SVARSY \pr\corerun.exe 9.623 ns 0.0717 ns 0.0636 ns 0.75
RandomDerived NextMinMax Job-MYJVJF \main\corerun.exe 13.435 ns 0.0845 ns 0.0749 ns 1.00
RandomDerived NextMinMax Job-SVARSY \pr\corerun.exe 13.756 ns 0.1365 ns 0.1277 ns 1.02
RandomSeed NextMinMax Job-MYJVJF \main\corerun.exe 13.366 ns 0.0667 ns 0.0591 ns 1.00
RandomSeed NextMinMax Job-SVARSY \pr\corerun.exe 10.686 ns 0.0471 ns 0.0441 ns 0.80
RandomDerived NextInt64 Job-MYJVJF \main\corerun.exe 78.238 ns 0.2008 ns 0.1677 ns 1.00
RandomDerived NextInt64 Job-SVARSY \pr\corerun.exe 43.399 ns 0.6699 ns 0.5939 ns 0.55
RandomSeed NextInt64 Job-MYJVJF \main\corerun.exe 74.153 ns 0.4255 ns 0.3772 ns 1.00
RandomSeed NextInt64 Job-SVARSY \pr\corerun.exe 23.521 ns 0.0418 ns 0.0327 ns 0.32
RandomDerived NextInt64Max Job-MYJVJF \main\corerun.exe 113.069 ns 0.5636 ns 0.5272 ns 1.00
RandomDerived NextInt64Max Job-SVARSY \pr\corerun.exe 68.993 ns 0.8238 ns 0.6431 ns 0.61
RandomSeed NextInt64Max Job-MYJVJF \main\corerun.exe 117.979 ns 0.6594 ns 0.8098 ns 1.00
RandomSeed NextInt64Max Job-SVARSY \pr\corerun.exe 40.567 ns 0.1941 ns 0.1816 ns 0.34
RandomDerived NextInt64MinMax Job-MYJVJF \main\corerun.exe 112.131 ns 0.3612 ns 0.3017 ns 1.00
RandomDerived NextInt64MinMax Job-SVARSY \pr\corerun.exe 67.423 ns 0.1900 ns 0.1684 ns 0.60
RandomSeed NextInt64MinMax Job-MYJVJF \main\corerun.exe 116.348 ns 0.4822 ns 0.4275 ns 1.00
RandomSeed NextInt64MinMax Job-SVARSY \pr\corerun.exe 41.832 ns 0.2730 ns 0.2554 ns 0.36
RandomDerived NextSingle Job-MYJVJF \main\corerun.exe 11.778 ns 0.0799 ns 0.0709 ns 1.00
RandomDerived NextSingle Job-SVARSY \pr\corerun.exe 11.889 ns 0.0619 ns 0.0517 ns 1.01
RandomSeed NextSingle Job-MYJVJF \main\corerun.exe 11.718 ns 0.0680 ns 0.0636 ns 1.00
RandomSeed NextSingle Job-SVARSY \pr\corerun.exe 9.043 ns 0.0333 ns 0.0295 ns 0.77
RandomDerived NextDouble Job-MYJVJF \main\corerun.exe 12.318 ns 0.1869 ns 0.1561 ns 1.00
RandomDerived NextDouble Job-SVARSY \pr\corerun.exe 11.460 ns 0.0667 ns 0.0557 ns 0.93
RandomSeed NextDouble Job-MYJVJF \main\corerun.exe 11.387 ns 0.0752 ns 0.0704 ns 1.00
RandomSeed NextDouble Job-SVARSY \pr\corerun.exe 8.947 ns 0.1129 ns 0.1056 ns 0.79
RandomDerived NextBytesArray Job-MYJVJF \main\corerun.exe 62,459.876 ns 384.2017 ns 340.5848 ns 1.00
RandomDerived NextBytesArray Job-SVARSY \pr\corerun.exe 68,100.319 ns 321.5379 ns 300.7667 ns 1.09
RandomSeed NextBytesArray Job-MYJVJF \main\corerun.exe 62,466.595 ns 260.9075 ns 231.2877 ns 1.00
RandomSeed NextBytesArray Job-SVARSY \pr\corerun.exe 68,294.184 ns 755.4770 ns 706.6736 ns 1.09
RandomDerived NextBytesSpan Job-MYJVJF \main\corerun.exe 80,514.292 ns 600.0956 ns 561.3298 ns 1.00
RandomDerived NextBytesSpan Job-SVARSY \pr\corerun.exe 81,179.767 ns 634.0510 ns 529.4613 ns 1.01
RandomSeed NextBytesSpan Job-MYJVJF \main\corerun.exe 85,292.812 ns 648.1876 ns 606.3151 ns 1.00
RandomSeed NextBytesSpan Job-SVARSY \pr\corerun.exe 64,892.721 ns 594.0378 ns 555.6634 ns 0.76
public class RandomSeed : TestBase
{
    public override Random Create() => new Random(42);
}

public class RandomDerived : TestBase
{
    public override Random Create() => new DerivedRandom();

    private sealed class DerivedRandom : Random { }
}

public abstract class TestBase
{
    private byte[] _buffer = new byte[10_000];
    private Random _random;

    public abstract Random Create();

    [GlobalSetup]
    public void Setup() => _random = Create();

    [Benchmark]
    public Random Ctor() => Create();

    [Benchmark]
    public int Next() => _random.Next();

    [Benchmark]
    public void NextMax() => _random.Next(42);

    [Benchmark]
    public void NextMinMax() => _random.Next(0, 42);

    [Benchmark]
    public void NextInt64() => _random.NextInt64();

    [Benchmark]
    public void NextInt64Max() => _random.NextInt64(42);

    [Benchmark]
    public void NextInt64MinMax() => _random.NextInt64(0, 42);

    [Benchmark]
    public void NextSingle() => _random.NextSingle();

    [Benchmark]
    public void NextDouble() => _random.NextDouble();

    [Benchmark]
    public void NextBytesArray() => _random.NextBytes(_buffer);

    [Benchmark]
    public void NextBytesSpan() => _random.NextBytes((Span<byte>)_buffer);
}

I'm not currently sure why NextBytesArray gets a little slower... that'll need more investigation.

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

@stephentoub stephentoub added this to the 6.0.0 milestone Aug 16, 2021
@ghost
Copy link

ghost commented Aug 16, 2021

Tagging subscribers to this area: @dotnet/area-system-runtime
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #57272
cc: @GrabYourPitchforks, @tannergooding

When we introduced the new Random algorithm, we did so by factoring the old algorithm out into an implementation strategy class that is instantiated for all use other than new Random(). This ends up penalizing other uses (providing a seed and/or deriving from Random) by adding more virtual dispatch than is strictly necessary, in particular for new Random(seed). This PR negates most but not all of that (expected and small) regression by splitting the compat implementation in two, one class for new Random(seed) and one for new DerivedRandom()/new DerivedRandom(seed); the former no longer needs to make virtual calls back out to the parent type. The former is also one that a consumer can't really do anything to improve, whereas in the derived case, the derivation may override to provide a more optimal implementation.

Additionally, we introduced NextInt64 in this release, which means we can still change the algorithm it uses / what derived methods it calls. This changes the implementation to make 3 calls instead of 8.

I also experimented but ultimately decided against the proposal in the cited issue to lazily-initialize the array for a derived type the first time it uses the state. The lazy initialization was measurable, and most derived types I found do end up using one of the base methods (even though obviously it could be used purely as an abstraction). We can reconsider in the future.

Type Method Job Toolchain Mean Error StdDev Ratio
RandomDerived Next Job-MYJVJF \main\corerun.exe 8.256 ns 0.1118 ns 0.0873 ns 1.00
RandomDerived Next Job-SVARSY \pr\corerun.exe 8.216 ns 0.0484 ns 0.0452 ns 1.00
RandomSeed Next Job-MYJVJF \main\corerun.exe 9.153 ns 0.0145 ns 0.0129 ns 1.00
RandomSeed Next Job-SVARSY \pr\corerun.exe 8.282 ns 0.0595 ns 0.0527 ns 0.90
RandomDerived NextMax Job-MYJVJF \main\corerun.exe 13.140 ns 0.0733 ns 0.0685 ns 1.00
RandomDerived NextMax Job-SVARSY \pr\corerun.exe 12.978 ns 0.0863 ns 0.0765 ns 0.99
RandomSeed NextMax Job-MYJVJF \main\corerun.exe 12.890 ns 0.0623 ns 0.0583 ns 1.00
RandomSeed NextMax Job-SVARSY \pr\corerun.exe 9.623 ns 0.0717 ns 0.0636 ns 0.75
RandomDerived NextMinMax Job-MYJVJF \main\corerun.exe 13.435 ns 0.0845 ns 0.0749 ns 1.00
RandomDerived NextMinMax Job-SVARSY \pr\corerun.exe 13.756 ns 0.1365 ns 0.1277 ns 1.02
RandomSeed NextMinMax Job-MYJVJF \main\corerun.exe 13.366 ns 0.0667 ns 0.0591 ns 1.00
RandomSeed NextMinMax Job-SVARSY \pr\corerun.exe 10.686 ns 0.0471 ns 0.0441 ns 0.80
RandomDerived NextInt64 Job-MYJVJF \main\corerun.exe 78.238 ns 0.2008 ns 0.1677 ns 1.00
RandomDerived NextInt64 Job-SVARSY \pr\corerun.exe 43.399 ns 0.6699 ns 0.5939 ns 0.55
RandomSeed NextInt64 Job-MYJVJF \main\corerun.exe 74.153 ns 0.4255 ns 0.3772 ns 1.00
RandomSeed NextInt64 Job-SVARSY \pr\corerun.exe 23.521 ns 0.0418 ns 0.0327 ns 0.32
RandomDerived NextInt64Max Job-MYJVJF \main\corerun.exe 113.069 ns 0.5636 ns 0.5272 ns 1.00
RandomDerived NextInt64Max Job-SVARSY \pr\corerun.exe 68.993 ns 0.8238 ns 0.6431 ns 0.61
RandomSeed NextInt64Max Job-MYJVJF \main\corerun.exe 117.979 ns 0.6594 ns 0.8098 ns 1.00
RandomSeed NextInt64Max Job-SVARSY \pr\corerun.exe 40.567 ns 0.1941 ns 0.1816 ns 0.34
RandomDerived NextInt64MinMax Job-MYJVJF \main\corerun.exe 112.131 ns 0.3612 ns 0.3017 ns 1.00
RandomDerived NextInt64MinMax Job-SVARSY \pr\corerun.exe 67.423 ns 0.1900 ns 0.1684 ns 0.60
RandomSeed NextInt64MinMax Job-MYJVJF \main\corerun.exe 116.348 ns 0.4822 ns 0.4275 ns 1.00
RandomSeed NextInt64MinMax Job-SVARSY \pr\corerun.exe 41.832 ns 0.2730 ns 0.2554 ns 0.36
RandomDerived NextSingle Job-MYJVJF \main\corerun.exe 11.778 ns 0.0799 ns 0.0709 ns 1.00
RandomDerived NextSingle Job-SVARSY \pr\corerun.exe 11.889 ns 0.0619 ns 0.0517 ns 1.01
RandomSeed NextSingle Job-MYJVJF \main\corerun.exe 11.718 ns 0.0680 ns 0.0636 ns 1.00
RandomSeed NextSingle Job-SVARSY \pr\corerun.exe 9.043 ns 0.0333 ns 0.0295 ns 0.77
RandomDerived NextDouble Job-MYJVJF \main\corerun.exe 12.318 ns 0.1869 ns 0.1561 ns 1.00
RandomDerived NextDouble Job-SVARSY \pr\corerun.exe 11.460 ns 0.0667 ns 0.0557 ns 0.93
RandomSeed NextDouble Job-MYJVJF \main\corerun.exe 11.387 ns 0.0752 ns 0.0704 ns 1.00
RandomSeed NextDouble Job-SVARSY \pr\corerun.exe 8.947 ns 0.1129 ns 0.1056 ns 0.79
RandomDerived NextBytesArray Job-MYJVJF \main\corerun.exe 62,459.876 ns 384.2017 ns 340.5848 ns 1.00
RandomDerived NextBytesArray Job-SVARSY \pr\corerun.exe 68,100.319 ns 321.5379 ns 300.7667 ns 1.09
RandomSeed NextBytesArray Job-MYJVJF \main\corerun.exe 62,466.595 ns 260.9075 ns 231.2877 ns 1.00
RandomSeed NextBytesArray Job-SVARSY \pr\corerun.exe 68,294.184 ns 755.4770 ns 706.6736 ns 1.09
RandomDerived NextBytesSpan Job-MYJVJF \main\corerun.exe 80,514.292 ns 600.0956 ns 561.3298 ns 1.00
RandomDerived NextBytesSpan Job-SVARSY \pr\corerun.exe 81,179.767 ns 634.0510 ns 529.4613 ns 1.01
RandomSeed NextBytesSpan Job-MYJVJF \main\corerun.exe 85,292.812 ns 648.1876 ns 606.3151 ns 1.00
RandomSeed NextBytesSpan Job-SVARSY \pr\corerun.exe 64,892.721 ns 594.0378 ns 555.6634 ns 0.76
public class RandomSeed : TestBase
{
    public override Random Create() => new Random(42);
}

public class RandomDerived : TestBase
{
    public override Random Create() => new DerivedRandom();

    private sealed class DerivedRandom : Random { }
}

public abstract class TestBase
{
    private byte[] _buffer = new byte[10_000];
    private Random _random;

    public abstract Random Create();

    [GlobalSetup]
    public void Setup() => _random = Create();

    [Benchmark]
    public Random Ctor() => Create();

    [Benchmark]
    public int Next() => _random.Next();

    [Benchmark]
    public void NextMax() => _random.Next(42);

    [Benchmark]
    public void NextMinMax() => _random.Next(0, 42);

    [Benchmark]
    public void NextInt64() => _random.NextInt64();

    [Benchmark]
    public void NextInt64Max() => _random.NextInt64(42);

    [Benchmark]
    public void NextInt64MinMax() => _random.NextInt64(0, 42);

    [Benchmark]
    public void NextSingle() => _random.NextSingle();

    [Benchmark]
    public void NextDouble() => _random.NextDouble();

    [Benchmark]
    public void NextBytesArray() => _random.NextBytes(_buffer);

    [Benchmark]
    public void NextBytesSpan() => _random.NextBytes((Span<byte>)_buffer);
}

I'm not currently sure why NextBytesArray gets a little slower... that'll need more investigation.

Author: stephentoub
Assignees: -
Labels:

area-System.Runtime

Milestone: 6.0.0

Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

When we introduced the new Random algorithm, we did so by factoring the old algorithm out into an implementation strategy class that is instantiated for all use other than `new Random()`.  This ends up penalizing other uses (providing a seed and/or deriving from Random) by adding more virtual dispatch than is strictly necessary, in particular for `new Random(seed)`.  This PR negates most of that (expected) regression by splitting the compat implementation in two, one class for `new Random(seed)` and one for `new DerivedRandom()`/`new DerivedRandom(seed)`; the former no longer needs to make virtual calls back out to the parent type.  The former is also one that a consumer can't really do anything to improve, whereas in the derived case, the derivation may override to provide a more optimal implementation.
We haven't shipped this yet, so we can change its implementation to make 3 calls instead of 8 and to delegate to a different overload of Next.
@stephentoub stephentoub merged commit f7633f4 into dotnet:main Aug 17, 2021
@stephentoub stephentoub deleted the randomperf branch August 17, 2021 15:07
@ghost ghost locked as resolved and limited conversation to collaborators Sep 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[.NET 6] Improving the default Random.NextInt64 performance for derived classes
3 participants