Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

System.Random: Parameterless constructor seeding improvement #1919 #2192

Merged
merged 2 commits into from
Oct 13, 2016

Conversation

dasMulli
Copy link

Implementation for Issue #1919: Improved the parameterless constructor for System.Random to be safer to use when called repeatedly.

@dnfclas
Copy link

dnfclas commented Nov 30, 2015

Hi @dasMulli, I'm your friendly neighborhood .NET Foundation Pull Request Bot (You can call me DNFBOT). Thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes. I promise there's no faxing. https://cla2.dotnetfoundation.org.

TTYL, DNFBOT;

@dasMulli
Copy link
Author

If i understand #1919 correctly, it isn't intended to be (as is System.Random because it is biased for some values). A secure way would be to use RandomNumberGenerator, but as i understand, referencing that from mscorlib is nontrivial on non-windows builds?

@dnfclas
Copy link

dnfclas commented Nov 30, 2015

@dasMulli, Thanks for signing the contribution license agreement so quickly! Actual humans will now validate the agreement and then evaluate the PR.

Thanks, DNFBOT;

}

/*==================================GenerateGlobalSeed====================================
**Action: Creates a cryptographically safe random number.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @Ironyx mentioned, this really isn't cryptographically safe (at least in part because no 32 bit number is). Please remove the comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'm shocked why i wrote the comment that way. Shame on me.

@dasMulli
Copy link
Author

dasMulli commented Dec 2, 2015

So, to clarify: This PR is to facilitate the use case "just give me some numbers", which should be the intent of calling the parameterless constructor. If someone built a Shuffle()-Method with new Random(), one would expect different results on consecutive calls. The approach of using Environment.TickCount for the empty constructor would cause the result to be the same for a millisecond, which is an awful lot of time for modern computers.
As @Ironyx mentioned, it is not even close to any cryptographic safety. I made a mistake because i used RandomNumberGenerator.Create() the first time to find out that it would require build configuration changes to make it compile on OS X. That is how the comment ended up there (thanks @morganbr).

@morganbr
Copy link

morganbr commented Dec 4, 2015

CC @janvorli

@GSPP
Copy link

GSPP commented Apr 3, 2016

Good implementation! This addresses my ticket and finally lays this common issue to rest.

**Returns: An integer that is safe to use as seed values for thread-local seed generators.
==========================================================================================*/
private static int GenerateGlobalSeed() {
return Guid.NewGuid().GetHashCode();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a Guid here? Can we use Environment.TickCount instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, on second thought that should be enough. One could argue that this may have some advantages if you started multiple processes during the same timer resolution interval but that is too hypothetical. Also Environment.TickCount is much less resource intensive. Will change it tomorrow.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Do you know what the perf impact of this is going to be?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introducing the same time-based problem again with that change. It will lead to problems much more rarely but it will lead to cross process identical random numbers. Any .NET process started within 15ms intervals will have the same random numbers (depending on threading structure)!

Also, the Random algorithm is not good. It has problems with near-identical seeds. E.g.: http://stackoverflow.com/questions/36376888/seeding-multiple-random-number-generators. I worry that the seeds x and x plus 15ms might result in correlated numbers.

Is it really that expensive to initialize the Guid generation system? When someone uses random numbers with the big and slow Random class, how bad can a Guid generation be?! It is just one Guid generation call ever. I'm sure 99% of the cost is initialization (JIT, PInvoke, the native Guid generator whatever it does).

This change would be very risky. At least its benefit should be quantified before making it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.High;

        Stopwatch.StartNew().Restart(); //warmup

        var sw = new Stopwatch();

        for (int i = 0; i < 100; i++)
        {
            sw.Restart();
            Guid.NewGuid();
            sw.Stop();
            Console.WriteLine(sw.Elapsed.Ticks);
        }

Gives me:

292
84
75
69
63
57
6
3
3
6
3
3
3
6
3
3
6
3
3
6
3
6
3
6
3
3
6
6
...

x64, Release, no Debugger. Dirty benchmark, but plausible results and repeatable.

Ticks are 100ns units. So we pay about 2.92 micro seconds for quality random numbers. var dummy = Environment.TickCount; appears to take 9 ticks for the first call.

Funny, that the first 6 numbers seem to have higher non-initialization costs (which are born by the first call). I wonder why that is.

new Random().Next(); gives:

51
27
21
15
12
12
12
12
12
12

All these first-call overheads are peanuts, both the for Guid and the 4.6 Random class.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @AlexGhiondea that TickCount is good enough as a global seed.

Although it may become more common to start a bunch of processes for scenarios for which AppDomains would have previously been used, relying on System.Random for important logic is discouraged anyway and applications doing that would have already faced seeding problems during startup.

Not to forget that if an applications requires "good" seeds, they most probably also need "good" random numbers so it should be using RNGCryptoServiceProvider instead of System.Random. That would be best addressed through documentation.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, multiple appdomains will have the same random numbers. This is worse.

What's the reasoning for not using Guid.NewGuid? Is 2.92 micro seconds too much?? That's 1/300000th of a second! This cost is only payed if the random infrastructure is being used. This is not in the startup path of most apps.

And if someone does not want to pay 2.92 micro seconds he can generate the seed himself. This argument works both ways.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is only one AppDomain in CoreCLR so that should not be an issue.

My concern with Guid is that is requires a bunch of additional computation. For the default case (which most people will use) I was hoping to avoid additional computations. This is really a convenience fix to help people start with a more random seed. If you have an app that really needs random seeds, you should not use the default ctor anyway.

Do we know what the difference is between using Guid and using TickCount in the new implementation?

Copy link

@GSPP GSPP Apr 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

additional computations

I have quantified those to 2.92 micro seconds. This is not "a bunch", it is a tiny amount. Are you concerned about 2.92 micro seconds?

Let's solve this problem for good and play it safe. Hopefully this is the last time this code needs to be touched. Just scroll through the result pages: https://www.google.com/webhp?complete=1&hl=en&gws_rd=ssl#complete=1&hl=en&q=c%23+random+same+number This trips people up. My hypothesis is that 50-90% of all .NET devs are affected by this issue once in their life. This is important to solve completely.

There is only one AppDomain in CoreCLR so that should not be an issue.

I understand CoreCLR code eventually makes its way into the Desktop product, does it not?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short answer is 'yes' I do worry 😄

However, in this case, I think using a Guid should be fine. I would still prefer that we don't introduce a dependency to Guid, but there doesn't seem to be any other way of introducing process specific entropy (which would solve the problem of multiple processes starting withing 15ms).

Regarding Desktop -- the code does not automatically make its way into Desktop. We have a [process in place] to track those kinds of changes. (https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/triage.md#porting-to-net-framework).

@AlexGhiondea
Copy link

@dotnet-bot please test Ubuntu x64 Checked Build and Test

…eed ThreadStatic Random instances that generate seeds for newly created Random instances.
@dasMulli dasMulli force-pushed the random-seeding-improvements branch from 3177551 to 9f6a0b6 Compare April 8, 2016 04:45
@dasMulli
Copy link
Author

dasMulli commented Apr 8, 2016

Squashed commits and rebased onto master.

@chrisaut
Copy link

What's happening here?

@Petermarcu
Copy link
Member

@AlexGhiondea , is this one ready to go? If so, lets finish it up...

}
t_threadRandom = new Random(seed);
}
return t_threadRandom.Next();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless things have changed recently around thread statics, it'd be good to cache t_threadRandom into a local so that the common path of it already being initialized has only one TLS read, rather than one for the null check and another for the Next call.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub Thanks for pointing that out! Updated PR with your other comments as well.


[ThreadStatic]
private static Random t_threadRandom;
private static Random s_globalRandom = new Random(GenerateGlobalSeed());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s_globalRandom can be readonly



/*=========================================================================================
**Action: Initializes a new instance of the Random class, using a the specified seed value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do need these added header comments? If we keep them, the "a the" here should be fixed.

@dasMulli
Copy link
Author

@dotnet-bot test Linux ARM Emulator Cross Debug Build

@dasMulli
Copy link
Author

1st build emulator segfault during regression test b49809, 2nd build timeout.
@dotnet-bot test Linux ARM Emulator Cross Debug Build

@Petermarcu
Copy link
Member

Checks have passed and change signed off. Merging. Thanks @dasMulli!

@Petermarcu Petermarcu merged commit 19ad9dd into dotnet:master Oct 13, 2016
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…mprovements

System.Random: Parameterless constructor seeding improvement dotnet/coreclr#1919

Commit migrated from dotnet/coreclr@19ad9dd
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
9 participants