-
Notifications
You must be signed in to change notification settings - Fork 2.6k
System.Random: Parameterless constructor seeding improvement #1919 #2192
System.Random: Parameterless constructor seeding improvement #1919 #2192
Conversation
|
Hi @dasMulli, I'm your friendly neighborhood .NET Foundation Pull Request Bot (You can call me DNFBOT). Thanks for your contribution! TTYL, DNFBOT; |
|
If i understand #1919 correctly, it isn't intended to be (as is System.Random because it is biased for some values). A secure way would be to use RandomNumberGenerator, but as i understand, referencing that from mscorlib is nontrivial on non-windows builds? |
|
@dasMulli, Thanks for signing the contribution license agreement so quickly! Actual humans will now validate the agreement and then evaluate the PR. |
src/mscorlib/src/System/Random.cs
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @Ironyx mentioned, this really isn't cryptographically safe (at least in part because no 32 bit number is). Please remove the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'm shocked why i wrote the comment that way. Shame on me.
|
So, to clarify: This PR is to facilitate the use case "just give me some numbers", which should be the intent of calling the parameterless constructor. If someone built a |
|
CC @janvorli |
|
Good implementation! This addresses my ticket and finally lays this common issue to rest. |
| **Returns: An integer that is safe to use as seed values for thread-local seed generators. | ||
| ==========================================================================================*/ | ||
| private static int GenerateGlobalSeed() { | ||
| return Guid.NewGuid().GetHashCode(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a Guid here? Can we use Environment.TickCount instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, on second thought that should be enough. One could argue that this may have some advantages if you started multiple processes during the same timer resolution interval but that is too hypothetical. Also Environment.TickCount is much less resource intensive. Will change it tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Do you know what the perf impact of this is going to be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introducing the same time-based problem again with that change. It will lead to problems much more rarely but it will lead to cross process identical random numbers. Any .NET process started within 15ms intervals will have the same random numbers (depending on threading structure)!
Also, the Random algorithm is not good. It has problems with near-identical seeds. E.g.: http://stackoverflow.com/questions/36376888/seeding-multiple-random-number-generators. I worry that the seeds x and x plus 15ms might result in correlated numbers.
Is it really that expensive to initialize the Guid generation system? When someone uses random numbers with the big and slow Random class, how bad can a Guid generation be?! It is just one Guid generation call ever. I'm sure 99% of the cost is initialization (JIT, PInvoke, the native Guid generator whatever it does).
This change would be very risky. At least its benefit should be quantified before making it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.High;
Stopwatch.StartNew().Restart(); //warmup
var sw = new Stopwatch();
for (int i = 0; i < 100; i++)
{
sw.Restart();
Guid.NewGuid();
sw.Stop();
Console.WriteLine(sw.Elapsed.Ticks);
}
Gives me:
292
84
75
69
63
57
6
3
3
6
3
3
3
6
3
3
6
3
3
6
3
6
3
6
3
3
6
6
...
x64, Release, no Debugger. Dirty benchmark, but plausible results and repeatable.
Ticks are 100ns units. So we pay about 2.92 micro seconds for quality random numbers. var dummy = Environment.TickCount; appears to take 9 ticks for the first call.
Funny, that the first 6 numbers seem to have higher non-initialization costs (which are born by the first call). I wonder why that is.
new Random().Next(); gives:
51
27
21
15
12
12
12
12
12
12
All these first-call overheads are peanuts, both the for Guid and the 4.6 Random class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @AlexGhiondea that TickCount is good enough as a global seed.
Although it may become more common to start a bunch of processes for scenarios for which AppDomains would have previously been used, relying on System.Random for important logic is discouraged anyway and applications doing that would have already faced seeding problems during startup.
Not to forget that if an applications requires "good" seeds, they most probably also need "good" random numbers so it should be using RNGCryptoServiceProvider instead of System.Random. That would be best addressed through documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, multiple appdomains will have the same random numbers. This is worse.
What's the reasoning for not using Guid.NewGuid? Is 2.92 micro seconds too much?? That's 1/300000th of a second! This cost is only payed if the random infrastructure is being used. This is not in the startup path of most apps.
And if someone does not want to pay 2.92 micro seconds he can generate the seed himself. This argument works both ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is only one AppDomain in CoreCLR so that should not be an issue.
My concern with Guid is that is requires a bunch of additional computation. For the default case (which most people will use) I was hoping to avoid additional computations. This is really a convenience fix to help people start with a more random seed. If you have an app that really needs random seeds, you should not use the default ctor anyway.
Do we know what the difference is between using Guid and using TickCount in the new implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
additional computations
I have quantified those to 2.92 micro seconds. This is not "a bunch", it is a tiny amount. Are you concerned about 2.92 micro seconds?
Let's solve this problem for good and play it safe. Hopefully this is the last time this code needs to be touched. Just scroll through the result pages: https://www.google.com/webhp?complete=1&hl=en&gws_rd=ssl#complete=1&hl=en&q=c%23+random+same+number This trips people up. My hypothesis is that 50-90% of all .NET devs are affected by this issue once in their life. This is important to solve completely.
There is only one AppDomain in CoreCLR so that should not be an issue.
I understand CoreCLR code eventually makes its way into the Desktop product, does it not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The short answer is 'yes' I do worry 😄
However, in this case, I think using a Guid should be fine. I would still prefer that we don't introduce a dependency to Guid, but there doesn't seem to be any other way of introducing process specific entropy (which would solve the problem of multiple processes starting withing 15ms).
Regarding Desktop -- the code does not automatically make its way into Desktop. We have a [process in place] to track those kinds of changes. (https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/triage.md#porting-to-net-framework).
|
@dotnet-bot please test Ubuntu x64 Checked Build and Test |
…eed ThreadStatic Random instances that generate seeds for newly created Random instances.
3177551 to
9f6a0b6
Compare
|
Squashed commits and rebased onto master. |
|
What's happening here? |
|
@AlexGhiondea , is this one ready to go? If so, lets finish it up... |
src/mscorlib/src/System/Random.cs
Outdated
| } | ||
| t_threadRandom = new Random(seed); | ||
| } | ||
| return t_threadRandom.Next(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless things have changed recently around thread statics, it'd be good to cache t_threadRandom into a local so that the common path of it already being initialized has only one TLS read, rather than one for the null check and another for the Next call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephentoub Thanks for pointing that out! Updated PR with your other comments as well.
src/mscorlib/src/System/Random.cs
Outdated
|
|
||
| [ThreadStatic] | ||
| private static Random t_threadRandom; | ||
| private static Random s_globalRandom = new Random(GenerateGlobalSeed()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: s_globalRandom can be readonly
src/mscorlib/src/System/Random.cs
Outdated
|
|
||
|
|
||
| /*========================================================================================= | ||
| **Action: Initializes a new instance of the Random class, using a the specified seed value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do need these added header comments? If we keep them, the "a the" here should be fixed.
|
@dotnet-bot test Linux ARM Emulator Cross Debug Build |
|
1st build emulator segfault during regression test b49809, 2nd build timeout. |
|
Checks have passed and change signed off. Merging. Thanks @dasMulli! |
…mprovements System.Random: Parameterless constructor seeding improvement dotnet/coreclr#1919 Commit migrated from dotnet/coreclr@19ad9dd
Implementation for Issue #1919: Improved the parameterless constructor for System.Random to be safer to use when called repeatedly.