New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added GetInt32 to RandomNumberGenerator. #31243

Merged
merged 1 commit into from Jul 25, 2018

Conversation

Projects
None yet
6 participants
@vcsjones
Collaborator

vcsjones commented Jul 21, 2018

Attempt at implementing RNG.GetInt32.

/cc @bartonjs @GrabYourPitchforks

Fixes #30873.

private static int LeadingZeroCount(uint value)
{
//TODO: Use lzcnt intrinsics when available.

This comment has been minimized.

@vcsjones

vcsjones Jul 21, 2018

Collaborator

I wasn't sure if it was acceptable at the current point to add a reference to System.Runtime.Intrinsics.Experimental.

}
while ((result & mask) > maskedRange);
}
else

This comment has been minimized.

@vcsjones

vcsjones Jul 21, 2018

Collaborator

I don't have any strong opinions about how to handle big endian archs here or the appropriate way to test it. You could also use the same slicing from the little endian path and use BinaryPrimitives.ReverseEndianness. I thought BE should avoid the penalty though.

This comment has been minimized.

@jkotas

jkotas Jul 22, 2018

Member

The actual random number generation is going to dominate the performance of this method. I doubt that you are getting any measurable speed up by micro-optimizing this for BE/LE.

It would be better to make this simpler and smaller code, with no special casing for BE/LE, something like:

do
{
    Span<byte> randomFillSlice = valueResultBytes.Slice(0, bytesRequired);
    RandomNumberGeneratorImplementation.FillSpan(randomFillSlice);
    result = BinaryPrimitives.ReadInt32LittleEndian(randomFillSlice);
}
while ((result & mask) > maskedRange);

This comment has been minimized.

@vcsjones

vcsjones Jul 22, 2018

Collaborator

Bah. That makes way more sense. Thanks!

@@ -96,12 +97,107 @@ public static void Fill(Span<byte> data)
RandomNumberGeneratorImplementation.FillSpan(data);
}
public static int GetInt32(int fromInclusive, int toExclusive)
{
if (fromInclusive >= toExclusive) throw new ArgumentException(SR.Argument_InvalidRandomRange);

This comment has been minimized.

@bartonjs

bartonjs Jul 22, 2018

Member

Style: The throw statement should be on the next line, indented.

value <<= 2;
c += 2;
}
return c + ((int)value >> 31);

This comment has been minimized.

@bartonjs

bartonjs Jul 22, 2018

Member

I'm not sure this is properly implemented.

value = 00000000
c = 1
yes
value = 00000000
c = 17
yes
value = 00000000
c = 25
yes
value = 00000000
c = 29
yes
value = 00000000
c = 31

return 31 + (0 >> 31) => 31
(32 was correct)

The "maybe subtraction" at the end seems less straightforward than the clz3 algorithm on Wikipedia (https://en.wikipedia.org/wiki/Find_first_set#CLZ). Of course, that one also has to special case 0 😄.

0 won't be used in context here, but since this is supposed to be functionally equivalent to lzcnt it should handle 0 correctly, particularly because this is likely to serve as the implementation on arm.

This comment has been minimized.

@vcsjones

vcsjones Jul 22, 2018

Collaborator

Hm, OK. I borrowed it from here and eyeballed it and it looked OK (shame on me), I will re-implement.

private static int LeadingZeroCount(uint value)

public static int GetInt32(int toExclusive)
{
if (toExclusive <= 0) throw new ArgumentOutOfRangeException(nameof(toExclusive), SR.ArgumentOutOfRange_NeedPosNum);

This comment has been minimized.

@bartonjs

bartonjs Jul 22, 2018

Member

throw on next line. The next function happens to violate our style (rule 1, at that). If you wanted to clean it up I wouldn't mind.

Span<uint> valueResult = stackalloc uint[1];
ref uint result = ref valueResult[0];
Span<byte> valueResultBytes = MemoryMarshal.AsBytes(valueResult);
if (BitConverter.IsLittleEndian)

This comment has been minimized.

@bartonjs

bartonjs Jul 22, 2018

Member

My personal style is to have a blank line before a control flow statement because I find it helps legibility. I'm just offering it out there, our style guide isn't prescriptive on this one way or another.

public static void GetInt32_FullRange()
{
int result = RandomNumberGenerator.GetInt32(int.MinValue, int.MaxValue);
Assert.NotEqual(0, result);

This comment has been minimized.

@bartonjs

bartonjs Jul 22, 2018

Member

There's a 1 in 4 billion chance this fails. If you want to assert something you should just assert it wasn't int.MaxValue

observedNumbers[number]++;
}
}
const double tollerance = 0.07;

This comment has been minimized.

@bartonjs

bartonjs Jul 22, 2018

Member

typo: tolerance

foreach ((_, int occurences) in observedNumbers)
{
double percentage = occurences / (double)numbers.Length;
Assert.True(Math.Abs(expected - percentage) < tollerance, "Occured number of times within threshold.");

This comment has been minimized.

@bartonjs

bartonjs Jul 22, 2018

Member

Typo: Occurred

double percentage = occurences / (double)numbers.Length;
Assert.True(Math.Abs(expected - percentage) < tollerance, "Occured number of times within threshold.");
}
}
}

This comment has been minimized.

@bartonjs

bartonjs Jul 22, 2018

Member

It would be good to see range validity test for negative bounds, and some tests with interesting-looking bounds sizes (really I want to test both the lzcnt implementation and the algorithm. Something annoying like (x, x + 0x0101) so the high bit in the mask value is technically required, but rarely valid. If you ran that 1000 times you "should" have at least one hit on the highest value, and shouldn't have value skew.)

(int.MinValue, int.MinValue + 3)
(-257, -129)
(-100, 5)
(254, 512)
...

Also, roll 1000d6, check for skew. (The existing "coinflip" test sort of does this, but it's bit-aligned in range, so it doesn't test the high value rejection vs mod bias algorithm differential)

// undue pressure to the underlying random number generator.
int bytesRequired = (32 + 7 - leadingZeroCount) / 8;
Span<uint> valueResult = stackalloc uint[1];

This comment has been minimized.

@jkotas

jkotas Jul 22, 2018

Member

We disable zero-initialization for span stackalloc in corefx in number of files (not for this assembly currently). It would be nice to code this without depending on stackalloc zero initialization.

@vcsjones

This comment has been minimized.

Collaborator

vcsjones commented Jul 22, 2018

I'm not fully sure what the failure in d243963 was. It looks like infrastructure? The most information I could glean was a workItemStatus had a failure count of 1, but the test failure count was 0.

Update: "mission control" says it was System.IO.Packaging.Tests.

@vcsjones

This comment has been minimized.

Collaborator

vcsjones commented Jul 22, 2018

@bartonjs @jkotas thank you for the (quick!) code review. I think I have addressed all feedback.

// We only want to generate as many bytes as required to satisfy the mask to not apply
// undue pressure to the underlying random number generator.
int bytesRequired = (32 + 7 - leadingZeroCount) / 8;

This comment has been minimized.

@danmosemsft

danmosemsft Jul 22, 2018

Member

Maybe comment where the 7 comes from?

mask = (1U << 32 - leadingZeroCount) - 1;
}
// We only want to generate as many bytes as required to satisfy the mask to not apply

This comment has been minimized.

@GrabYourPitchforks

GrabYourPitchforks Jul 22, 2018

Member

Is it really that bad to drain 4 bytes at a time from the CSPRNG? Seems like it would also simplify the logic below considerably.

This comment has been minimized.

@GrabYourPitchforks

GrabYourPitchforks Jul 22, 2018

Member

You can also get away with not relying on the lzcnt instruction, since if you're draining 4 bytes at a time you don't really care about the leading zero count. All you really care about is smearing the most significant set bit rightward to generate a mask:

// assume 'value' is uint
uint mask = value;
mask |= mask >> 1;
mask |= mask >> 2;
mask |= mask >> 4;
mask |= mask >> 8;
mask |= mask >> 16;

This comment has been minimized.

@GrabYourPitchforks

GrabYourPitchforks Jul 22, 2018

Member

In particular, lines 119 - 149 become much simpler, collapsing to:

uint mask = maskedRange;
mask |= mask >> 1;
mask |= mask >> 2;
mask |= mask >> 4;
mask |= mask >> 8;
mask |= mask >> 16;

Span<uint> valueResult = stackalloc uint[1]; // no need to zero-init since we're filling the whole buffer
uint result;
do {
    RandomNumberGeneratorImplementation.FillSpan(MemoryMarshal.AsBytes(valueResult));
    result = valueResult[0] & mask;   
} while (result > maskedRange)

And you get rid of the LeadingZeroCount function entirely.

This comment has been minimized.

@vcsjones

vcsjones Jul 22, 2018

Collaborator

Is it really that bad to drain 4 bytes at a time from the CSPRNG? Seems like it would also simplify the logic below considerably.

I don't really have a strong opinion - agree the code is simpler, but my overall feeling is that while it adds a lot of lines of code, it doesn't add that much additional complexity.

@bartonjs ?

This comment has been minimized.

@vcsjones

vcsjones Jul 22, 2018

Collaborator

For context, I think the majority of random numbers are for indexing into collections to pick a random "thing", and most collections are fairly small, less than 256, so we'd be asking the CSPRNG for something that we throw away 75% of the time. I know the Windows RNG is pretty capable of dealing with that, but I'm not sure about other platforms like IoT where it may have a larger impact.

This comment has been minimized.

@bartonjs

bartonjs Jul 23, 2018

Member

@GrabYourPitchforks You've worked with much more constrained environments than I have. If you don't think that burning that much data from the CSPRNG is bad then I'll endorse the "simple until we need to be more complicated" plan.

All of our "with a keyboard" systems are probably measured in MB/s, but I don't have a sense of the limit of MD5 calls per second on a Raspberry Pi. (Or whatever PRF is used these days)

This comment has been minimized.

@GrabYourPitchforks

GrabYourPitchforks Jul 24, 2018

Member

I'm not aware of any device where it's significantly more expensive to drain 4 bytes from the RNG than it is to drain fewer than 4 bytes. Maybe we'll have such a platform in the future. For now IMO our best bet is to keep the code as simple as possible, and we can optimize in a future release if a new requirement comes online.

This comment has been minimized.

@vcsjones

vcsjones Jul 24, 2018

Collaborator

Okie doke, simple it is.

throw new ArgumentException(SR.Argument_InvalidRandomRange);
// The total possible range is [0, 4,294,967,295).
long adjustedRange = (long)toExclusive - fromInclusive;

This comment has been minimized.

@GrabYourPitchforks

GrabYourPitchforks Jul 22, 2018

Member

This result fits into a uint, no need to make it a long.

This comment has been minimized.

@vcsjones

vcsjones Jul 22, 2018

Collaborator

@GrabYourPitchforks Well, since ranges themselves can be negative it made sense to use long to allow subtracting a negative from a negative to just do the right thing. Are you suggesting

uint adjustedRange = (uint)((long)toExclusive - fromInclusive);

Or do you want to avoid the long entirely?

This comment has been minimized.

@vcsjones

vcsjones Jul 22, 2018

Collaborator

Oh I think you are proposing

uint adjustedRange = (uint)toExclusive - (uint)fromInclusive;

I forget that two's complement is magic.

}
while (result > maskedRange);
return (int)(result + fromInclusive);

This comment has been minimized.

@GrabYourPitchforks

GrabYourPitchforks Jul 22, 2018

Member

This will spill to a long then back to an int. Recommend instead return (int)result + fromInclusive.

@vcsjones

This comment has been minimized.

Collaborator

vcsjones commented Jul 23, 2018

Test failures look unrelated.

@vcsjones

This comment has been minimized.

Collaborator

vcsjones commented Jul 25, 2018

@GrabYourPitchforks @bartonjs OK. Changed, rebased, squashed.

@GrabYourPitchforks GrabYourPitchforks merged commit 928873f into dotnet:master Jul 25, 2018

14 checks passed

CROSS Check Build finished.
Details
Linux arm Release Build Build finished.
Details
Linux arm64 Release Build Build finished.
Details
Linux x64 Release Build Build finished.
Details
Linux-musl x64 Debug Build Build finished.
Details
NETFX x86 Release Build Build finished.
Details
OSX x64 Debug Build Build finished.
Details
Packaging All Configurations x64 Debug Build Build finished.
Details
UWP CoreCLR x64 Debug Build Build finished.
Details
UWP NETNative x86 Release Build Build finished.
Details
WIP ready for review
Details
Windows x64 Debug Build Build finished.
Details
Windows x86 Release Build Build finished.
Details
license/cla All CLA requirements met.
Details

@vcsjones vcsjones deleted the vcsjones:30873-rng-integers branch Jul 25, 2018

@karelz karelz added this to the 3.0 milestone Aug 21, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment