New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Proposal: CSPRNG integers with ranges #30873

Closed
vcsjones opened this Issue Jul 6, 2018 · 10 comments

Comments

Projects
None yet
5 participants
@vcsjones
Collaborator

vcsjones commented Jul 6, 2018

Currently, there are two main ways of getting random data. The insecure System.Random, and System.Security.Cryptography.RandomNumberGenerator and its derivatives.

The Problem

A common issue I find when doing reviews is that developers are unsure of the best way to get random integers in a secure manner, specifically when they need to clamp the integer to a specific range. They will usually take a few incorrect approaches.

  1. Use System.Random because it offers a Next(int, int) option that has exactly what they want, at the cost of insecurity. They may seed System.Random with output from a CSPRNG, however this is also problematic for a variety of reasons.

  2. Use RandomNumberGenerator incorrectly. They may use GetBytes to fill an integer and use modulus to clamp it to a specific range. Modulus usually introduces selection bias. I've seen all manners of trying to use S.S.C.RNG to coerce its values into integers, some better than others, but many of which are flawed.

Developers usually want to do this for a variety of reasons. A common reason I see is to build a random string with a specific character set. Something like:

static string GetRandomString(string characterSet, int length) {
  var builder = new StringBuilder();
  for (var i = 0; i < length; i++) {
    var randInt = GetSomeRandomInteger(fromInclusive: 0, toExclusive: characterSet.Length);
    builder.Append(characterSet[randInt]);
  }
  return builder.ToString();
}

This can be done for reset codes, confirmation codes, temporary passwords, etc.

Proposal

We should offer an API that allows developers to get random integers, securely, without having to think too hard about it.

Add the following APIs:

namespace System.Security.Cryptography {
  public abstract class RandomNumberGenerator {

    /// <summary>
    /// Provides a random integer within a specific range.
    /// </summary>
    /// <param name="fromInclusive">The inclusive lower-bound of the range.</param>
    /// <param name="toExclusive">The exclusive upper-bound of the range.</param>
    public static int GetInteger(int fromInclusive, int toExclusive) => throw null;

    /// <summary>
    /// Provides a random integer from zero to an upper-bound.
    /// </summary>
    /// <param name="toExclusive">The exclusive upper-bound of the range.</param>
    public static int GetInteger(int toExclusive) => throw null;
  }
}

S.S.C.RNG will implement these and rely on RandomNumberGeneratorImplementation's GetBytes implementation.

Other thoughts

Is it worth making this cancellable? It's very likely that the implementation will have handle the case of making multiple attempts to generate a number that falls within the correct range. Those implementations often distribute binomially, so it's unlikely to be a source of a DoS, but possible. If this is a concern, perhaps an overload returning ValueTask<int> and accepting a CancellationToken are desirable.

Is there prior art somewhere in .NET that we can make public, or move in to CoreFX? (Perhaps aspnet has already solved this.)

/cc @bartonjs @GrabYourPitchforks

@GrabYourPitchforks

This comment has been minimized.

Member

GrabYourPitchforks commented Jul 6, 2018

I don't think we need to make this cancelable. Assuming our implementation generates a random 64-bit value internally, and assuming that value is evenly drawn from the domain of all 64-bit integers, the odds of us needing to try again for bias elimination are around 1 in 8 billion.

@bartonjs

This comment has been minimized.

Member

bartonjs commented Jul 6, 2018

Something like

if (toExclusive <= fromInclusive)
    throw something();

int range = toExclusive - fromInclusive - 1;

if (range == 0)
    return fromInclusive;

int lzcnt = Intrinsics.SSE4.lzcnt(range);
int mask = (1 << 32-lzcnt) - 1;

int val = int.MaxValue;
Span<int> valSpan = stackalloc int[1];

while (val > range)
{
    GetBytes(valSpan.AsBytes());
    val = valSpan[0] & mask;
}

return val + fromInclusive;

?

Spinning with the lzcnt/mask approach never happens if the total range was a power of 2 (range == mask), and at approaching 50% (from below) probability in the case where the total range is a power of 2 plus one. 4 spins would have a ~6.25% chance, and only consume 32 bytes from the CSPRNG.

If someone wanted to hook it up to a blocking TRNG I guess maybe they'd have a concern; but any CSPRNG should be able to handle that at no worse than "F10 fast" (aka if you're in the debugger you never see this call lag).

@GrabYourPitchforks

This comment has been minimized.

Member

GrabYourPitchforks commented Jul 6, 2018

Clearly we also need a method void Shuffle<T>(IList<T> buffer). :)

@vcsjones

This comment has been minimized.

Collaborator

vcsjones commented Jul 6, 2018

Clearly we also need a method void Shuffle

I had considered adding a proposal for different shuffling algorithms, however decided that I may do those later as a separate API proposal to keep this one succinct and have a higher chance of the proposal moving forward.

The broader issue I see is confusion about when to use random and when to use a shuffle. "I want to randomly generate a string with no repeating characters" would make good use for a proper shuffle.

If someone wanted to hook it up to a blocking TRNG

Yes. This API proposal is dependent on GetBytes and its implementation. This was the rationale behind making the members virtual. If a user implemented this against a very slow RNG, they could re-implement the GetInteger functions, perhaps even just throwing a NotSupportedException because the implementation can't be used for this API, for whatever reason.

A case I might see is a developer that derives RandomNumberGenerator to create a mock one that returns predictable values for unit testing. If they were to call GetInteger on their derived type (say it gets swapped out via IoC in unit tests), it might spin forever.

Speaking to some offline comments I got asked, "why is the upper bound exclusive", that's to match existing behaviors of many other APIs and frameworks. System.Random uses an exclusive upper-bound, and it would be nice if developers that want to switch to this could do so fairly easily.

@bartonjs

This comment has been minimized.

Member

bartonjs commented Jul 6, 2018

I see the value, and it might be the right shape for what it is. There are two drawbacks I see:

  • We just added Fill(Span<byte>) to remove most of the places where people needed an instance of a CSPRNG (modulo places they want to swap one out), this goes against that grain.
  • It wastes a lot of random bits.

Yahtzee's 5d6 can be accomplished in 15 bits, 3 per die (with a 1/4 chance of a "cocked die" in each set). 5 * 3 + 5 * 3 / 4 + 5 * 3 / 16 = 15 + 3 + 0 = 18 bits EV. 5 calls to GetInteger(6) will burn 5 * 32 + 5 * 32/4 + 5 * 32/16 + 5 * 32/64 + 5 * 32/256 = 160 + 40 + 10 + 2 + 0 = 212 bits EV. Essentially, that threw 200 bits in the garbage (1200% overhead). It shouldn't really be a burden on a CSPRNG, but it's still less than ideal.

Certainly the code could be modified to ask only the "relevant bytes" to be filled, which would drop Yahtzee to 5 * 8 + 5 * 8 / 4 + 5 * 8 / 16 + 5 * 8 / 64 = 40 + 10 + 2 + 0 = 52 bits EV., down to only 200% overhead.

A "bit bank" would also reduce it, but that would allow a segment of memory to spy on to know what the next random value of something would be, and that feels wrong.

@vcsjones

This comment has been minimized.

Collaborator

vcsjones commented Jul 6, 2018

We just added Fill(Span<byte>) to remove most of the places where people needed an instance of a CSPRNG (modulo places they want to swap one out), this goes against that grain.

I see. Would we want additional static members that defer to RandomNumberGeneratorImplementation?

namespace System.Security.Cryptography {
  public abstract class RandomNumberGenerator {
    public static int GetInteger(int fromInclusive, int toExclusive) {
      RandomNumberGeneratorImplementation.GetInteger(fromInclusive, toExclusive);
    }
  }
}

Or should we not bother with the instance members at all now and rely on RandomNumberGeneratorImplementation's GetBytes?

It wastes a lot of random bits.

Yes. As you mentioned, it can be improved to only request the necessary bytes instead of filling a whole integer. The trade off for me is that some developers will continue to do the wrong thing unless something is in the box.

@bartonjs

This comment has been minimized.

Member

bartonjs commented Jul 6, 2018

The problem is just that you can't have a static and an instance member with the same name. I don't know that it's "wrong" to add this as an instance member, but it means that the caller is back to managing an object lifetime. So it's just a thing to think about.

@vcsjones

This comment has been minimized.

Collaborator

vcsjones commented Jul 6, 2018

I updated the proposal to remove the instance methods and instead propose static methods.

  1. This aligns with the desire to not require managing an instance of S.S.C.RNG.
  2. Alleviates the concern where a developer has subclassed RandomNumberGenerator and changed GetBytes to work in a way that will break the implementation of GetInteger.
  3. I don't see tremendous value in allowing passing different instances around to swap out behavior. If developers really want that behavior, they can build their own instance-based wrapper around the static methods.
@terrajobst

This comment has been minimized.

Member

terrajobst commented Jul 17, 2018

We discussed if we should add a float or double version because that would be consistent with Random but that's because the underlying primitive there is a double while RandomNumberGenerator's primitive is a random bits. We could add floating point version if someone ever cares.

We considered a parameterless overload GetInt32() as this would allow int.MaxValue to be part of the range but in practice there really isn't a scenario for it (and it can be down allocation free with the GetBytes() method if ever needed).

We also considered GetInt64() versions but given that these values are mostly used for indexing this seems equally over the top.

namespace System.Security.Cryptography
{
  public abstract class RandomNumberGenerator
  {
    public static int GetInt32(int fromInclusive, int toExclusive);
    public static int GetInt32(int toExclusive);
  }
}
@vcsjones

This comment has been minimized.

Collaborator

vcsjones commented Jul 17, 2018

I'm happy to implement this if someone can assign it to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment