Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std.uuid.randomUUID() is insecure and has been deprecated #7618

Closed
wants to merge 1 commit into from

Conversation

jpf91
Copy link
Contributor

@jpf91 jpf91 commented Sep 6, 2020

The randomUUID() overload not taking any arguments has been deprecated. It is dangerous, as it causes predictable, non-secure UUIDs and has not been documented as doing that previously. Silently changing this function to use a secure system RNG is not possible, as these RNGs may block at early boot when they are not yet seeded, causing randomUUID to block as well: https://wiki.debian.org/BoottimeEntropyStarvation

See https://forum.dlang.org/thread/zjmyxudvscfwrkrdabmt@forum.dlang.org for a detailed discussion of the problem.

Depending on whether the desired output needs to be cryptographically secure and considering the potential blocking problem, users should use rndGen (insecure) from std.random or a secure alternative (currently not available in phobos) like this:

auto insecureUUID = rndGen.randomUUID();

Affected phobos modules

All callers in phobos are in unittests and don't necessarily need secure, unpredictable UUIDs, they just need unique IDs to generate file names. The affected unittests have been updated to explicitly use the insecure rndGen.

Followup PRs

#7619 adds an interface to the OS RNG, which allows to generate secure UUIDs with std.uuid.

@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @jpf91! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + phobos#7618"

@jpf91
Copy link
Contributor Author

jpf91 commented Sep 6, 2020

Dscanner complains that local imports should only import individual symbols. But import std.uuid : randomUUID imports the whole overload set and triggers a deprecation warning on the 0-argument overload, even if it's never used. What's the preferred way to solve this?

EDIT: I just moved the import out of the unittest blocks into a version(unittest).

std/uuid.d Outdated Show resolved Hide resolved
The `randomUUID()` overload not taking any arguments has been
deprecated. It is dangerous, as it causes predictable, non-secure UUIDs
and has not been documented as doing that previously. Silently changing
this function to use a secure system RNG is not possible, as these RNGs
may block at early boot when they are not yet seeded, causing
`randomUUID` to block as well. See
https://forum.dlang.org/thread/zjmyxudvscfwrkrdabmt@forum.dlang.org for
a detailed discussion of the problem.

Depending on whether the desired output needs to be cryptographically
secure and considering the potential blocking problem, users should use
`rndGen` (insecure) from std.random or a secure alternative like this:
----------------------------
auto insecureUUID = rndGen.randomUUID();
----------------------------

All callers in phobos are in unittests and don't necessarily need
secure, unpredictable UUIDs, they just need unique IDs to generate file
names. The affected unittests have been updated to explicitly use the
insecure rndGen.
@CyberShadow
Copy link
Member

It may help to clarify why using a PRNG to generate UUIDs is "insecure". Are random UUIDs somehow generally defined as having been generated by a cryptographically secure RNG? (Otherwise, the same would apply to everything in Phobos using std.random.)

@jpf91
Copy link
Contributor Author

jpf91 commented Sep 6, 2020

The forum discussion linked in the description is based on this article: https://breakpoint.purrfect.fr/article/cracking_phobos_uuid.html

The author argues although the RFC does not guarantee that, users expect random UUIDs to be secure, as they use those as session IDs, etc. And the module did not previously document that the generated IDs are not secure in the context.

However, if you question that, you should rather discuss that with the author of that article / on the newsgroup than with me ;-) I don't have a strong opinion on the issue. I think this fact should have been documented, but phobos doesn't do any cryptography and we never stated that the function is cryptographically safe. Nevertheless, it may be nicer to force the user to explicitly decide whether he wants secure behaviour, rather than defaulting to insecure behaviour.

I don't have a strong opinion on that, but in the newsgroup there seemed to be an agreement that this is an issue which should be addressed, so I prepared this PR.

@CyberShadow
Copy link
Member

CyberShadow commented Sep 6, 2020

The forum discussion linked in the description is based on this article: https://breakpoint.purrfect.fr/article/cracking_phobos_uuid.html

I read the article. It seems a bit scaremongery-ish?

The change is good, but personally I would completely avoid the words "secure" and "insecure" altogether. We can use "unpredictable" to describe the OS RNG.

BTW, I have raised a similar concern many years ago for the Phobos's use of rndGen, which can be seeded with a constant value (incl. by some library deep in the import hierarchy), for generating random file names. The random filenames can also expose the state of rndGen and allow predicting future output. I was promptly put on a stake and dismissed.

@pbackus
Copy link
Contributor

pbackus commented Sep 6, 2020

The change is good, but personally I would completely avoid the words "secure" and "insecure" altogether. We can use "unpredictable" to describe the OS RNG.

It's probably better to use the precise technical term, "[not] cryptographically secure."

@schveiguy
Copy link
Member

I disagree with this change completely. This change:

  1. Breaks code that might already be OK (not used in generating security tokens)
  2. Recommends using the insecure random number generator via documentation and deprecation messages.

Someone who is ignorant of the security issues here, following the deprecation message, and substituting all uses of randomUUID with rndGen.randomUUID is still going to have the same problem. And Phobos will still be the one giving them that recommendation. In other words, this doesn't even shift the blame to the user!

I instead recommend simply switching to a secure RNG under the hood, which will not break any code, and fix the underlying insecurity issue. If they use randomUUID without caring which RNG is used, then we should use the one that is appropriate for the most situations. If they have an issue with that, there is the version that accepts an RNG to use.

The counter argument to this was here: https://forum.dlang.org/post/rj25m8$of8$1@digitalmars.com

In short, using a secure number generator may pause an application while entropy is gathered, whereas using the insecure one will not. However, this does not actually break anything. The code compiles as normal, the data that comes back is usable, it's just that it's possible (but very unlikely) that the application will not come out of pause (but this is only for system startup applications). Or it's possible the pause is unacceptable for the specific usage.

I don't think we should break all code that uses randomUUID for the sake of theoretical applications. I'm also OK with telling people who complain about pauses to go ahead and use the insecure version if they want to build with the latest Phobos. This is not the same as "silent breakage", where a programming error results. This is 100% an implementation detail that we can change.

I will note that vibe.d uses a secure RNG to generate UUIDs, and I've never seen any slowdown based on that.

@WebFreak001
Copy link
Member

I partly agree and partly disagree with @schveiguy here:

  1. Breaks code that might already be OK (not used in generating security tokens)

This does not break any code, in fact this changes nothing at all and is thus even less breakage than changing the default random generator. (which is already pretty much not breaking) People can just choose to ignore this notice, it's not like the simple randomUUID is going away for legacy applications.

  1. Recommends using the insecure random number generator via documentation and deprecation messages.

I believe this is a good way to shift blame to the user. The documentation clearly shows in the examples that this is insecure (in the example code comments and in documentation) so users will definitely know if they read the documentation. If they just discover this from auto complete, they will be greeted with the notice to explicitly specify the RNG they want to use. (some insecure or possible future secure one) In case a user uses an insecure algorithm where they need secure code, it's on them.

However I do agree that this is an implementation detail and the default should be changed to a more reasonable secure default, especially because there are very few cases where you need this function to be the fastest possible. (I can only really think of bulk UUID generation here)

Even in the case phobos gets a secure RNG and the default here is changed, this notice should stay in and then be flipped to tell people to use things like secureRNG.randomUUID() or rngGen.randomUUID() depending on the use-case they need.

@wilzbach
Copy link
Member

wilzbach commented Sep 6, 2020

We should not merge this until there's actually a secure rndGen in Phobos. Otherwise the user can't do anything about the message.

@schveiguy
Copy link
Member

schveiguy commented Sep 6, 2020

This does not break any code, in fact this changes nothing at all and is thus even less breakage than changing the default random generator.

This is deprecating a function. That is a breaking change for users of that function (by default you cannot build with deprecated functions).

Edit: I confused deprecations with warnings.

But what is the goal here? Is it to remove the no-arg version? If so, then it will break code.

It would actually be interesting if the end result is instead of removing the deprecated function, we immediately replace it with a secure one. Is that a possibility?

@schveiguy
Copy link
Member

I believe this is a good way to shift blame to the user. The documentation clearly shows in the examples that this is insecure (in the example code comments and in documentation) so users will definitely know if they read the documentation.

This entire endeavor is based on people NOT reading the documentation, and NOT realizing the issue. You think they will change now? They will read the compiler saying "blah blah use rndGen.randomUUID instead blah blah", and do that. I don't think we have a path from there to nudge them in the right direction (I suppose you could pragma(msg) a warning if it detects an unsecure RNG type as a parameter?)

I'll put it another way -- we are better off adding a secure RNG, updating the documentation to ONLY ever use that RNG and making NO other code changes than making this change. The onus is ALREADY on the interested user to use the right RNG, and it's possible. It's just that the default is dangerous and normally shouldn't be used.

@CyberShadow
Copy link
Member

As far as breaking changes go, this seems to be on the better side of the spectrum:

  • Fixing compilation requires a simple search & replace to preserve the old functionality.
  • The fixed code will work in old D versions, even much older ones (not just n-1 as we usually try to ensure in such cases).
  • The fixed code is beneficially clearer in a few problematic areas (not just "security" but also implicitly shared global state).

So, I'm not against the deprecation in this case, but Steven's proposal also sounds good.

One more argument to be made against this change is that we already have a lot of precedents for functions such as randomUUID: all functions in std.random already optionally take a RNG and default to rndGen, and randomUUID follows this established pattern. So, this raises the question why randomUUID should get this special treatment, and not everything else.

Another direction we could go is to replace MT19937 with a CSPRNG, and then seed it from an unpredictable RNG (i.e. the OS/hardware RNG) at runtime initialization. That should avoid the blocking/IO for each RNG call.

@schveiguy
Copy link
Member

schveiguy commented Sep 6, 2020

  • The fixed code will work in old D versions

It depends on what you mean by "work". It will still have the same issues that the old code has (insecure UUIDs generated).

This is an interesting deprecation in that we aren't actually trying to fix bugs or even remove the behavior at all. It's just that we want people to pay attention to what they are doing, and make sure it fits the use case they have. And the mechanism we are using is pretty much a bikeshed recoloring. It's like "hey, we repainted the bikeshed, maybe now you'll notice the hole in it and fix that."

I'm not sure this change accomplishes more than making people rewrite their already flawed code into another identically flawed version. It would be useful to explore further the expected behavior changes and consider whether we think this PR will effect those changes.

Note that std.random has a huge disclaimer on the top warning about how these RNGs are not cryptographically secure, but randomUUID does not. An uncontroversial change might simply be to copy that disclaimer to the randomUUID function.

why randomUUID should get this special treatment, and not everything else.

An excellent question. I think the simple answer is, UUIDs are often used as generated tokens for authentication. If there are other functions which are commonly used to implement security features such as authentication or cryptography, then we should make sure those use a better RNG as well by default.

I see no reason to require usage of a possibly much slower crypto-appropriate RNG for something like std.random.dice. I'll also note again that the disclaimer at the top puts std.random in a much less culpable position than it is for std.uuid.

@jpf91
Copy link
Contributor Author

jpf91 commented Sep 6, 2020

Note that std.random has a huge disclaimer on the top warning about how these RNGs are not cryptographically secure, but randomUUID does not. An uncontroversial change might simply be to copy that disclaimer to the randomUUID function.

I'm fine with that, but there was the argument made on the newsgroup that this does not notify maintainers of currently broken code.

Another direction we could go is to replace MT19937 with a CSPRNG, and then seed it from an unpredictable RNG (i.e. the OS/hardware RNG) at runtime initialization. That should avoid the blocking/IO for each RNG call.

It might be nice to have such csprng in phobos, but it might be nontrivial to get right. I guess chacha20 + fast key erasure would be the minimum things to consider: https://crypto.stackexchange.com/questions/67700/what-are-implementations-of-prngs-based-on-chacha20 . On some OS, this is actually the variant used in arc4random.

libsodium has such an implementation, but it's ISC licensed, so probably not compatible with phobos.

@pbackus
Copy link
Contributor

pbackus commented Sep 6, 2020

Note that std.random has a huge disclaimer on the top warning about how these RNGs are not cryptographically secure, but randomUUID does not. An uncontroversial change might simply be to copy that disclaimer to the randomUUID function.

I'm fine with that, but there was the argument made on the newsgroup that this does not notify maintainers of currently broken code.

Changing randomUUID to use a CSPRNG by default does better than notify maintainers: it actually fixes currently-broken code, without any effort required on the maintainers' part.

The only reason there is any need to notify maintainers in the first place is because this solution has (thus far) been rejected.

@jpf91
Copy link
Contributor Author

jpf91 commented Sep 6, 2020

The only reason there is any need to notify maintainers in the first place is because this solution has (thus far) been rejected.

Well, I don't mind. We can change randomUUID to default to a system RNG, if everyone agrees to that. However, we have to get #7619 merged first then.

@RazvanN7
Copy link
Collaborator

Since #7619 was closed and this was depending on it, I will also close this one.

@RazvanN7 RazvanN7 closed this Jan 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants