[WIP] Easy User-land CSPRNG #1119

Closed
wants to merge 23 commits into
from

Projects

None yet
@SammyK
Contributor
SammyK commented Feb 24, 2015

Submitting as a PR per @nikic's recommendation.

RFC can be found here.

@lt

I wonder about the usefulness of this function. This can be easily achieved with random_bytes and bin2hex, so I feel like this is unnecessary duplication of functionality for the sake of one function call.

Owner

Yeah, I'm wouldn't put up too much of a fight on this one, but 100% of the times I've ever needed random bytes it has always been in the form of hex or int. I'm thinking most user-land kids are in the same boat. :)

I noticed that the hex table for bin2hex() is just a char of 16 as opposed to the larger base64_encode() or session hex tables. Would it be considered more random with a bigger hex table?

Owner

So I've been doing a bunch of research on binary to hex conversions and learned a lot. Since hex (base 16) is just a different representation of the same data in binary (base 2), it wouldn't necessarily add more randomness to convert the data to Base32 or Base64. But at the end of the day, the common user-land PHP developer won't know about these things in too much detail (as I didn't before tonight!) But what the user-land dev is thinking, "I need a random alpha-numeric string", so hex should suffice.

I've also be researching how other languages do random like Ruby, Go and Python. They all look for /dev/urandom or the CryptGenRandom on Windows.

BTW: I've seen most implementations of a unique ID in other languages make use of the CSPRNG like in Python. I think we should do the same with uniqid(). :)

lt replied Feb 17, 2015

Good job doing the research (I feel almost proud! :)).

You're absolutely right, hex or base64 doesn't add anything. In fact it takes away entropy, and we're not talking about small amounts here.

So lets take hex for example. Conversion to hex literally takes an 8 bit representation of data, and turns it into a 4 bit representation of data (used as an index in a character table). So how much entropy do we really lose? Maths time (I'm thinking free-form here, and I'm not a mathematician, but hopefully this is along the right lines!)

The entropy in a base-encoded string compared to the raw binary is: 1 / (2(8 - bits_per_byte ) * nbytes). Just to test, plugging in the full 8 bits per byte we get 1 / 20 which is 1/1, full entropy.

So for example a 16 byte (128 bit) piece of keying material, keeping the input as hex rather than full binary, the key is 1 / 24*16 as strong. 1 / 264.

264 is a reasonably familiar number, being the number of combinations in a 64-bit unsigned integer. I can't remember it in full, but I can remember it's roughly 18.5 quintillion. A hex encoded string, for 16 bytes, is 18.5 quintillion times weaker than the full binary string. Incidentally, that is the same number of key combinations there are (2128 / 264 == 264).

It's still going to take a long time to crack, but I have no idea how this affects attacks on ciphers that break X of so many rounds, if you can guarantee specific weaknesses in the key.

lt replied Feb 17, 2015

I think migrating other parts of PHP to use a CSPRNG should be a separate PR.

uniqid() and session_create_id() are obvious candidates.

Owner

Oh man! Math! I've got more research to do. :)

Yes, I agree with you on creating separate PR's for uniqid() and session_create_id(). Hopefully those will be easy to knock out after this one.

I should have a little bit more time on Thursday to convert some of this research into commits on this bad boy. :)

@lt

Edit: - Hah, I see you already did this in your second commit :)

With your includes at the top:

#if PHP_WIN32
#include "win32/winutil.h"
#endif

Then use php_win32_get_random_bytes(unsigned char *buf, size_t len)
This already uses the CryptGenRandom windows API call, so we're all good here.

@lt

This should go back to being zend_string and use bytes->val as your char *.
zend_string_alloc to create it and zend_string_release if you hit an error condition. Don't release and use RETURN_STR if you have success. This saves having to copy the string at the end of the function with RETVAL_STRINGL

@lt

We need to be careful about uniformity in this function. If you use modulus to bring the number down to your upper bound and the divisor is not a power of 2, you can end up with some bias in the result which we obviously want to avoid with a CSPRNG.

I'll help with the implementation of fixing this.

Owner

Should we remove the min option and just have max? None of the PRNG's in other languages I've reviewed have a min. Also, in Ruby the default functionality returns a float. Don't know if we want to explore that path as well.

lt replied Feb 20, 2015

Remove min ๐Ÿ‘
Return float ๐Ÿ‘Ž

Owner

Sounds good!

@lt

There's a few more options to investigate here. /dev/urandom is going to be "ok", but I also think it should be the last option. Where possible we should try and avoid opening file descriptors.

Off the top of my head, things to investigate (I can do all of this if you like):

  • What would be the security impact of having a user space arc4random implementation for Linux.
  • Where arc4random exists, and libc is shared amongst all processes, is the arc4random state shared or per process?
  • Attempt to use /dev/arandom if it exists and for some odd reason we don't have arc4random
  • Linux getrandom syscall (for kernel versions >= 3.17)
@lt

To get a random number to work on do php_random_bytes(&number, sizeof(number));, so whatever source of random is picked as being the best for the platform is used.

@lt

@rican7 - Documentation issue in the RFC. The value in the implementation here is correct. I've let Sammy know.

Owner

Fixed in the RFC.

๐Ÿ‘

@lt

@SammyK I derped in the error message, minimum definitely should not be greater than the maximum :)

lt replied Feb 24, 2015

Perhaps we should make it so both min and max need to be specified to be valid? Doesn't really feel sensible to allow the min without a max. random_int(0) => 123456 // wat!

if (ZEND_NUM_ARGS() == 1) {
    php_error_docref(NULL, E_WARNING, "Error message of your choosing");
    RETURN_FALSE;
}
Owner

Lol. Nice catch. :) I'll fix.

@lt

This needs min putting back in.

@nikic nikic and 1 other commented on an outdated diff Feb 24, 2015
ext/standard/random.c
+ zend_string *bytes;
+
+ if (zend_parse_parameters(ZEND_NUM_ARGS(), "l", &size) == FAILURE) {
+ return;
+ }
+
+ if (size < 1) {
+ php_error_docref(NULL, E_WARNING, "Length must be greater than 0");
+ RETURN_FALSE;
+ }
+
+ bytes = zend_string_alloc(size, 0);
+
+ if (php_random_bytes(bytes->val, size) == FAILURE) {
+ zend_string_release(bytes);
+ return;
@nikic
nikic Feb 24, 2015 Contributor

Probably that should be RETURN_FALSE as well? Same in random_int.

@SammyK
SammyK Feb 24, 2015 Contributor

Nice catch! I'll fix.

@nikic nikic commented on an outdated diff Feb 24, 2015
ext/standard/random.c
+/* $Id$ */
+
+#include <stdlib.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <math.h>
+
+#include "php.h"
+
+#if PHP_WIN32
+# include "win32/winutil.h"
+#endif
+
+static int php_random_bytes(void *bytes, size_t size)
+{
+ int n = 0;
@nikic
nikic Feb 24, 2015 Contributor

Better would be ssize_t.

@nikic
nikic Feb 24, 2015 Contributor

The declaration should also be moved into the while loop, otherwise this will likely generate a warning on BSD systems with arc4random.

@laruence laruence and 2 others commented on an outdated diff Feb 25, 2015
ext/standard/random.c
+ RETURN_STR(bytes);
+}
+/* }}} */
+
+/* {{{ proto int random_int(int min, int max)
+Return an arbitrary pseudo-random integer */
+PHP_FUNCTION(random_int)
+{
+ zend_long min = ZEND_LONG_MIN;
+ zend_long max = ZEND_LONG_MAX;
+ zend_ulong limit;
+ zend_ulong umax;
+ zend_ulong result;
+
+ if (ZEND_NUM_ARGS() == 1) {
+ php_error_docref(NULL, E_WARNING, "A minimum and maximum value are expected, only minimum given");
@laruence
laruence Feb 25, 2015 Member

since you already have LONG_MIN and LONG_MAX, so if only minimum value min is given. then simply assume it means min to LONG_MAX?

@patrickallaert
patrickallaert Mar 19, 2015 Contributor

@laruence: I thought about the same thing but then I realize that reading:
random_int(10);
might be understood as 10 being the max.

@laruence
laruence Mar 19, 2015 Member

hmm, that could be a doc issue.. but it's not a big deal :)

@SammyK
SammyK Mar 20, 2015 Contributor

Oops - this is a dingleberry left over from when min and max were optional args. They are both required in the current spec that's being voted on. I'll remove this check. :)

@laruence laruence commented on an outdated diff Feb 25, 2015
ext/standard/random.c
+ zend_long max = ZEND_LONG_MAX;
+ zend_ulong limit;
+ zend_ulong umax;
+ zend_ulong result;
+
+ if (ZEND_NUM_ARGS() == 1) {
+ php_error_docref(NULL, E_WARNING, "A minimum and maximum value are expected, only minimum given");
+ RETURN_FALSE;
+ }
+
+ if (zend_parse_parameters(ZEND_NUM_ARGS(), "|ll", &min, &max) == FAILURE) {
+ return;
+ }
+
+ if (min >= max) {
+ php_error_docref(NULL, E_WARNING, "Minimum value must be less than the maximum value");
@laruence
laruence Feb 25, 2015 Member

maybe something like: 1st arg must be less than 2nd arg is more obvious for user?

@laruence laruence commented on an outdated diff Feb 25, 2015
ext/standard/random.c
+ umax = max - min;
+
+ if (php_random_bytes(&result, sizeof(result)) == FAILURE) {
+ RETURN_FALSE;
+ }
+
+ // Special case where no modulus is required
+ if (umax == ZEND_ULONG_MAX) {
+ RETURN_LONG((zend_long)result);
+ }
+
+ // Increment the max so the range is inclusive of max
+ umax++;
+
+ // Powers of two are not biased
+ if (umax & ~umax != umax) {
@laruence
laruence Feb 25, 2015 Member

use braces make it more readable

@tom-- tom-- commented on the diff Feb 25, 2015
ext/standard/random.c
+ }
+
+ // Increment the max so the range is inclusive of max
+ umax++;
+
+ // Powers of two are not biased
+ if (umax & ~umax != umax) {
+ // Ceiling under which ZEND_LONG_MAX % max == 0
+ limit = ZEND_ULONG_MAX - (ZEND_ULONG_MAX % umax) - 1;
+
+ // Discard numbers over the limit to avoid modulo bias
+ while (result > limit) {
+ if (php_random_bytes(&result, sizeof(result)) == FAILURE) {
+ return;
+ }
+ }
@tom--
tom-- Feb 25, 2015

So the random_int() function:

  • has a non-deterministic execution time (actually, it's quite random)
  • that doesn't have any limit
  • so the function isn't guaranteed to return at all, ever.

Naturally I understand that the likelihood of these characteristics causing problems in PHP applications is small but I think reviewers should be aware of it.

@nikic
nikic Feb 25, 2015 Contributor

Let bw = sizeof(zend_ulong) * 8. ZEND_ULONG_MAX % umax will take its largest value for umax = 2 ** (bw - 1), in which case limit = 2 ** (bw - 1) - 1. This means that all results with top bit 1 will be discarded here. For every result the probability for a set top bit is 1/2. As such the worst case probability that this loop has not stopped after generating n random numbers is 1 / 2**n.

@tom--
tom-- Feb 25, 2015

@nikic That looks correct.

@datibbaw
datibbaw Mar 21, 2015 Contributor

Why can't we just use the RAND_RANGE macro (or something similar) for this? It seems to me that fighting against the modulo problem is a wasted effort.

@nikic
nikic Mar 21, 2015 Contributor

@datibbaw Because RAND_RANGE is highly biased. It's not an option for crypto-quality randomness. I'm not aware of algorithms for this that don't use a form of rejection sampling.

@lt
lt Mar 21, 2015 Contributor

It seems to me that fighting against the modulo problem is a wasted effort.

Get out! :)

In rand() or mt_rand() I completely agree, it would be wasted effort. In a function that is advertised as crypto-quality it's important.

@datibbaw
datibbaw Mar 21, 2015 Contributor

Well, that's fair enough; I shall read more into this, thanks :)

@narfbg narfbg commented on the diff Feb 25, 2015
ext/standard/random.c
+ return;
+ }
+
+ if (min >= max) {
+ php_error_docref(NULL, E_WARNING, "Minimum value must be less than the maximum value");
+ RETURN_FALSE;
+ }
+
+ umax = max - min;
+
+ if (php_random_bytes(&result, sizeof(result)) == FAILURE) {
+ RETURN_FALSE;
+ }
+
+ // Special case where no modulus is required
+ if (umax == ZEND_ULONG_MAX) {
@narfbg
narfbg Feb 25, 2015 Contributor

I'm pretty sure this is dead code - an ULONG_MAX is always higher than LONG_MAX, and therefore even if you have umax = ZEND_LONG_MAX - 0, it can never be as high ...

Also, for some reason random_int(0, PHP_INT_MAX) results in umax = -1 ...

@lt
lt Mar 15, 2015 Contributor

@narfbg I don't think it's dead code. If you take the absolute minimum value, (unsigned) umax = (signed) PHP_INT_MAX - (signed) PHP_INT_MIN (full negative). Which should be the same as ZEND_ULONG_MAX.

Not sure how umax is equal to -1 under those conditions either. It's unsigned...

@SammyK SammyK referenced this pull request in SammyK/php-src-csprng Feb 25, 2015
Closed

Implement random_bytes() and random_int() #3

@SammyK SammyK commented on the diff Feb 25, 2015
ext/standard/random.c
+
+ if (ZEND_NUM_ARGS() == 1) {
+ php_error_docref(NULL, E_WARNING, "A minimum and maximum value are expected, only minimum given");
+ RETURN_FALSE;
+ }
+
+ if (zend_parse_parameters(ZEND_NUM_ARGS(), "|ll", &min, &max) == FAILURE) {
+ return;
+ }
+
+ if (min >= max) {
+ php_error_docref(NULL, E_WARNING, "Minimum value must be less than the maximum value");
+ RETURN_FALSE;
+ }
+
+ umax = max - min;
@SammyK
SammyK Feb 25, 2015 Contributor

We need to make sure the min & max values weren't the same.

@nikic
nikic Feb 25, 2015 Contributor

You're already checking this on line 126.

@SammyK
SammyK Feb 25, 2015 Contributor

Oh snap. Not. Enough. Coffee! Oh wait, I'm drinking decaf for some reason today. :/

@nikic nikic added RFC PHP7 labels Mar 8, 2015
@patrickallaert patrickallaert and 1 other commented on an outdated diff Mar 19, 2015
ext/standard/random.c
+ zend_long min = ZEND_LONG_MIN;
+ zend_long max = ZEND_LONG_MAX;
+ zend_ulong limit;
+ zend_ulong umax;
+ zend_ulong result;
+
+ if (ZEND_NUM_ARGS() == 1) {
+ php_error_docref(NULL, E_WARNING, "A minimum and maximum value are expected, only minimum given");
+ RETURN_FALSE;
+ }
+
+ if (zend_parse_parameters(ZEND_NUM_ARGS(), "|ll", &min, &max) == FAILURE) {
+ return;
+ }
+
+ if (min >= max) {
@patrickallaert
patrickallaert Mar 19, 2015 Contributor

I would change >= to be >.
Reason: If I need a number between a and b, I expect that it could possibly return either a or b, if the range is inclusive (note that it isn't clear in the RFC).
In the case a == b, I would expect that this method returns a or b.
If that isn't particularly useful with fixed value being passed to the function, it is in the case of variables.

@ircmaxell
ircmaxell Mar 26, 2015 Contributor

Well, if min == max, then there's nothing random about the answer (it's min). Hence there's no reason for the function to return a valid value (in fact, returning a value could mask bugs where you thought it was random but it wasn't).

@datibbaw

The random_int() function isn't very efficient in terms of entropy consumption; if I want random values between 1 and 100 it may consume at least 8 bytes to return the value, wasting 7 bytes.

@lt
Contributor
lt commented Mar 21, 2015

@datibbaw We've discussed at quite some length that it is virtually (and in most cases literally) impossible to exhaust any sources of entropy.

Do you have any concerns about this that aren't related to the pool exhaustion myth?

We can check the bounds and request less bytes, but I wonder if it's worth the overhead. Obviously pulling bytes and discarding them helps the cause of being unpredictable, so I'm personally not too bothered about this until a substantiated problem arises from it.

@tom--
tom-- commented Mar 21, 2015

@datibbaw

There is no question about the requirement: this feature must generate uniformly-distributed random numbers. Our context means we need an algorithm that will derive them from uniformly-distributed random bytes.

Back in the 1980s when computers were really slow and expensive and efficiency of algorithms was actually important, I worked on Monte Carlo simulations, which do not even need cryptographic randomness. We still had to use this algorithm for generating uniformly-distributed random integers in a given range because we needed statistically meaningful results. To this day I have never come across an alternative.

If "entropy consumption" is so serious a problem that we cannot use this algorithm then we have to drop the feature altogether. So I think you need to explain what "entropy consumption" means in this context and how its "efficiency" becomes a serious consideration in the design of this generator.

@datibbaw
Contributor

I agree that with a proper entropy source (or a combination of them) this ought not be a problem. The only reason for bringing this up is that I've run into issues with /dev/random before :)

@tom--
tom-- commented Mar 21, 2015

@datibbaw Everybody runs into the problem of /dev/random blocking. The solution is simple: never use /dev/random on Linux. (/dev/random on Linux is a bug.) This generator does not use it.

@ircmaxell ircmaxell commented on an outdated diff Mar 25, 2015
ext/standard/random.c
+#if HAVE_DEV_ARANDOM
+ fd = open("/dev/arandom", O_RDONLY);
+#else
+#if HAVE_DEV_URANDOM
+ fd = open("/dev/urandom", O_RDONLY);
+#endif // URANDOM
+#endif // ARANDOM
+ if (fd < 0) {
+ php_error_docref(NULL, E_WARNING, "Cannot open source device");
+ return FAILURE;
+ }
+
+ RANDOM_G(fd) = fd;
+ }
+
+ size_t n = 0;
@ircmaxell
ircmaxell Mar 25, 2015 Contributor

n must be signed. It can be int or ssize_t, but it needs to be signed. Otherwise the if (n < 0) condition can literally never happen.

@weltling weltling commented on an outdated diff Mar 26, 2015
ext/standard/php_random.h
+/* $Id$ */
+
+#ifndef PHP_RANDOM_H
+#define PHP_RANDOM_H
+
+PHP_FUNCTION(random_bytes);
+PHP_FUNCTION(random_int);
+
+PHP_MINIT_FUNCTION(lcg);
+
+ZEND_BEGIN_MODULE_GLOBALS(random)
+ int fd;
+ZEND_END_MODULE_GLOBALS(random)
+
+#ifdef ZTS
+# define RANDOM_G(v) TSRMG(random_globals_id, zend_random_globals *, v);
@weltling
weltling Mar 26, 2015 Contributor

there should be no semicolon at the end

@weltling
weltling Mar 26, 2015 Contributor

Also ZEND_TSRMG should be used to avoid abundant function calls. But even without it I guess the TS build is broken as there's no code allocating/deallocating globals. You can lookup how it is done in some other submodule, fe browser cap.

I'd also like to ask - shouldn't the secure device actually be reopened on each request rather than once on the start? Maybe it were better to introduce the request handlers, if it should. Or even not putting it into globals, but reopening every time it's required - then it's guaranteed to be valid every time it's used.

@weltling weltling commented on the diff Mar 26, 2015
ext/standard/basic_functions.c
@@ -3646,6 +3659,8 @@ PHP_MINIT_FUNCTION(basic) /* {{{ */
# endif
#endif
+ BASIC_MINIT_SUBMODULE(random)
@weltling
weltling Mar 26, 2015 Contributor

The prototype to this needs to be somewhere in the header, otherwise there'll be visibility issue.

@weltling weltling commented on an outdated diff Mar 26, 2015
ext/standard/random.c
+#include <fcntl.h>
+#include <math.h>
+
+#include "php.h"
+#include "php_random.h"
+
+#if PHP_WIN32
+# include "win32/winutil.h"
+#endif
+
+ZEND_DECLARE_MODULE_GLOBALS(random);
+
+/* {{{ */
+PHP_MINIT_FUNCTION(random)
+{
+ RANDOM_G(fd) = -1;
@weltling
weltling Mar 26, 2015 Contributor

this descriptor should be closed somewhen

@weltling weltling commented on the diff Mar 26, 2015
ext/standard/random.c
+ php_error_docref(NULL, E_WARNING, "A minimum and maximum value are expected, only minimum given");
+ RETURN_FALSE;
+ }
+
+ if (zend_parse_parameters(ZEND_NUM_ARGS(), "|ll", &min, &max) == FAILURE) {
+ return;
+ }
+
+ if (min >= max) {
+ php_error_docref(NULL, E_WARNING, "Minimum value must be less than the maximum value");
+ RETURN_FALSE;
+ }
+
+ umax = max - min;
+
+ if (php_random_bytes(&result, sizeof(result)) == FAILURE) {
@weltling
weltling Mar 26, 2015 Contributor

What is the principal difference of this move from casting some garbage into integer? Could someone point to a theory behind this?

@ircmaxell
ircmaxell Mar 26, 2015 Contributor

What do you mean? How else would you do it?

@weltling
weltling Mar 26, 2015 Contributor

Yeah, that was actually my question :) Maybe I'm too fixed on LCG, so just making a stab on finding some system. In a LCG one would have a seed and a kind of formula. Here, we read some sequence of bits which are then inclined to be an integer.

Probably right, at the end those bits are glued together. At the end, the integer is in the exact range of how much bits was requested. But just wondering, no further shuffling, endianness difference, etc., just taking them as is? Are there some tests on the quality of the outcome? That's basically what I was asking.

@tom--
tom-- Mar 26, 2015

@weltling There is no need. If the input to the algorithm is a sequence of independent, uniformly-distributed random bytes then the its output is guaranteed to be independent, uniformly-distributed over the requested range and have the same "kind" of randomness as the input. In this case, since we are using trusted sources of crypto-secure pseudo-random bytes, the output is a crypto-secure pseudo-random integer.

Try thinking of it this way. Say you need random numbers in the range 0 to 13 and all you have as a source of randomness is a friend with a 20-sided die. The only way (that I know) to get the numbers you need without bias or skewing the distribution is to ask your friend to keep rolling the die until it gives a number in the desired range.

Now, say you need numbers in the range 0 to 8. You can map two disjoint subsets of the die's 0 to 19 range to the desired range and this improves the algorithm's efficiency.

I believe this is exactly analogous to how this algorithm is supposed to work. (I am making no comment about the correctness of the implementation.)

It's an old algorithm we can trust.

@lt
lt Mar 26, 2015 Contributor

@weltling There's no need for any extra work. If an integer occupies 64 bits of memory, and we use a random source to set every one of those bits, the result is a random integer with the same quality as the random source.

Endianness does not matter since every byte is independently random. If the bytes are ordered AB they are equally random to being ordered BA.

There are tools that can test the quality of the random output, but you'll literally be testing the underlying source. We're putting our faith in the Linux/Windows/BSD APIs here. If it turns out that these sources are in fact low quality, then civilisation itself will collapse :)

@lt
lt Mar 26, 2015 Contributor

@tom--

I believe this is exactly analogous to how this algorithm is supposed to work.

Indeed. Find the ceiling under RAND_MAX where upper_bound % ceiling == 0, and discard all values greater than that ceiling.

@weltling
weltling Mar 26, 2015 Contributor

@lt the world is concurrent enough to not to break because of whichever virtual RNG :) But ok, so the presumption of innocence is applied to the OS randomness sources.

@tom-- yeah, maybe also an improvement could be to ask the friend to throw more than one die at once. Possibly it could reduce the whole circles count. However not sure how reliable it would be with this method (i mean how many uniform random data one can get at once), for LCG i can say it could be done with something like AVX/SSE vectorization capabilities. Here it's probably only to be spotted empirically.

Thanks for the answers, guys.

@sarciszewski

Congratulations on passing unanimously! One thing: Will you ever support the getrandom(2) syscall on newer Linux flavors?

(I didn't see anything in the RFC and I'm not really awake enough to peruse the code.)

@lt
Contributor
lt commented Mar 30, 2015

@sarciszewski This is something we have discussed, but decided to leave for future scope. getrandom (and getentropy on BSD) are typically used to seed RNGs, and are not designed for high throughput themselves.

If you compile against LibreSSL and have a Linux Kernel version > 3.17, you will be using getrandom today.

@ircmaxell
Contributor

The vote passed, but was never closed. This should be merged soon ;-)

@nikic nikic commented on an outdated diff Apr 9, 2015
ext/standard/random.c
+#if HAVE_DEV_ARANDOM
+ fd = open("/dev/arandom", O_RDONLY);
+#else
+#if HAVE_DEV_URANDOM
+ fd = open("/dev/urandom", O_RDONLY);
+#endif // URANDOM
+#endif // ARANDOM
+ if (fd < 0) {
+ php_error_docref(NULL, E_WARNING, "Cannot open source device");
+ return FAILURE;
+ }
+
+ RANDOM_G(fd) = fd;
+ }
+
+ ssize_t n = 0;
@nikic
nikic Apr 9, 2015 Contributor

This needs to be moved into the while look (combined declaration and assignment) to conform with C89.

@SammyK
Contributor
SammyK commented Apr 9, 2015

Thanks for the push @ircmaxell! :) I'm coordinating with @lt to get this bad boy ready for merge. :)

@lt lt commented on an outdated diff Apr 9, 2015
ext/standard/random.c
+ if (php_random_bytes(bytes->val, size) == FAILURE) {
+ zend_string_release(bytes);
+ RETURN_FALSE;
+ }
+
+ bytes->val[size] = '\0';
+
+ RETURN_STR(bytes);
+}
+/* }}} */
+
+/* {{{ proto int random_int(int min, int max)
+Return an arbitrary pseudo-random integer */
+PHP_FUNCTION(random_int)
+{
+ zend_long min = ZEND_LONG_MIN;
@lt
lt Apr 9, 2015 Contributor

@SammyK Since both parameters are required, we don't need these defaults any more.

@SammyK
Contributor
SammyK commented Apr 10, 2015

Hokay! @lt updated this PR based on feedback. I normalized the return values when errors happen & updated the tests & merged in the lest from master & fixed conflicts. Phew!

The RFC officially passed unanimously and this PR can be merged into master now. W00t! Not sure if @nikic is the one to do that or who we need to ping.

Thanks again @ircmaxell, @lt, @rdlowrey, and all the other kids who help me along on this one! ๐Ÿป Let's do some more of these! :D

@nikic
nikic commented on f8a6d38 Apr 29, 2015

Zpp failures should always return null (what was implemented initially), unless there is some very strong reason to do otherwise. We usually only make exceptions to this for legacy functions.

@nikic nikic referenced this pull request May 9, 2015
Closed

Userland CSPRNG #1268

@nikic
Contributor
nikic commented May 9, 2015

Merged via #1268. Thanks everyone who worked on this!

@nikic nikic closed this May 9, 2015
@SammyK
Contributor
SammyK commented May 11, 2015

Yay! Thanks @nikic! :)

@SammyK SammyK deleted the SammyK:rand-bytes branch May 11, 2015
@CodesInChaos
  1. The documentation for random_int could use some improvements:
    • Clarify the behaviour if min<=max is violated
    • Specify if the bounds are inclusive or exclusive (looking at rand and mt_rand it's probably inclusive)
    • It should mention that the distribution is uniform
  2. Why does it return FALSE which can easily treated as 0 instead of something you can't ignore, like an exception when the RNG in unavailable?
  3. What about a secure polyfill?
@SammyK
Contributor
SammyK commented Jul 6, 2015

Why does it return FALSE which can easily treated as 0 instead of something you can't ignore, like an exception when the RNG in unavailable?

Good point. I think a RuntimeException should be thrown when a proper source of random cannot be detected. That way when errors are turned off we won't have people using false/0 as their CSPRNG. :) Thoughts @lt?

@SammyK
Contributor
SammyK commented Jul 6, 2015

@lt: I just pushed up a branch with this change, can you take a look? :)

@CodesInChaos

@SammyK What should happen if min > max? Also an exception?

@lt
Contributor
lt commented Jul 7, 2015

I think when we wrote this, exceptions in the engine hadn't yet passed.

I'd be happy with InvalidArgumentException for min>max and len<=0

@CodesInChaos

Should len === 0 be considered invalid? I'd consider it valid and only throw an exception if len < 0.

@tom--
tom-- commented Jul 7, 2015

When someone tries to generate an empty string using the CSPRNG, I think it's likely a programming error (if it were my project, I'd define it as a programming error). So I think it is safer if PHP treats this case as invalid.

And, from the other point of view, I think PHP can reasonably reject bug reports that you can't generate empty strings with the CSPRNG, especially if the conditions for these exceptions are documented.

@lt
Contributor
lt commented Jul 7, 2015

If it's not a programming error, it's a stupid thing to do. Invalid all the way.

@CodesInChaos

At least min == max on random_int shouldn't be an error, since that can be useful in practice. For example when picking the last card in a shuffling algorithm.

@tom--
tom-- commented Jul 7, 2015

@CodesInChaos I agree.

@lt
Contributor
lt commented Jul 7, 2015

I'll disagree with practically useful. But I'll agree that it should be allowed, in the "spirit of PHP"

If we're going to allow min == max it should be special cased, no point fetching from the rng for it.

    if (min >= max) {
        if (min == max) {
            RETURN_LONG(min);
        }

        zend_throw_exception(spl_ce_InvalidArgumentException, "Maximum value must not be less than the minimum value", 0);
        return;

    }

@SammyK perhaps do something like this for SPL exceptions: https://github.com/php/php-src/blob/ed1b64877d82af71bc64a48bf914046640e8a270/ext/mysqli/mysqli.c#L606 So it can fall back to a less specific exception if SPL is not included.

@nikic
Contributor
nikic commented Jul 7, 2015

Please do the change to exceptions separately from the changing any error conditions.

@SammyK
Contributor
SammyK commented Jul 7, 2015

Sounds good. I'll submit 2 separate PR's - one for throwing general exceptions and one for new error conditions. Then I'll update the docs when they are merged and post here for another once-over to make sure the docs cover all the bases. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment