[WIP] Easy User-land CSPRNG #1119
Conversation
This comment has been minimized.
This comment has been minimized.
I wonder about the usefulness of this function. This can be easily achieved with |
This comment has been minimized.
This comment has been minimized.
Yeah, I'm wouldn't put up too much of a fight on this one, but 100% of the times I've ever needed random bytes it has always been in the form of hex or int. I'm thinking most user-land kids are in the same boat. :) I noticed that the hex table for |
This comment has been minimized.
This comment has been minimized.
So I've been doing a bunch of research on binary to hex conversions and learned a lot. Since hex (base 16) is just a different representation of the same data in binary (base 2), it wouldn't necessarily add more randomness to convert the data to Base32 or Base64. But at the end of the day, the common user-land PHP developer won't know about these things in too much detail (as I didn't before tonight!) But what the user-land dev is thinking, "I need a random alpha-numeric string", so hex should suffice. I've also be researching how other languages do random like Ruby, Go and Python. They all look for BTW: I've seen most implementations of a unique ID in other languages make use of the CSPRNG like in Python. I think we should do the same with |
This comment has been minimized.
This comment has been minimized.
Good job doing the research (I feel almost proud! :)). You're absolutely right, hex or base64 doesn't add anything. In fact it takes away entropy, and we're not talking about small amounts here. So lets take hex for example. Conversion to hex literally takes an 8 bit representation of data, and turns it into a 4 bit representation of data (used as an index in a character table). So how much entropy do we really lose? Maths time (I'm thinking free-form here, and I'm not a mathematician, but hopefully this is along the right lines!) The entropy in a base-encoded string compared to the raw binary is: 1 / (2(8 - bits_per_byte ) * nbytes). Just to test, plugging in the full 8 bits per byte we get 1 / 20 which is 1/1, full entropy. So for example a 16 byte (128 bit) piece of keying material, keeping the input as hex rather than full binary, the key is 1 / 24*16 as strong. 1 / 264. 264 is a reasonably familiar number, being the number of combinations in a 64-bit unsigned integer. I can't remember it in full, but I can remember it's roughly 18.5 quintillion. A hex encoded string, for 16 bytes, is 18.5 quintillion times weaker than the full binary string. Incidentally, that is the same number of key combinations there are (2128 / 264 == 264). It's still going to take a long time to crack, but I have no idea how this affects attacks on ciphers that break X of so many rounds, if you can guarantee specific weaknesses in the key. |
This comment has been minimized.
This comment has been minimized.
I think migrating other parts of PHP to use a CSPRNG should be a separate PR.
|
This comment has been minimized.
This comment has been minimized.
Oh man! Math! I've got more research to do. :) Yes, I agree with you on creating separate PR's for I should have a little bit more time on Thursday to convert some of this research into commits on this bad boy. :) |
This comment has been minimized.
This comment has been minimized.
Edit: - Hah, I see you already did this in your second commit :) With your includes at the top:
Then use |
This comment has been minimized.
This comment has been minimized.
This should go back to being |
This comment has been minimized.
This comment has been minimized.
We need to be careful about uniformity in this function. If you use modulus to bring the number down to your upper bound and the divisor is not a power of 2, you can end up with some bias in the result which we obviously want to avoid with a CSPRNG. I'll help with the implementation of fixing this. |
This comment has been minimized.
This comment has been minimized.
Should we remove the |
This comment has been minimized.
This comment has been minimized.
Remove |
This comment has been minimized.
This comment has been minimized.
Sounds good! |
This comment has been minimized.
This comment has been minimized.
There's a few more options to investigate here. Off the top of my head, things to investigate (I can do all of this if you like):
|
This comment has been minimized.
This comment has been minimized.
To get a random number to work on do |
First attempt at random_int(). Commented-out random_hex() until the other functions are rocking.
Changes from my fork of rand-bytes
This comment has been minimized.
This comment has been minimized.
@Rican7 - Documentation issue in the RFC. The value in the implementation here is correct. I've let Sammy know. |
This comment has been minimized.
This comment has been minimized.
Fixed in the RFC. |
This comment has been minimized.
This comment has been minimized.
|
This comment has been minimized.
This comment has been minimized.
@SammyK I derped in the error message, minimum definitely should not be greater than the maximum :) |
This comment has been minimized.
This comment has been minimized.
Perhaps we should make it so both min and max need to be specified to be valid? Doesn't really feel sensible to allow the min without a max. if (ZEND_NUM_ARGS() == 1) {
php_error_docref(NULL, E_WARNING, "Error message of your choosing");
RETURN_FALSE;
} |
This comment has been minimized.
This comment has been minimized.
Lol. Nice catch. :) I'll fix. |
This comment has been minimized.
This comment has been minimized.
This needs |
|
||
if (php_random_bytes(bytes->val, size) == FAILURE) { | ||
zend_string_release(bytes); | ||
return; |
nikic
Feb 24, 2015
Member
Probably that should be RETURN_FALSE
as well? Same in random_int
.
Probably that should be RETURN_FALSE
as well? Same in random_int
.
SammyK
Feb 24, 2015
Author
Contributor
Nice catch! I'll fix.
Nice catch! I'll fix.
|
||
static int php_random_bytes(void *bytes, size_t size) | ||
{ | ||
int n = 0; |
nikic
Feb 24, 2015
Member
Better would be ssize_t
.
Better would be ssize_t
.
nikic
Feb 24, 2015
Member
The declaration should also be moved into the while
loop, otherwise this will likely generate a warning on BSD systems with arc4random.
The declaration should also be moved into the while
loop, otherwise this will likely generate a warning on BSD systems with arc4random.
zend_ulong result; | ||
|
||
if (ZEND_NUM_ARGS() == 1) { | ||
php_error_docref(NULL, E_WARNING, "A minimum and maximum value are expected, only minimum given"); |
laruence
Feb 25, 2015
Member
since you already have LONG_MIN and LONG_MAX, so if only minimum value min is given. then simply assume it means min to LONG_MAX?
since you already have LONG_MIN and LONG_MAX, so if only minimum value min is given. then simply assume it means min to LONG_MAX?
patrickallaert
Mar 19, 2015
Contributor
@laruence: I thought about the same thing but then I realize that reading:
random_int(10);
might be understood as 10
being the max.
@laruence: I thought about the same thing but then I realize that reading:
random_int(10);
might be understood as 10
being the max.
laruence
Mar 19, 2015
Member
hmm, that could be a doc issue.. but it's not a big deal :)
hmm, that could be a doc issue.. but it's not a big deal :)
SammyK
Mar 20, 2015
Author
Contributor
Oops - this is a dingleberry left over from when min
and max
were optional args. They are both required in the current spec that's being voted on. I'll remove this check. :)
Oops - this is a dingleberry left over from when min
and max
were optional args. They are both required in the current spec that's being voted on. I'll remove this check. :)
} | ||
|
||
if (min >= max) { | ||
php_error_docref(NULL, E_WARNING, "Minimum value must be less than the maximum value"); |
laruence
Feb 25, 2015
Member
maybe something like: 1st arg must be less than 2nd arg is more obvious for user?
maybe something like: 1st arg must be less than 2nd arg is more obvious for user?
@sarciszewski This is something we have discussed, but decided to leave for future scope. If you compile against LibreSSL and have a Linux Kernel version > 3.17, you will be using |
The vote passed, but was never closed. This should be merged soon ;-) |
RANDOM_G(fd) = fd; | ||
} | ||
|
||
ssize_t n = 0; |
nikic
Apr 9, 2015
Member
This needs to be moved into the while look (combined declaration and assignment) to conform with C89.
This needs to be moved into the while look (combined declaration and assignment) to conform with C89.
Thanks for the push @ircmaxell! :) I'm coordinating with @lt to get this bad boy ready for merge. :) |
Return an arbitrary pseudo-random integer */ | ||
PHP_FUNCTION(random_int) | ||
{ | ||
zend_long min = ZEND_LONG_MIN; |
Hokay! @lt updated this PR based on feedback. I normalized the return values when errors happen & updated the tests & merged in the lest from master & fixed conflicts. Phew! The RFC officially passed unanimously and this PR can be merged into master now. W00t! Not sure if @nikic is the one to do that or who we need to ping. Thanks again @ircmaxell, @lt, @rdlowrey, and all the other kids who help me along on this one! |
This comment has been minimized.
This comment has been minimized.
Zpp failures should always return null (what was implemented initially), unless there is some very strong reason to do otherwise. We usually only make exceptions to this for legacy functions. |
Merged via #1268. Thanks everyone who worked on this! |
Yay! Thanks @nikic! :) |
|
Good point. I think a |
@lt: I just pushed up a branch with this change, can you take a look? :) |
@SammyK What should happen if |
I think when we wrote this, exceptions in the engine hadn't yet passed. I'd be happy with InvalidArgumentException for |
Should |
When someone tries to generate an empty string using the CSPRNG, I think it's likely a programming error (if it were my project, I'd define it as a programming error). So I think it is safer if PHP treats this case as invalid. And, from the other point of view, I think PHP can reasonably reject bug reports that you can't generate empty strings with the CSPRNG, especially if the conditions for these exceptions are documented. |
If it's not a programming error, it's a stupid thing to do. Invalid all the way. |
At least |
@CodesInChaos I agree. |
I'll disagree with practically useful. But I'll agree that it should be allowed, in the "spirit of PHP" If we're going to allow if (min >= max) {
if (min == max) {
RETURN_LONG(min);
}
zend_throw_exception(spl_ce_InvalidArgumentException, "Maximum value must not be less than the minimum value", 0);
return;
} @SammyK perhaps do something like this for SPL exceptions: Line 606 in ed1b648 |
Please do the change to exceptions separately from the changing any error conditions. |
Sounds good. I'll submit 2 separate PR's - one for throwing general exceptions and one for new error conditions. Then I'll update the docs when they are merged and post here for another once-over to make sure the docs cover all the bases. :) |
Submitting as a PR per @nikic's recommendation.
RFC can be found here.