Skip to content

Conversation

bishopb
Copy link
Contributor

@bishopb bishopb commented Jul 13, 2015

Requesting a MT random number in a range that exceeds PHP_INT_MAX causes overflow, which then results in a non-uniform distribution of random numbers. Examples:

$rand = mt_rand(PHP_INT_MIN, PHP_INT_MAX); // causes overflow $rand will be disproportionately 0

Fixed as follows. If overflow would not happen, proceed as before scaling into the desired range. If overflow would happen, (a) scale the number into the negative range, (b) scale the number into the positive range, and (c) add them together. While not technically a BC breakage, it's possible that someone was relying on the old behavior which produced disproportionately more zero values. Advise if this should be back patched.

See: https://bugs.php.net/bug.php?id=63174

@bishopb bishopb changed the title Fixes bug #63174 Fix #63174 - mt_rand() fails when given range exceeding PHP_INT_MAX Jul 13, 2015
@cmb69
Copy link
Member

cmb69 commented Jul 14, 2015

Travis shows strange errors for ext/mysqli/tests/mysqli_fetch_all.php on line 155. Can you please have a look at this issue, Bishop?

*
* -RL
*/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you move this to the header file? IMO it makes more sense having it closer to the implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@datibbaw Because the implementation of the algorithm the comment describes is in the header file, lines 69-70.

bishopb added 2 commits July 15, 2015 09:06
…eavy lifting. Can reuse for all the places where we might have overflow.
…e, detect overflow situations and compensate by adding two random numbers together: one from the lower half plus one from the upper half. Also, improve test to detect how the original PRNG were non-random. Also, apply fix to rand as well.
@cmb69
Copy link
Member

cmb69 commented Jul 16, 2015

I saw your patch. Does the (zend_long) cast fix the issue?

I don't have a 32bit Linux at hand, so I tested on Windows 32bit, where getrandmax() == 32767. Therefore I changed the first 2 lines of test script to:

$rand_min = PHP_INT_MIN;
$rand_max = PHP_INT_MAX;

That worked out fine, and also replacing rand() with mt_rand(). I believe that'll work on 32 Linux (i.e. where getrandmax() == 2137483647), too, but I may be mistaken, and it would be great if you could run some tests,

Another, albeit minor, issue is the step to (M+1). Somehow that doesn't feel quite right – I may err, though.

@nikic
Copy link
Member

nikic commented Jul 23, 2015

I would go with the very first version, i.e. disallow ranges that exceed mtrandmax. The implementation just doesn't work reasonably for anything else -- if you start off with a 31bit number you shouldn't try to stretch it to anything more than that.

@smalyshev smalyshev added the Bug label Jul 25, 2015
@cmb69
Copy link
Member

cmb69 commented Jul 30, 2015

@nikic I agree that it's best disallow such ranges. However, that constitutes a BC break, so if we do it, it should be done for PHP 7.0, and not for a minor version, what demands some hurry.

@nikic
Copy link
Member

nikic commented Jul 30, 2015

@cmb69 The alternative would be (also for PHP 7 only) to use proper rejection sampling (from 2 MT rand outputs on 64bit) if we exceed maxrand range. This will change the sequence if such mt_rand calls are made, but at least you will get a uniform result.

@cmb69
Copy link
Member

cmb69 commented Jul 30, 2015

@nikic Isn't it a problem that two consecutive mt_rand()s are not independent? Otherwise we could simply construct n1<<31|n2 and scale down like before (of course, missing some values due to limited double precision, either way).

@bishopb
Copy link
Contributor Author

bishopb commented Jul 30, 2015

@cmb69 Yes, my understanding is that MT samples are not independent. There is an alternate MT disjoint samples algorithm, but I think that might be overkill.

I think we're all in agreement that the safest approach is the very first version: disallow ranges exceeding mtrandmax. If the user wants a bigger range, he can do something like this in userland:

$rn = (rand(0, 1) ? mt_rand() : mt_rand(-mt_getrandmax(), -1);

@cmb69
Copy link
Member

cmb69 commented Jul 30, 2015

@bishopb ACK. What do we do with rand(), which has basically the same issue? Note that PHP_RAND_MAX is 32767 on Windows.

@bishopb
Copy link
Contributor Author

bishopb commented Sep 23, 2015

Handling this in a different way, closing this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants