Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.8] Fix seeding logic in Arr::shuffle #27861

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
3 participants
@cviebrock
Copy link
Contributor

cviebrock commented Mar 12, 2019

The random comparator logic used -- return rand(-1, 1) -- is not really that random, as it's biased 2:1 in favour of not swapping the two elements.

values before random comparator result after
a, b -1 a, b
a, b 0 a, b
a, b 1 b, a

This can be illustrated with the following code, which checks what the first element of a "randomized" array is when using the same logic as Arr::shuffle() when a seed is passed (assuming the seeds in this case are randomized as well).

$k = [ 'a', 'b', 'c', 'd', 'e' ];
$r = array_fill_keys($k, 0);

for ($i=0; $i<1000000; $i++) {
    $array = $k;

    usort($array, function () {
        return rand(-1, 1);
    });

    $r[ $array[0] ] ++;
}

print_r($r);

Results:

Array
(
    [a] => 564137
    [b] => 105554
    [c] => 281362
    [d] => 36514
    [e] => 12433
)

So, when that random logic runs, more than half the time the shuffled array still has the first element in the first position.

Changing the logic to use return rand(0,1) helps a bit, but is still biased:

Array
(
    [a] => 307937
    [b] => 204875
    [c] => 308190
    [d] => 116441
    [e] => 62557
)

I think the basic fix for this is to simply use PHP's shuffle() which returns a much more evenly randomized array. If the randomization needs to be seeded, we can just set that before calling shuffle() (and then set it again afterwards, with a random seed, to prevent later logic in the code from being biased).

This PR does this.

Colin Viebrock

@GrahamCampbell GrahamCampbell changed the title Fix seeding logic in Arr::shuffle [5.8] Fix seeding logic in Arr::shuffle Mar 12, 2019

@driesvints

This comment has been minimized.

Copy link
Member

driesvints commented Mar 13, 2019

Can you add a test to verify this?

@taylorotwell

This comment has been minimized.

Copy link
Member

taylorotwell commented Mar 13, 2019

Merged.

@cviebrock

This comment has been minimized.

Copy link
Contributor Author

cviebrock commented Mar 13, 2019

I was going to ask what kind of test would make sense on a function that should produce random results. The only thing I could think of was making sure that shuffling the same array with the same seed produced the same result. Would that be useful @taylorotwell / @driesvints ... or are you happy with having just merged it?

@driesvints

This comment has been minimized.

Copy link
Member

driesvints commented Mar 18, 2019

@cviebrock this is fine for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.