Skip to content

[9.x] Random function doesn't generate evenly distributed random chars#45916

Merged
taylorotwell merged 4 commits into
laravel:9.xfrom
janh-kramer:9.x
Feb 2, 2023
Merged

[9.x] Random function doesn't generate evenly distributed random chars#45916
taylorotwell merged 4 commits into
laravel:9.xfrom
janh-kramer:9.x

Conversation

@janh-kramer

Copy link
Copy Markdown
Contributor

The random function does not generate evenly distributed random characters particularly for short strings (<12 chars). For example, if exactly 1 character is generated, the letters A, Q, g and w are significantly overrepresented.

This is due to the conversion of the randomly generated bytes using the base64 function. If the number of bytes is not divisible by three, zeros are added at the end in the algorithm (see https://en.wikipedia.org/wiki/Base64).

This leads to the increased frequency of A (000000), Q (010000), g (100000) and w (110000).

Evenly distributed random characters are obtained if it is ensured that the number of generated random bytes modulo 3 is equal to 0:

$bytesSize = (int) ceil(($size) / 3) * 3;

@driesvints driesvints changed the title Bugfix: random function doesn't generate evenly distributed random chars [10.x] Random function doesn't generate evenly distributed random chars Feb 2, 2023
@taylorotwell taylorotwell merged commit 4f2e46c into laravel:9.x Feb 2, 2023
@janh-kramer

Copy link
Copy Markdown
Contributor Author

@taylorotwell, When changing the test, you forgot to adjust the delta, so now the test will always pass.

@taylorotwell

Copy link
Copy Markdown
Member

ok, what should I change the delta to?

@janh-kramer

Copy link
Copy Markdown
Contributor Author

At most 200, could even be less

@janh-kramer

Copy link
Copy Markdown
Contributor Author

Should I target this to 10.x?

@taylorotwell

Copy link
Copy Markdown
Member

Sure

@giggsey

giggsey commented Feb 3, 2023

Copy link
Copy Markdown
Contributor

It's a minor complaint, but can the comments be updated to match the change in the iterations in 30681bc

        // take 6.200.000 samples, because there are 62 different characters
        for ($i = 0; $i < 620000; $i++) {

@driesvints driesvints changed the title [10.x] Random function doesn't generate evenly distributed random chars [9.x] Random function doesn't generate evenly distributed random chars Feb 3, 2023
@driesvints driesvints removed the bug-fix label Feb 3, 2023
@driesvints

Copy link
Copy Markdown
Member

@janh-kramer this is already merged into 10.x 👍

@janh-kramer

Copy link
Copy Markdown
Contributor Author

Is this PR then accepted as a candidate for the Bughunt?

@driesvints

Copy link
Copy Markdown
Member

It was sent to 9.x so no sorry.

@janh-kramer

Copy link
Copy Markdown
Contributor Author

come on ..

@driesvints

Copy link
Copy Markdown
Member

I'm very sorry @janh-kramer but the rules are very clear on this. Only bug fixes sent to 10.x are eligible. The contest is specifically to figure out breaking changes in Laravel v10. Bugs that already exist in 9.x need to be solved on that version.

Please see https://blog.laravel.com/laravel-v10-bug-hunt

@janh-kramer

Copy link
Copy Markdown
Contributor Author

no problem thanks, it was just a try. Glad to contribute the first time

@ziming

ziming commented Feb 3, 2023

Copy link
Copy Markdown
Contributor

I dont see any harm in allowing. not like 1 counted PR will win the contest. plus 7 feb is near. but that's just an opinion

@driesvints

Copy link
Copy Markdown
Member

@janh-kramer we very much appreciate your contribution 👍

@ziming the problem with allowing one now would be unfair to everyone else who has sent in bug fixes to 9.x in the past weeks.

@ziming

ziming commented Feb 3, 2023

Copy link
Copy Markdown
Contributor

Ah i see. :)

@GrahamCampbell

Copy link
Copy Markdown
Collaborator

This seems overly complex? Why not just base62 encode the output (by which i mean, divide and ciel the number of chars to determine the number of bytes we need, gen the bytes, base62 encode the output, then trim to the correct length - which will remove either 0 or 1 chars off of the end)?

@GrahamCampbell

Copy link
Copy Markdown
Collaborator

The reason this is correct is that bsse62 can be thought of as stretching out the bytes over the charset we want, which means we get the perfect distribution we want.

@taylorotwell

Copy link
Copy Markdown
Member

@janh-kramer let us think about the bug hunt stuff - so far we only have 3 bug fixes contributed to 10.x which feels a bit lame to me and I think it would be better to include 9.x as well personally. Will revisit this.

chu121su12 pushed a commit to chu121su12/framework that referenced this pull request Feb 6, 2023
…rs (laravel#45916)

* Bugfix: The random function does not generate evenly distributed random characters particularly for short strings (<12 chars)

* fixed styling

* fixed styling

* formatting

---------

Co-authored-by: Jan Kramer <j.kramer@codenker.de>
Co-authored-by: Taylor Otwell <taylor@laravel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants