Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: high precision random number generator #2357

Merged
merged 55 commits into from
Feb 27, 2024

Conversation

ST-DDT
Copy link
Member

@ST-DDT ST-DDT commented Aug 29, 2023

Fixes #2355

To Fix

  • fix/reduce duplicates returned from faker.number.int() and faker.number.float()
  • fix faker.number.int() only returning even numbers (31bits out of 53)
  • fix faker.number.int(max) is incapable of returning max value for large maxes

@ST-DDT ST-DDT added c: bug Something isn't working help wanted Extra attention is needed needs test More tests are needed p: 2-high Fix main branch s: needs decision Needs team/maintainer decision m: number Something is referring to the number module labels Aug 29, 2023
@ST-DDT ST-DDT self-assigned this Aug 29, 2023
@codecov
Copy link

codecov bot commented Aug 29, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.57%. Comparing base (0d4cba6) to head (a4b4203).

❗ Current head a4b4203 differs from pull request most recent head 998611b. Consider uploading reports for the commit 998611b to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             next    #2357      +/-   ##
==========================================
- Coverage   99.57%   99.57%   -0.01%     
==========================================
  Files        2846     2815      -31     
  Lines      249892   249870      -22     
  Branches     1055     1066      +11     
==========================================
- Hits       248830   248804      -26     
- Misses       1062     1066       +4     
Files Coverage Δ
src/faker.ts 100.00% <100.00%> (ø)
src/index.ts 100.00% <100.00%> (ø)
src/internal/mersenne.ts 97.58% <100.00%> (+0.14%) ⬆️
src/simple-faker.ts 100.00% <100.00%> (ø)

... and 83 files with indirect coverage changes

@matthewmayer
Copy link
Contributor

Probably also worth checking if genrandRes53CC is significantly slower, or if there are any other downsides to using it.

@ST-DDT
Copy link
Member Author

ST-DDT commented Aug 30, 2023

It's twice as slow.

@matthewmayer
Copy link
Contributor

I think this should be probably considered a breaking change given it causes nearly every snapshot to change. That would also allow for beta testing of the change before 9.0 in case the slower algorithm causes performance problems.

@matthewmayer matthewmayer added the breaking change Cannot be merged when next version is not a major release label Aug 30, 2023
@ST-DDT
Copy link
Member Author

ST-DDT commented Aug 30, 2023

The breaking of the snapshots is caused by next() now effectively taking 2 values from the seed.
We could reduce this if we dynamically request either 32/53 bits of precision depending on the values in use.
e.g. an array with length 15 doesn't need 53 bits of precision for the index.

@ST-DDT
Copy link
Member Author

ST-DDT commented Aug 30, 2023

I'm not that deep into math, but for some reason using two values from the seed increases the variance of the values.
Which increase likelihood of this test to fail: https://github.com/faker-js/faker/actions/runs/6025687970/job/16346941314?pr=2357

@ST-DDT
Copy link
Member Author

ST-DDT commented Aug 30, 2023

I'm not that deep into math, but for some reason using two values from the seed increases the variance of the values. Which increase likelihood of this test to fail: faker-js/faker/actions/runs/6025687970/job/16346941314?pr=2357

Nevermind, I found the error, currently the first and the last element aren't evenly distributed for number.int()

@ST-DDT ST-DDT added p: 1-normal Nothing urgent and removed s: needs decision Needs team/maintainer decision p: 2-high Fix main branch labels Aug 31, 2023
@ST-DDT
Copy link
Member Author

ST-DDT commented Aug 31, 2023

Team Decision

We will change mersenne to 53 bit precision to reduce the duplicates and create a separate PR for the float issues.

test/internal/mersenne/twister.spec.ts Outdated Show resolved Hide resolved
test/internal/mersenne/twister.spec.ts Outdated Show resolved Hide resolved
@ST-DDT ST-DDT requested review from Shinigami92 and a team February 22, 2024 18:39
@ST-DDT ST-DDT linked an issue Feb 23, 2024 that may be closed by this pull request
docs/guide/randomizer.md Outdated Show resolved Hide resolved
docs/guide/upgrading_v9/2357.md Outdated Show resolved Hide resolved
Shinigami92
Shinigami92 previously approved these changes Feb 26, 2024
matthewmayer
matthewmayer previously approved these changes Feb 27, 2024
@ST-DDT ST-DDT removed the s: needs decision Needs team/maintainer decision label Feb 27, 2024
@ST-DDT
Copy link
Member Author

ST-DDT commented Feb 27, 2024

3 approving reviews. Enabling auto merge.

@ST-DDT ST-DDT enabled auto-merge (squash) February 27, 2024 19:25
@ST-DDT ST-DDT requested review from a team February 27, 2024 19:26
@ST-DDT ST-DDT merged commit 4ab0731 into next Feb 27, 2024
14 checks passed
@ST-DDT ST-DDT deleted the fix/mersenne-and-number-precision branch February 27, 2024 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Cannot be merged when next version is not a major release c: bug Something isn't working m: number Something is referring to the number module p: 1-normal Nothing urgent
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Export the default randomizer faker.datatype.number often gives duplicates
5 participants