Skip to content

Commit

Permalink
feat!: high precision random number generator (#2357)
Browse files Browse the repository at this point in the history
  • Loading branch information
ST-DDT committed Feb 27, 2024
1 parent 0d4cba6 commit 4ab0731
Show file tree
Hide file tree
Showing 35 changed files with 1,821 additions and 1,679 deletions.
17 changes: 17 additions & 0 deletions docs/guide/randomizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,23 @@ There are two connected use cases we have considered where this might be needed:
1. Re-Use of the same `Randomizer` within multiple `Faker` instances.
2. The use of a random number generator from a third party library.

## Built-In `Randomizer`s

Faker ships with two variations

```ts
import {
generateMersenne32Randomizer, // Default prior to v9
generateMersenne53Randomizer, // Default since v9
} from '@faker-js/faker';

const randomizer = generateMersenne53Randomizer();
```

The 32bit `Randomizer` is faster, but the 53bit `Randomizer` generates better random values (with significantly fewer duplicates).

But you can also implement your own by implementing the [related interface](/api/randomizer.html).

## Using `Randomizer`s

A `Randomizer` has to be set during construction of the instance:
Expand Down
72 changes: 72 additions & 0 deletions docs/guide/upgrading_v9/2357.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Use High Precision RNG by default

TLDR: Many Faker methods will return a different result in v9 compared to v8 for the same seed.

In v9 we switch from a 32 bit random value to a 53 bit random value.
We don't change the underlying algorithm much, but we now consume two seed values each step instead of one.
This affects generated values in two ways:

- In large lists or long numbers the values are spread more evenly.
This also reduces the number of duplicates it generates.
For `faker.number.int()` this reduces the number of duplicates from `1 / 10_000` to less than `1 / 8_000_000`.
- If you start with the same initial seed to generate a value, you might see some changes in the results you get.
This is because we're now working with a higher precision, which affects how numbers are rounded off.
As a result, the methods we use might produce slightly different outcomes.
And since we are now using two seed values each time subsequent results appear to skip a value each time.

```ts
import {
SimpleFaker,
generateMersenne32Randomizer,
generateMersenne53Randomizer,
} from '@faker-js/faker';

// < v9 default
const f32 = new SimpleFaker({ randomizer: generateMersenne32Randomizer() });
f32.seed(123);
const r32 = f32.helpers.multiple(() => f32.number.int(10), { count: 10 });
// > v9 default
const f53 = new SimpleFaker({ randomizer: generateMersenne53Randomizer() });
f53.seed(123);
const r53 = f53.helpers.multiple(() => f53.number.int(10), { count: 5 });

diff(r32, r53);
//[
// 7,
// 7, // [!code --]
// 3,
// 4, // [!code --]
// 2,
// 7, // [!code --]
// 6,
// 7, // [!code --]
// 7,
// 5, // [!code --]
//]
```

## Adoption

If you don't have any seeded tests and just want some random values, then you don't have to change anything.

If you have seeded tests, you have to update most test snapshots or similar comparisons to new values.

If you are using vitest, you can do that using `pnpm vitest run -u`.

## Keeping the old behavior

You can keep the old behavior, if you create your own `Faker` instance
and pass a `Randomizer` instance from the `generateMersenne32Randomizer()` function to it.

```ts{8}
import {
Faker,
generateMersenne32Randomizer, // < v9 default
generateMersenne53Randomizer, // > v9 default
} from '@faker-js/faker';
const faker = new Faker({
randomizer: generateMersenne32Randomizer(),
...
});
```
2 changes: 1 addition & 1 deletion src/faker.ts
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ export class Faker extends SimpleFaker {
* Specify this only if you want to use it to achieve a specific goal,
* such as sharing the same random generator with other instances/tools.
*
* @default generateMersenne32Randomizer()
* @default generateMersenne53Randomizer()
*/
randomizer?: Randomizer;
}) {
Expand Down
4 changes: 4 additions & 0 deletions src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,10 @@ export type {
export { FakerError } from './errors/faker-error';
export { Faker } from './faker';
export type { FakerOptions } from './faker';
export {
generateMersenne32Randomizer,
generateMersenne53Randomizer,
} from './internal/mersenne';
export * from './locale';
export { fakerEN as faker } from './locale';
export * from './locales';
Expand Down
27 changes: 24 additions & 3 deletions src/internal/mersenne.ts
Original file line number Diff line number Diff line change
Expand Up @@ -328,9 +328,7 @@ export class MersenneTwister19937 {

/**
* Generates a MersenneTwister19937 randomizer with 32 bits of precision.
* This is the default randomizer used by Faker.
*
* @internal
* This is the default randomizer used by faker prior to v9.0.
*/
export function generateMersenne32Randomizer(): Randomizer {
const twister = new MersenneTwister19937();
Expand All @@ -350,3 +348,26 @@ export function generateMersenne32Randomizer(): Randomizer {
},
};
}

/**
* Generates a MersenneTwister19937 randomizer with 53 bits of precision.
* This is the default randomizer used by faker starting with v9.0.
*/
export function generateMersenne53Randomizer(): Randomizer {
const twister = new MersenneTwister19937();

twister.initGenrand(Math.ceil(Math.random() * Number.MAX_SAFE_INTEGER));

return {
next(): number {
return twister.genrandRes53();
},
seed(seed: number | number[]): void {
if (typeof seed === 'number') {
twister.initGenrand(seed);
} else if (Array.isArray(seed)) {
twister.initByArray(seed, seed.length);
}
},
};
}
6 changes: 3 additions & 3 deletions src/simple-faker.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { generateMersenne32Randomizer } from './internal/mersenne';
import { generateMersenne53Randomizer } from './internal/mersenne';
import { DatatypeModule } from './modules/datatype';
import { SimpleDateModule } from './modules/date';
import { SimpleHelpersModule } from './modules/helpers';
Expand Down Expand Up @@ -117,12 +117,12 @@ export class SimpleFaker {
* Specify this only if you want to use it to achieve a specific goal,
* such as sharing the same random generator with other instances/tools.
*
* @default generateMersenne32Randomizer()
* @default generateMersenne53Randomizer()
*/
randomizer?: Randomizer;
} = {}
) {
const { randomizer = generateMersenne32Randomizer() } = options;
const { randomizer = generateMersenne53Randomizer() } = options;

this._randomizer = randomizer;
}
Expand Down
12 changes: 12 additions & 0 deletions test/internal/__snapshots__/mersenne.spec.ts.snap
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,15 @@ exports[`generateMersenne32Randomizer() > seed: 42 > should return deterministic
exports[`generateMersenne32Randomizer() > seed: 1211 > should return deterministic value for next() 1`] = `0.9285201537422836`;

exports[`generateMersenne32Randomizer() > seed: 1337 > should return deterministic value for next() 1`] = `0.2620246761944145`;

exports[`generateMersenne53Randomizer() > seed: [42,1,2] > should return deterministic value for next() 1`] = `0.8562037477947296`;

exports[`generateMersenne53Randomizer() > seed: [1211,1,2] > should return deterministic value for next() 1`] = `0.8916433279801969`;

exports[`generateMersenne53Randomizer() > seed: [1337,1,2] > should return deterministic value for next() 1`] = `0.17990487224060836`;

exports[`generateMersenne53Randomizer() > seed: 42 > should return deterministic value for next() 1`] = `0.3745401188473625`;

exports[`generateMersenne53Randomizer() > seed: 1211 > should return deterministic value for next() 1`] = `0.9285201539025842`;

exports[`generateMersenne53Randomizer() > seed: 1337 > should return deterministic value for next() 1`] = `0.2620246750155817`;
8 changes: 6 additions & 2 deletions test/internal/mersenne.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import { beforeAll, beforeEach, describe, expect, it } from 'vitest';
import {
MersenneTwister19937,
generateMersenne32Randomizer,
generateMersenne53Randomizer,
} from '../../src/internal/mersenne';
import type { Randomizer } from '../../src/randomizer';
import { seededRuns } from '../support/seeded-runs';
Expand Down Expand Up @@ -84,8 +85,11 @@ describe('MersenneTwister19937', () => {
});
});

describe('generateMersenne32Randomizer()', () => {
const randomizer: Randomizer = generateMersenne32Randomizer();
describe.each([
['generateMersenne32Randomizer()', generateMersenne32Randomizer],
['generateMersenne53Randomizer()', generateMersenne53Randomizer],
])('%s', (_, factory) => {
const randomizer: Randomizer = factory();

it('should return a result matching the interface', () => {
expect(randomizer).toBeDefined();
Expand Down
82 changes: 41 additions & 41 deletions test/modules/__snapshots__/airline.spec.ts.snap
Original file line number Diff line number Diff line change
Expand Up @@ -23,33 +23,33 @@ exports[`airline > 42 > airport 1`] = `
}
`;

exports[`airline > 42 > flightNumber > flightNumber addLeadingZeros 1`] = `"0089"`;
exports[`airline > 42 > flightNumber > flightNumber addLeadingZeros 1`] = `"0097"`;

exports[`airline > 42 > flightNumber > flightNumber length 2 to 4 1`] = `"891"`;
exports[`airline > 42 > flightNumber > flightNumber length 2 to 4 1`] = `"975"`;

exports[`airline > 42 > flightNumber > flightNumber length 2 to 4 and addLeadingZeros 1`] = `"0891"`;
exports[`airline > 42 > flightNumber > flightNumber length 2 to 4 and addLeadingZeros 1`] = `"0975"`;

exports[`airline > 42 > flightNumber > flightNumber length 3 1`] = `"479"`;
exports[`airline > 42 > flightNumber > flightNumber length 3 1`] = `"497"`;

exports[`airline > 42 > flightNumber > flightNumber length 3 and addLeadingZeros 1`] = `"0479"`;
exports[`airline > 42 > flightNumber > flightNumber length 3 and addLeadingZeros 1`] = `"0497"`;

exports[`airline > 42 > flightNumber > noArgs 1`] = `"89"`;
exports[`airline > 42 > flightNumber > noArgs 1`] = `"97"`;

exports[`airline > 42 > recordLocator > allowNumerics 1`] = `"DTY7RT"`;
exports[`airline > 42 > recordLocator > allowNumerics 1`] = `"DYRM66"`;

exports[`airline > 42 > recordLocator > allowVisuallySimilarCharacters 1`] = `"JUYETU"`;
exports[`airline > 42 > recordLocator > allowVisuallySimilarCharacters 1`] = `"JYTPEE"`;

exports[`airline > 42 > recordLocator > both allowNumerics and allowVisuallySimilarCharacters 1`] = `"DSY6QS"`;
exports[`airline > 42 > recordLocator > both allowNumerics and allowVisuallySimilarCharacters 1`] = `"DYQL55"`;

exports[`airline > 42 > recordLocator > noArgs 1`] = `"JVYETU"`;
exports[`airline > 42 > recordLocator > noArgs 1`] = `"JYTQDD"`;

exports[`airline > 42 > seat > aircraftType narrowbody 1`] = `"14E"`;
exports[`airline > 42 > seat > aircraftType narrowbody 1`] = `"14F"`;

exports[`airline > 42 > seat > aircraftType regional 1`] = `"8D"`;

exports[`airline > 42 > seat > aircraftType widebody 1`] = `"23H"`;
exports[`airline > 42 > seat > aircraftType widebody 1`] = `"23K"`;

exports[`airline > 42 > seat > noArgs 1`] = `"14E"`;
exports[`airline > 42 > seat > noArgs 1`] = `"14F"`;

exports[`airline > 1211 > aircraftType 1`] = `"widebody"`;

Expand All @@ -74,33 +74,33 @@ exports[`airline > 1211 > airport 1`] = `
}
`;

exports[`airline > 1211 > flightNumber > flightNumber addLeadingZeros 1`] = `"5872"`;
exports[`airline > 1211 > flightNumber > flightNumber addLeadingZeros 1`] = `"9296"`;

exports[`airline > 1211 > flightNumber > flightNumber length 2 to 4 1`] = `"5872"`;
exports[`airline > 1211 > flightNumber > flightNumber length 2 to 4 1`] = `"9296"`;

exports[`airline > 1211 > flightNumber > flightNumber length 2 to 4 and addLeadingZeros 1`] = `"5872"`;
exports[`airline > 1211 > flightNumber > flightNumber length 2 to 4 and addLeadingZeros 1`] = `"9296"`;

exports[`airline > 1211 > flightNumber > flightNumber length 3 1`] = `"948"`;
exports[`airline > 1211 > flightNumber > flightNumber length 3 1`] = `"982"`;

exports[`airline > 1211 > flightNumber > flightNumber length 3 and addLeadingZeros 1`] = `"0948"`;
exports[`airline > 1211 > flightNumber > flightNumber length 3 and addLeadingZeros 1`] = `"0982"`;

exports[`airline > 1211 > flightNumber > noArgs 1`] = `"5872"`;
exports[`airline > 1211 > flightNumber > noArgs 1`] = `"9296"`;

exports[`airline > 1211 > recordLocator > allowNumerics 1`] = `"XGWT86"`;
exports[`airline > 1211 > recordLocator > allowNumerics 1`] = `"XW8ZPQ"`;

exports[`airline > 1211 > recordLocator > allowVisuallySimilarCharacters 1`] = `"YLXUFD"`;
exports[`airline > 1211 > recordLocator > allowVisuallySimilarCharacters 1`] = `"YXFZRR"`;

exports[`airline > 1211 > recordLocator > both allowNumerics and allowVisuallySimilarCharacters 1`] = `"XGWS84"`;
exports[`airline > 1211 > recordLocator > both allowNumerics and allowVisuallySimilarCharacters 1`] = `"XW8ZOO"`;

exports[`airline > 1211 > recordLocator > noArgs 1`] = `"YMXUFC"`;
exports[`airline > 1211 > recordLocator > noArgs 1`] = `"YXFZSS"`;

exports[`airline > 1211 > seat > aircraftType narrowbody 1`] = `"33C"`;
exports[`airline > 1211 > seat > aircraftType narrowbody 1`] = `"33F"`;

exports[`airline > 1211 > seat > aircraftType regional 1`] = `"19B"`;
exports[`airline > 1211 > seat > aircraftType regional 1`] = `"19D"`;

exports[`airline > 1211 > seat > aircraftType widebody 1`] = `"56E"`;
exports[`airline > 1211 > seat > aircraftType widebody 1`] = `"56J"`;

exports[`airline > 1211 > seat > noArgs 1`] = `"33C"`;
exports[`airline > 1211 > seat > noArgs 1`] = `"33F"`;

exports[`airline > 1337 > aircraftType 1`] = `"narrowbody"`;

Expand All @@ -125,30 +125,30 @@ exports[`airline > 1337 > airport 1`] = `
}
`;

exports[`airline > 1337 > flightNumber > flightNumber addLeadingZeros 1`] = `"0061"`;
exports[`airline > 1337 > flightNumber > flightNumber addLeadingZeros 1`] = `"0022"`;

exports[`airline > 1337 > flightNumber > flightNumber length 2 to 4 1`] = `"61"`;
exports[`airline > 1337 > flightNumber > flightNumber length 2 to 4 1`] = `"22"`;

exports[`airline > 1337 > flightNumber > flightNumber length 2 to 4 and addLeadingZeros 1`] = `"0061"`;
exports[`airline > 1337 > flightNumber > flightNumber length 2 to 4 and addLeadingZeros 1`] = `"0022"`;

exports[`airline > 1337 > flightNumber > flightNumber length 3 1`] = `"351"`;
exports[`airline > 1337 > flightNumber > flightNumber length 3 1`] = `"312"`;

exports[`airline > 1337 > flightNumber > flightNumber length 3 and addLeadingZeros 1`] = `"0351"`;
exports[`airline > 1337 > flightNumber > flightNumber length 3 and addLeadingZeros 1`] = `"0312"`;

exports[`airline > 1337 > flightNumber > noArgs 1`] = `"61"`;
exports[`airline > 1337 > flightNumber > noArgs 1`] = `"22"`;

exports[`airline > 1337 > recordLocator > allowNumerics 1`] = `"AK68AJ"`;
exports[`airline > 1337 > recordLocator > allowNumerics 1`] = `"A6AGBJ"`;

exports[`airline > 1337 > recordLocator > allowVisuallySimilarCharacters 1`] = `"GOEFHO"`;
exports[`airline > 1337 > recordLocator > allowVisuallySimilarCharacters 1`] = `"GEHLIN"`;

exports[`airline > 1337 > recordLocator > both allowNumerics and allowVisuallySimilarCharacters 1`] = `"9K57AJ"`;
exports[`airline > 1337 > recordLocator > both allowNumerics and allowVisuallySimilarCharacters 1`] = `"95AGBI"`;

exports[`airline > 1337 > recordLocator > noArgs 1`] = `"GPDEGP"`;
exports[`airline > 1337 > recordLocator > noArgs 1`] = `"GDGMHN"`;

exports[`airline > 1337 > seat > aircraftType narrowbody 1`] = `"10D"`;
exports[`airline > 1337 > seat > aircraftType narrowbody 1`] = `"10A"`;

exports[`airline > 1337 > seat > aircraftType regional 1`] = `"6C"`;
exports[`airline > 1337 > seat > aircraftType regional 1`] = `"6A"`;

exports[`airline > 1337 > seat > aircraftType widebody 1`] = `"16F"`;
exports[`airline > 1337 > seat > aircraftType widebody 1`] = `"16B"`;

exports[`airline > 1337 > seat > noArgs 1`] = `"10D"`;
exports[`airline > 1337 > seat > noArgs 1`] = `"10A"`;

0 comments on commit 4ab0731

Please sign in to comment.