The docs mention being able to call the seed() method so you can use a generated dataset as part of a unit test.
Due to the way Faker uses the random module, this usecase is a bit fragile. Any modification to the data requested, or any outside uses of the random module during generation will diverge the dataset.
Here is a quick script demonstrating the problem along with a couple of potential solutions:
import random
from faker import Faker
fake = Faker()
# initial run
fake.seed(1234)
print fake.name()
print fake.name()
print fake.name()
# repeated run with same data
fake.seed(1234)
print fake.name()
print fake.name()
print fake.name()
# adding new fake calls prevent us from getting the same names we had originally
fake.seed(1234)
print fake.name(), fake.email()
print fake.name(), fake.email()
print fake.name(), fake.email()
# One way is to implement a preserve/restore mechanism so that the user can get back to the previous trail of data
fake.seed(1234)
print fake.name()
r = random.getstate()
print fake.email()
random.setstate(r)
print fake.name()
r = random.getstate()
print fake.email()
random.setstate(r)
print fake.name()
r = random.getstate()
print fake.email()
random.setstate(r)
# A similar problem arises if the program using faker happens to use a non-instance random call during generation.
# The best way to prevent this issue is to have faker use an instance of random rather than the module version.
# If faker used an instance version of random, you could also resolve the original problem by using different faker instances
fake.seed(1234)
fake2 = Faker()
fake2.seed(1234)
print fake.name(), fake2.email()
print fake.name(), fake2.email()
print fake.name(), fake2.email()
The docs mention being able to call the seed() method so you can use a generated dataset as part of a unit test.
Due to the way Faker uses the random module, this usecase is a bit fragile. Any modification to the data requested, or any outside uses of the random module during generation will diverge the dataset.
Here is a quick script demonstrating the problem along with a couple of potential solutions: