-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance random generator repeatibility #14
Comments
Thank you very much for the very good explanation. I spent quite a bit of time to find an easy way to correct this behavior, but I think we will have to remove all the classmethod/staticmethod. Then change the faker.generators.Generator:
and a lot of other provider methods. Anyway I think it will be difficult to use correctly the Currently, this method seems to be buggy. |
I don't think that it will be that painful to try to clean up this issue. I wish I wasn't so rusty with Python app development outside a small single script, or I'd try to put together a pull request for you, but I fear it wouldn't be adequate unless I put more time into it than I have available. I definitely agree that Generator needs to get an instance of random rather than using the module static. That is step 0. Even just doing that one simple change would remove most of the problem and enable the approach I put at the bottom of my example where people could just spawn more than one faker object when they want to keep the seed stable. I suggest testing against regression and then start by just releasing that. If that goes well, the next thing I'd recommend would be adding two more methods to generator to allow checkpointing the random instance via getstate and setstate. The end user wouldn't even have to store the state themselves, you would probably be fine introducing it as a type of transactional system where they could call a method like "savestate" before they start doing "weird" stuff, then call "restorestate" when they want to get back to normal. save would call getstate and store the state in a instance variable, and restorestate would just call setstate to update the random instance back to where it was. As for your worry about some methods using random a variable number of times, I'd be happy to look at it if you could point me at one, but I suspect they might not be a problem due to the pseudo nature of the prg. Here is why I think that, let me know if I'm wrong. Since all of the randomness in methods a, b, and c are based on the same rng, the random instance stored inside Generator, if a program calls a, then b, then c using the same initial seed, it should still get the same output. The random number between 1 and 5 that b used to determine how many strings to output will always be the same given the same initial seed. |
Just to jump in with some initial assessment. @deinspanjer is correct. Checking the current state of the seed would be optimal. Based on my assessment, it seems that you have one single instance of the Generator object per factory. Thus, I have a possible simpler idea...
|
Seeding the random module will seed that module globally -- this affects calls to random outside the faker package. Seeding an instance of the Random class will not affect other calls to the random module. This addresses one concern in issue joke2k#14.
Seeding the random module will seed that module globally -- this affects calls to random outside the faker package. Seeding an instance of the Random class will not affect other calls to the random module. This addresses one concern in issue joke2k#14.
Seeding the random module will seed that module globally -- this affects calls to random outside the faker package. Seeding an instance of the Random class will not affect other calls to the random module. This addresses one concern in issue joke2k#14.
Seeding the random module will seed that module globally -- this affects calls to random outside the faker package. Seeding an instance of the Random class will not affect other calls to the random module. This addresses one concern in issue joke2k#14.
Fixed in #259 |
The docs mention being able to call the seed() method so you can use a generated dataset as part of a unit test.
Due to the way Faker uses the random module, this usecase is a bit fragile. Any modification to the data requested, or any outside uses of the random module during generation will diverge the dataset.
Here is a quick script demonstrating the problem along with a couple of potential solutions:
The text was updated successfully, but these errors were encountered: