Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequence Determinism When Adding New Property #104

Closed
MarcoTheFirst opened this issue Nov 9, 2017 · 6 comments
Closed

Sequence Determinism When Adding New Property #104

MarcoTheFirst opened this issue Nov 9, 2017 · 6 comments

Comments

@MarcoTheFirst
Copy link

MarcoTheFirst commented Nov 9, 2017

It is my goal to always create the same sample data, every time I re-create the DB. I’m aware of the following way:

Randomizer.Seed = new Random(8675309);

However, this only results in constant sample data as long as I don’t add another property / rulefor. As soon as a new property is added, the randomizer’s values are obviously shifted by one, so that all following objects and property values become different.

Do you already have a workaround for this problem?

If not, I have a suggestion: We could have a Randomizer per combination “Type.Property”. So your internal mappers would create different Randomizers (using initialization with a constant ) for Customer.FirstName and Customer.LastName, and the same for all other properties. This would result in objects always being the same, regardless of how many properties are added or removed.

Is there a way to implement this easily with the current architecture of Bogus, or is this something that requires a core modification?

Thanks,
Marco

@bchavez
Copy link
Owner

bchavez commented Nov 9, 2017

Hi @MarcoTheFirst ,

Thanks for creating a GitHub issue, it's easier for me to keep track of just in case we need to make changes to Bogus. It also helps out the community/google search when people run across the same problem; we can have something to refer to.

So, you have an interesting problem. I think your issue can be solved by using local seeds instead of relying on the global Randomizer.Seed.

To do that, you'll need to use the .UseSeed method on all Faker<T>s your application. Ex:

var faker = new Faker<Order>()
   .UseSeed(1234)
   .RuleFor(o => o.OrderId, f => f.IndexVariable++)
   .RuleFor(o => o.Item, f => f.Commerce.Product())
   .RuleFor(o => o.Quantity, f => f.Random.Int(1,5));
   //new properties or fields here

var order = faker.Generate();

This way, all Faker<T> fakers are isolated from one another using their own localized seed instead all fakers drawing from a global static seed. Then, any time you need to add properties/fields just be sure to add your new properties/fields at the end of the method chain as you encounter them.

Additionally, you can derive from Faker<T> and use your derived class across your entire application too:

var faker = new MyFaker<Order>()
   .RuleFor(o => o.OrderId, f => f.IndexVariable++)
   .RuleFor(o => o.Item, f => f.Commerce.Product())
   .RuleFor(o => o.Quantity, f => f.Random.Int(1, 5));
   //new properties or fields here

class MyFaker<T> : Faker<T> where T : class
{
   public MyFaker()
   {
      this.UseSeed(1234);
   }
}

Let me know if that helps. Please close the issue if you feel this solution works for you.

Thanks,
Brian

@MarcoTheFirst
Copy link
Author

Hi Brian

Thanks for the interesting approach. However, I think it only works for generating single objects. As soon as I call faker.Generate(5), the random shifting will occur once again within the same faker instance (because adding a property will shift the random number for the second object instance).

The reason I want to get close to 100% reproduction of test data is that I'd like to run automated UI tests with as little effort as possible. So if - for example - I want to test a full text search, I need to be sure that the contact "Laura Smith" exists for sure, and that it's contact number 356 on page 52 of my list.. you get the idea. Of course I could create static objects, but that would create more work in the end... defeats the purpose of having such a great tool like Bogus.

I'll try fiddling around with subclassing faker.. maybe I'll find a suitable method to override. If you have another idea, let me know. Thanks!

Marco

@MarcoTheFirst
Copy link
Author

OK, I found a way to make this work for more than one object. This just leaves the question whether you see any drawbacks to my solution... maybe performance is not so great because I create a new Randomizer for every object instance, but currently I don't see any other way:

    public class CustomFaker<T> : Faker<T> where T : class
    {
        int startSeed = 0;

        public CustomFaker(string locale = "en") 
            : base(locale)
        {
        }

        public override T Generate(string ruleSets = null)
        {
            startSeed++;
            FakerHub.Random = new Randomizer(startSeed);
            return base.Generate(ruleSets);
        }
    }

bchavez added a commit that referenced this issue Nov 10, 2017
@bchavez
Copy link
Owner

bchavez commented Nov 10, 2017

Hi @MarcoTheFirst ,

Ah, thank you for the clarification. Most definitely, your workaround would be the way to get the expected behavior you're looking for. Although, I might subclass it like this:

public class CustomFaker<T> : Faker<T> where T : class
{
   private int seed;
   protected override void PopulateInternal(T instance, string[] ruleSets)
   {
      this.UseSeed(seed++);
      base.PopulateInternal(instance, ruleSets);
   }
}

PopulateInternal digs your code a little bit deeper into Bogus just before the rules start executing and might be helpful to catch other API calls like Populate in case they are ever used by your code in the future. Also, I used .UseSeed here to avoid any unexpected behavior when cloning Faker<T>. But 6-of-one, half-dozen the other, your solution should work just fine. I see no immediate problems.

I recognize your issue as a desirable default behavior for Faker<T> so thanks for bringing this to my attention. I'll keep a mental note of this in case others look for the same behavior too. We might switch to doing this by default in the future, maybe.

Also, I should point out locales change often upstream from faker.js. So anytime we update locales from faker.js there's always a possibility that Bogus' locale data might change too. Locale data upstream get added, removed and updated from faker.js and might break hard-coded assertions like ("Laura Smith", contact number 356, on page 52). So, just something to keep in mind. If you want to keep Bogus dependency updated with the latest Bogus version, I'd recommend avoiding hard-coding assertions as much as possible.

Related: #100, #101

🍫 🍪 🍭 Ronald Jenkees - Stay Crunchy

@MarcoTheFirst
Copy link
Author

Thanks a lot for your tips and code improvement! Indeed, the update of locales from faker.js could pose a problem down the road.

For unit tests I'd never use hard-coded assertions, but we're using an awesome tool called LeapTest to fully automate our UI testing. And there it would be very helpful to re-populate a sample database overnight as close as possible to the previous day's state. Restoring from a static DB backup is not really an option during development process, because db schemas change etc. So Bogus comes in handy, because with identical test data sets, we can do UI testing on "known" cases (such as the Laura Smith example mentioned earlier). It would be near to impossible to create all these UI tests dynamically.

Anyway, that's not part of my initial problem, but maybe helpful to you or other readers finding their way here :-)

Thanks again!

@bchavez bchavez changed the title Keep test data identical after add Sequence Determinism When Adding New Property Nov 10, 2017
@bchavez
Copy link
Owner

bchavez commented Mar 21, 2019

Referencing this issue from twitter: https://twitter.com/sgoguen/status/1108564023363153922

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants