Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datafaker annotations #675

Closed
snuyanzin opened this issue Feb 8, 2023 · 19 comments
Closed

Datafaker annotations #675

snuyanzin opened this issue Feb 8, 2023 · 19 comments

Comments

@snuyanzin
Copy link
Collaborator

Currently this is just an idea out loud. So feel free to criticize/propose improvements or something completely new.

The idea is to have

  1. class level annotation with locale and see info (optional)
  2. field level annotations with method name
  3. after all validations passed it should be able to generate objects with fields filled based on datafaker specific set annotations.
@bodiam
Copy link
Contributor

bodiam commented Feb 11, 2023

Something like:

@FakeLocale("nl", "nl")
class Person {
   @FakeValue("name.firstName")
    private String firstName;
}

And then:

val persons: List<Person> = faker.generate(Person::class, 100)

?

@kingthorin
Copy link
Collaborator

I like this idea, but I'll be honest I don't know much about custom annotations.

@snuyanzin
Copy link
Collaborator Author

kind of

@Fake(country="nl", language="nl", seed = 31415)
class Person {
   @Provider(expression="#{...faker expression here ...}")
    private String firstname;
    @MethodProvider("methodToUse")
   private Address address;
}

Fake annotation takes locale specific and seed info
Provider works only for string fields since expressions could result in string only
MethodProvider works for any object however there should be validation done that types match.
Here could be 2 possibilities:

  1. MethodProvider has same type and then it just sets the return value
  2. MethodProvider returns a collection or an array and then only one value will be picked to set

@ssimiao
Copy link

ssimiao commented Feb 11, 2023

would these annotations stay in the domain class? I believe it would be polluted if that same class has other annotations, but I liked the idea

@snuyanzin
Copy link
Collaborator Author

in the class which will be faked.
Another way (without annotation pollution) is #513
or custom builders however it requires more code

@bodiam
Copy link
Contributor

bodiam commented Feb 12, 2023

I don't really see the need for the top level annotation but yes, I think this could be helpful for classes which are used for generating test data only.

@bodiam
Copy link
Contributor

bodiam commented Feb 12, 2023

I like this idea, but I'll be honest I don't know much about custom annotations.

The idea is that you create a new annotation class with some fields, just like a normal class, and using reflection you can check if the class has the annotation, get access to it and its fields, and handle them how you see fit. It's less magic than it seems.

@bodiam
Copy link
Contributor

bodiam commented Feb 12, 2023

would these annotations stay in the domain class? I believe it would be polluted if that same class has other annotations, but I liked the idea

They would, but I wouldn't recommend mixing production classes, like hibernate or Jackson annotated classes with classes typically used for testing, like Datafaker classes. But that depends mostly on the use case I guess.

@snuyanzin
Copy link
Collaborator Author

yes, probably top level annotation could be replaced by fields level.
Locale and seed info could be passed there if required

@RVRhub
Copy link
Contributor

RVRhub commented Feb 19, 2023

MethodProvider maybe also should have some config (for instance seed) or it doesn't make sense because it will be defined in class annotation for Address?

What should the expression look like?:

  • should it be something like this Address::street.
  • it should be some kind dsl.

@snuyanzin
Copy link
Collaborator Author

MethodProvider maybe also should have some config (for instance seed) or it doesn't make sense because it will be defined in class annotation for Address?

could be extra attributes for annotations with some predefined defaults

What should the expression look like?:

same as mentioned here https://www.datafaker.net/documentation/expressions/

@RVRhub
Copy link
Contributor

RVRhub commented Mar 13, 2023

I would like to implement its first version, I already have a small prototype. I will create pr as soon as possible to continue the discussion.

@snuyanzin
Copy link
Collaborator Author

go ahead @RVRhub , looking forward to see it in datafaker

@snuyanzin
Copy link
Collaborator Author

snuyanzin commented Apr 2, 2023

After having several chats and POCs with @RVRhub we faced 2 issues:

  1. it looks like yes the code is becoming polluted with annotations
  2. it's impossible to have several objects generated in different ways.

To address these issues we decided to have only one annotation @FakeForSchema which is a reference to a method providing a default schema to generate objects.

The idea is to build the solution based on java transformers approach #513 which already can generate objects.

So the objects could be generated in 2 ways:
1.

MyClass object = Faker.populate(MyClass.class);

this will generate an object with usage of default schema mentioned by @FakeForSchema
2.

MyClass object = Faker.populate(MyClass.class, myschema);

this will allow to generate objects based on others schemas.

In this way it's possible to have a number of different schemas and generate objects based on them.
At the same time schemas and objects generations are decoupled

@snuyanzin
Copy link
Collaborator Author

merged with #754

@snuyanzin
Copy link
Collaborator Author

// cc @eliasnogueira since it could be an alternative to several builders

@eliasnogueira
Copy link
Contributor

eliasnogueira commented Apr 15, 2023

Thank you @snuyanzin for pinging me.
Awesome feature guys! Congrats!

To highlight the differences we could have, as the builder alternative.

Alternative 1 (without using this feature)

Usage of the Model class that implements the Builder pattern. The faker user is used directly to the attribute/field.

Preconditions

  • object implementing the Builder pattern

Advantages

  • No use of extra features (custom methods and classes)
  • Faster execution compared to Alternative 2, around 370 ms

Disadvantages

  • model objects must implement the build pattern
public class SimulationDataFactory {
    public static Simulation newSimulation() {
        return new SimulationBuilder().
                name(faker.name().nameWithMiddle()).
                cpf(faker.cpf().valid()).
                email(faker.internet().emailAddress()).
                amount(new BigDecimal(faker.number().numberBetween(MIN_AMOUNT, MAX_AMOUNT))).
                installments(faker.number().numberBetween(MIN_INSTALLMENTS, MAX_INSTALLMENTS)).
                insurance(faker.bool().bool()).build();
    }
}

Alternative 2 (using this feature)

Preconditions

  • raw object (getters and setters only)

Advantages

  • No necessity to implement the Build pattern for the Model objects
  • Model objects can be Java records (reducing the boilerplate code and removing Lombok)
  • Schema as a template to generate data into Models using different approaches without a necessity of extra methods

Disadvantages

  • fields are associated through string names, so any changes on them have a manual action
  • slower execution compared to the Alternative 1, around 500 ms
public class SimulationDataFactory {
    public static Simulation newSimulation() {
        Schema<Object, ?> schema = Schema.of(
                field("name", () -> faker.name().nameWithMiddle()),
                field("cpf", () -> faker.cpf().valid()),
                field("email", () -> faker.internet().emailAddress()),
                field("amount", () -> new BigDecimal(faker.number().numberBetween(MIN_AMOUNT, MAX_AMOUNT))),
                field("installments", () -> faker.number().numberBetween(MIN_INSTALLMENTS, MAX_INSTALLMENTS)),
                field("insurance", () -> faker.bool().bool())
        );

        return (Simulation) new JavaObjectTransformer().apply(Simulation.class, schema);
    }
}

@snuyanzin
Copy link
Collaborator Author

thanks for the summary @eliasnogueira

a minor comment about Alternative 2 (using this feature)

in fact it can be used net.datafaker.providers.base.BaseFaker#populate
like

public class SimulationDataFactory {
    public static Simulation newSimulation() {
        Schema<Object, ?> schema = Schema.of(
                field("name", () -> faker.name().nameWithMiddle()),
                field("cpf", () -> faker.cpf().valid()),
                field("email", () -> faker.internet().emailAddress()),
                field("amount", () -> new BigDecimal(faker.number().numberBetween(MIN_AMOUNT, MAX_AMOUNT))),
                field("installments", () -> faker.number().numberBetween(MIN_INSTALLMENTS, MAX_INSTALLMENTS)),
                field("insurance", () -> faker.bool().bool())
        );

        return Faker.populate(Simulation.class, schema);
    }
}

@eliasnogueira
Copy link
Contributor

@snuyanzin Not yet as it's not available in the 1.8.1 :-(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants