Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for regular expressions when specifying a field name #322

Closed
wants to merge 1 commit into from
Closed

Add support for regular expressions when specifying a field name #322

wants to merge 1 commit into from

Conversation

kkurczewski
Copy link

Hi,
Change is simple, allow to build field definitions using patterns.

I work with many large schemas that has many plain Strings fields. Random mess in each of these doesn't look good as example data. There is a lot of different context IDs, names, etc. Sadly as mentioned I can't differentiate these by type.

I could list them by hand but it would be way easier to use generic regex for them.

Tests are green, most impacted are FieldDefinition constructors, on the other hand FieldDefinitionBuilder still allow to build definitions old way.

:)

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 90.242% when pulling a331fa7 on kkurczewski:master into 87ba0e6 on benas:master.

@fmbenhassine
Copy link
Member

Hi @kkurczewski ,

Thank you for this PR! If I understand correctly, the goal is to be able to use a regexp in the name of a field definition in order to match multiple fields and use the same randomizer for all of them. An example would be:

@lombok.Data
public class Foo {
    private String name1;
    private String name2;
    private String job;
    private Bar bar;
}

@lombok.Data
public class Bar {
    private String name3;
    private String address;
}

@Test
void testFieldDefinitionWithNameAsRegexp() {
    // given
    EnhancedRandom enhancedRandom = new EnhancedRandomBuilder()
            .randomize(
                    field().named("name*").ofType(String.class).get(),
                    (Randomizer<String>) () -> "foo")
            .build();

    // when
    Foo foo = enhancedRandom.nextObject(Foo.class);

    // then
    assertThat(foo.getName1()).isEqualTo("foo");
    assertThat(foo.getName2()).isEqualTo("foo");
    assertThat(foo.getBar().getName3()).isEqualTo("foo");
    assertThat(foo.getJob()).isNotEqualTo("foo");
    assertThat(foo.getBar().getAddress()).isNotEqualTo("foo");
}

Is that correct?

If my understanding is correct, then it s possible to achieve this with the following change in AbstractRandomizerRegistry:

     protected boolean hasName(final Field field, final String name) {
-        return name == null || field.getName().equals(name);
+        return name == null || java.util.regex.Pattern.compile(name).asPredicate().test(field.getName());
     }

This change makes the previous test pass and is backward compatible.

My concern is that changes suggested in this PR are breaking changes (public constructors of FieldDefinition have changed and the protected method AbstractRandomizerRegistry#hasName has been renamed to matches).

What do you think?

Kr,
Mahmoud

@kkurczewski
Copy link
Author

Hi @benas,

Yes, you got the intent right and yours solution is way simpler than mine.

But one thing was buggin' me. In your example name* should be name.*, but test passes right?

I found counter intuitive behavior of Pattern.compile(name).asPredicate() which is problematic.
This predicate returns true for first substring that match given pattern.

Even more strange behavior, this line will fail:
assertFalse(Pattern.compile("n*me").asPredicate().test("name"));

So in case of pattern matching we should rely only on:
Pattern.compile(name).matcher(field.getName()).matches();

One thing that may be drawback is that Pattern compilation is not persisted between uses. But I'm not sure how much it will affect execution.

@fmbenhassine
Copy link
Member

fmbenhassine commented Feb 11, 2019

Hi @kkurczewski

Thank you for your feedback!

But one thing was buggin' me. In your example name* should be name.*, but test passes right?

True. name.* is more intuitive: It means "name" followed by any character (the dot) zero or more times. While name* means "nam" followed by zero or more "e". The test still passes because Pattern.compile("name*").asPredicate().test("name1") is true.

I found counter intuitive behavior of Pattern.compile(name).asPredicate() which is problematic.
This predicate returns true for first substring that match given pattern.

Even more strange behavior, this line will fail:
assertFalse(Pattern.compile("n*me").asPredicate().test("name"));

So in case of pattern matching we should rely only on:
Pattern.compile(name).matcher(field.getName()).matches();

That's how regular expressions work in Java and it is not really the business of RB. As a library developer, I will document the method FieldDefinition#name saying that it accepts a regular expression as defined in java.util.regex.Pattern.compile.

One thing that may be drawback is that Pattern compilation is not persisted between uses. But I'm not sure how much it will affect execution.

That's true and I thought about it too. We can't do anything about it (like storing the compiled pattern as a field in the state of the registry) because the method might be called multiple times with different names. I think that's not a bid deal for now, let's try to not optimize prematurely and tackle any performance issue lazily when reported (if ever).

If you agree with the change I suggested, would you mind if I apply it? Otherwise, please update the PR accordingly and I will merge it.

Kr,
Mahmoud

@kkurczewski
Copy link
Author

Not sure I we are on the same page with regex behavior, say I use Pattern#asPredicate() method with this snippet:

 EnhancedRandom enhancedRandom = new EnhancedRandomBuilder()
            .randomize(
                    field().named("name").ofType(String.class).get(),
                    (Randomizer<String>) () -> "foo")
            .build();

Now name, name1 and surname will be set to foo even if I don't want to use regex.

Conclusion, Pattern#matcher() is safe whereas Pattern#asPredicate() is not.

If you agree with the change I suggested, would you mind if I apply it?

I don't mind. Feel free to apply it :)

@fmbenhassine
Copy link
Member

Looks like I missed the point Pattern#matcher() vs Pattern#asPredicate(). Here is the test that should IMO pass to accept the feature:

import io.github.benas.randombeans.api.EnhancedRandom;
import io.github.benas.randombeans.api.Randomizer;
import org.junit.jupiter.api.Test;

import static io.github.benas.randombeans.FieldDefinitionBuilder.field;
import static org.assertj.core.api.Assertions.assertThat;

public class Issue322 {

    @lombok.Data
    public class Foo {
        private String name1;
        private String name2;
        private String nickname;
        private String job;
        private Bar bar;
    }

    @lombok.Data
    public class Bar {
        private String name;
        private String address;
    }

    @Test
    void testFieldDefinitionWithNameAsRegexp() {
        // given
        EnhancedRandom enhancedRandom = new EnhancedRandomBuilder()
                .randomize(
                        field().named("name.*").ofType(String.class).get(),
                        (Randomizer<String>) () -> "foo")
                .build();

        // when
        Foo foo = enhancedRandom.nextObject(Foo.class);

        // then
        assertThat(foo.getName1()).isEqualTo("foo");
        assertThat(foo.getName2()).isEqualTo("foo");
        assertThat(foo.getBar().getName()).isEqualTo("foo");

        assertThat(foo.getNickname()).isNotEqualTo("foo");
        assertThat(foo.getJob()).isNotEqualTo("foo");
        assertThat(foo.getBar().getAddress()).isNotEqualTo("foo");
    }
}

This test passes with the following change in AbstractRandomizerRegistry:

     protected boolean hasName(final Field field, final String name) {
-        return name == null || field.getName().equals(name);
+        return name == null || java.util.regex.Pattern.compile(name).matcher(field.getName()).matches();
     }

Note that I updated the change to use Pattern#matcher instead of Pattern#asPredicate(). BTW, there is also a Pattern#asMatchPredicate() that can do the trick.

I don't mind. Feel free to apply it :)

I don't apply anything until we both agree and the change is correct (which was not the case in my previous comment) 😄 What do you think about these updates? Did I miss a test case?

@kkurczewski
Copy link
Author

Looks good to me :)

@fmbenhassine fmbenhassine added this to the 3.9.0 milestone Feb 12, 2019
@fmbenhassine fmbenhassine changed the title Add support for field name pattern Add support for regular expressions when specifying a field name Feb 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants