Add support for regular expressions when specifying a field name #322

kkurczewski · 2018-11-30T20:06:47Z

Hi,
Change is simple, allow to build field definitions using patterns.

I work with many large schemas that has many plain Strings fields. Random mess in each of these doesn't look good as example data. There is a lot of different context IDs, names, etc. Sadly as mentioned I can't differentiate these by type.

I could list them by hand but it would be way easier to use generic regex for them.

Tests are green, most impacted are FieldDefinition constructors, on the other hand FieldDefinitionBuilder still allow to build definitions old way.

:)

coveralls · 2018-11-30T20:13:04Z

Coverage increased (+0.01%) to 90.242% when pulling a331fa7 on kkurczewski:master into 87ba0e6 on benas:master.

fmbenhassine · 2019-02-09T18:14:29Z

Hi @kkurczewski ,

Thank you for this PR! If I understand correctly, the goal is to be able to use a regexp in the name of a field definition in order to match multiple fields and use the same randomizer for all of them. An example would be:

@lombok.Data
public class Foo {
    private String name1;
    private String name2;
    private String job;
    private Bar bar;
}

@lombok.Data
public class Bar {
    private String name3;
    private String address;
}

@Test
void testFieldDefinitionWithNameAsRegexp() {
    // given
    EnhancedRandom enhancedRandom = new EnhancedRandomBuilder()
            .randomize(
                    field().named("name*").ofType(String.class).get(),
                    (Randomizer<String>) () -> "foo")
            .build();

    // when
    Foo foo = enhancedRandom.nextObject(Foo.class);

    // then
    assertThat(foo.getName1()).isEqualTo("foo");
    assertThat(foo.getName2()).isEqualTo("foo");
    assertThat(foo.getBar().getName3()).isEqualTo("foo");
    assertThat(foo.getJob()).isNotEqualTo("foo");
    assertThat(foo.getBar().getAddress()).isNotEqualTo("foo");
}

Is that correct?

If my understanding is correct, then it s possible to achieve this with the following change in AbstractRandomizerRegistry:

     protected boolean hasName(final Field field, final String name) {
-        return name == null || field.getName().equals(name);
+        return name == null || java.util.regex.Pattern.compile(name).asPredicate().test(field.getName());
     }

This change makes the previous test pass and is backward compatible.

My concern is that changes suggested in this PR are breaking changes (public constructors of FieldDefinition have changed and the protected method AbstractRandomizerRegistry#hasName has been renamed to matches).

What do you think?

Kr,
Mahmoud

kkurczewski · 2019-02-11T18:08:10Z

Hi @benas,

Yes, you got the intent right and yours solution is way simpler than mine.

But one thing was buggin' me. In your example name* should be name.*, but test passes right?

I found counter intuitive behavior of Pattern.compile(name).asPredicate() which is problematic.
This predicate returns true for first substring that match given pattern.

Even more strange behavior, this line will fail:
assertFalse(Pattern.compile("n*me").asPredicate().test("name"));

So in case of pattern matching we should rely only on:
Pattern.compile(name).matcher(field.getName()).matches();

One thing that may be drawback is that Pattern compilation is not persisted between uses. But I'm not sure how much it will affect execution.

fmbenhassine · 2019-02-11T21:27:53Z

Hi @kkurczewski

Thank you for your feedback!

But one thing was buggin' me. In your example name* should be name.*, but test passes right?

True. name.* is more intuitive: It means "name" followed by any character (the dot) zero or more times. While name* means "nam" followed by zero or more "e". The test still passes because Pattern.compile("name*").asPredicate().test("name1") is true.

I found counter intuitive behavior of Pattern.compile(name).asPredicate() which is problematic.
This predicate returns true for first substring that match given pattern.

Even more strange behavior, this line will fail:
assertFalse(Pattern.compile("n*me").asPredicate().test("name"));

So in case of pattern matching we should rely only on:
Pattern.compile(name).matcher(field.getName()).matches();

That's how regular expressions work in Java and it is not really the business of RB. As a library developer, I will document the method FieldDefinition#name saying that it accepts a regular expression as defined in java.util.regex.Pattern.compile.

One thing that may be drawback is that Pattern compilation is not persisted between uses. But I'm not sure how much it will affect execution.

That's true and I thought about it too. We can't do anything about it (like storing the compiled pattern as a field in the state of the registry) because the method might be called multiple times with different names. I think that's not a bid deal for now, let's try to not optimize prematurely and tackle any performance issue lazily when reported (if ever).

If you agree with the change I suggested, would you mind if I apply it? Otherwise, please update the PR accordingly and I will merge it.

Kr,
Mahmoud

kkurczewski · 2019-02-12T15:59:55Z

Not sure I we are on the same page with regex behavior, say I use Pattern#asPredicate() method with this snippet:

 EnhancedRandom enhancedRandom = new EnhancedRandomBuilder()
            .randomize(
                    field().named("name").ofType(String.class).get(),
                    (Randomizer<String>) () -> "foo")
            .build();

Now name, name1 and surname will be set to foo even if I don't want to use regex.

Conclusion, Pattern#matcher() is safe whereas Pattern#asPredicate() is not.

If you agree with the change I suggested, would you mind if I apply it?

I don't mind. Feel free to apply it :)

fmbenhassine · 2019-02-12T20:25:25Z

Looks like I missed the point Pattern#matcher() vs Pattern#asPredicate(). Here is the test that should IMO pass to accept the feature:

import io.github.benas.randombeans.api.EnhancedRandom;
import io.github.benas.randombeans.api.Randomizer;
import org.junit.jupiter.api.Test;

import static io.github.benas.randombeans.FieldDefinitionBuilder.field;
import static org.assertj.core.api.Assertions.assertThat;

public class Issue322 {

    @lombok.Data
    public class Foo {
        private String name1;
        private String name2;
        private String nickname;
        private String job;
        private Bar bar;
    }

    @lombok.Data
    public class Bar {
        private String name;
        private String address;
    }

    @Test
    void testFieldDefinitionWithNameAsRegexp() {
        // given
        EnhancedRandom enhancedRandom = new EnhancedRandomBuilder()
                .randomize(
                        field().named("name.*").ofType(String.class).get(),
                        (Randomizer<String>) () -> "foo")
                .build();

        // when
        Foo foo = enhancedRandom.nextObject(Foo.class);

        // then
        assertThat(foo.getName1()).isEqualTo("foo");
        assertThat(foo.getName2()).isEqualTo("foo");
        assertThat(foo.getBar().getName()).isEqualTo("foo");

        assertThat(foo.getNickname()).isNotEqualTo("foo");
        assertThat(foo.getJob()).isNotEqualTo("foo");
        assertThat(foo.getBar().getAddress()).isNotEqualTo("foo");
    }
}

This test passes with the following change in AbstractRandomizerRegistry:

     protected boolean hasName(final Field field, final String name) {
-        return name == null || field.getName().equals(name);
+        return name == null || java.util.regex.Pattern.compile(name).matcher(field.getName()).matches();
     }

Note that I updated the change to use Pattern#matcher instead of Pattern#asPredicate(). BTW, there is also a Pattern#asMatchPredicate() that can do the trick.

I don't mind. Feel free to apply it :)

I don't apply anything until we both agree and the change is correct (which was not the case in my previous comment) 😄 What do you think about these updates? Did I miss a test case?

kkurczewski · 2019-02-12T20:37:41Z

Looks good to me :)

Add support for field name pattern

a331fa7

fmbenhassine added the waiting-for-feedback label Feb 9, 2019

fmbenhassine added the feature label Feb 12, 2019

fmbenhassine added this to the 3.9.0 milestone Feb 12, 2019

fmbenhassine removed the waiting-for-feedback label Feb 12, 2019

fmbenhassine closed this in 5886032 Feb 12, 2019

fmbenhassine changed the title ~~Add support for field name pattern~~ Add support for regular expressions when specifying a field name Feb 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for regular expressions when specifying a field name #322

Add support for regular expressions when specifying a field name #322

kkurczewski commented Nov 30, 2018

coveralls commented Nov 30, 2018

fmbenhassine commented Feb 9, 2019

kkurczewski commented Feb 11, 2019

fmbenhassine commented Feb 11, 2019 •

edited

Loading

kkurczewski commented Feb 12, 2019

fmbenhassine commented Feb 12, 2019

kkurczewski commented Feb 12, 2019

Add support for regular expressions when specifying a field name #322

Add support for regular expressions when specifying a field name #322

Conversation

kkurczewski commented Nov 30, 2018

coveralls commented Nov 30, 2018

fmbenhassine commented Feb 9, 2019

kkurczewski commented Feb 11, 2019

fmbenhassine commented Feb 11, 2019 • edited Loading

kkurczewski commented Feb 12, 2019

fmbenhassine commented Feb 12, 2019

kkurczewski commented Feb 12, 2019

fmbenhassine commented Feb 11, 2019 •

edited

Loading