Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HV-1466: UniqueElements constraint #879

Closed
wants to merge 10 commits into from

Conversation

tadhgpearson
Copy link
Contributor

@tadhgpearson tadhgpearson commented Nov 3, 2017

https://hibernate.atlassian.net/browse/HV-1466

Adding the @Uniqueitems annotation to a collection ensures that it contains no two items that are equal. This is pretty handy when you're using JAX-RS and want to receive a set of items - Jackson and Moxy both deserialize JSON arrays (or XML collections) to List - this allows you to validate that you won't lose fidelity if you convert it to a sorted set.

We've been using this for a while and it works pretty well. The duplicate check code is tuned for the fact that validation will normally pass, and that the original order of the input collection should be respected when producing the output message.
I tweaked the message since the original proposal since the $validated_value can often be very large - repeating it in the output is not necessarily very useful.

I haven't provided any validation message with this, since as you can see, I'm currently overwriting the message to provide a value in English only, directly from the Validator, which also contains the duplicate items. I don't know of a way to get the duplicate items value into a property message, any feedback on that would be welcome. I'm happy to localize this in French & English for you, can't help you much with other languages I'm afraid.

Looking forward to your feedback, thanks! Tadhg

@Hibernate-CI
Copy link
Contributor

Can one of the admins add this person to the trusted builders? (reply with: "add to whitelist" or "ok to test")

Copy link
Member

@gsmet gsmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tadhgpearson

Thanks, nice work. I added a couple of comments inline to help you simplify what you did and be more in line with what we usually do.

Feel free to ask any question you might have!

public class UniqueItemsValidator implements ConstraintValidator<UniqueItems, Collection<?>> {
@Override
public void initialize(UniqueItems uniqueItems) {
//No annotation fields to initialize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to override this method, we have a default method now.

@Constraint(validatedBy = UniqueItemsValidator.class)
public @interface UniqueItems {

String message() default "{org.hibernate.validator.constraints.UniqueItems.message}";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the idea is that you should declare this string with the other constraints in ValidationMessages.properties so that it can be translated.

List<String> duplicates = findDuplicates( objects );
String message = "Contains the following duplicate items: " + duplicates;
validatorContext.disableDefaultConstraintViolation();
validatorContext.buildConstraintViolationWithTemplate( message ).addConstraintViolation();
Copy link
Member

@gsmet gsmet Nov 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of building a new message you won't be able to translate, you can use the default message and add parameters for the interpolation. You can find an example of that in PatternValidator (the unwrapping of the context and the addition of regexp as a message parameter).

Note that you should use InterpolationHelper.escapeMessageParameter() on the string to escape the parameter string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a look

private List<String> findDuplicates(Collection<?> objects) {
Set<Object> uniqueItems = new HashSet<>();
List<?> duplicates = objects.stream().filter( o -> !uniqueItems.add( o ) ).collect( Collectors.toList() );
return duplicates.stream().map( String::valueOf ).collect( Collectors.toList() );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why you create an intermediate list here? You could continue to use the stream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good feedback - there used to be a method in there - it got pulled back and the streams never got joined

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this with Marko this morning and we are wondering if, maybe, we shouldn't add the items to the message but rather provide the objects in a Payload (the payload API allows to add an object to the constraint violation that the user can get afterwards). That would allow the users to get the duplicate objects directly and do whatever they want with it instead of just having a string.

Calling toString() might be ugly if not implemented.

So, maybe, we could have a generic message saying that there are duplicate items and then the user could use the payload to do whatever he wants with it?

As the first user of this annotation, let us know what you think.

Btw, thinking a bit more about it, I think we should support having nulls in the collections, don't you think? That would qualify for a new test :).

Copy link
Contributor Author

@tadhgpearson tadhgpearson Nov 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many options like that... allowing users to define their own comparators instead of equals seems like another obvious way to go. My philosophy here was start simple - rely on the existing equals and toString, and see if anyone care. It's easy to expand it if necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding null items in the collections, this is already supported. One null is unique, multiple nulls are not unique and cause an exception.

I can add this to the tests, should it also be documented?

* @param objects Non-null collection of objects in which to find duplicates.
*
* @return Non-null collection of the toString output for the dupliciate objects.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a comment for such a trivial private method. Let's keep it simple :).

assertTrue( results.stream().anyMatch( cv -> cv.getMessage().contains( duplicate ) ) );
}

static class TestObjectA {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can make your test objects private.

}
}

static class TestObjectB {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly, I don't think we need these weird cases. A test on an object with a working equals/hashCode seems sufficient to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The world is full of weird use cases... I'm happy to test less if you prefer, let me know :D

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it's not worth it testing the weird hashCode/equals behaviors.

Copy link
Member

@gunnarmorling gunnarmorling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @tadhgpearson, thanks a lot for this contribution. I've added some remarks on the user-facing API, details inline.

import org.hibernate.validator.internal.constraintvalidators.hv.UniqueItemsValidator;

/**
* Validates that the provided Collection is a Set of unique items. By default, uniqueness is defined using the object's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is a Set of" seems a bit confusing because it's not limited to Set, the Java type. How about "Validates that the provided {@link Collection} contains only unique elements."?

Is it limited to Collection actually, or could it even be Iterable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to use many synonyms to help clarify behavior for different users, and outside of the Java concept of a set, we are essentially attempting to verify a set- or bag-like behavior. But I'm happy to use the terminology you've provided if you think it fits better with the project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be iterable, but it's substantially more expensive for large collections, since iterables don't provide a size method to quickly check if the size of the set of items equals the size of the original collection.
Also, since iterables tend to be streams, the annotation would have to read the whole stream to check, which would add complexity and might not be the behavior the user expects.

* Validates that the provided Collection is a Set of unique items. By default, uniqueness is defined using the object's
* equals method.
* <p>
* This is necessary because JAX-RS always deserializes to a list. Thus, duplicates are implicitly and silently removed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't limit this to JAX-RS. Maybe "One use case for this constraint is JAX-Rs which ...".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "This can be useful with JAX-RS" ?

@Target({ ElementType.FIELD, ElementType.METHOD, ElementType.PARAMETER, ElementType.TYPE })
@Retention(RetentionPolicy.RUNTIME)
@Constraint(validatedBy = UniqueItemsValidator.class)
public @interface UniqueItems {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be an @since tag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, might be helpful to tell you which version we should target: 6.0.5 (our next micro)!

@gsmet
Copy link
Member

gsmet commented Nov 13, 2017

@tadhgpearson thanks!

I rebased and pushed a couple more changes to be more consistent with what we do elsewhere.

I made a few changes to the validator so that we avoid to create 2 HashSets in the failing case and only create one. I think it's better this way and I think it's also more readable.

I also used our collection helper to avoid the set to be resized.

I adjusted the tests to use our test helpers (and made a few adjustments to our test helpers to add the description).

Let me know what you think!

@tadhgpearson
Copy link
Contributor Author

^ Not sure what's causing the build to fail here - my local build passes and the error in the Jenkins logs doesn't seem related to any of my code changes

@gsmet
Copy link
Member

gsmet commented Nov 13, 2017

@tadhgpearson I'm on it. I introduced a JavaDoc error. I'll amend my commit and force-push.

@gsmet gsmet force-pushed the HV-1466 branch 2 times, most recently from b27e4c6 to f8f5adf Compare November 13, 2017 15:04
@gsmet
Copy link
Member

gsmet commented Nov 13, 2017

@tadhgpearson fixed and rebased.

I also added a programmatic definition class and some documentation. Review welcome!

@marko-bekhta
Copy link
Member

Added Ukrainian translation for the message with the last commit.

Checks that a collection doesn't contain elements that are equal.
@gsmet gsmet changed the title HV-1466: Unique Items annotation HV-1466: UniqueElements constraint Nov 14, 2017
@tadhgpearson
Copy link
Contributor Author

programmaticMapping = configuration.createConstraintMapping();

That's kinda handy - I had no idea you could do this!

@tadhgpearson
Copy link
Contributor Author

Looks very much more in line with the rest of the project now 👍

My only comment is that I don't see the duplicate item in the message any more. Knowing the duplicate item in the message is actually very useful for the creator of the input, because finding the one duplicate in collection of thousands of items (which is the scenario in which you're going to want this constraint) is one of the major reasons for implementing this validator in the first place.

@gsmet gsmet force-pushed the HV-1466 branch 2 times, most recently from fe6454a to 43498fb Compare November 14, 2017 15:11
@gsmet
Copy link
Member

gsmet commented Nov 14, 2017

@tadhgpearson about having {duplicates} in the message, we decided to discard it but we kept it in the validator. So you can simply override the message in your app and include it. You also can get the list of duplicate elements in the dynamic payload of the constraint violation if you want. I added something in the documentation about it. I added a test to be sure we keep this behavior. You can use this test as an example on how to implement it in your application.

We did it to stay consistent with the rest of the project and to provide a sensible message even if the user does not have a user-readable toString().

We also renamed the constraint to @UniqueElements as we usually talk about collection elements in the existing documentation.

I think we are done with this one now.

@marko-bekhta could you make a last quick review of the changes to be sure I didn't miss anything when I renamed the constraint? Thanks!

Copy link
Member

@marko-bekhta marko-bekhta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gsmet looks good. Found one last usage of items. And also one thing that we always forget about - AnnotationProcessor :) probably need to add new constraint in there as well. But these two are very minor.

/**
* @author Guillaume Smet
*/
public class UniqueItemsDefTest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only this test still has items instead of elements :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, I knew I would miss one :).

@gsmet
Copy link
Member

gsmet commented Nov 14, 2017

@marko-bekhta I just pushed the support for the annotation processor.

@marko-bekhta
Copy link
Member

Thanks @gsmet !

@gsmet
Copy link
Member

gsmet commented Nov 14, 2017

Merged!

@tadhgpearson @marko-bekhta thanks!

@gsmet gsmet closed this Nov 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants