New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HV-1867 Add UUID validation #1199
Conversation
Can one of the admins add this person to the trusted builders? (reply with: "add to whitelist" or "ok to test") |
@gsmet Could you have a look at it, please? I think, a character sequence UUID validation could be useful for many users. |
engine/src/main/java/org/hibernate/validator/constraints/UUID.java
Outdated
Show resolved
Hide resolved
...ne/src/main/java/org/hibernate/validator/internal/constraintvalidators/hv/UUIDValidator.java
Outdated
Show resolved
Hide resolved
@yrodiere @marko-bekhta Something else I can do? Have you decided to merge it yet? |
@gsmet Do you have time to look at this pull request? |
Thanks for your pull request! This pull request appears to follow the contribution rules. › This message was automatically generated. |
Rebased to get the latest CI fixes. |
Hey @dheid Sorry for the delay, I just got back from PTO and things were crazy before and obviously even crazier after. This is on my TODO list. I definitely think it has value and it looks like a very nice work. I'll ping you as soon as I'll get to it. |
@gsmet Thank you so much!!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than my suggestion the code looks good to me.
for ( int charIndex = 0; charIndex < valueLength; charIndex++ ) { | ||
|
||
char ch = value.charAt( charIndex ); | ||
|
||
if ( ch == '-' ) { | ||
groupIndex++; | ||
groupLength = 0; | ||
} | ||
else { | ||
|
||
groupLength++; | ||
if ( groupLength > GROUP_LENGTHS[groupIndex] ) { | ||
return false; | ||
} | ||
|
||
int numericValue = Character.digit( ch, 16 ); | ||
if ( numericValue == -1 ) { | ||
// not a hex digit | ||
return false; | ||
} | ||
if ( letterCase == LetterCase.LOWER_CASE && numericValue > 9 && !Character.isLowerCase( ch ) ) { | ||
return false; | ||
} | ||
if ( letterCase == LetterCase.UPPER_CASE && numericValue > 9 && !Character.isUpperCase( ch ) ) { | ||
return false; | ||
} | ||
checksum += numericValue; | ||
version = extractVersion( version, charIndex, numericValue ); | ||
variant = extractVariant( variant, charIndex, numericValue ); | ||
|
||
} | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a request for a change, but I just wanted to share how I would have expected this to look like in case you want to change the implementation.
I would have expected a constant loop from 0 to 35 with a big switch covering all indices and asserting a valid char is used. Something like this:
for ( int charIndex = 0; charIndex < valueLength; charIndex++ ) { | |
char ch = value.charAt( charIndex ); | |
if ( ch == '-' ) { | |
groupIndex++; | |
groupLength = 0; | |
} | |
else { | |
groupLength++; | |
if ( groupLength > GROUP_LENGTHS[groupIndex] ) { | |
return false; | |
} | |
int numericValue = Character.digit( ch, 16 ); | |
if ( numericValue == -1 ) { | |
// not a hex digit | |
return false; | |
} | |
if ( letterCase == LetterCase.LOWER_CASE && numericValue > 9 && !Character.isLowerCase( ch ) ) { | |
return false; | |
} | |
if ( letterCase == LetterCase.UPPER_CASE && numericValue > 9 && !Character.isUpperCase( ch ) ) { | |
return false; | |
} | |
checksum += numericValue; | |
version = extractVersion( version, charIndex, numericValue ); | |
variant = extractVariant( variant, charIndex, numericValue ); | |
} | |
} | |
LOOP: for ( int charIndex = 0; charIndex < 36; charIndex++ ) { | |
char ch = value.charAt( charIndex ); | |
switch ( charIndex ) { | |
// Handle hyphens | |
case 9: | |
case 14: | |
case 19: | |
case 24: | |
if ( ch != '-' ) { | |
return false; | |
} | |
continue LOOP; | |
// Handle M and N parts | |
case 15: | |
version = extractVersion( ch ); | |
break; | |
case 20: | |
variant = extractVariant( ch ); | |
break; | |
} | |
int numericValue = Character.digit( ch, 16 ); | |
if ( numericValue == -1 ) { | |
// not a hex digit | |
return false; | |
} | |
switch ( letterCase ) { | |
case LOWER_CASE: | |
if ( numericValue > 9 && !Character.isLowerCase( ch ) ) { | |
return false; | |
} | |
break; | |
case UPPER_CASE : | |
if ( numericValue > 9 && !Character.isUpperCase( ch ) ) { | |
return false; | |
} | |
break; | |
} | |
checksum += numericValue; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the chars have to be in ASCII encoding, you could go even a step further if you like by introducing an array for the character code points.
private static final byte[] DIGITS = new byte[] {
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, -1, -1,
-1, -1, -1, -1, -1, 10, 11, 12, 13, 14, 15, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 10, 11, 12,
13, 14, 15
};
int charAsInt = ch;
int numericValue = charAsInt >= 0 && charAsInt < DIGITS.length ? DIGITS[charAsInt] : -1;
if ( numericValue == -1 ) {
// not a hex digit
return false;
}
switch ( letterCase ) {
case LOWER_CASE:
if ( ch > 'A' && ch < 'F' ) {
return false;
}
break;
case UPPER_CASE :
if ( ch > 'a' && ch < 'f' ) {
return false;
}
break;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh Christian, I find Daniel's approach more concise and I don't see how your suggestion actually improves upon it? It also doesn't account for the group length, IINM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this creative solution! But if you don't mind I'll stick to my solution, because it's easier to understand from my point of view.
} | ||
return Arrays.binarySearch( this.variant, variant ) != -1; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be some sort of check in here that the version belongs to {1,2,3,4,5} ?
In the UUIDValidatorTest you do verify that versions match, but you're using versions that are (IIUC) are not allowed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope that I understand you correctly: The version can have a value that is not yet officially existing. But maybe in the future there will be a new kind of UUID and the implementation will support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, in general I'm somewhat more in favor of not foreseeing everything that could potentially happen in the future, but it's not a big issue, I'll leave it up to you. I do like your validation implementation, I think it's quite elegant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Really cheers me up.
assertFalse( uuidValidator.isValid( "2d5614ff-891e-07a8-b49e-9758506a9bab", null ) ); | ||
assertFalse( uuidValidator.isValid( "2d5614ff-891e-07a8-b49e-a758506a9bab", null ) ); | ||
assertFalse( uuidValidator.isValid( "2d5614ff-891e-07a8-b49e-b758506a9bab", null ) ); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor detail: is there a point in doing 12 invalid variant checks if only the last group is different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, that's a mistake. I'll fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, thanks
@jrenaat @yrodiere @beikov @gsmet @marko-bekhta I extracted the letter case check to a separate method for enhanced readability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the various reviews everyone.
This looks good and will go into 8.0.0 (note that 8.0.0 is targeting Jakarta EE 10 and will be released when we have all the Jakarta EE 10 specs in their final versions). I adjusted the @since
accordingly.
@dheid I will merge as soon as CI is green. Thanks for this work! |
And merged! Thanks. |
Thanks everyone so much! |
https://hibernate.atlassian.net/browse/HV-1867