Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HV-1867 Add UUID validation #1199

Merged
merged 1 commit into from Feb 28, 2022
Merged

HV-1867 Add UUID validation #1199

merged 1 commit into from Feb 28, 2022

Conversation

dheid
Copy link
Contributor

@dheid dheid commented Jan 9, 2022

@Hibernate-CI
Copy link
Contributor

Can one of the admins add this person to the trusted builders? (reply with: "add to whitelist" or "ok to test")

@dheid
Copy link
Contributor Author

dheid commented Jan 9, 2022

@gsmet Could you have a look at it, please? I think, a character sequence UUID validation could be useful for many users.

@dheid
Copy link
Contributor Author

dheid commented Jan 11, 2022

@yrodiere @marko-bekhta Something else I can do? Have you decided to merge it yet?

@yrodiere
Copy link
Member

yrodiere commented Jan 11, 2022

@dheid I don't decide that kind of things on this project :) I just saw your patch and thought I'd suggest an improvement.

You'll have to wait for @gsmet to have a look, and this can take time since he has a lot to do (on this project and others) these days.

@dheid
Copy link
Contributor Author

dheid commented Feb 20, 2022

@gsmet Do you have time to look at this pull request?

@hibernate-github-bot
Copy link

hibernate-github-bot bot commented Feb 20, 2022

Thanks for your pull request!

This pull request appears to follow the contribution rules.

› This message was automatically generated.

@gsmet
Copy link
Member

gsmet commented Feb 22, 2022

Rebased to get the latest CI fixes.

@gsmet
Copy link
Member

gsmet commented Feb 22, 2022

Hey @dheid

Sorry for the delay, I just got back from PTO and things were crazy before and obviously even crazier after.

This is on my TODO list. I definitely think it has value and it looks like a very nice work.

I'll ping you as soon as I'll get to it.

@dheid
Copy link
Contributor Author

dheid commented Feb 22, 2022

@gsmet Thank you so much!!!

Copy link

@beikov beikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than my suggestion the code looks good to me.

Comment on lines 89 to 118
for ( int charIndex = 0; charIndex < valueLength; charIndex++ ) {

char ch = value.charAt( charIndex );

if ( ch == '-' ) {
groupIndex++;
groupLength = 0;
}
else {

groupLength++;
if ( groupLength > GROUP_LENGTHS[groupIndex] ) {
return false;
}

int numericValue = Character.digit( ch, 16 );
if ( numericValue == -1 ) {
// not a hex digit
return false;
}
if ( letterCase == LetterCase.LOWER_CASE && numericValue > 9 && !Character.isLowerCase( ch ) ) {
return false;
}
if ( letterCase == LetterCase.UPPER_CASE && numericValue > 9 && !Character.isUpperCase( ch ) ) {
return false;
}
checksum += numericValue;
version = extractVersion( version, charIndex, numericValue );
variant = extractVariant( variant, charIndex, numericValue );

}

}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a request for a change, but I just wanted to share how I would have expected this to look like in case you want to change the implementation.
I would have expected a constant loop from 0 to 35 with a big switch covering all indices and asserting a valid char is used. Something like this:

Suggested change
for ( int charIndex = 0; charIndex < valueLength; charIndex++ ) {
char ch = value.charAt( charIndex );
if ( ch == '-' ) {
groupIndex++;
groupLength = 0;
}
else {
groupLength++;
if ( groupLength > GROUP_LENGTHS[groupIndex] ) {
return false;
}
int numericValue = Character.digit( ch, 16 );
if ( numericValue == -1 ) {
// not a hex digit
return false;
}
if ( letterCase == LetterCase.LOWER_CASE && numericValue > 9 && !Character.isLowerCase( ch ) ) {
return false;
}
if ( letterCase == LetterCase.UPPER_CASE && numericValue > 9 && !Character.isUpperCase( ch ) ) {
return false;
}
checksum += numericValue;
version = extractVersion( version, charIndex, numericValue );
variant = extractVariant( variant, charIndex, numericValue );
}
}
LOOP: for ( int charIndex = 0; charIndex < 36; charIndex++ ) {
char ch = value.charAt( charIndex );
switch ( charIndex ) {
// Handle hyphens
case 9:
case 14:
case 19:
case 24:
if ( ch != '-' ) {
return false;
}
continue LOOP;
// Handle M and N parts
case 15:
version = extractVersion( ch );
break;
case 20:
variant = extractVariant( ch );
break;
}
int numericValue = Character.digit( ch, 16 );
if ( numericValue == -1 ) {
// not a hex digit
return false;
}
switch ( letterCase ) {
case LOWER_CASE:
if ( numericValue > 9 && !Character.isLowerCase( ch ) ) {
return false;
}
break;
case UPPER_CASE :
if ( numericValue > 9 && !Character.isUpperCase( ch ) ) {
return false;
}
break;
}
checksum += numericValue;
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the chars have to be in ASCII encoding, you could go even a step further if you like by introducing an array for the character code points.

private static final byte[] DIGITS = new byte[] {
        -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, -1, -1,
        -1, -1, -1, -1, -1, 10, 11, 12, 13, 14, 15, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 10, 11, 12,
        13, 14, 15
};
int charAsInt = ch;
int numericValue = charAsInt >= 0 && charAsInt < DIGITS.length ? DIGITS[charAsInt] : -1;
if ( numericValue == -1 ) {
	// not a hex digit
	return false;
}

switch ( letterCase ) {
	case LOWER_CASE:
		if ( ch > 'A' && ch < 'F' ) {
			return false;
		}
		break;
	case UPPER_CASE :
		if ( ch > 'a' && ch < 'f' ) {
			return false;
		}
		break;
}

Copy link
Contributor

@jrenaat jrenaat Feb 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh Christian, I find Daniel's approach more concise and I don't see how your suggestion actually improves upon it? It also doesn't account for the group length, IINM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this creative solution! But if you don't mind I'll stick to my solution, because it's easier to understand from my point of view.

}
return Arrays.binarySearch( this.variant, variant ) != -1;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be some sort of check in here that the version belongs to {1,2,3,4,5} ?
In the UUIDValidatorTest you do verify that versions match, but you're using versions that are (IIUC) are not allowed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope that I understand you correctly: The version can have a value that is not yet officially existing. But maybe in the future there will be a new kind of UUID and the implementation will support it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, in general I'm somewhat more in favor of not foreseeing everything that could potentially happen in the future, but it's not a big issue, I'll leave it up to you. I do like your validation implementation, I think it's quite elegant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Really cheers me up. ☺️

assertFalse( uuidValidator.isValid( "2d5614ff-891e-07a8-b49e-9758506a9bab", null ) );
assertFalse( uuidValidator.isValid( "2d5614ff-891e-07a8-b49e-a758506a9bab", null ) );
assertFalse( uuidValidator.isValid( "2d5614ff-891e-07a8-b49e-b758506a9bab", null ) );

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor detail: is there a point in doing 12 invalid variant checks if only the last group is different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, that's a mistake. I'll fix that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, thanks

@dheid
Copy link
Contributor Author

dheid commented Feb 23, 2022

@jrenaat @yrodiere @beikov @gsmet @marko-bekhta I extracted the letter case check to a separate method for enhanced readability

Copy link
Member

@gsmet gsmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the various reviews everyone.

This looks good and will go into 8.0.0 (note that 8.0.0 is targeting Jakarta EE 10 and will be released when we have all the Jakarta EE 10 specs in their final versions). I adjusted the @since accordingly.

@gsmet
Copy link
Member

gsmet commented Feb 28, 2022

@dheid I will merge as soon as CI is green. Thanks for this work!

@gsmet gsmet merged commit 233e4ea into hibernate:main Feb 28, 2022
@gsmet
Copy link
Member

gsmet commented Feb 28, 2022

And merged! Thanks.

@dheid
Copy link
Contributor Author

dheid commented Feb 28, 2022

Thanks everyone so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants