Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use JDK UUID#fromString for Java >= 15 #15

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,8 @@ This implementation does away with the string concatenation (and resulting strin
| `UUID#fromString(String)` | 2,613,730 UUIDs/second | ± 25,278 UUIDs/second |
| `FastUUID#parseUUID(String)` | 16,796,302 UUIDs/second | ± 191,695 UUIDs/second |
andrebrait marked this conversation as resolved.
Show resolved Hide resolved

Java 15 uses a [much improved method](https://github.com/openjdk/jdk/commit/ebadfaeb2e1cc7b5ce5f101cd8a539bc5478cf5b) to parse UUIDs from strings. As a result, we just pass calls to `FastUUID#fromString(String)` through to `UUID#fromString(String)` under Java 15 and newer.
andrebrait marked this conversation as resolved.
Show resolved Hide resolved

### UUIDs to strings

We've shown that we can significantly improve upon the stock `UUID#fromString(String)` implementation. Can we achieve similar gains in going from a `UUID` to a `String`? Let's take a look at the stock implementation of `UUID#toString()` from Java 8:
Expand Down
8 changes: 4 additions & 4 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -48,21 +48,21 @@
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.13.1</version>
<version>4.13.2</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-core</artifactId>
<version>1.21</version>
<version>1.32</version>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: we should probably pull this out into a property so we don't have to change the version in three different places. Let's definitely leave that for a separate effort, though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that, no problem. Will extract when I get the chance to open it again on IntelliJ.

<scope>test</scope>
</dependency>

<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>1.21</version>
<version>1.32</version>
<scope>test</scope>
</dependency>
</dependencies>
Expand Down Expand Up @@ -110,7 +110,7 @@
<path>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>1.21</version>
<version>1.32</version>
</path>
</annotationProcessorPaths>
</configuration>
Expand Down
28 changes: 23 additions & 5 deletions src/benchmark/java/com/eatthepath/UUIDBenchmark.java
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ public class UUIDBenchmark {
private final UUID[] uuids = new UUID[PREGENERATED_UUID_COUNT];
private final String[] uuidStrings = new String[PREGENERATED_UUID_COUNT];

private int i = 0;
// Using a long here because on very fast machines the int will overflow
private long i = 0;

@Setup
public void setup() {
Expand All @@ -54,21 +55,38 @@ public void setup() {

@Benchmark
public UUID benchmarkUUIDFromString() {
return UUID.fromString(this.uuidStrings[this.i++ % PREGENERATED_UUID_COUNT]);
resetCounterIfNecessary();
return UUID.fromString(this.uuidStrings[(int) (this.i++ % PREGENERATED_UUID_COUNT)]);
}

@Benchmark
public UUID benchmarkUUIDFromFastParser() {
return FastUUID.parseUUID(this.uuidStrings[this.i++ % PREGENERATED_UUID_COUNT]);
resetCounterIfNecessary();
return FastUUID.parseUUID(this.uuidStrings[(int) (this.i++ % PREGENERATED_UUID_COUNT)]);
}

// Checking if type-check to String won't affect performance
@Benchmark
public UUID benchmarkUUIDFromCharSequenceFastParser() {
resetCounterIfNecessary();
return FastUUID.parseUUID((CharSequence) this.uuidStrings[(int) (this.i++ % PREGENERATED_UUID_COUNT)]);
}

@Benchmark
public String benchmarkUUIDToString() {
return this.uuids[this.i++ % PREGENERATED_UUID_COUNT].toString();
resetCounterIfNecessary();
return this.uuids[(int) (this.i++ % PREGENERATED_UUID_COUNT)].toString();
}

@Benchmark
public String benchmarkFastParserToString() {
return FastUUID.toString(this.uuids[this.i++ % PREGENERATED_UUID_COUNT]);
resetCounterIfNecessary();
return FastUUID.toString(this.uuids[(int) (this.i++ % PREGENERATED_UUID_COUNT)]);
}

private void resetCounterIfNecessary() {
if (this.i == Long.MAX_VALUE) {
this.i = 0;
}
andrebrait marked this conversation as resolved.
Show resolved Hide resolved
}
}
31 changes: 31 additions & 0 deletions src/main/java/com/eatthepath/uuid/FastUUID.java
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
public class FastUUID {

private static final boolean USE_JDK_UUID_TO_STRING;
private static final boolean USE_JDK_UUID_FROM_STRING;

static {
int majorVersion = 0;
Expand All @@ -53,6 +54,7 @@ public class FastUUID {
}

USE_JDK_UUID_TO_STRING = majorVersion >= 9;
USE_JDK_UUID_FROM_STRING = majorVersion >= 15;
}

private static final int UUID_STRING_LENGTH = 36;
Expand Down Expand Up @@ -107,6 +109,35 @@ private FastUUID() {
* described in {@link UUID#toString()}
*/
public static UUID parseUUID(final CharSequence uuidSequence) {
if (USE_JDK_UUID_FROM_STRING && uuidSequence instanceof String) {
// OpenJDK 15 and newer use a faster method for parsing UUIDs
return UUID.fromString((String) uuidSequence);
}

return parseUUIDInternal(uuidSequence);
}

/**
* Parses a UUID from the given string. The string must represent a UUID as described in
* {@link UUID#toString()}.
*
* @param uuidString the string from which to parse a UUID
*
* @return the UUID represented by the given string
*
* @throws IllegalArgumentException if the given string does not conform to the string representation as
* described in {@link UUID#toString()}
*/
public static UUID parseUUID(final String uuidString) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm missing something: why do we need this if we also have parseUUID(CharSequence), which also checks if the CharSequence is a String?

Copy link
Author

@andrebrait andrebrait Nov 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because JDK's method only works with String.

So we can:

  1. Change the signature of our method to parseUUID(String) and possibly break compatibility;
  2. Call CharSequence#toString() but that will lead to memory copies in any non-String objects that implement CharSequence;
  3. Add a method that uses String (so it can be passed to JDK's in case we're running on >= 15) and keep the CharSequence there too and shared the common code path between the two (for JDKs prior to 15, but CharSequence would always go there since we don't want to copy memory);
  4. Number 3 + we check if the CharSequence is a String so we can pass it to JDK's, but we keep using our implementation in case it's not (since it performs closely anyway).

instanceof has basically no cost on modern JVMs anyway, so I chose to go with 4.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah—I think I see where you're going here. If I can reword it to make sure we're on the same page, I think you're saying that you want to add a String-specific variant because that will lower the cost of passing through to UUID#fromString under Java 15 and newer.

If I'm understanding that right, I'd gently push back that I think it's okay to continue using fast-uuid's parser even under Java 15. My rationale is that if somebody is working with non-String CharSequences, they probably have some specific thing they're doing that means they want to avoid String conversion in the first place, and using the sliiiiightly slower FastUuid#parseUUID(CharSequence) will still be faster for those users than converting to a String, then using the slightly faster UUID#fromString(String).

As you say, using instanceof in FastUuid#parseUUID(CharSequence) is a negligible cost, so most users would still get the benefits of using the built-in parser under Java 15 and newer, and it's only the oddballs (like me!) who have a CharSequence-specific use case that would be staying in the fast-uuid space.

Does that make sense? Am I missing your point?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more of a compile-time optimization and maybe being able to deal with JVMs where instanceof isn't super-duper cheap.

If you upgrade from existing code to this, your compiler will link your calls that use String-typed references to the new method. So

String uuidString = someMethodThatReturnsString();
// this will now link directly to the String method instead of the CharSequence one without the user having to change any code
UUID id = FastUUID.parseUUID(uuidString);

CharSequence uuidChars = someMethodThatReturnsCharSequence();
// this will still link to the CharSequence method
UUID idFromCharSequence = FastUUID.parseUUID(uuidChars);

If uuidChars happens to actually be a String, we can still leverage the speed boost by checking that and calling JDK's UUID#fromString.

Providing the overload with String allows us to skip that step without the user changing any code because upon compilation, they'd link to the new method if the reference is String-typed.

A CharSequence reference to a CharSequence non-String object will still just use the existing parser just fine. There is no conversion involved.

I see this as a win-win change. The only downside is that FastUUID.parseUUID(null) would not compile anymore. You'd have to cast the null to either CharSequence or String.

For someone using fast-uuid indirectly, the type-check to String inside the CharSequence overload still allows for the performance boost in Java 15 even without recompilation.

if (USE_JDK_UUID_FROM_STRING) {
// OpenJDK 15 and newer use a faster method for parsing UUIDs
return UUID.fromString(uuidString);
}

return parseUUIDInternal(uuidString);
}

private static UUID parseUUIDInternal(final CharSequence uuidSequence) {
if (uuidSequence.length() != UUID_STRING_LENGTH ||
uuidSequence.charAt(8) != '-' ||
uuidSequence.charAt(13) != '-' ||
Expand Down