Skip to content

8263677: Improve Character.isLowerCase/isUpperCase lookups #3028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

cl4es
Copy link
Member

@cl4es cl4es commented Mar 16, 2021

This patch changes the otherLowercase / otherUppercase bits to be set if either the codepoint is of type LOWERCASE_LETTER and UPPERCASE_LETTER, or the Unicode Other_Lowercase / Other_Uppercase property is set. This simplifies the lookup in Character.isLowerCase/isUpperCase to a single table lookup, which appears to be healthy for performance.

I also took the opportunity to clean up the somewhat dated GenerateCharacter utility class.

Testing: tier1-3


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8263677: Improve Character.isLowerCase/isUpperCase lookups

Reviewers

Download

To checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3028/head:pull/3028
$ git checkout pull/3028

To update a local copy of the PR:
$ git checkout pull/3028
$ git pull https://git.openjdk.java.net/jdk pull/3028/head

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 16, 2021

👋 Welcome back redestad! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Mar 16, 2021

@cl4es The following labels will be automatically applied to this pull request:

  • build
  • i18n

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added build build-dev@openjdk.org i18n i18n-dev@openjdk.org labels Mar 16, 2021
@cl4es
Copy link
Member Author

cl4es commented Mar 16, 2021

Baseline:

Benchmark               (codePoint)  Mode  Cnt   Score   Error  Units
Characters.isLowerCase            9  avgt    5  13.809 ± 0.032  ns/op
Characters.isLowerCase           65  avgt    5  13.811 ± 0.052  ns/op
Characters.isLowerCase           97  avgt    5  12.552 ± 0.057  ns/op
Characters.isLowerCase          128  avgt    5  13.823 ± 0.076  ns/op
Characters.isLowerCase          170  avgt    5  13.811 ± 0.066  ns/op
Characters.isLowerCase          223  avgt    5  12.556 ± 0.058  ns/op
Characters.isLowerCase          410  avgt    5  19.466 ± 0.104  ns/op
Characters.isLowerCase          430  avgt    5  20.718 ± 0.100  ns/op
Characters.isUpperCase            9  avgt    5  12.556 ± 0.056  ns/op
Characters.isUpperCase           65  avgt    5  12.559 ± 0.067  ns/op
Characters.isUpperCase           97  avgt    5  12.555 ± 0.055  ns/op
Characters.isUpperCase          128  avgt    5  12.559 ± 0.060  ns/op
Characters.isUpperCase          170  avgt    5  12.556 ± 0.036  ns/op
Characters.isUpperCase          223  avgt    5  12.554 ± 0.055  ns/op
Characters.isUpperCase          410  avgt    5  20.722 ± 0.129  ns/op
Characters.isUpperCase          430  avgt    5  19.459 ± 0.091  ns/op

Patch:

Benchmark               (codePoint)  Mode  Cnt   Score   Error  Units
Characters.isLowerCase            9  avgt    5  12.556 ± 0.035  ns/op
Characters.isLowerCase           65  avgt    5  12.562 ± 0.073  ns/op
Characters.isLowerCase           97  avgt    5  12.551 ± 0.062  ns/op
Characters.isLowerCase          128  avgt    5  12.553 ± 0.039  ns/op
Characters.isLowerCase          170  avgt    5  12.554 ± 0.051  ns/op
Characters.isLowerCase          223  avgt    5  12.552 ± 0.035  ns/op
Characters.isLowerCase          410  avgt    5  18.833 ± 0.068  ns/op
Characters.isLowerCase          430  avgt    5  18.832 ± 0.074  ns/op
Characters.isUpperCase            9  avgt    5  12.555 ± 0.050  ns/op
Characters.isUpperCase           65  avgt    5  12.557 ± 0.041  ns/op
Characters.isUpperCase           97  avgt    5  12.554 ± 0.056  ns/op
Characters.isUpperCase          128  avgt    5  12.554 ± 0.055  ns/op
Characters.isUpperCase          170  avgt    5  12.555 ± 0.054  ns/op
Characters.isUpperCase          223  avgt    5  12.553 ± 0.036  ns/op
Characters.isUpperCase          410  avgt    5  18.831 ± 0.099  ns/op
Characters.isUpperCase          430  avgt    5  18.826 ± 0.047  ns/op

@cl4es cl4es marked this pull request as ready for review March 16, 2021 19:44
@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 16, 2021
@mlbridge
Copy link

mlbridge bot commented Mar 16, 2021

Webrevs

Copy link
Member

@erikj79 erikj79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from build point of view. I like the code cleanups.

@openjdk
Copy link

openjdk bot commented Mar 16, 2021

@cl4es This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8263677: Improve Character.isLowerCase/isUpperCase lookups

Reviewed-by: erikj, ihse, naoto, rriggs

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 22 new commits pushed to the master branch:

  • 000012a: 8148937: (str) Adapt StringJoiner for Compact Strings
  • a707fcb: 8263723: [BACKOUT] MoveAndUpdateClosure::do_addr calls function with side-effects in an assert
  • 86e9cd9: 8263667: Avoid running GitHub actions on branches named pr/*
  • 41276eb: 8259863: doc: JShell snippet doesn't compile
  • f9f2eef: 8263434: Dangling references after MethodComparator::methods_EMCP
  • 23fc2a4: 8263688: Coordinate equals, hashCode and compareTo of JavacFileManager.PathAndContainer
  • d1baed6: 8263672: fatal error: no reachable node should have no use
  • 086a66a: 8261095: Add test for clhsdb "symbol" command
  • ec95a5c: 8263410: ListModel javadoc refers to non-existent interface
  • 7b9d256: 8261262: Kitchensink24HStress.java crashed with EXCEPTION_ACCESS_VIOLATION
  • ... and 12 more: https://git.openjdk.java.net/jdk/compare/e33bfb3977153e9c74261d1836b2cb153c47c192...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 16, 2021
@@ -314,7 +307,7 @@ static void FAIL(String s) {
static long[] buildMap(UnicodeSpec[] data, SpecialCaseMap[] specialMaps, PropList propList)
{
long[] result;
if (bLatin1 == true) {
if (bLatin1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps shorten to:
final long[] result = new long[bLatin1 ? 256 : 1 << 16];

for (int k = 0; k < m; k++) {
buffer[ptr+k] = map[i+k];
}
if (m >= 0) System.arraycopy(map, i, buffer, ptr, m);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If m == 0, you could skip the arraycopy.

for (int j = 0; j < ptr; j++) {
newdata[j] = buffer[j];
}
if (ptr >= 0) System.arraycopy(buffer, 0, newdata, 0, ptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, if ptr == 0, skip the arraycopy

for (int j=0; j<args.length; ++j) {
desc.append(" " + args[j]);
for (String arg : args) {
desc.append(" " + arg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid string concat:
desc.append(' ').append(arg);

@cl4es
Copy link
Member Author

cl4es commented Mar 17, 2021

/integrate

@openjdk openjdk bot closed this Mar 17, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Mar 17, 2021
@cl4es cl4es deleted the character_case branch March 17, 2021 15:22
@openjdk
Copy link

openjdk bot commented Mar 17, 2021

@cl4es Since your change was applied there have been 23 commits pushed to the master branch:

  • b63b5d4: 8263732: ProblemList serviceability/sa/ClhsdbSymbol.java on ZGC
  • 000012a: 8148937: (str) Adapt StringJoiner for Compact Strings
  • a707fcb: 8263723: [BACKOUT] MoveAndUpdateClosure::do_addr calls function with side-effects in an assert
  • 86e9cd9: 8263667: Avoid running GitHub actions on branches named pr/*
  • 41276eb: 8259863: doc: JShell snippet doesn't compile
  • f9f2eef: 8263434: Dangling references after MethodComparator::methods_EMCP
  • 23fc2a4: 8263688: Coordinate equals, hashCode and compareTo of JavacFileManager.PathAndContainer
  • d1baed6: 8263672: fatal error: no reachable node should have no use
  • 086a66a: 8261095: Add test for clhsdb "symbol" command
  • ec95a5c: 8263410: ListModel javadoc refers to non-existent interface
  • ... and 13 more: https://git.openjdk.java.net/jdk/compare/e33bfb3977153e9c74261d1836b2cb153c47c192...master

Your commit was automatically rebased without conflicts.

Pushed as commit e152cc0.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build-dev@openjdk.org i18n i18n-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants