Skip to content

8321180: Condition for non-latin1 string size too large exception is off by one #17008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

RogerRiggs
Copy link
Contributor

@RogerRiggs RogerRiggs commented Dec 6, 2023

In the compact string implementation of non-latin1 (UTF16) strings the length is constrained by VM implementation limit on the size a byte array that can be allocated. To produce a useful exception the implementation checks the string size against the maximum byte array size. The exception message is correct

"UTF16 String size is " + len + ", should be less than or equal to " + MAX_LENGTH

The code should match the message, otherwise the exception thrown is:

java.lang.OutOfMemoryError: Requested array size exceeds VM limit 

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8321180: Condition for non-latin1 string size too large exception is off by one (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17008/head:pull/17008
$ git checkout pull/17008

Update a local copy of the PR:
$ git checkout pull/17008
$ git pull https://git.openjdk.org/jdk.git pull/17008/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17008

View PR using the GUI difftool:
$ git pr show -t 17008

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17008.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 6, 2023

👋 Welcome back rriggs! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 6, 2023
@openjdk
Copy link

openjdk bot commented Dec 6, 2023

@RogerRiggs The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Dec 6, 2023
@mlbridge
Copy link

mlbridge bot commented Dec 6, 2023

Webrevs

throw new OutOfMemoryError("UTF16 String size is " + len +
", should be less than " + MAX_LENGTH);
", should be less than or equal to " + MAX_LENGTH);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
", should be less than or equal to " + MAX_LENGTH);
", should be less than " + MAX_LENGTH);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second @ExE-Boss's comment.

Copy link
Contributor Author

@RogerRiggs RogerRiggs Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to recheck.

Correct exception message text
Removed unused imports
byte[] nonAscii = "\u0100".getBytes();
int nonAsciiSize = nonAscii.length;
int asciisize = byteSize - nonAsciiSize;
byte[] arr = new byte[asciisize + nonAsciiSize];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
byte[] arr = new byte[asciisize + nonAsciiSize];
byte[] arr = new byte[byteSize];

Note that asciisize + nonAsciiSize = byteSize - nonAsciiSize + nonAsciiSize = byteSize, even in the presence of overflows. Then asciisize becomes useless.

int nonAsciiSize = nonAscii.length;
int asciisize = byteSize - nonAsciiSize;
byte[] arr = new byte[asciisize + nonAsciiSize];
Arrays.fill(arr, (byte)'x'); // fill with latin1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(byte) 0 is a valid UTF-8 sequence after all, so there is no need to fill the array with 'x'.

char[] nonAscii = "\u0100".toCharArray();
int nonAsciiSize = nonAscii.length;
int asciisize = size - nonAsciiSize;
char[] arr = new char[asciisize + nonAsciiSize];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

int nonAsciiSize = nonAscii.length;
int asciisize = size - nonAsciiSize;
char[] arr = new char[asciisize + nonAsciiSize];
Arrays.fill(arr, 'x'); // fill with latin1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

@rgiulietti
Copy link
Contributor

LGTM

@openjdk
Copy link

openjdk bot commented Dec 9, 2023

@RogerRiggs This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8321180: Condition for non-latin1 string size too large exception is off by one

Reviewed-by: rgiulietti

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 43 new commits pushed to the master branch:

  • ce10844: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays)
  • 5c12a18: 8320790: Update --release 22 symbol information for JDK 22 build 27
  • 7180088: 8321429: (fc) FileChannel.lock creates a FileKey containing two long index values, they could be stored as int values
  • 0c178be: 8321206: Make Locale related system properties StaticProperty
  • 6c13a30: 8312307: Obsoleted code in hb-jdk-font.cc
  • 5e6bfc5: 8321539: Minimal build is broken by JDK-8320935
  • 2c2d4d2: 8321485: remove serviceability/attach/ConcAttachTest.java from problemlist on macosx
  • 0eb299a: 8316141: Improve CEN header validation checking
  • b893a2b: 8321597: Use .template consistently for files treated as templates by the build
  • 05f9509: 8321374: Add a configure option to explicitly set CompanyName property in VersionInfo resource for Windows exe/dll
  • ... and 33 more: https://git.openjdk.org/jdk/compare/91ffdfb1fcacbb95b93491d412e506695198946e...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 9, 2023
@RogerRiggs
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Dec 12, 2023

Going to push as commit 4fb5c12.
Since your change was applied there have been 65 commits pushed to the master branch:

  • d5a96e3: 8321621: Update JCov version to 3.0.16
  • aadf368: 6230751: [Fmt-Ch] Recursive MessageFormats in ChoiceFormats ignore indicated subformats
  • a3447ec: 8321827: Remove unnecessary suppress warnings annotations from the printing processor
  • b25ed57: 8270269: Desktop.browse method fails if earlier CoInitialize call as COINIT_MULTITHREADED
  • df4ed7e: 8321739: Source launcher fails with "Not a directory" error
  • 5718039: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes
  • c516852: 8321889: JavaDoc method references with wrong (nested) type
  • 7d90396: 8321422: Test gc/g1/pinnedobjs/TestPinnedObjectTypes.java times out after completion
  • 6f48240: 8321729: Remove 'orb' field in RMIConnector
  • e1fd663: 8311306: Test com/sun/management/ThreadMXBean/ThreadCpuTimeArray.java failed: out of expected range
  • ... and 55 more: https://git.openjdk.org/jdk/compare/91ffdfb1fcacbb95b93491d412e506695198946e...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 12, 2023
@openjdk openjdk bot closed this Dec 12, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 12, 2023
@openjdk
Copy link

openjdk bot commented Dec 12, 2023

@RogerRiggs Pushed as commit 4fb5c12.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@RogerRiggs RogerRiggs deleted the 8321180-utf16-greater-equal branch November 27, 2024 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants