-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8311906: Race condition in String constructor #15902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back liach! A progress list of the required criteria for merging this PR into |
The JBS issue is assigned to Roger Riggs, he has changes in progress for this so please coordinate to avoid this duplicate effort. |
I've been working on something similar and am in the process of fine tuning it to ensure that it is performance neutral both in both allocation rate and cpu time and has little/no impact on either latin1 or UTF16 string codings. |
A prototype I thought of was to modify array compression to return the stopping index + the bad 2-byte value, packed like lengthCoder used by String concatenation, at that index to avoid double-reads, but was not sure how to modify hotspot code to accomplish that. As long as we have a valid 2-byte value into the UTF16 value array, we can prevent constructing invalid Strings. |
A side comment: I don't think it's problematic if we incorrectly create a LATIN1 String (such as by downcasting a char to byte), for such a String is valid and it's user's fault (for not publishing their writes to the array in happens-before order). We only think about avoiding writing a values array with no significant byte: we can just write any non-trivial 2-byte into UTF16 to make it valid, and that's why I'm looking for compress to return a That said, my adjustment to UTF8 parsing code is overkill: https://github.com/openjdk/jdk/pull/15902/files#diff-f8131d8a48caf7cfc908417fad241393c2ef55408172e9a28dcaa14b1d73e1fbR575-L580 plus changing initialization of |
@liach This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Closing in favor of #16425. |
In the constructor of String, many locations the user-supplied byte or char arrays are read multiple times with a plain memory access; if a user previously wrote to one of such locations out of happens-before order, distinct plain memory reads may result in different unanticipated values.
The main problem caused by such error is that String constructor may incorrectly produce a UTF16 coder string with all-LATIN1 compatible characters when
COMPACT_STRING
is true, which breaks the contract of String. (The error can happen the other way around, but the resulting LATIN1 string is valid; this patch does not address that.)Thus, I modified the String data compression for non-trusted arrays: a LATIN1 compression first-pass is still done, but if the first compression fails, a second compression pass is done on a trusted (that is, copied from the original data) data where reading would be consistent. The approach takes a toll on UTF16 string construction time, but should not be more costly memory-wise.
A separate routine to decode UTF8 in String constructor that takes byte encoding has the same multi-read problem, that the old
offset--
leads to a problematic double read. This is resolved by copying the data to decode to a local array at first instead of reading from the user-provided byte array. This fix also costs more runtime but at no extra memory cost.Internal APIs such as newStringNoRepl are not guarded by this patch, as they are already trusted to be immutable and unshared.
test/jdk/java/lang/String
tests pass. More testing is needed to see if there are other edge cases not covered.Please review and don't hesitate to critique my approach and patch.
Progress
Integration blocker
Issues
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15902/head:pull/15902
$ git checkout pull/15902
Update a local copy of the PR:
$ git checkout pull/15902
$ git pull https://git.openjdk.org/jdk.git pull/15902/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 15902
View PR using the GUI difftool:
$ git pr show -t 15902
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15902.diff
Webrev
Link to Webrev Comment