-
Notifications
You must be signed in to change notification settings - Fork 6k
8275007: Java fails to start with null charset if LC_ALL is set to certain locales #6282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back naoto! A progress list of the required criteria for merging this PR into |
Webrevs
|
@naotoj This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 186 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to add test?
I would if I could, but specifying the environment dependent settings in the test would be fragile and error-prone, so I did not add a test but marked the issue as |
@naotoj |
This is about running on unusual configurations where the native encoding (native.encoding and sun.jnu.encoding) doesn't name a charset in java.base. There has been several reports of NPE in such cases. Now that the default charset is UTF-8 it means there is a sane fallback so the VM can startup. |
Thanks @AlanBateman . |
Yes, it's a JDK internal property with the charset name to use when decoding or encoding file names (not the file content). The issue in this PR is about what to do when starting in unsupported configurations. In the recent releases the JDK will typically NPE or fail with an exception. Since JEP 400 this no longer impacts the default charset because it is UTF-8. This moves the problem on to choosing a fallback for places in the JDK that use the value of native.encoding or sun.jnu.encoding. I think the only choices to either fail at startup or default to UTF-8 as proposed. |
Thanks, @AlanBateman for the explanation. |
If this issue is not a part of JEP-400, I think UTF_8 may not be only solution for sun/nio/fs/Util.java. |
The same rationale as JEP 400, that UTF-8 is the de-facto encoding these days. This is for the exceptional case where JDK does not support the native encoding, choosing fallback as UTF-8 is not a bad choice IMO. |
Just to add to Naoto's comment. When the JDK starts in unusual/unsupported configurations then it needs a default for the stdout/stderr print streams, for encoding/decoding files paths in java.io, and for encoding/decoding files paths in the java.nio.file implementation. It's important that the latter two are the same, having one be UTF-8 and the other be 8859-1 won't work. There may be an argument that the JDK should emit a warning at startup. |
Thanks @AlanBateman . I'm very sorry for your confusion... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Updated to emit a warning, if |
This doesn't surprise me. It would be be useful if you could do the same experiment with main line and JDK 17 too. There have been several reports of NPE when trying to startup in environments where the locale does not map to a charset that the JDK supports. In your output I see "WARNING: The encoding of the underlying platform's file system is not supported by the JVM: PT154". Thereafter what you are seeing a recursive initialisation issue. To break this cycle we need jnuEncoding to be set before initialising the file system. |
@takiguc, thank you for trying out the fix and informing us of the issue. I've been using Oracle Linux 7.6 with the same locale setting as yours, but I've never seen that failure. Will need to recreate the error and investigate. |
I don't have Oracle Linux...
I tried following command, "sun.jnu.encoding" system property was referred following source code.
|
In your previous post you used env LC_ALL=kk_KZ.pt154 java Hello.java. Would it be possible to repeat that with JDK 17 and the main-line (latest JDK 18 EA build is okay). I'm wondering if you see the NPE. In any case, our next step here will be to have fallback saved during early startup so that file system and other places don't do the lookup. This seems the sanest way to avoid the recursive initialisation issue. |
I could see following output on my CentOS7.
|
I applied same kind of patch into jdk17u.
|
I reproduced those failures on Debian Linux. Corrected the issue by replacing unsupported jnu encoding with |
if (!jnuEncodingSupported) { | ||
props.setProperty("sun.jnu.encoding", "UTF-8"); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could replace the two fields with something like "notSupportedJnuEncoding" that is only set when not supported. That keeps the additional code in initPhase1 to a minimum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced as suggested.
System.err.printf( | ||
"WARNING: The encoding of the underlying platform's" + | ||
" file system is not supported by the JVM: %s%n", | ||
jnuEncoding); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can drop "by the JVM" from the warning.
Also just to point out that this is running in initPhase3. I'm pretty sure the use of formatters here will execute code that checks VM.isBooted. It might be better to just use string concatenation and avoid loading formatters here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// Check if sun.jnu.encoding is supporetd or not | ||
var jnuEncoding = props.getProperty("sun.jnu.encoding"); | ||
if (jnuEncoding == null || !Charset.isSupported(jnuEncoding)) { | ||
notSupportedJnuEncoding = jnuEncoding == null ? "null" : jnuEncoding; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are nearly there. When setting notSupportedJnuEncoding then we also need to set the sun.jnu.encoding property to "UTF-8", setting it in initPhase3 is too late because the file system may be used using initPhase2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially, I thought that was the case, but I mistook your suggestion as setting only notSupportedJnuEncoding
in initPhase1. Reverted the fallback location into initPhase1()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially, I thought that was the case, but I mistook your suggestion as setting only
notSupportedJnuEncoding
in initPhase1. Reverted the fallback location intoinitPhase1()
.
np, it's always tricky to touch code that executes early in the startup. I think we are good now.
@naotoj , sorry for bothering you again.
|
Fallback to UTF-8 in jni_uti.c::InitializeEncoding() as well, if the jnu encoding is not supported. @takiguc, would you try the changeset on your environment? |
Hello @naotoj .
These were what I expected. |
private static final Charset jnuEncoding = | ||
Charset.forName( | ||
GetPropertyAction.privilegedGetProperty("sun.jnu.encoding"), | ||
sun.nio.cs.UTF_8.INSTANCE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Startup will ensure that sun.jnu.encoding has a value that is a supported encoding so I assume the the fallback in sun.nio.ch.Util is no longer needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted the fix.
return 0; | ||
} | ||
(*env)->GetByteArrayRegion(env, hab, 0, len, (jbyte *)result); | ||
result[len] = 0; /* NULL-terminate */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change to newSizedStringJava is okay but I think the change to getStringBytes needs more eyes. Will it fall through to the code path sets it to FAST_UTF_8 during startup or does that happen later? If later then I assume there is a race with other threads that read jnuEncoding and potential for the JNI global ref to be freed from under their feet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Removed the premature jnuEncoding
initialization in InitializeEncoding()
.
Eliminated premature jnuEncoding initializaiton in jni_util.c::InitializeEncoding()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, the latest version looks good.
/integrate |
Going to push as commit b8453eb.
Your commit was automatically rebased without conflicts. |
Please review the subject fix. In light of JEP400, Java runtime can/should start in UTF-8 charset if the underlying native encoding is not supported.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6282/head:pull/6282
$ git checkout pull/6282
Update a local copy of the PR:
$ git checkout pull/6282
$ git pull https://git.openjdk.java.net/jdk pull/6282/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 6282
View PR using the GUI difftool:
$ git pr show -t 6282
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6282.diff