Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8259045: Exception message from saproc.dll is garbled on Windows with Japanese locale #1928

Closed
wants to merge 1 commit into from

Conversation

YaSuenag
Copy link
Member

@YaSuenag YaSuenag commented Jan 4, 2021

I got garbled exception message as following when I run livenmethods CLHSDB command:

sun.jvm.hotspot.debugger.DebuggerException : ?w???????W

My Windows laptop is set Japanese Locale, garbled message was written in Japanese.
saproc.dll would throw exception via ThrowNew() JNI function, but it accepts UTF-8 encoded message. However FormatMessage() Windows API might not return UTF-8 encoded string on Japanese locale.

java.dll (libjava,so) provides good functions to resolve this issue. We can convert localized (non ascii) chars to UTF-8 string. I use them in this PR and remove FormatMessage() call from sadis.c.
And also I remove -D_MBCS from compiler option because MBCS has been already deprecated - it does not seem to add to any other executables.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8259045: Exception message from saproc.dll is garbled on Windows with Japanese locale

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1928/head:pull/1928
$ git checkout pull/1928

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 4, 2021

👋 Welcome back ysuenaga! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 4, 2021

@YaSuenag The following labels will be automatically applied to this pull request:

  • build
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added serviceability serviceability-dev@openjdk.org build build-dev@openjdk.org labels Jan 4, 2021
@YaSuenag YaSuenag marked this pull request as ready for review January 4, 2021 11:45
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 4, 2021
@mlbridge
Copy link

mlbridge bot commented Jan 4, 2021

Webrevs

Copy link
Member

@erikj79 erikj79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build changes look ok to me, but someone from serviceability needs to comment on the actual change.

@openjdk
Copy link

openjdk bot commented Jan 4, 2021

@YaSuenag This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8259045: Exception message from saproc.dll is garbled on Windows with Japanese locale

Reviewed-by: erikj, cjplummer, iklam

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 37 new commits pushed to the master branch:

  • 940b053: 8259232: Bad JNI lookup during printing
  • df721f0: 8259291: Cleanup unnecessary local variables
  • d20d2fa: 8258143: Update --release 16 symbol information for JDK 16 build 30 or later
  • 3be6e06: 8259312: VerifyCACerts.java fails as soneraclass2ca cert will expire in 90 days
  • 722f236: 8259231: Epsilon: improve performance under contention during virtual space expansion
  • f6cb8c5: 8258908: Remove JVM option CleanChunkPoolAsync
  • c0540ff: 8231627: ThreadsListHandleInErrorHandlingTest.java fails in printing all threads
  • 7e01bc9: 8255264: Support for identifying the full range of IPv4 localhost addresses on Windows
  • 8a05d60: 8259042: Inconsistent use of general primitives loops
  • e3b9da1: 8259287: AbstractCompiler marks const in wrong position for is_c1/is_c2/is_jvmci
  • ... and 27 more: https://git.openjdk.java.net/jdk/compare/a2a3f4a3dcdf3aa7eb9f17d892d1ffe729733330...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 4, 2021
Copy link
Contributor

@plummercj plummercj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are probably 25 or so places in our code where we use FormatMessage to get the error message. Are these all going to run into the same FormateMessage bug?

Also, it's not clear to me why you are getting garbled text in the first place. You said "Windows API might not return UTF-8 encoded string on Japanese locale." Why is that the case?


#ifdef _WINDOWS
static int getLastErrorString(char *buf, size_t len)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this being removed because a copy already exists in jni_util_md.c, and you now have access to this copy because you are linking against java.dll?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly.

@YaSuenag
Copy link
Member Author

YaSuenag commented Jan 5, 2021

There are probably 25 or so places in our code where we use FormatMessage to get the error message. Are these all going to run into the same FormateMessage bug?

jdk.hotspot.agent do not have FormatMessage() call in other place.
Did you say about whole JDK code? I haven't checked all of them, but some code (e.g. net_util_md.c) uses JNU_ThrowByName() which is provided by java.dll.

Also, it's not clear to me why you are getting garbled text in the first place. You said "Windows API might not return UTF-8 encoded string on Japanese locale." Why is that the case?

Japanese locale on Windows uses CP932., so FormatMessage() would return error message with Japanese chars which are encoded by CP932. It is not UTF-8.

@iklam
Copy link
Member

iklam commented Jan 5, 2021

Now the saproc DLL has an external reference to getLastErrorString, JNU_NewStringPlatform and JNU_NewObjectByName on all platforms. Do we need to modify the makefiles for platforms other than Windows?

@plummercj
Copy link
Contributor

jdk.hotspot.agent do not have FormatMessage() call in other place.
Did you say about whole JDK code? I haven't checked all of them, but some code (e.g. net_util_md.c) uses JNU_ThrowByName() which is provided by java.dll.

Yes, I was referring to all JDK code. Given how many references there are to FormatMessage(), I'm surprised this issue has not turned up before. I was wondering if there is something unique about the SA reference that makes the issue only turn up there.

@YaSuenag
Copy link
Member Author

YaSuenag commented Jan 5, 2021

Now the saproc DLL has an external reference to getLastErrorString, JNU_NewStringPlatform and JNU_NewObjectByName on all platforms. Do we need to modify the makefiles for platforms other than Windows?

@iklam Agree, so I added LIBS_unix := -ljava to make/modules/jdk.hotspot.agent/Lib.gmk in this change. It works fine on Windows x64 and Linux x64.

@iklam
Copy link
Member

iklam commented Jan 5, 2021

jdk.hotspot.agent do not have FormatMessage() call in other place.
Did you say about whole JDK code? I haven't checked all of them, but some code (e.g. net_util_md.c) uses JNU_ThrowByName() which is provided by java.dll.

Yes, I was referring to all JDK code. Given how many references there are to FormatMessage(), I'm surprised this issue has not turned up before. I was wondering if there is something unique about the SA reference that makes the issue only turn up there.

I looked at a cases in the JDK code where Java_sun_security_pkcs11_wrapper_PKCS11_connect() calls FormatMessage(). It eventually passes the char* to jni_ThrowNew, which eventually gets to Exceptions::new_exception with to_utf8_safe==safe_to_utf8. This means the message will be converted with java_lang_String::create_from_str(), which does NOT call JNU_NewStringPlatform. So I think in this case, the error message will also be garbled.

java.base/windows/native/libnet/Inet4AddressImpl.c seems to have a similar problem in the ping4 function.

But, I don't know how to write a simple test case for the above.

@YaSuenag
Copy link
Member Author

YaSuenag commented Jan 5, 2021

@iklam Thanks for your review!

I looked at a cases in the JDK code where Java_sun_security_pkcs11_wrapper_PKCS11_connect() calls FormatMessage(). It eventually passes the char* to jni_ThrowNew, which eventually gets to Exceptions::new_exception with to_utf8_safe==safe_to_utf8. This means the message will be converted with java_lang_String::create_from_str(), which does NOT call JNU_NewStringPlatform. So I think in this case, the error message will also be garbled.

java.base/windows/native/libnet/Inet4AddressImpl.c seems to have a similar problem in the ping4 function.

Agree. ping4() calls NET_ThrowNew() which passes ThrowNew() JNI function without modifying arguments.
I looked at other cases in JDK code, following places has possible to garble:

  • throwByName() @ jdk.crypto.cryptoki/share/native/libj2pkcs11/p11_util.c
    • Java_sun_security_pkcs11_Secmod_nssLoadLibrary() @ j2secmod_md.c
  • jdk.jdi/share/native/libdt_shmem/SharedMemoryTransport.c
    • throwShmemException() @ SharedMemoryTransport.c

But, I don't know how to write a simple test case for the above.

Agree, we might need to fix garbled messages one by one when we find out issues.
IMHO we need to establish rule that we have to use JNU_NewStringPlatform() if we want to throw exception with message from platform (FormatMessage(), strerror()) from JNI.

@plummercj
Copy link
Contributor

Given that this seems to be a common problem in our code, and likely a very very old problem at that, why has it never been reported before? I'm not questioning the fix except to the extent that I'm questioning our understanding of the problem.

@iklam
Copy link
Member

iklam commented Jan 5, 2021

Given that this seems to be a common problem in our code, and likely a very very old problem at that, why has it never been reported before? I'm not questioning the fix except to the extent that I'm questioning our understanding of the problem.

I think it's probably because the errors only happen in very rare cases. For example, when a DLL is missing from the JAVA_HOME. For more common cases, such as Process_impl.c, the code has been rewritten to use FormatMessageW, which is unicode.

@YaSuenag
Copy link
Member Author

YaSuenag commented Jan 5, 2021

Given that this seems to be a common problem in our code, and likely a very very old problem at that, why has it never been reported before? I'm not questioning the fix except to the extent that I'm questioning our understanding of the problem.

I think it's probably because the errors only happen in very rare cases. For example, when a DLL is missing from the JAVA_HOME. For more common cases, such as Process_impl.c, the code has been rewritten to use FormatMessageW, which is unicode.

In addition, this problem can be seen on non-ASCII locales such as CJK, so I guess many people overlooked it.
And also I think it is not just a problem on Windows because some older Linux machines on Japan set locale to EUC JP.

In case of ProcessImpl_md.c which @iklam pointed, result of FormatMessageW() which is encoded WideString (UTF-16) will be converted to UTF-8 with WideCharToMultiByte() at win32Error(), so it should work fine.

@YaSuenag
Copy link
Member Author

YaSuenag commented Jan 6, 2021

/integrate

@openjdk openjdk bot closed this Jan 7, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 7, 2021
@openjdk
Copy link

openjdk bot commented Jan 7, 2021

@YaSuenag Since your change was applied there have been 39 commits pushed to the master branch:

  • 28e1f4d: 8247957: remove doclint support for HTML 4
  • 80544e4: 8250564: Remove terminally deprecated constructor in GSSUtil
  • 940b053: 8259232: Bad JNI lookup during printing
  • df721f0: 8259291: Cleanup unnecessary local variables
  • d20d2fa: 8258143: Update --release 16 symbol information for JDK 16 build 30 or later
  • 3be6e06: 8259312: VerifyCACerts.java fails as soneraclass2ca cert will expire in 90 days
  • 722f236: 8259231: Epsilon: improve performance under contention during virtual space expansion
  • f6cb8c5: 8258908: Remove JVM option CleanChunkPoolAsync
  • c0540ff: 8231627: ThreadsListHandleInErrorHandlingTest.java fails in printing all threads
  • 7e01bc9: 8255264: Support for identifying the full range of IPv4 localhost addresses on Windows
  • ... and 29 more: https://git.openjdk.java.net/jdk/compare/a2a3f4a3dcdf3aa7eb9f17d892d1ffe729733330...master

Your commit was automatically rebased without conflicts.

Pushed as commit 67c2211.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@YaSuenag YaSuenag deleted the JDK-8259045 branch January 7, 2021 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build-dev@openjdk.org integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org
4 participants