Skip to content
This repository has been archived by the owner. It is now read-only.

8269335: Unable to load svml library #143

Closed
wants to merge 8 commits into from
Closed

Conversation

@sviswa7
Copy link
Contributor

@sviswa7 sviswa7 commented Jun 24, 2021

The OpenJDK 17 early access build from jdk.java.net fails to load svml library.
We need to give full svml library path to dll_load call for the load to succeed.,


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk17 pull/143/head:pull/143
$ git checkout pull/143

Update a local copy of the PR:
$ git checkout pull/143
$ git pull https://git.openjdk.java.net/jdk17 pull/143/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 143

View PR using the GUI difftool:
$ git pr show -t 143

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk17/pull/143.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Jun 24, 2021

👋 Welcome back sviswanathan! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

@openjdk openjdk bot commented Jun 24, 2021

@sviswa7 The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@sviswa7 sviswa7 marked this pull request as ready for review Jun 24, 2021
@openjdk openjdk bot added the rfr label Jun 24, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Jun 24, 2021

Copy link
Member

@PaulSandoz PaulSandoz left a comment

Unsure how this slipped through when we tested, but i think we need an explicit test. The test can execute a Java process with logging enabled and examine its log output (since the loading of SVML is logged). Searching the test code base should find some examples to copy.

@@ -7000,7 +7001,12 @@ address generate_avx_ghash_processBlocks() {
// Get svml stub routine addresses
void *libsvml = NULL;
char ebuf[1024];
libsvml = os::dll_load(JNI_LIB_PREFIX "svml" JNI_LIB_SUFFIX, ebuf, sizeof ebuf);
char dll_name[JVM_MAXPATHLEN];
int ret = jio_snprintf(dll_name, sizeof(dll_name), "%s%slib%s" JNI_LIB_PREFIX "svml" JNI_LIB_SUFFIX,
Copy link
Member

@PaulSandoz PaulSandoz Jun 24, 2021

Can we use os::dll_locate_lib?

@DamonFool
Copy link
Member

@DamonFool DamonFool commented Jun 24, 2021

I used to test the svml in JDK17.
It seems to work on Linux.

So how can I reproduce this bug?
Thanks.

@sviswa7
Copy link
Contributor Author

@sviswa7 sviswa7 commented Jun 24, 2021

@DamonFool The problem shows up only with the build on jdk.java.net. I also didn't see this problem while testing with internal builds.

@DamonFool
Copy link
Member

@DamonFool DamonFool commented Jun 24, 2021

@DamonFool The problem shows up only with the build on jdk.java.net. I also didn't see this problem while testing with internal builds.

So why it works for our internal builds but fails with the builds on jdk.java.net?

@sviswa7
Copy link
Contributor Author

@sviswa7 sviswa7 commented Jun 24, 2021

@PaulSandoz I have attempted to use dll_locate_lib. Please see if this looks ok to you.

@sviswa7
Copy link
Contributor Author

@sviswa7 sviswa7 commented Jun 24, 2021

@DamonFool Very good question. It took me sometime to figure this out. There is something in java.exe which makes the difference. If you replace the java.exe in the internal build with the one on jdk.java.net you will be able to reproduce the problem. Giving the full path to dll_load in libjvm.so as in this patch fixes the issue.

@DamonFool
Copy link
Member

@DamonFool DamonFool commented Jun 25, 2021

@DamonFool Very good question. It took me sometime to figure this out. There is something in java.exe which makes the difference. If you replace the java.exe in the internal build with the one on jdk.java.net you will be able to reproduce the problem. Giving the full path to dll_load in libjvm.so as in this patch fixes the issue.

I didn't get it tested on windows before.

Does the original code also work on Windows with your internal builds?
If so, I think more investigation is needed to figure out the root cause.
Thanks.

@sviswa7
Copy link
Contributor Author

@sviswa7 sviswa7 commented Jun 25, 2021

@DamonFool The internal builds worked on both the platforms (Linux and Windows). Placing the built jvm.dll or libjvm.so with this patch in the jdk.java.net build also works on both the platforms.
Please let me know if you have a better solution. This is the best that I could come up with. Please do keep in mind, we don't have too much time before RDP2.

@PaulSandoz
Copy link
Member

@PaulSandoz PaulSandoz commented Jun 25, 2021

/cc hotspot-runtime

@openjdk
Copy link

@openjdk openjdk bot commented Jun 25, 2021

@PaulSandoz
The hotspot-runtime label was successfully added.

@PaulSandoz
Copy link
Member

@PaulSandoz PaulSandoz commented Jun 25, 2021

@PaulSandoz I have attempted to use dll_locate_lib. Please see if this looks ok to you.

Be useful if someone from hotspot-runtime can look, @dholmes-ora WDYT?

We still need to add a test.

@DamonFool
Copy link
Member

@DamonFool DamonFool commented Jun 25, 2021

@DamonFool The internal builds worked on both the platforms (Linux and Windows). Placing the built jvm.dll or libjvm.so with this patch in the jdk.java.net build also works on both the platforms.
Please let me know if you have a better solution. This is the best that I could come up with. Please do keep in mind, we don't have too much time before RDP2.

I'm not sure if the change really fix the problem since the root cause still remains unknown.

Replacing the jvm.dll or libjvm.so makes very little sense to me.

A better way to verify the fix is re-building the whole image with the patch on Oracle's platform.
But that is still not enough.
Even though it gets passed on Oracle's machines, we still don't know if it really get fixed on all others' platforms.

I don't have a Windows machine so it's hard for me to investigate it.
Thanks.

@openjdk openjdk bot removed the rfr label Jun 25, 2021
@sviswa7
Copy link
Contributor Author

@sviswa7 sviswa7 commented Jun 25, 2021

@PaulSandoz I have added a test. Please let me know if it looks ok.

@openjdk openjdk bot added the rfr label Jun 25, 2021
@dholmes-ora
Copy link
Member

@dholmes-ora dholmes-ora commented Jun 25, 2021

@PaulSandoz I think we need input from the build team to determine why the java.net binary exhibits different behaviour to an internal build.

@@ -0,0 +1,76 @@
/*
* Copyright (c) 2020, 2021, Oracle and/or its affiliates. All rights reserved.
Copy link
Member

@DamonFool DamonFool Jun 25, 2021

I think the copyright year should be 2021 only since it's a new file, right?

Copy link
Contributor Author

@sviswa7 sviswa7 Jun 25, 2021

Thanks for catching it. I will correct the copyright year.

ProcessBuilder pb = ProcessTools.createJavaProcessBuilder(
"-Xmn8m", "-Xlog:library=info",
"--add-modules=jdk.incubator.vector",
"LoadSvmlTest$VectorTest");
Copy link
Member

@PaulSandoz PaulSandoz Jun 25, 2021

Can you do VectorTest.class.getName() instead?

@PaulSandoz
Copy link
Member

@PaulSandoz PaulSandoz commented Jun 25, 2021

@PaulSandoz I have added a test. Please let me know if it looks ok.

Test looks good. I will run through our build/test system tomorrow morning, and try to get some details on build differences.

@sviswa7
Copy link
Contributor Author

@sviswa7 sviswa7 commented Jun 25, 2021

Thanks a lot Paul.

char dll_path[JVM_MAXPATHLEN];
#ifdef _WINDOWS
int ret = jio_snprintf(dll_path, sizeof(dll_path), "%s%sbin", Arguments::get_java_home(), os::file_separator());
#else
int ret = jio_snprintf(dll_path, sizeof(dll_path), "%s%slib", Arguments::get_java_home(), os::file_separator());
#endif
Copy link
Member

@dholmes-ora dholmes-ora Jun 25, 2021

There is a platform independent way to do this. Here is the example of how libverify is loaded by the JVM:

// Load verify dll
char buffer[JVM_MAXPATHLEN];
char ebuf[1024];
if (!os::dll_locate_lib(buffer, sizeof(buffer), Arguments::get_dll_dir(), "verify"))
return NULL; // Caller will throw VerifyError

void *lib_handle = os::dll_load(buffer, ebuf, sizeof(ebuf));

@sviswa7
Copy link
Contributor Author

@sviswa7 sviswa7 commented Jun 25, 2021

@dholmes-ora Thanks a lot. I have updated the patch accordingly.

@dholmes-ora
Copy link
Member

@dholmes-ora dholmes-ora commented Jun 25, 2021

I've checked the layout of the jdk.java.net binaries and those we use in our internal testing and see no difference, so I cannot explain why one case would fail when the other succeeds.

I'll take the latest patch for a spin through our build/test system and pass the link to @PaulSandoz to check progress/success.

David

if (Platform.isWindows())
output.shouldMatch("Loaded library .*svml.dll");
else
output.shouldMatch("Loaded library .*libsvml.so");
Copy link
Member

@dholmes-ora dholmes-ora Jun 25, 2021

You could just check for .svml. and not worry about dll vs so. That should suffice and keeps the test simpler.

Copy link
Contributor Author

@sviswa7 sviswa7 Jun 25, 2021

Updated the test accordingly. Thanks.

Copy link
Member

@PaulSandoz PaulSandoz left a comment

All tests pass. I will talk with the build folks to find a root cause but i am confident it's more robust due to the use of os::dll_locate_lib.

If you can please update the test to use the class name, rather explicitly using the result of that.

@openjdk
Copy link

@openjdk openjdk bot commented Jun 25, 2021

@sviswa7 This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8269335: Unable to load svml library

Reviewed-by: psandoz

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 12 new commits pushed to the master branch:

  • 4eb3212: 8268871: Adjust javac to updated exhaustiveness specification
  • 44691cc: 8268972: Add default impl for recent new Reporter.print method
  • 7ab1285: 8267610: NPE at at jdk.compiler/com.sun.tools.javac.jvm.Code.emitop
  • cfa6a99: 8269316: ProblemList vmTestbase/vm/mlvm/mixed/stress/regression/b6969574/INDIFY_Test.java on Linux-X64 -Xcomp
  • 22d8675: 8269315: ProblemList javax/swing/JFileChooser/FileSystemView/SystemIconTest.java on Win-X64
  • 443a79a: 8269314: ProblemList serviceability/dcmd/gc/RunFinalizationTest.java on Win-X64 and linux-aarch64
  • 424cc50: 8269307: ProblemList java/awt/KeyboardFocusmanager/TypeAhead/ButtonActionKeyTest/ButtonActionKeyTest.java on win-x64
  • 63bcd33: 8269246: Scoped ByteBuffer vector access
  • 3fb28d3: 8269218: GaloisCounterMode.overlapDetection misses the JDK-8263436 fix again
  • d3d3b22: 8269265: ProblemList serviceability/sa/TestJmapCoreMetaspace.java with ZGC
  • ... and 2 more: https://git.openjdk.java.net/jdk17/compare/b4743143428a3e0c9a6d1d7dcaf73f7a06882e84...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Jun 25, 2021
@sviswa7
Copy link
Contributor Author

@sviswa7 sviswa7 commented Jun 25, 2021

@PaulSandoz I have updated the test to use VectorTest.class.getName(). Please let me know if I can integrate.

Copy link
Member

@PaulSandoz PaulSandoz left a comment

Looks good.

@sviswa7
Copy link
Contributor Author

@sviswa7 sviswa7 commented Jun 25, 2021

/integrate

@openjdk
Copy link

@openjdk openjdk bot commented Jun 25, 2021

Going to push as commit 1e3b418.
Since your change was applied there have been 12 commits pushed to the master branch:

  • 4eb3212: 8268871: Adjust javac to updated exhaustiveness specification
  • 44691cc: 8268972: Add default impl for recent new Reporter.print method
  • 7ab1285: 8267610: NPE at at jdk.compiler/com.sun.tools.javac.jvm.Code.emitop
  • cfa6a99: 8269316: ProblemList vmTestbase/vm/mlvm/mixed/stress/regression/b6969574/INDIFY_Test.java on Linux-X64 -Xcomp
  • 22d8675: 8269315: ProblemList javax/swing/JFileChooser/FileSystemView/SystemIconTest.java on Win-X64
  • 443a79a: 8269314: ProblemList serviceability/dcmd/gc/RunFinalizationTest.java on Win-X64 and linux-aarch64
  • 424cc50: 8269307: ProblemList java/awt/KeyboardFocusmanager/TypeAhead/ButtonActionKeyTest/ButtonActionKeyTest.java on win-x64
  • 63bcd33: 8269246: Scoped ByteBuffer vector access
  • 3fb28d3: 8269218: GaloisCounterMode.overlapDetection misses the JDK-8263436 fix again
  • d3d3b22: 8269265: ProblemList serviceability/sa/TestJmapCoreMetaspace.java with ZGC
  • ... and 2 more: https://git.openjdk.java.net/jdk17/compare/b4743143428a3e0c9a6d1d7dcaf73f7a06882e84...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Jun 25, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Jun 25, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jun 25, 2021

@sviswa7 Pushed as commit 1e3b418.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
4 participants