-
Notifications
You must be signed in to change notification settings - Fork 704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extended.functional testvmcheck_6 crash vmState=0x0005ffff #7247
Comments
https://ci.eclipse.org/openj9/job/Test_openjdknext_j9_sanity.functional_x86-64_mac_OpenJDK/62
|
This is the findenv crash again. Looking through OpenJ9, the only setenv I can find is: Looking through OMR, I only see references in AUXV: Looking in the Extensions repo, the references I see are either in the launcher code or AWT related. I wouldn't expect either to changing after startup. |
https://ci.eclipse.org/openj9/job/Test_openjdk11_j9_extended.functional_x86-64_mac_Nightly/198
|
https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_extended.functional_x86-64_mac_Release/19/
|
https://ci.eclipse.org/openj9/job/Test_openjdk13_j9_extended.functional_x86-64_mac_Release/12
|
From an internal build:
To rebuild the failed tests in =https://ci.adoptopenjdk.net/job/Grinder, use the following links: |
https://ci.eclipse.org/openj9/job/Test_openjdk11_j9_extended.functional_x86-64_mac_Nightly/279
|
From an internal test
To rebuild the failed tests in =https://ci.adoptopenjdk.net/job/Grinder, use the following links: |
Saw similar crash on internal build, java/lang/ProcessBuilder/Basic.java#id0.Basic_id0
|
assigned to @rpshukla |
https://ci.eclipse.org/openj9/job/Test_openjdk14_j9_sanity.functional_x86-64_mac_Release/8 |
Note the problem occurred a couple of time in grinders in a specific test, as shown in #9108. |
I'll try reproducing the failure locally for |
Note that for a jdk_custom grinder, the name of the test is |
https://ci.eclipse.org/openj9/job/Test_openjdk11_j9_extended.functional_x86-64_mac_Nightly/326 |
Update on my investigation: I haven't yet been able to reproduce this locally on my macbook. However, I did use dapptrace to log system libc calls on a successful run of Looking into the sources for libc on MacOS, I noticed that between libc version 1082.50.1 and 1158.1.2, the function
I was also able to reproduce the failure in this grinder where I set the node to a machine with MacOS 10.10. 1 of 10 runs failed: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/2830/ I am running a 100x grinder on MacOS 10.14 to see if the failure will show up on a newer system: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/2831/ If the MacOS 10.14 grinder doesn't have any failures, then I suspect this seg fault is related to the libc version on the system. I'm not sure exactly how libc versions map to MacOS versions but the problem has definitely been seen on MacOS 10.10. Update: the 100x MacOS 10.14 grinder had no failures. |
100x grinder for |
It looks like macOS 10.12 is the first version to have the 10.11.6 has Libc-1082.60.1 which doesn't have source code available but some of the failures above occurred on a 10.11.6 machine so I'm guessing it has the non-locked version of 10.11.5 has Libc-1082.50.1 which definitely has the non-locked version of I will look into trying to reproduce the failure on a machine with 10.11.6 or earlier. |
https://ci.eclipse.org/openj9/job/Test_openjdk11_j9_sanity.openjdk_x86-64_mac_Nightly/12 |
https://ci.eclipse.org/openj9/job/Test_openjdk11_j9_sanity.openjdk_x86-64_mac_Nightly/13 |
https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_sanity.openjdk_x86-64_mac_Nightly/12 |
https://ci.eclipse.org/openj9/job/Test_openjdk11_j9_sanity.openjdk_x86-64_mac_Nightly/17 |
https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_sanity.openjdk_x86-64_mac_Nightly/18 |
I’ve been trying to find documentation from apple on the change from the unlocked to the locked versions of libc. So far I haven’t been able to find anything relevant. Looking at the source code, I think this lock was added to protect the static memory accessed by a call to a function called Again, here is the source for the locked version of getenv : https://opensource.apple.com/source/Libc/Libc-1158.1.2/stdlib/FreeBSD/getenv.c.auto.html And here is the source for Note this bit of code inside getenv where the lock is acquired:
Furthermore, looking at the posix standard for getenv, this function does not need to be thread-safe: https://pubs.opengroup.org/onlinepubs/9699919799/functions/getenv.html. The lock seems to have been added by apple as a convenience. @andrewcraik fyi |
In looking further at
There are a couple of places where
This is worth trying. We can add code to lock / unlock around getenv calls in the JIT and see if that's sufficient to avoid the crash. Running that way for a month or so should give us a fair bit of confidence in whether or not the crash still occurs |
I'm looking into locking |
@pshipton Has this crash been seen recently (since April 26th)? I've run 400+ grinder iterations, but I haven't had luck reproducing.
|
No. There is a second Issue for this problem (for no good reason) #5153 but the last reported occurrence in there is April 17. |
Still not able to reproduce in grinder. I've started running grinders with some additional options, i.e. Alongside testing, I'm adding locking around the Also wanted to note that #5061 looks like the same problem and has some good discussion there too. |
Ran several more sets of grinders, but no crashes occurred. Tested with the following
Seems like this issue is not reproducible at the moment...if this crash shows up again, I'll circle back to this. |
I've opened #9994 and eclipse/omr#5338 to replace |
Another _findEnv crash with a different vmState. https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_sanity.functional_x86-64_mac_OMR_testList_0/42
|
@AdamBrousseau @jdekonin Can you confirm the version of OSX installed on https://ci.eclipse.org/openj9/computer/osx1011-x86-1/ ? We've been trying to hunt down this |
Machine -2 is the same. Edit: |
Ran some more grinders on
I don't have access to run grinders on the
In case those links don't work, these are the parameters I'm using:
Could also run some grinders on the adopt machines, but it seems there isn't an OSX machine with version 10.11 https://ci.adoptopenjdk.net/label/sw.os.osx/. |
10x osx1011-x86-1 https://ci.eclipse.org/openj9/job/Grinder/934 |
20x osx1011-x86-1 https://ci.eclipse.org/openj9/job/Grinder/936 |
fyi #11430 |
https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_extended.functional_x86-64_mac_Release_testList_1/19
|
https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_extended.functional_x86-64_mac_Nightly_testList_1/241
|
https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_extended.functional_x86-64_mac_Nightly_testList_1/257
|
https://openj9-jenkins.osuosl.org/job/Test_openjdk8_j9_extended.functional_x86-64_mac_Nightly_testList_0/19 - osx1011-x86-1
|
A similar
Also observed at JDK8 0.27 release build https://openj9-jenkins.osuosl.org/job/Test_openjdk8_j9_extended.functional_x86-64_mac_Release/5/tapResults/ |
Another one #13487 (comment) |
https://openj9-jenkins.osuosl.org/job/Test_openjdk8_j9_extended.functional_x86-64_mac_Nightly_testList_0/127 - osx1011-x86-1
|
https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_extended.functional_x86-64_mac_Release_testList_0/12 - osx1011-x86-1
|
https://openj9-jenkins.osuosl.org/job/Test_openjdk8_j9_sanity.functional_x86-64_mac_Nightly_testList_1/220 - osx1011-x86-1
|
https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_extended.functional_x86-64_mac_Nightly/153/
Possible dup of #5153
The text was updated successfully, but these errors were encountered: