-
Notifications
You must be signed in to change notification settings - Fork 711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jdk19 Compiling method: java/lang/Thread.genThreadName() crash vmstate 0005FFFF #15855
Comments
Maybe related to #15251 I only started running regular testing on jdk19 two nights ago. |
@r30shah ^^ |
Looking at the core-dump from one of the failure from the diagnostic files (.../aqa-tests/TKG/output_16626899173765/jdk_util_1/work/java/util/concurrent/Semaphore/RacingReleases/core.20220908.225950.46458.0001.dmp), I see following stack trace.
So the failure happens when we are installing AOT cached method. Segmentation fault happens trying to find out an entry in the hash table, where we call the
So when we come into the Further checking down the stack from where we pass in this wrong object, I see that a call made from vmStruct = 0x000003ff10002200 - VMThread pointer On the surface, it looks like we have emitted a code in the AOT compiled code for which we do not have a relocation yet, but to confirm would need further investigations. @dchopra001 Can you please take a look at this ? I did launch an internal grinder and seems like it is reproducible there as well. openj9/runtime/util/modularityHelper.c Line 78 in b8752ec
[3]. openj9/runtime/vm/resolvesupport.cpp Line 710 in b8752ec
[4]. openj9/runtime/vm/resolvesupport.cpp Line 257 in b8752ec
|
Any new news on this one? |
@dchopra001 Can you post update of your investigations on this issue? |
I haven't been able to reproduce this one locally. So instead I'm trying to reproduce this failure in the internal Jenkins grinder infrastructure. Update on that front is that so far I've been able to reproduce the crash and some of the rtlogs generated during the AOT load are available. However I don't see the original AOT compilation trace logs anywhere in the failing jobs. I'll need those to progress further on this one. It's possible that the original JVM that does the AOT compilation doesn't receive the JVM options or that the logs are deleted for whatever reason. I'm experimenting with the options and will update here once I know more or am able to reproduce the failure with enough tracing info. |
https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_extended.functional_s390x_linux_Nightly_testList_0/14
|
https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/22 There is a core / javacore / Snap in .\aqa-tests\TKG\output_16650183664248\jdk_util_1\work\java\util\concurrent\FutureTask\ExplicitSet
|
I suspect there are number of dups of this issue (timeouts) that weren't identified as this issue. |
https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_extended.functional_s390x_linux_Nightly_testList_1/18
|
https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/26/ It's not actually running a test but doing cleanup before running the test. core, javacore under
|
https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_extended.system_s390x_linux_Nightly_testList_2/32 https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/33 |
@dchopra001 If you are unable to narrow it down to the failing change set and getting the AOT compilation logs from the failure are the next unnecessary step, then let's see if there is a way to may be manually executing the steps for the job failure. |
I did finally manage to get an AOT log for this. Due to the way the files are moved around post crash by the test harness the rtlog isn't there. I think I should still be able to get some information out of this though. I'm looking through the results right now. Hopefully I'll have an update soon. |
This problem happens because of a
The Here's the table: openj9/runtime/compiler/optimizer/IdiomTransformations.cpp Lines 10452 to 10474 in 1af0a29
I'm not sure why we've only started seeing this in Java 19 as this code looks to have been around for a while. The fix will probably be either adding a new relo record or disabling this transformation for relocatable compiles (I'm not sure if the perf gains are meaningful here). I've launched a grinder to see if this fails when we disable the transformation. I noticed that @hzongaro modified code in this area earlier this year. @hzongaro any chance this issue could be related to your PR? |
Dhruv @dchopra001, I'll take a look |
Similar segmentation error seen on RTC 148250 |
https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_extended.functional_s390x_linux_Nightly_testList_0/28 |
@jdmpapin mentioned in an offline discussion that it should be reasonable okay to disable this transformations for relocatable compiles. I was curious as to why this was only showing up now in Java 19. @hzongaro took at the logs for Java 17 and 19 earlier and it seems like it might just be luck that we're seeing this be so prevalent in Java 19 now. |
I'm testing out a fix for this in the internal builds and will open a PR once the grinder passes. |
Thank you Dhruv |
My test grinder with the fix failed. It seems to be a similar issue. I'm not sure why yet so have launched another grinder to collect trace logs again. |
I wasn't able to get the logs when the failure occurred due to an infra failure:
Trying again. |
I've opened PRs here: |
https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/45 https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_extended.system_s390x_linux_Nightly_testList_0/44 |
@dchopra001 Do you think this will be resolved within 2 weeks? |
Yes, the PRs should be approved+merged soon. |
https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/3/
jdk_util_1\work\java\util\concurrent\FutureTask\ExplicitSet\
jdk_util_1\work\java\util\concurrent\Semaphore\RacingReleases
jdk_util_1\work\java\util\jar\JarFile\jarVerification\MultiProviderTest\
contains a javacore and core file for a gpf.
https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/3/openjdk_test_output.tar.gz
Changes from previous build:
2712974...0ed4cfa
eclipse-openj9/openj9-omr@16e09fb...560f16b
@tajila
The text was updated successfully, but these errors were encountered: