-
Notifications
You must be signed in to change notification settings - Fork 722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DDR should detect truncated core files #8983
Comments
dbx seems to agree there isn't any memory at 0x30000000 in the problematic core. I didn't find any way to get a memory map from dbx, although |
For the record, I'm not expecting my change to have any effect on non-mixed builds. |
@keithc-ca I got a core from last night's build and it's working. I don't know if this is because it's fixed, or this build just didn't experience the problem. Given that a couple of previous builds of this nature I checked didn't work, it looks promising. For the record can you please point out which part of the change may have fixed it. |
Prior to #9026, |
@keithc-ca @gacholio was there a prior change that broke it? We need to check if any release branches need to be fixed. |
Yes, my previous change to the RAS init is what broke this. |
#5783 is the culprit. |
That was delivered in May 2019. We should put a fix for this problem into the 0.20.0 release so we can service the release. @DanHeidinga fyi |
@pshipton has graciously volunteered to do the backporting. |
Backporting the rasdump.c change from #9026 is probably the safest course - all of the other changes should have no effect on single-mode builds. |
Originally broken by eclipse-openj9#5783, fixed by eclipse-openj9#9026. Port the rasdump.c changes from eclipse-openj9#9026 to the 0.20.0 release branch to resolve eclipse-openj9#8983, AIX core files which cannot be read by DDR. [ci skip] Signed-off-by: Peter Shipton <Peter_Shipton@ca.ibm.com>
Created #9117 for the backport. |
I looked at The
According to [1], the second byte is [1] https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/filesreference/core.html |
Obviously it would be useful to see a message that the core is truncated. Opening the core in dbx I see a message about truncation. I thought I did this with a previous core unreadable with DDR (which is gone), but didn't see any truncation message. I'll keep this open for adding a truncation message, but move it to the next milestone. |
|
With #9199, |
This is still open until truncation messages are added for other platforms. |
@keith am I correct DDR detects truncated core files on AIX and Linux (ELF)? If so I'd be inclined to close this until somebody complains about a problem on another platform. |
Yes, DDR detects truncated core files on AIX and Linux. |
extended.system testing has been failing on AIX jdk11+ for a while due to OOM problems related to setting MALLOCOPTIONS. I tried to look at some of the core files from xlinux, but all the ones I've tried show "No JRE". I also recall @dmitripivkine got a core for a crash on AIX, and had a similar problem. I couldn't find the issue, but maybe Dmitri can track it down.
At the time Dmitri found the core I tried a simple test, but the core file produced was readable across platforms.
I'm wondering if there is something going wrong when trying to locate the JRE in the core. Perhaps not all the memory segments are being found properly. Seems to me there was a problem in this area fairly recently.
Looking at the allocateRASStruct() code, it starts looking to put the RAS struct at 0x30000000. Looking at the core, the lowest addresses are the following. This must be incorrect because there has to be memory below 4G to support compressed refs.
Looking at a core created by a simple command (
java -Xdump:system
), I see low memoryThe text was updated successfully, but these errors were encountered: