-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8367609: serviceability/sa/ClhsdbPmap.java fails when built with Clang #27274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back fandreuzzi! A progress list of the required criteria for merging this PR into |
|
@fandreuz This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 49 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@kevinjwalls, @plummercj) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
Webrevs
|
|
Hi - Do you still have the same core? Can you attach a "readelf -a " output from it in the jbs issue with the other (helpful) files? Would like to see how things get mapped. It's interesting to me at the moment that in a random build of mine with gcc, libjvm has e.g. ..so loadable text is just part of the mapping at the base address, no offset. A core of that contains: But your clang build has the text with some offset and vaddr: ..so does that appear as a distinct PH in the core? |
|
Hi @kevinjwalls, I don't have the same core. I ran the test another time, you find the new files attached to the ticket (those starting with This is the corresponding entry in the new core: |
|
Thanks - does that still have the same problem? libjvm second seg: libjvm binary: These figures look like they work, for that check which has been working for gcc builds: ROUNDUP(existing_map->memsz, page_size) != ROUNDUP(lib_php->p_memsz, page_size) left: core file memsize: 0x0000000000e23000 Was core 1 different? Looks like a slightly smaller libjvm in that run (mem size was 0e225c0 and the core contained a 0xe24000 size mapping, which was the problem). |
|
Hi @kevinjwalls, I think that might be my fault. I rerun the test and I noticed the failure this time is not in I think this happens by chance. I added the files from the new round to the ticket. |
Thanks looking again - do you still have the files, as libjimage_sections.txt ... has the Section Headers, not the Program Headers (readelf -a is fine 8-) ) |
|
Hi @kevinjwalls, I just uploaded |
|
Thanks for the additional info. |
|
Hi @kevinjwalls, what you say makes sense to me. I tried the updated check on my environment and it fixed the problem as well as what I initially proposed in this PR. I updated the PR too, the new check looks cleaner than what I had. |
kevinjwalls
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK glad it works. 8-)
Maybe a comment above line 423 so say something like:
// Account for the PH being at some vaddr offset from mapping in core file.
We could have left the ROUNDUP on line 427 to the comparison on line 431 to make it more like the original. I think either way is good.
Good for me, let's check Chris who looked at it also agrees.
| // Account for the PH being at some vaddr offset from mapping in core file. | ||
| uint64_t lib_memsz = lib_php->p_memsz; | ||
| if (target_vaddr > existing_map->vaddr) { | ||
| lib_memsz += target_vaddr - existing_map->vaddr; | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change looks good, but just want to make sure I'm understanding it properly. Kevin commented the following:
hsdb> ERROR: address conflict @ 0x7fa9ff0e4440 (existing map size = 102400, size = 97328, flags = 5)
print_error("address conflict @ 0x%lx (existing map size = %ld, size = %ld, flags = %d)\n",
target_vaddr, existing_map->memsz, lib_php->p_memsz, lib_php->p_flags);
core has no mapping at exactly 0x7fa9ff0e4440 but has:
Start End Page Offset
0x00007fa9ff0db000 0x00007fa9ff0e4000 0x0000000000000000 /home/fandreuz/code/jdk/build/clang/images/jdk/lib/libjimage.so (0x9000 size) 0x841c rounded up
0x00007fa9ff0e4000 0x00007fa9ff0fd000 0x0000000000000008 /home/fandreuz/code/jdk/build/clang/images/jdk/lib/libjimage.so (0x19000 size) the existing map size from the error
So the problem here is that we were expecting 0x00007fa9ff0e4000 but got 0x7fa9ff0e4440. Basically target_vaddr is at an unexpected offset from existing_map->vaddr. The fix is to ad this offset to lib_memsz so the error is no longer triggered. Do we understand why this offset is happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we understand why this offset is happening?
It looks like the binary is just built differently, clang or the linker or some combination arranges things differently to what we have seen from gcc.
Maybe it's enough to say there doesn't have to be a 1:1 mapping from program headers to segments in the process.
We can have a core file segment that contains more than one program header from the library (that 0x19000 size segment at the libjimage base address).
So, new question:
we must have been through this loop already, in the libjimage example with the first LOAD PH of size 0x000000000000841c at the library base address.
Did that pass the same test?
No it can't, the mapping is 0x19000 and size 0x841c does not round up to that.
I think we must ignore the first LOAD PH from the clang build as we check:
408 } else if (lib_php->p_flags != existing_map->flags) {
..and the first LOAD PH from https://bugs.openjdk.org/secure/attachment/116209/3libjimage_all.txt
..is a Read only, not execute.
This should be OK in this case:
https://bugs.openjdk.org/secure/attachment/116199/3core_all.txt
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000141000 0x00007fa9ff0e4000 0x0000000000000000
0x0000000000019000 0x0000000000019000 R E 0x1000
Filesize and memsiz both 0x019000 must mean the core contains everything, we don't need either of these from the library.
But there is no harm in continuing past the problematic check and populating existing_map from the library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. That's mostly making sense to me. I'm still a bit fuzzy on it, but that's ok. Has testing been done to make sure this doesn't break anything with gcc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has testing been done to make sure this doesn't break anything with gcc?
Yes, I didn't see any problem while testing the change with gcc.
plummercj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
|
Thanks for the review @kevinjwalls and @plummercj ! |
|
/integrate |
|
/sponsor |
|
Going to push as commit 8be1616.
Your commit was automatically rebased without conflicts. |
|
@kevinjwalls @fandreuz Pushed as commit 8be1616. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
The problem seems to be in read_lib_segments (ps_core.c), this check is too harsh:
jdk/src/jdk.hotspot.agent/linux/native/libsaproc/ps_core.c
Lines 423 to 425 in 5271448
In my run,
existing_map->memsz = 0xe24000, while the rhs in L425 is0xe23000. According to the NT_FILE entry, this segment oflibjvm.sohas file offset 0x67f000. It seems that the linker aligned it down according to the page size (0x1000). The offset of the same segment according toreadelf -l libjvm.sois 0x67fc80. This additional offset should be added top_memszto obtain the 0xe24000, which we see in the core dump.I added some files to the ticket for context.
Passes
tier1andtier2.Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27274/head:pull/27274$ git checkout pull/27274Update a local copy of the PR:
$ git checkout pull/27274$ git pull https://git.openjdk.org/jdk.git pull/27274/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 27274View PR using the GUI difftool:
$ git pr show -t 27274Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27274.diff
Using Webrev
Link to Webrev Comment