Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8306561: Possible out of bounds access in print_pointer_information #16381

Closed
wants to merge 8 commits into from

Conversation

TOatGithub
Copy link
Contributor

@TOatGithub TOatGithub commented Oct 26, 2023

MallocTracker::print_pointer_information in src/hotspot/share/services/mallocTracker.cpp is called to check the highest pointer address of the reserved region. To do so it aligns the test pointer down to the next 8 Byte boundary and casts this address to class MallocHeader in order to use this classes eye-catcher member _canary for validation. Method looks_valid() dereferences _canary's content. _canary has an offset of 14 bytes relative to the class. Therefore it resides outside the reserved region for the highest pointer address, which causes a segmentation violation.

We would expect the same error also for other platforms than AIX as memory is read, which is not allocated. Interestingly, Linux seems to allow this access for 5 times 4K above the reserved region.

As a solution, looks_valid() should check _canary's address as being invalid, and return false immediately.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8306561: Possible out of bounds access in print_pointer_information (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16381/head:pull/16381
$ git checkout pull/16381

Update a local copy of the PR:
$ git checkout pull/16381
$ git pull https://git.openjdk.org/jdk.git pull/16381/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16381

View PR using the GUI difftool:
$ git pr show -t 16381

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16381.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 26, 2023

👋 Welcome back TOatGithub! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@TOatGithub
Copy link
Contributor Author

/label add hotspot

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Oct 26, 2023
@openjdk
Copy link

openjdk bot commented Oct 26, 2023

@TOatGithub
The hotspot label was successfully added.

@TOatGithub TOatGithub changed the title JDK 8306561: Gtests crash with SIGSEGV in print_pointer_information on AIX JDK 8306561 : Gtests crash with SIGSEGV in print_pointer_information on AIX Oct 26, 2023
@TOatGithub TOatGithub changed the title JDK 8306561 : Gtests crash with SIGSEGV in print_pointer_information on AIX 8306561: Gtests crash with SIGSEGV in print_pointer_information on AIX Oct 26, 2023
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 26, 2023
@mlbridge
Copy link

mlbridge bot commented Oct 26, 2023

Webrevs

@tstuefe
Copy link
Member

tstuefe commented Oct 26, 2023

@TOatGithub I changed the description in JBS since this is not AIX specific. Could you adjust the PR title please?

@TOatGithub TOatGithub changed the title 8306561: Gtests crash with SIGSEGV in print_pointer_information on AIX 8306561: Possible out of bounds access in print_pointer_information Oct 26, 2023
Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would fix it another way. The underlying assumption in MallocHeader::looks_valid() is that the header resides fully in readable memory. The caller must make sure of that.

The better way to fix this would be in print_pointer_information(). It must make sure, before calling MallocHeader::looks_valid(), that the header is contained fully in readable memory.

Something like:

-     if (!os::is_readable_pointer(here)) {
+    if (!os::is_readable_pointer(here) || !os::is_readable_pointer(here + sizeof(MallocHeader) {

or something similar.

@TOatGithub
Copy link
Contributor Author

Agreed. Would you still prefer to blow up if the caller hands down bad pointers? In other words: should MallocHeader::looks_valid() not do my additional check?

@tstuefe
Copy link
Member

tstuefe commented Oct 26, 2023

Yes, let it blow up then.

@TOatGithub
Copy link
Contributor Author

Thanks for pointing this out, and kudos to @JoKern65 for his help and insights; he confirms your point.

@JoKern65
Copy link
Contributor

@tstuefe: I helped analyzing this problem by writing a plain c test pgm mmaping a page and trying to read beyound. On AIX as expected the very next byte after the requested region leads to a segmentation violation, but on linux (both flavours, linuxintel and linuxppc64) I was able to read exactly 20 KB beyond, before running into segmentation violation.
This might be the reason, why the developer of print_pointer_information() was not aware that he creates code that could crash.
Thomas, do you have an idea why linux (and maybe other platforms) map more memory as requested?
It has nothing to do with memory pages. The additional memory does not end at the next memory page boundary, but exactly 20KB after the end of the requested region.
Astonishing is that at the lower end of the region there is no extra memory accessible.

@tstuefe
Copy link
Member

tstuefe commented Oct 27, 2023

@tstuefe: I helped analyzing this problem by writing a plain c test pgm mmaping a page and trying to read beyound. On AIX as expected the very next byte after the requested region leads to a segmentation violation, but on linux (both flavours, linuxintel and linuxppc64) I was able to read exactly 20 KB beyond, before running into segmentation violation. This might be the reason, why the developer of print_pointer_information() was not aware that he creates code that could crash. Thomas, do you have an idea why linux (and maybe other platforms) map more memory as requested? It has nothing to do with memory pages. The additional memory does not end at the next memory page boundary, but exactly 20KB after the end of the requested region. Astonishing is that at the lower end of the region there is no extra memory accessible.

Plain bad luck and rare test execution.

Whether or not you can read over the end of an mmaped segment depends on whether there are VMAs mapped beyond that. Linux kernel clusters VMAs to keep VMA fragmentation down. So you may have adjacent mappings. This is subject to ASLR of course but I always see the VMAs pretty much clustered regardless of ASLR. And though this sounds like it should be random, you can have a pretty consistent order of VMA allocation across many VM runs, and therefore similarly looking process memory maps.

Just looking at /proc/pid/maps will be probably clearer to you than me explaining it :)

@tstuefe
Copy link
Member

tstuefe commented Oct 27, 2023

@tstuefe: I helped analyzing this problem by writing a plain c test pgm mmaping a page and trying to read beyound. On AIX as expected the very next byte after the requested region leads to a segmentation violation, but on linux (both flavours, linuxintel and linuxppc64) I was able to read exactly 20 KB beyond, before running into segmentation violation. This might be the reason, why the developer of print_pointer_information() was not aware that he creates code that could crash. Thomas, do you have an idea why linux (and maybe other platforms) map more memory as requested? It has nothing to do with memory pages. The additional memory does not end at the next memory page boundary, but exactly 20KB after the end of the requested region. Astonishing is that at the lower end of the region there is no extra memory accessible.

Plain bad luck and rare test execution.

Whether or not you can read over the end of an mmaped segment depends on whether there are VMAs mapped beyond that. Linux kernel clusters VMAs to keep VMA fragmentation down. So you may have adjacent mappings. This is subject to ASLR of course but I always see the VMAs pretty much clustered regardless of ASLR. And though this sounds like it should be random, you can have a pretty consistent order of VMA allocation across many VM runs, and therefore similarly looking process memory maps.

Just looking at /proc/pid/maps will be probably clearer to you than me explaining it :)

Just read your answer and I see you wrote a little test program that shows that behavior, probably single threaded. The same explanation applies here though, unless there is a simple bug.

For instance, if - after the mmap - you do something that allocates C-heap, the libc may allocate a new arena with mmap and place it just beyond your mmaped region. A lot of libc functions need C-heap under the hood (e.g. calling C assert() will allocate C-heap to assemble the assert line).

@JoKern65
Copy link
Contributor

@tstuefe: Meanwhile I had also asked Klaus Möser who gave me a similar explanation and we both together examined /proc/pid/maps to understand that a runtime library had allocated the memory beyond my mmap region.
Klaus also told me that AIX has an mmap alignment of 256GB, so it is very likely that the addresses beyond and below your mapped region are not used, explaining the segmentation violation.

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay (if tests are green).

@openjdk
Copy link

openjdk bot commented Oct 27, 2023

@TOatGithub This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8306561: Possible out of bounds access in print_pointer_information

Reviewed-by: stuefe, clanger

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 57 new commits pushed to the master branch:

  • 4a85f6a: 8317132: Prepare HotSpot for permissive-
  • 5207443: 8317965: TestLoadLibraryDeadlock.java fails with "Unable to load native library.: expected true, was false"
  • ee57e73: 8317612: ChoiceFormat and MessageFormat constructors call non-final public method
  • f262f06: 8319211: Regression in LoopOverNonConstantFP
  • bfaf570: 8311546: Certificate name constraints improperly validated with leading period
  • d354141: 8318694: [JVMCI] disable can_call_java in most contexts for libjvmci compiler threads
  • c86592d: 8319046: Execute tests in source/class-file order in JavadocTester
  • 3660a90: 8319139: Improve diagnosability of JavadocTester output
  • 7f47c51: 8316025: Use testUI() method of PassFailJFrame.Builder in FileChooserSymLinkTest.java
  • 36de19d: 8317048: VerifyError with unnamed pattern variable and more than one components
  • ... and 47 more: https://git.openjdk.org/jdk/compare/141dae8b76d41accfa02a0250a1c24364cbf6f25...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@tstuefe, @RealCLanger) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 27, 2023
Copy link
Contributor

@RealCLanger RealCLanger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but wait for the nightly testing results.

src/hotspot/share/nmt/mallocHeader.inline.hpp Outdated Show resolved Hide resolved
@TOatGithub
Copy link
Contributor Author

tests passed in dbg build;
opt build still faces an SIGILL error in GTestWrapper when executing AsyncLogTest, which to my understanding is unrelated; therefore, I created created https://bugs.openjdk.org/browse/JDK-8319104

@openjdk
Copy link

openjdk bot commented Oct 31, 2023

⚠️ @TOatGithub This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@tstuefe
Copy link
Member

tstuefe commented Nov 1, 2023

tests passed in dbg build; opt build still faces an SIGILL error in GTestWrapper when executing AsyncLogTest, which to my understanding is unrelated; therefore, I created created https://bugs.openjdk.org/browse/JDK-8319104

Yes, that looks unrelated. @navyxliu could you please take a look at JDK-8319104 ?

@JoKern65
Copy link
Contributor

JoKern65 commented Nov 2, 2023

@navyxliu : I have already taken over JDK-8319104, started analyzation, found the reason for the SIGILL and are working now on a fix for the root cause.

@TOatGithub
Copy link
Contributor Author

Checks still run fine with the latest switch to os::is_readable_range. Therefore:

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Nov 2, 2023
@openjdk
Copy link

openjdk bot commented Nov 2, 2023

@TOatGithub
Your change (at version 60d46df) is now ready to be sponsored by a Committer.

@MBaesken
Copy link
Member

MBaesken commented Nov 2, 2023

/sponsor

@openjdk
Copy link

openjdk bot commented Nov 2, 2023

Going to push as commit d6ce62e.
Since your change was applied there have been 58 commits pushed to the master branch:

  • 7f31a05: 8319203: Parallel: Rename addr_is_marked_imprecise
  • 4a85f6a: 8317132: Prepare HotSpot for permissive-
  • 5207443: 8317965: TestLoadLibraryDeadlock.java fails with "Unable to load native library.: expected true, was false"
  • ee57e73: 8317612: ChoiceFormat and MessageFormat constructors call non-final public method
  • f262f06: 8319211: Regression in LoopOverNonConstantFP
  • bfaf570: 8311546: Certificate name constraints improperly validated with leading period
  • d354141: 8318694: [JVMCI] disable can_call_java in most contexts for libjvmci compiler threads
  • c86592d: 8319046: Execute tests in source/class-file order in JavadocTester
  • 3660a90: 8319139: Improve diagnosability of JavadocTester output
  • 7f47c51: 8316025: Use testUI() method of PassFailJFrame.Builder in FileChooserSymLinkTest.java
  • ... and 48 more: https://git.openjdk.org/jdk/compare/141dae8b76d41accfa02a0250a1c24364cbf6f25...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Nov 2, 2023
@openjdk openjdk bot closed this Nov 2, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Nov 2, 2023
@openjdk
Copy link

openjdk bot commented Nov 2, 2023

@MBaesken @TOatGithub Pushed as commit d6ce62e.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@@ -211,7 +211,8 @@ bool MallocTracker::print_pointer_information(const void* p, outputStream* st) {
const uint8_t* here = align_down(addr, smallest_possible_alignment);
const uint8_t* const end = here - (0x1000 + sizeof(MallocHeader)); // stop searching after 4k
for (; here >= end; here -= smallest_possible_alignment) {
if (!os::is_readable_pointer(here)) {
// JDK-8306561: cast to a MallocHeader needs to guarantee it can reside in readable memory
if (!os::is_readable_range(here, here + sizeof(MallocHeader) - 1)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I noticed this late, but the " - 1" looks wrong here, because is_readable_range() checks for < to, not <= to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Dean, thanks for finding this. I opened https://bugs.openjdk.org/browse/JDK-8319542 to address this and will fix it in a timely manner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
6 participants