-
Couldn't load subscription status.
- Fork 6.1k
8342504: Remove NMT header and footer canaries #21560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back jsjolen! A progress list of the required criteria for merging this PR into |
|
@jdksjolen This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 90 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
|
@jdksjolen The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
e9b3c9a to
a3203f7
Compare
a3203f7 to
90734f6
Compare
|
Nice code removal! I wonder if there is any measurable performance impact? Not a review yet, but I took a superficial peek and I like it. Real review coming up... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see code removal, tweaks to malloc header size, tests fixes, so all looks good, except for my question about MallocTracker::print_pointer_information()
| // accidentally like a valid malloc block header (canaries and all) so this is not | ||
| // totally failproof and may give a wrong answer. It is safe in that it will never | ||
| // crash, even when encountering unmapped memory. | ||
| bool MallocTracker::print_pointer_information(const void* p, outputStream* st) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the only thing that had to go was candidate->looks_valid()
Isn't it useful to keep the rest of this code though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've also removed the is_live() call. So now we really can't make a reasonable guess whether a potential pointer really points to a MallocHeader. Since, in my opinion, we can't find a reasonable candidate (not sufficient clues), I decided to remove the code.
| * 8 9 10 11 12 13 14 15 16 ++ | ||
| * +--------+--------+--------+--------+--------+--------+--------+--------+ ------------------------ | ||
| * ... | malloc site table marker | flags | unused | canary | ... User payload .... | ||
| * ... | malloc site table marker | flags | unused | ... User payload .... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flags -> Memory Tags
| * 8 9 10 11 12 13 14 15 16 ++ | ||
| * +--------+--------+--------+--------+--------+--------+--------+--------+ ------------------------ | ||
| * ... | malloc site table marker | flags | unused | canary | ... User payload .... | ||
| * ... | malloc site table marker | flags | unused | ... User payload .... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flags -> Memory Tags
|
|
||
| // realloc is the trickiest of the bunch. Test that realloc works and correctly takes over | ||
| // NMT header and footer to the resized block. We just test that nothing crashes - if the | ||
| // header/footer get corrupted, NMT heap corruption checker will trigger alert on os::free()). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After removing canary, can NMT still check memory corruption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not anymore. The canaries are how we perform memory corruption checks. As mentioned in the PR description, I believe that since we have introduced UBSan and ASan it is no longer necessary for NMT to be able to detect memory corruption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to caution that this would trash the ability to check for corruption for platforms that don't support ubsan and asan. Windows, with the VC compiler, comes to mind (Actually, even the Windows/gcc Port cannot use ubsan and asan either, since the sanitizers haven't been properly ported for Windows in general)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the input, Julian. It seems like both ubsan and asan are available on Clang for Windows (8.1+ for asan), so we'd have a way of getting memory corruption detection on those platforms also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang and its many variants (clang-cl, clang in MinGW mode, etc) are not supported as compilers for the Windows JDK. The only compiler that the build system allows for Windows is VC. I do have a port that supports gcc (Which unfortunately also does not have access to ubsan or asan), but have not started work on clang because it was a bit above my capabilities
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. I'm personally not aware of why using a different C++-compiler on a particular platform would be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since ASAN is supported by VC on Windows platforms, we can still be safe to remove the memory corruption checking by NMT. ASAN is already available in JDK build, so available for the Windows platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this work.
Some flags vs MemTag found.
|
Thanks! I switched the name to 'tags' to make sure that the name fits within the description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Not a review, just a comment in passing. Removing the NMT header/footer canaries is probably needed to give us a |
I don't think that it'll help too much, we will still have a header in NMT, same size. It's just not used for canaries any more. Edit: Scratch that. I just used the fact that we no longer have headers to propose a way of solving the problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment line saying that NMT can detect memory corruptions is not valid any more and is still in the code.
I disagree. See my comment in 8318721: https://bugs.openjdk.org/browse/JDK-8318721?focusedId=14727326&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14727326 |
|
As I said in preparation of this work, I don't oppose it, but I am not happy. ASAN is not a replacement. ASAN is a special build, slow, needs tons of additional memory, stops at the first (often false) positive, and is often bitrotted since, to my knowledge, no vendor builds ASAN-enabled JVMs regularly. More importantly, if you have a problem in the field, it is easy to convince a customer to switch on NMT. You will not convince a customer to switch their production JVMs against an ASAN-enabled one. But okay, let's remove it. I hope the capabilities this will enable are worth the loss of this capability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
|
@jdksjolen This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
|
We're now in JDK-25 so I'm planning to integrate this ASAP. |
|
@jdksjolen This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
|
@jdksjolen this pull request can not be integrated into git checkout delete-canaries
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
Do you have any recent examples of NMT canaries helping out "in the field"? I do feel a little uneasy about loosing this feature, assuming it still helps out in the field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still LGTM.
Nothing I can share, but I had two customers using Unsafe.AllocateMemory (which atm are the only allocations from outside the JVM that are tracked), and they overwrote the boundaries. Took a while to find this, but the canaries added by NMT (mainly the hex dump outputs) were helpful. I wonder, do we have to completely throw this feature away? Even a single byte as canary would be better than nothing? |
Or make the feature optional for those devs, who are willing to pay the price? I would be OK with this being a dynamic, optional feature that can be turned OFF/ON. |
increases complexity, though. malloc header is 16 byte, we could use eg the most significant byte of the size field. Heck, we could use the topmost four bytes, since probably Nobody ever allocates more than 4g with malloc. But certainly the topmost byte. 56 bit are larger than most addrss spaces. |
|
Hi all, As you might suspect, there is a reason that this PR is still open. I am also a bit unsure about removing this. The goal is to replace the two canary bytes in the header with 2 bytes for MemTags, and using the unused byte to gain a total of 4 bytes. One thing we can do is to replace only one of the canary bytes, to gain a total of 3 bytes for MemTags. A 3 byte MemTag would absolutely be sufficient for what we want. |
Thinking about this, I actually rely on this feature a lot more often: A) Customer: "Look at my hs-err file" B) Me: (see crashes in somewhere in libc, or something that looks like corrupted C-heap) "Switch on NMT please" C) Customer: "ok" - does it, no change. D) Now I know it is not a simple double free or overwrite of memory allocated by the hotspot or direct byte buffers. I can exclude, for now, us (the VM vendor) as a culprit and take a closer look at whatever third-party JNI libraries are running in the process. In the end, it may still turn out we broke something, but for now a misuse of C-heap from non-JDK code is more likely. So, NMT is a useful first-responder tool in these cases. This is not that rare. And this feature will be a lot more useful once we have full-process - or at least, with Johan's plans - full JDK integration of NMT. Because then, the surface of instrumented code is larger. If, at step (B), I would have asked the customer: "Can you please switch out the JVM against one I give you that has ASAN enabled", that runs into tons of problems: they may not be in a position to switch out binaries (maybe deployed out of reach somewhere), they may not be allowed to, ASAN build may be too slow, ASAN crashes may be unacceptable in production (and these crashes will be much more likely than NMT finding some problem). For that reason, I propose to leave the footer canary as it is. I cannot see what purpose removing it serves (other than "oh it is simpler now"), and arguably it's the more important of the two canaries for catching end-of-buffer overruns. The leading canary let's shrink to one byte. Ideally placed right in front of the user payload. |
This sounds like a good middle-ground. |
Today NMT has two canaries: A header and a footer canary. These enable mainly two things:
With the introduction of UBSan and Asan into OpenJDK we have gained stronger tools for this sort of analysis, without requiring NMT to be activated. Therefore, I believe that point 2 is no longer something that NMT needs to support. For point number one, we will unfortunately be losing this ability.
I want to delete these canaries to open up a few free bytes. These can allow us to have "practically unlimited" (4 bytes) of memory tags.
tier1-tier2 tests succeeded.
I am awaiting discussion on the Hotspot-dev mailing list, but keeping this PR open for review.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/21560/head:pull/21560$ git checkout pull/21560Update a local copy of the PR:
$ git checkout pull/21560$ git pull https://git.openjdk.org/jdk.git pull/21560/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 21560View PR using the GUI difftool:
$ git pr show -t 21560Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/21560.diff
Using Webrev
Link to Webrev Comment