New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8261401: Add sanity check for UseSHM large pages similar to the one used with hugetlb large pages #2488
Conversation
|
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Stefan,
Testing UseSHM makes definitely sense. Since SysV shm has more restrictions than the mmap API I am not sure that the former works where the latter does not.
About your patch: The shmget test is good, I am not sure about the rest.
You first attempt to shmget(IPC_PRIVATE, page_size, SHM_HUGETLB)
. Then you separately scan for CAP_IPC_LOCK. But would shmget(SHM_HUGETLB) not fail with EPERM if CAP_IPC_LOCK is missing? man shmget says:
EPERM The SHM_HUGETLB flag was specified, but the caller was not
privileged (did not have the CAP_IPC_LOCK capability).
So I think querying for EPERM after shmget should be sufficient? If CAP_IPC_LOCK was missing, the shmget should have failed and we should never get to the point of can_lock_memory().
About the rlimit test: Assuming the man page is wrong and shmget succeeds even without the capability, you only print this limit if can_lock_memory() returned false. The way I understand this is that the limit can still be the limiting factor even with those capabilities in place.
I think it would make more sense to extend the error reporting if later real "shmget" calls fail. Note that later reserve calls can fail for a number of reasons (huge page pool exhausted, RLIMIT_MEMLOCK reached, SHMMAX or SHMALL reached, commit charge reached its limit...) which means that just reporting on RLIMIT_MEMLOCK would be arbitrary. Whether or not more intelligent error reporting makes sense depends also on whether we think we still need the SysV path at all. I personally doubt that this still makes sense.
Cheers, Thomas
src/hotspot/os/linux/os_linux.cpp
Outdated
@@ -3564,6 +3564,66 @@ bool os::Linux::hugetlbfs_sanity_check(bool warn, size_t page_size) { | |||
return result; | |||
} | |||
|
|||
// The capability needed to lock memory CAP_IPC_LOCK | |||
#define CAP_IPC_LOCK 14 | |||
#define CAP_IPC_LOCK_BIT (1 << CAP_IPC_LOCK) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not just include linux/capabilities.h ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, I'll change it.
Nice to hear that you agree Thomas :)
This was my initial though as well, but it turns out that the capability is only needed to "lock" more memory than the MEMLOCK-limit defines. So the first I could of course design the other check to once again call
In old kernels this was the case but according to
So if we have the capability the limit will never be a limiting factor.
I agree that this warning is not covering all bases, but I though it was a good addition compared to just be silent about it. But I like your idea about just doing the simplest sanity check here and then when a mapping goes bade try to figure out why it did. In that case I would only include the first |
Removed check for IPC_LOCK capability. If a large page mapping fails the errno is already present in the warning printed out. We could look at improving this to better explain when EPERM vs ENOMEM occurs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks almost good. Minor nit below.
Okay, I see what you did now.
This makes sense, especially since you could add that warning to both SysV and mmap paths since they share at least some of the restrictions. Like huge page pool size. It also would mean you don't pay for some of these tests unless needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now. Thank you!
@kstefanj This change now passes all automated pre-integration checks. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 27 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
|
src/hotspot/os/linux/os_linux.cpp
Outdated
// Possible reasons for shmget failure: | ||
// 1. shmmax is too small for the request. | ||
// > check shmmax value: cat /proc/sys/kernel/shmmax | ||
// > increase shmmax value: echo "0xffffffff" > /proc/sys/kernel/shmmax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd avoid using a 32 bit constant on potentially 64 bit systems. According to that same man page, the limit on 64 bits is 127 TB which is more than the suggested 4 GB :)
The man page talks about ULONG_MAX - 2^24
as default (for 64 bit systems?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is a copy of the old one where we actually use shmget()
, but I agree that it could use an update. Not even sure we have to suggest a specific value, something like:
// > increase shmmax value: echo "0xffffffff" > /proc/sys/kernel/shmmax | |
// > increase shmmax value: echo "new value" > /proc/sys/kernel/shmmax |
Might be good enough.
I will also update the original comment.
bool os::Linux::shm_hugetlbfs_sanity_check(bool warn, size_t page_size) { | ||
// Try to create a large shared memory segment. | ||
int shmid = shmget(IPC_PRIVATE, page_size, SHM_HUGETLB|IPC_CREAT|SHM_R|SHM_W); | ||
if (shmid == -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the flags should contain the SHM_HUGE_*
flags (considering large pages) similar to hugetlbfs for proper checking to avoid the failure like in JDK-8261636 reported just recently for the same reason.
According to the man pages] such flags exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree if the later usage of shmget()
would make use of those flags but it don't. We might want o add a CR for enabling use of other page sizes than the default large page size for SHM as well. But there is also discussions about trying to remove this SHM support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am good with creating a CR for that. We can always close it later.
src/hotspot/os/linux/os_linux.cpp
Outdated
// 2. not enough large page memory. | ||
// > check available large pages: cat /proc/meminfo | ||
// > increase amount of large pages: | ||
// echo new_value > /proc/sys/vm/nr_hugepages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just my opinion, but I would prefer to refer to some generic documentation about these options (and use the "new" locations in /sys/kernel/mm/hugepages/hugepages-*/
instead of the "legacy" /proc/sys, e.g. https://lwn.net/Articles/376606/
- yes this is not official documentation, but the best I could find on the spot)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, the documentation I usually refer to is:
https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt
Since our SHM-support currently only handles the default large page size I think we should suggest to use:
sysctl -w vm.nr_hugepages=new_value
// echo new_value > /proc/sys/vm/nr_hugepages | |
// sysctl -w vm.nr_hugepages=new_value | |
// > For more information regarding large pages please refer to: | |
// https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look Thomas
bool os::Linux::shm_hugetlbfs_sanity_check(bool warn, size_t page_size) { | ||
// Try to create a large shared memory segment. | ||
int shmid = shmget(IPC_PRIVATE, page_size, SHM_HUGETLB|IPC_CREAT|SHM_R|SHM_W); | ||
if (shmid == -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree if the later usage of shmget()
would make use of those flags but it don't. We might want o add a CR for enabling use of other page sizes than the default large page size for SHM as well. But there is also discussions about trying to remove this SHM support.
src/hotspot/os/linux/os_linux.cpp
Outdated
// Possible reasons for shmget failure: | ||
// 1. shmmax is too small for the request. | ||
// > check shmmax value: cat /proc/sys/kernel/shmmax | ||
// > increase shmmax value: echo "0xffffffff" > /proc/sys/kernel/shmmax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is a copy of the old one where we actually use shmget()
, but I agree that it could use an update. Not even sure we have to suggest a specific value, something like:
// > increase shmmax value: echo "0xffffffff" > /proc/sys/kernel/shmmax | |
// > increase shmmax value: echo "new value" > /proc/sys/kernel/shmmax |
Might be good enough.
I will also update the original comment.
src/hotspot/os/linux/os_linux.cpp
Outdated
// 2. not enough large page memory. | ||
// > check available large pages: cat /proc/meminfo | ||
// > increase amount of large pages: | ||
// echo new_value > /proc/sys/vm/nr_hugepages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, the documentation I usually refer to is:
https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt
Since our SHM-support currently only handles the default large page size I think we should suggest to use:
sysctl -w vm.nr_hugepages=new_value
// echo new_value > /proc/sys/vm/nr_hugepages | |
// sysctl -w vm.nr_hugepages=new_value | |
// > For more information regarding large pages please refer to: | |
// https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't approve the changes you made in the "Suggested changes" boxes, or even comment on them as "looks good" directly, but they look all good.
bool os::Linux::shm_hugetlbfs_sanity_check(bool warn, size_t page_size) { | ||
// Try to create a large shared memory segment. | ||
int shmid = shmget(IPC_PRIVATE, page_size, SHM_HUGETLB|IPC_CREAT|SHM_R|SHM_W); | ||
if (shmid == -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am good with creating a CR for that. We can always close it later.
Thanks for the reviews @tstuefe and @tschatzl FYI, I just filed three more enhancements: |
/integrate |
@kstefanj Since your change was applied there have been 27 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit f639df4. |
When large pages are enabled on Linux (using -XX:+UseLargePages), both UseHugeTLBFS and UseSHM can be used. We prefer to use HugeTLBFS and first do a sanity check to see if this kind of large pages are available and if so we disable UseSHM.
The problematic part is when HugeTLBFS pages are not available, then we disable this flag and without doing any sanity check for UseSHM, we mark large pages as enabled using SHM. One big problem with this is that SHM also requires the same type of explicitly allocated huge pages as HugeTLBFS and also privileges to lock memory. So it is likely that in the case of not being able to use HugeTLBFS we probably can't use SHM either.
A fix for this would be to do a similar sanity check as currently done for HugeTLBFS and if it fails disable UseLargePages since we will always fail such allocation attempts anyways.
The proposed sanity check consist of two part, where the first is just trying create a shared memory segment using
shmget()
with SHM_HUGETLB to use large pages. If this fails there is no idea in trying to use SHM to get large pages.The second part checks if the process has privileges to lock memory or if there will be a limit for the SHM usage. I think this would be a nice addition since it will notify the user about the limit and explain why large page mappings fail. The implementation parses
/proc/self/status
to make sure the needed capability is available.This change needs two tests to be updated to handle that large pages not can be disabled even when run with +UseLargePages. One of these tests are also updated in PR#2486 and I plan to get that integrated before this one.
Progress
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk pull/2488/head:pull/2488
$ git checkout pull/2488