New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8243315: ParallelScavengeHeap::initialize() passes GenAlignment as page size to os::trace_page_sizes instead of actual page size #1161
Conversation
…ge size to os::trace_page_sizes instead of actual page size
Hi @jaokim, welcome to this OpenJDK project and thanks for contributing! We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user jaokim" as summary for the issue. If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing |
Webrevs
|
Thanks for taking a look at this issue. A few suggestions.
@@ -66,7 +66,7 @@ jint ParallelScavengeHeap::initialize() { | |||
os::trace_page_sizes("Heap", | |||
MinHeapSize, | |||
reserved_heap_size, | |||
GenAlignment, | |||
os::vm_page_size(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I briefly looked at this in the past and I don't think using os::vm_page_size()
straight off will not work for the case when large pages are used. So we need a way to figure out if the heap got reserved using large pages. Once again you can take a look at how G1 does it, it's in actual_reserved_page_size()
in g1CollectedHeap.cpp. I'm not sure you can use that straight off, but something similar should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for comments! Have rearranged the the logging of this. The logic for G1 also applies to how parallel's heap is reserved.
@jaokim this pull request can not be integrated into git checkout JBS-8180069
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
|
||
log_info(pagesize)("Alignments: " | ||
" space_align=" SIZE_FORMAT "%s" | ||
" gen_align=" SIZE_FORMAT "%s" | ||
" heap_align=" SIZE_FORMAT "%s", | ||
byte_size_in_exact_unit(SpaceAlignment), exact_unit_for_byte_size(SpaceAlignment), | ||
byte_size_in_exact_unit(GenAlignment), exact_unit_for_byte_size(GenAlignment), | ||
byte_size_in_exact_unit(HeapAlignment), exact_unit_for_byte_size(HeapAlignment) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is moved to the ParallelInitLogger
virtual void print_heap(); | ||
//virtual void print_workers(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed too soon, just ignore these.
#include "gc/shared/genArguments.hpp" | ||
#include "gc/shared/gcLogPrecious.hpp" | ||
|
||
void ParallelInitLogger::print_gc_specific() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this as heap information so I would suggest using print_heap
instead and as now print it before the general heap info.
log_info_p(gc, init)("Alignments:" | ||
" space_align=" SIZE_FORMAT "%s" | ||
" gen_align=" SIZE_FORMAT "%s" | ||
" heap_align=" SIZE_FORMAT "%s", | ||
byte_size_in_exact_unit(SpaceAlignment), exact_unit_for_byte_size(SpaceAlignment), | ||
byte_size_in_exact_unit(GenAlignment), exact_unit_for_byte_size(GenAlignment), | ||
byte_size_in_exact_unit(HeapAlignment), exact_unit_for_byte_size(HeapAlignment) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation here got a bit of now. I also took a look at the current output and we do not use '=' anywhere else so I suggest going with something like this:
log_info_p(gc, init)("Alignments:" | |
" space_align=" SIZE_FORMAT "%s" | |
" gen_align=" SIZE_FORMAT "%s" | |
" heap_align=" SIZE_FORMAT "%s", | |
byte_size_in_exact_unit(SpaceAlignment), exact_unit_for_byte_size(SpaceAlignment), | |
byte_size_in_exact_unit(GenAlignment), exact_unit_for_byte_size(GenAlignment), | |
byte_size_in_exact_unit(HeapAlignment), exact_unit_for_byte_size(HeapAlignment) | |
); | |
log_info_p(gc, init)("Alignments:" | |
" Space " SIZE_FORMAT "%s," | |
" Generation " SIZE_FORMAT "%s," | |
" Heap " SIZE_FORMAT "%s", | |
byte_size_in_exact_unit(SpaceAlignment), exact_unit_for_byte_size(SpaceAlignment), | |
byte_size_in_exact_unit(GenAlignment), exact_unit_for_byte_size(GenAlignment), | |
byte_size_in_exact_unit(HeapAlignment), exact_unit_for_byte_size(HeapAlignment)); |
This would generate output like this:
[0.811s][info][gc,init] Compressed Oops: Enabled (Zero based)
[0.811s][info][gc,init] Alignments: Space 512K, Generation 512K, Heap 2M
[0.811s][info][gc,init] Heap Min Capacity: 8M
[0.811s][info][gc,init] Heap Initial Capacity: 1002M
[0.811s][info][gc,init] Heap Max Capacity: 16012M
[0.811s][info][gc,init] Pre-touch: Disabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looks much better! I just took the format from the original pagesize output, but this looks better.
MinHeapSize, | ||
reserved_heap_size, | ||
page_size, | ||
rs.base(), | ||
rs.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation is one off.
size_t page_size = os::vm_page_size(); | ||
if (UseLargePages) { | ||
// There are two ways to manage large page memory. | ||
// 1. OS supports committing large page memory. | ||
// 2. OS doesn't support committing large page memory so ReservedSpace manages it. | ||
// And ReservedSpace calls it 'special'. If we failed to set 'special', | ||
// we reserved memory without large page. | ||
if (os::can_commit_large_page_memory() || rs.special()) { | ||
// An alignment at ReservedSpace comes from preferred page size or | ||
// heap alignment, and if the alignment came from heap alignment, it could be | ||
// larger than large pages size. So need to cap with the large page size. | ||
page_size = MIN2(rs.alignment(), os::large_page_size()); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is exactly the same code as in actual_reserved_page_size()
in g1CollectedHeap.cpp I think we should lift it out to a helper that can be used by both G1 and Parallel. Not entirely sure where the best location for such helper is, but a static function ReservedSpace::actual_page_size(ReservedSpace)
could work.
@jaokim I've run some additional testing and haven't seen any problems so I'm ready to sponsor whenever you want to integrate. |
/sponsor |
@kstefanj @jaokim Since your change was applied there have been 22 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 8cd2e0f. |
…ge size to os::trace_page_sizes instead of actual page size Reviewed-by: sjohanss, tschatzl
Hi guys, I was just reviewing this but see that it was already pushed in the meantime. I have some concerns about this change. Most importantly, it perpetuates the notion that alignment has anything to do with pages, which is just wrong. I know we have still some coding in the hotspot which assumes this but it is not true. We should remove any coding assuming that, instead of building onto it. ReservedSpace::alignment is just that, the alignment of the start address of the ReservedSpace. Arguably, there is no need to even keep it as a member in ReservedSpace, it is needed for reservation and should not be needed beyond that. So, (why is this function static btw?) As an example for larger alignment which have nothing to do with page size: Metaspace reserves 4M-aligned ReservedSpaces. Currently only uses small pages, but that may change. Furthermore I think the notion of asking about the page size of a range is itself shaky. A range can and does consist of multiple page sizes (see e.g. what Linux reserves with UseHugeTLBFS). The patch also assumes os::large_page_size() to be the one large page size used, and currently there are attempts by Intel to more or less fall back to a "small large page" if the large page is too large for a given area: see #1153. This would increase the number of used page sizes, and could mean space reserved with reserve_memory_special() would silently return memory with small-large-pages instead of os::large_page_size(). I am sorry to make such a fuss. In an ideal world we would have something like
which would return information about all page sizes in the range (page_sizes_t could be a bitmask btw since all page sizes are pow2). We have something like this on AIX, see That'd be awesome since it would take the burden from us to keep information of page sizes or second-guessing reservation logic. That information could even be cached in a ReservedSpace if its difficult to get. Cheers, Thomas |
I wrote
which is of course wrong since it just returns the global small page size. Querying the page size of an arbitrary range seems to require walking smaps, and that is way more cumbersome. |
Hi Thomas, Thanks for sharing your concerns. I agree that this becomes more of a problem now when it is exposed outside the GC, that's why I wasn't sure where we should put this:
This is also why it was added as a static helper rather than a member. An alternative is to move this helper to a shared GC utility, and maybe document it a bit more. Adding multiple large page sizes will require this function to be updated, that's one of the reasons I wanted it shared not duplicated. As you say the best thing would be if we had a way to really query the page size rather than doing a good estimation given the looks of the reserved space. But for the current use-cases in the GC, this estimation seems to be enough. Going the other route is, as you say, way more work. A different solution would be to add a What would be your preferred way forward? Cheers, |
Hi Stefan,
Thanks for taking my concerns seriously. I know that this is all a bit messy, and has a long history. I briefly looked into some way to ask the OS about the page size of a memory region. That is certainly possible but annoyingly complicated. On AIX we have a thing already. The only platform that has :) On Linux you would have to scan /proc/self/smaps. May be worthwhile but complex and expensive for large processes. On Windows I have no clue what do do. VirtualQuery() does not return page information. Neither do I know what to do on MacOS. Interestingly, I found that we have os::page_info() and os::scan_pages(), which look like they would do what we want but they are empty on all platforms. Solaris leftover. I filed JDK-8257076 for you guys to track this since you may remove some of the associated NUMA coding too. So the more practical and faster way would be to store this information ourselves. It would be better than trying to deduce what we did at reservation time. I am currently working on a prototype to expand os::reserve_memory_special() to optionally return information about the page sizes in the reserved range. Kinda like sigset_t. That would allow the os implementors to do whatever (eg mix in 2M pages), and you have the information about that set of page sizes, which could be stored in ReservedSpace as a member. Later, that info can be used as "what the underlying page sizes are to our best knowledge". That prototype would look a bit like this:
This is a smaller version of what I think would be generally a good idea, which is for all os::reserve__... APIs to return meta information about the reserved space for the caller to hold on to. But this is only for os::reserve_memory_special(), and only for page sizes, so less daunting. I'll post an RFR if I have a prototype ready. Then we may take another look at this change and rework this actual_page_size(). Cheers, Thomas |
So Thomas in what cases do we need the |
Happens if you reserve a space with a size not aligned to the underlying large page size. See reserve_memory_special_huge_tlbfs_mixed() on Linux. In that case, we try to be smart and fill in the ends with small pages rather than failing. |
Of which the Intel approach with the 2M pages could be seen as an expansion (e.g. if you have a heap of 1.5G, allocate 1G huge page, and fill in the ends with 2M pages, and if necessary with 4K pages). |
Sorry was a bit unclear, I know about this case but I was more thinking do we need to support it. Could we just always align correctly. |
The model would be so much nicer if a |
Yes, but I think it makes sense to allow the os layer to mix page sizes for large paged areas, for performance reasons. The fact that this coding exists, and that Intel wants to further complicate it and add 2M pages, means we have a need here. Trying to avoid this just means that people add patches sideways to satisfy specific needs, which hurts maintainability. Also, I don't like to discourage first time contributors with lots of concerns, therefore I'd like a cleaner, more flexible os layer. But no-one forces you to accept multi-page-sized-areas. If you really want just one page size, you can query the largest page size available beforehand and align the reservation size accordingly, and with my proposed change you could now assert the result and log it correctly. But if one just generally wants large pages without caring for the precise layout, one could let os::reserve_memory_special() do its best, and would now get the information about what reserve_memory_special() did. For example, were I to re-introduce large pages for Metaspace, I would like to have the luxury of just calling os::reserve_memory_special() without having to think about specific geometry - if the space is large enough for large pages, it should stitch the area together as best as it can. |
Hi,
On 25.11.20 16:52, Thomas Stuefe wrote:
Yes, but I think it makes sense to allow the os layer to mix page sizes
for large paged areas, for performance reasons. The fact that this
coding exists,
That code may be an artifact of 32 bit machines from 10+ years ago where
address space fragmentation has been a real concern (typically Windows).
Or RAM sizes were measured in hundreds of MB instead of GBs where (on
Linux) pre-reserving hundreds of MB for huge pages is/has been an issue.
and that Intel wants to further complicate it and add 2M
pages, means we have a need here.
The JDK-8256155 (Intel) CR does not state that requirement. Imho it only
says that the author wants to use (any and all) configured pages for
different reserved spaces.
E.g. the machine has (on x64, to simplify the case) configured:
10 1G pages
1000 2M pages
so the heap should use the 1G pages (assuming it's less than 10G), other
reservations like code heap should first use the 50 2M pages before
falling back to other page sizes to better use available TLB cache entries.
I would prefer if we do not overcomplicate the requirements :) Also
probably this should be asked and followed up in the correct review thread.
Trying to avoid this just means that people add patches sideways to
satisfy specific needs, which hurts maintainability. Also, I don't like
to discourage first time contributors with lots of concerns, therefore
I'd like a cleaner, more flexible os layer.
I'd like a simpler, maybe less flexible but understandable by mere
mortals OS layer :) lower layer that does not make upper layers too
complicated.
But no-one forces you to accept multi-page-sized-areas. If you really
want just one page size, you can query the largest page size available
beforehand and align the reservation size accordingly, and with my
Which is no issue with 64 bit machines at all, but probably has been
with the prevalence of 32 bit address spaces.
proposed change you could now assert the result and log it correctly.
But if one just generally wants large pages without caring for the
precise layout, one could let os::reserve_memory_special() do its best,
and would now get the information about what reserve_memory_special() did.
This is a kind of one-sided argument, taking only commit into account.
Since actually giving back memory is expected nowadays, taking care of
different random page sizes is complicated.
E.g. when implementing G1's region memory management (in 8u40) the
decision to only support a single page size for every one of its GC data
structures has been a conscious one - because the complexity overhead
did not justify the gains.
Nobody complained yet.
For example, were I to re-introduce large pages for Metaspace, I would
like to have the luxury of just calling os::reserve_memory_special()
without having to think about specific geometry - if the space is large
enough for large pages, it should stitch the area together as best as it
can.
That's true, but that Metaspace layer then needs to be aware of multiple
page sizes when uncommitting, and (presumably) tracking liveness on the
lowest granularity anyway. Which does not make the code easier.
Thanks,
Thomas
|
Hi Thomas, first off, the last thing I want is for matters to become more complicated. God no. Today the virtual memory layer is inconsistent and undocumented and has subtle bugs. Among other things this means that PRs can take forever to discuss, which I guess is frustrating for newcomers. Or if they are pushed they can cause a lot of cleanup work afterwards. Just attempting to write a consistent API description of the virtual memory layer, parameters expected behavior etc, is very difficult. I know since I tried. And that is a bad sign. Tests are missing too, since what you cannot describe you cannot test. For an interesting example, see JDK-8255978. That bug had been in there since at least JDK8. One problem I see reoccurring with these APIs is that meta information about a reservation are needed later, but are lost. Many examples:
The code then often tries to guess these information from "circumstantial evidence", which introduces lots of strange dependencies and makes the coding fragile. ReservedSpace::actual_page_size() in this patch is one example. A better way would be to have these information kept somewhere. Ideally, the OS would do this and I could ask it for properties of a memory mapping. In reality, the OS is not so forthcoming. So we should keep these information ourselves. One of these information is the page sizes of a region. In that light you can see my proposal. I did not propose to change a single atom about how os::reserve_memory_special() works. I just proposed for it to just be honest and return information about what it did. So that this information could be kept inside ReservedSpace and reused instead of guessed. And since the reality today is that os::reserve_memory_special() can reserve multiple page sizes, a correct implementation would have to reflect that or be wrong. You are right, reporting multiple page sizes is more difficult than just one. But even more complex is what we have today, where a mapping can have multiple page sizes but this is not acknowledged by upper layers. In this case, we have ReservedSpace::actual_page_size() which plain cannot be implemented correctly. I am fine with the idea of restricting RS to just one page size. But then lets do that and get rid of mixed TLBFS reservation (and whatever Windows does, they may be doing something similar). We also should specify the page size as an explicit parameter and not confuse this with the alignment.
I am not sure what coding you refer to.
Flexible stitching would have one advantage though, in that a huge page pool could be better used. In your example, if someone starts with a 12G heap, it would fail since no pool indicidually is large enough, but combining different page sizes would work. I wont insist on this. As I wrote, I am fine with one-pagesize-per-mapping if we can arrive there.
Since it was relevant to this PR, I did mention it here.
I do too.
I may be missing something here, but is hupe paged space not pinned? How can this be uncommitted? Okay yes, if it can be uncommitted, multiple page sizes should be avoided.
See above. I was under the assumption that uncommit goes out of the window the moment you choose explicit large pages.
Thanks, Thomas |
You are correct that explicit huge pages are pinned and of course there might be situations where it would be beneficial to do the stitching, but I think the maintenance cost will be significantly higher. It sound like you would prefer a simpler model as well and int would certainly be interesting to see if there are any problems connected to getting rid of the mixed mappings we have today. I also agree that we should do better book keeping since it is so hard to get the information later on. |
Hi guys, gentle ping. I know we are all busy with the impending code freeze :) I am fine with removing the ability to reserve space with multiple page sizes. But that is something I cannot drive - probably someone at Oracle should, since it has repercussions for GC and for backward compatibility. Might also take a bit longer. Meanwhile, we still have the new ReservedSpace::actual_page_size(), which is subtly broken and will be more so once JDK-8234930 is implemented. IIUC you did not like the proposal of making reserve_space_special() return information about the reserved page sizes. That's totally fine for me, but have we decided on how to deal with this instead? Cheers, Thomas |
Hi Thomas, I just filed JDK-8257757 to keep track of this, feel free to add information to it. Let's work together to try to improve this. Cheers, |
Description
The logging for reserved heap space, printed the
GenAlignment
value instead of page size.Before:
Besides changing to display page size using
os::vm_page_size()
, this fix also adds logging ofSpaceAlignment
,GenAlignment
, andHeapAlignment
.After:
Please advice whether the added logging of alignments is sufficient or irrelevant. There were no signs of either alignment being logged anywhere.
Testing
-Xlog:pagesize=info -XX:+UseParallelGC
Progress
Testing
Failed test tasks
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk pull/1161/head:pull/1161
$ git checkout pull/1161