Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8243315: ParallelScavengeHeap::initialize() passes GenAlignment as page size to os::trace_page_sizes instead of actual page size #1161

Closed
wants to merge 9 commits into from

Conversation

jaokim
Copy link
Contributor

@jaokim jaokim commented Nov 11, 2020

Description

The logging for reserved heap space, printed the GenAlignment value instead of page size.
Before:

[0.369s][info][gc      ] Using Parallel
[0.374s][info][pagesize] Heap:  min=2M max=2M base=0x00000000ffe00000 page_size=512K size=2M

Besides changing to display page size using os::vm_page_size(), this fix also adds logging of SpaceAlignment ,GenAlignment, and HeapAlignment.
After:

[0.367s][info][pagesize] Alignments:  space_align=512K gen_align=512K heap_align=2M
[0.369s][info][gc      ] Using Parallel
[0.374s][info][pagesize] Heap:  min=2M max=2M base=0x00000000ffe00000 page_size=4K size=2M

Please advice whether the added logging of alignments is sufficient or irrelevant. There were no signs of either alignment being logged anywhere.

Testing

  • Tested manually, and logs are showing with arguments -Xlog:pagesize=info -XX:+UseParallelGC
  • Tested Tier1, Tier2, Tier3.

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Testing

Linux aarch64 Linux arm Linux ppc64le Linux s390x Linux x64 Linux x86 Windows x64 macOS x64
Build ✔️ (1/1 passed) (1/1 failed) (1/1 failed) ✔️ (1/1 passed) ✔️ (6/6 passed) (2/2 failed) ✔️ (2/2 passed) ✔️ (2/2 passed)
Test (tier1) ✔️ (9/9 passed) ✔️ (9/9 passed) ✔️ (9/9 passed)

Failed test tasks

Issue

  • JDK-8243315: ParallelScavengeHeap::initialize() passes GenAlignment as page size to os::trace_page_sizes instead of actual page size

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1161/head:pull/1161
$ git checkout pull/1161

…ge size to os::trace_page_sizes instead of actual page size
@bridgekeeper bridgekeeper bot added the oca label Nov 11, 2020
@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Nov 11, 2020

Hi @jaokim, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user jaokim" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

@openjdk
Copy link

@openjdk openjdk bot commented Nov 11, 2020

@jaokim The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc label Nov 11, 2020
@jaokim jaokim changed the title 8180069: ParallelScavengeHeap::initialize() passes GenAlignment as page size to os::trace_page_sizes instead of actual page size 8243315: ParallelScavengeHeap::initialize() passes GenAlignment as page size to os::trace_page_sizes instead of actual page size Nov 11, 2020
@bridgekeeper bridgekeeper bot removed the oca label Nov 11, 2020
@jaokim jaokim marked this pull request as ready for review Nov 12, 2020
@openjdk openjdk bot added the rfr label Nov 12, 2020
@mlbridge
Copy link

@mlbridge mlbridge bot commented Nov 12, 2020

Webrevs

Copy link
Contributor

@kstefanj kstefanj left a comment

Thanks for taking a look at this issue. A few suggestions.

src/hotspot/share/gc/parallel/parallelArguments.cpp Outdated Show resolved Hide resolved
@@ -66,7 +66,7 @@ jint ParallelScavengeHeap::initialize() {
os::trace_page_sizes("Heap",
MinHeapSize,
reserved_heap_size,
GenAlignment,
os::vm_page_size(),
Copy link
Contributor

@kstefanj kstefanj Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I briefly looked at this in the past and I don't think using os::vm_page_size() straight off will not work for the case when large pages are used. So we need a way to figure out if the heap got reserved using large pages. Once again you can take a look at how G1 does it, it's in actual_reserved_page_size() in g1CollectedHeap.cpp. I'm not sure you can use that straight off, but something similar should work.

Copy link
Contributor Author

@jaokim jaokim Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for comments! Have rearranged the the logging of this. The logic for G1 also applies to how parallel's heap is reserved.

@openjdk openjdk bot removed the rfr label Nov 19, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Nov 19, 2020

@jaokim this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JBS-8180069
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added merge-conflict rfr labels Nov 19, 2020

log_info(pagesize)("Alignments: "
" space_align=" SIZE_FORMAT "%s"
" gen_align=" SIZE_FORMAT "%s"
" heap_align=" SIZE_FORMAT "%s",
byte_size_in_exact_unit(SpaceAlignment), exact_unit_for_byte_size(SpaceAlignment),
byte_size_in_exact_unit(GenAlignment), exact_unit_for_byte_size(GenAlignment),
byte_size_in_exact_unit(HeapAlignment), exact_unit_for_byte_size(HeapAlignment)
);
Copy link
Contributor Author

@jaokim jaokim Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is moved to the ParallelInitLogger

virtual void print_heap();
//virtual void print_workers();
Copy link
Contributor Author

@jaokim jaokim Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed too soon, just ignore these.

Copy link
Contributor

@kstefanj kstefanj left a comment

Thanks for the update Joakim, some more comments :)

#include "gc/shared/genArguments.hpp"
#include "gc/shared/gcLogPrecious.hpp"

void ParallelInitLogger::print_gc_specific() {
Copy link
Contributor

@kstefanj kstefanj Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this as heap information so I would suggest using print_heap instead and as now print it before the general heap info.

log_info_p(gc, init)("Alignments:"
" space_align=" SIZE_FORMAT "%s"
" gen_align=" SIZE_FORMAT "%s"
" heap_align=" SIZE_FORMAT "%s",
byte_size_in_exact_unit(SpaceAlignment), exact_unit_for_byte_size(SpaceAlignment),
byte_size_in_exact_unit(GenAlignment), exact_unit_for_byte_size(GenAlignment),
byte_size_in_exact_unit(HeapAlignment), exact_unit_for_byte_size(HeapAlignment)
);
Copy link
Contributor

@kstefanj kstefanj Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation here got a bit of now. I also took a look at the current output and we do not use '=' anywhere else so I suggest going with something like this:

Suggested change
log_info_p(gc, init)("Alignments:"
" space_align=" SIZE_FORMAT "%s"
" gen_align=" SIZE_FORMAT "%s"
" heap_align=" SIZE_FORMAT "%s",
byte_size_in_exact_unit(SpaceAlignment), exact_unit_for_byte_size(SpaceAlignment),
byte_size_in_exact_unit(GenAlignment), exact_unit_for_byte_size(GenAlignment),
byte_size_in_exact_unit(HeapAlignment), exact_unit_for_byte_size(HeapAlignment)
);
log_info_p(gc, init)("Alignments:"
" Space " SIZE_FORMAT "%s,"
" Generation " SIZE_FORMAT "%s,"
" Heap " SIZE_FORMAT "%s",
byte_size_in_exact_unit(SpaceAlignment), exact_unit_for_byte_size(SpaceAlignment),
byte_size_in_exact_unit(GenAlignment), exact_unit_for_byte_size(GenAlignment),
byte_size_in_exact_unit(HeapAlignment), exact_unit_for_byte_size(HeapAlignment));

This would generate output like this:

[0.811s][info][gc,init] Compressed Oops: Enabled (Zero based)                                                                                                                                                                                                                                                        
[0.811s][info][gc,init] Alignments: Space 512K, Generation 512K, Heap 2M                                                                                                                                                                                                                                             
[0.811s][info][gc,init] Heap Min Capacity: 8M                                                                                                                                                                                                                                                                        
[0.811s][info][gc,init] Heap Initial Capacity: 1002M                                                                                                                                                                                                                                                                 
[0.811s][info][gc,init] Heap Max Capacity: 16012M                                                                                                                                                                                                                                                                    
[0.811s][info][gc,init] Pre-touch: Disabled                                                                                                                                                                                                                                                                          

Copy link
Contributor Author

@jaokim jaokim Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, looks much better! I just took the format from the original pagesize output, but this looks better.

MinHeapSize,
reserved_heap_size,
page_size,
rs.base(),
rs.size());
Copy link
Contributor

@kstefanj kstefanj Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is one off.

size_t page_size = os::vm_page_size();
if (UseLargePages) {
// There are two ways to manage large page memory.
// 1. OS supports committing large page memory.
// 2. OS doesn't support committing large page memory so ReservedSpace manages it.
// And ReservedSpace calls it 'special'. If we failed to set 'special',
// we reserved memory without large page.
if (os::can_commit_large_page_memory() || rs.special()) {
// An alignment at ReservedSpace comes from preferred page size or
// heap alignment, and if the alignment came from heap alignment, it could be
// larger than large pages size. So need to cap with the large page size.
page_size = MIN2(rs.alignment(), os::large_page_size());
}
}
Copy link
Contributor

@kstefanj kstefanj Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is exactly the same code as in actual_reserved_page_size() in g1CollectedHeap.cpp I think we should lift it out to a helper that can be used by both G1 and Parallel. Not entirely sure where the best location for such helper is, but a static function ReservedSpace::actual_page_size(ReservedSpace) could work.

@openjdk openjdk bot added hotspot and removed rfr hotspot-gc labels Nov 24, 2020
@openjdk openjdk bot added rfr and removed merge-conflict labels Nov 24, 2020
Copy link
Contributor

@tschatzl tschatzl left a comment

Lgtm.

@kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Nov 25, 2020

@jaokim I've run some additional testing and haven't seen any problems so I'm ready to sponsor whenever you want to integrate.

@jaokim
Copy link
Contributor Author

@jaokim jaokim commented Nov 25, 2020

Thank you @kstefanj and @tschatzl for reviewing and comments!

Tested tier1, tier2, tier3 without problems.

/integrate

@openjdk openjdk bot added the sponsor label Nov 25, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Nov 25, 2020

@jaokim
Your change (at version 8293b65) is now ready to be sponsored by a Committer.

@kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Nov 25, 2020

/sponsor

@openjdk openjdk bot closed this Nov 25, 2020
@openjdk openjdk bot added integrated and removed sponsor ready rfr labels Nov 25, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Nov 25, 2020

@kstefanj @jaokim Since your change was applied there have been 22 commits pushed to the master branch:

  • cdb41ba: 8255904: Remove superfluous use of reflection in Class::isRecord
  • c45725e: 8256747: GitHub Actions: decouple the hotspot build-only jobs from Linux x64 testing
  • 79e57ac: 8256643: Terminally deprecate ThreadGroup stop, destroy, isDestroyed, setDaemon and isDaemon
  • 0d91f0a: 8252848: Optimize small primitive arrayCopy operations through partial inlining using AVX-512 masked instructions
  • 66943fe: 8256517: (ref) Reference.clear during reference processing may lose notification
  • 3c230b8: 8256993: Clarify Package::isSealed javadoc about package sealing vs sealed class or interface
  • 1b7a61f: 8254999: Move G1RemSetSamplingTask to more appropriate location
  • 695117f: 8255479: [aarch64] assert(src->section_index_of(target) == CodeBuffer::SECT_NONE) failed: sanity
  • dbfeb90: 8243559: Remove root certificates with 1024-bit keys
  • 2a1e9be: 8256364: vmTestbase/nsk/jvmti/scenarios/capability/CM01/cm01t002 failed with "assert(handle != __null) failed: JNI handle should not be null"
  • ... and 12 more: https://git.openjdk.java.net/jdk/compare/b52f6c053bf5cf6363088cb8bd210abb4e7350ff...master

Your commit was automatically rebased without conflicts.

Pushed as commit 8cd2e0f.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk-notifier bot referenced this issue Nov 25, 2020
…ge size to os::trace_page_sizes instead of actual page size

Reviewed-by: sjohanss, tschatzl
@tstuefe
Copy link
Member

@tstuefe tstuefe commented Nov 25, 2020

Hi guys,

I was just reviewing this but see that it was already pushed in the meantime.

I have some concerns about this change.

Most importantly, it perpetuates the notion that alignment has anything to do with pages, which is just wrong. I know we have still some coding in the hotspot which assumes this but it is not true. We should remove any coding assuming that, instead of building onto it.

ReservedSpace::alignment is just that, the alignment of the start address of the ReservedSpace. Arguably, there is no need to even keep it as a member in ReservedSpace, it is needed for reservation and should not be needed beyond that.

So, page_size = MIN2(rs.alignment(), os::large_page_size()); is wrong, e.g. if I create a space with 128M aligned base, that does say nothing about the underlying page size. Unfortunately we do not have tests to test this. This coding had been less of a problem as long as it lived in gc land and was only called for heap areas. Now it is a general purpose function.

(why is this function static btw?)

As an example for larger alignment which have nothing to do with page size: Metaspace reserves 4M-aligned ReservedSpaces. Currently only uses small pages, but that may change.

Furthermore I think the notion of asking about the page size of a range is itself shaky. A range can and does consist of multiple page sizes (see e.g. what Linux reserves with UseHugeTLBFS).

The patch also assumes os::large_page_size() to be the one large page size used, and currently there are attempts by Intel to more or less fall back to a "small large page" if the large page is too large for a given area: see #1153. This would increase the number of used page sizes, and could mean space reserved with reserve_memory_special() would silently return memory with small-large-pages instead of os::large_page_size().

I am sorry to make such a fuss.


In an ideal world we would have something like

pagesizes_t os::query_page_sizes(address range);

which would return information about all page sizes in the range (page_sizes_t could be a bitmask btw since all page sizes are pow2). We have something like this on AIX, see os::Aix::query_pagesize. On Linux, we have e.g. int getpagesize(void);

That'd be awesome since it would take the burden from us to keep information of page sizes or second-guessing reservation logic. That information could even be cached in a ReservedSpace if its difficult to get.

Cheers, Thomas

@tstuefe
Copy link
Member

@tstuefe tstuefe commented Nov 25, 2020

I wrote

On Linux, we have e.g. int getpagesize(void);

which is of course wrong since it just returns the global small page size. Querying the page size of an arbitrary range seems to require walking smaps, and that is way more cumbersome.

@kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Nov 25, 2020

Hi Thomas,

Thanks for sharing your concerns. I agree that this becomes more of a problem now when it is exposed outside the GC, that's why I wasn't sure where we should put this:

Not entirely sure where the best location for such helper is, but a static function ReservedSpace::actual_page_size(ReservedSpace) could work.

This is also why it was added as a static helper rather than a member. An alternative is to move this helper to a shared GC utility, and maybe document it a bit more. Adding multiple large page sizes will require this function to be updated, that's one of the reasons I wanted it shared not duplicated.

As you say the best thing would be if we had a way to really query the page size rather than doing a good estimation given the looks of the reserved space. But for the current use-cases in the GC, this estimation seems to be enough. Going the other route is, as you say, way more work.

A different solution would be to add a _largest_page_size member to the reserved space, which would be the larges page size used by that reservation. That might make sense if we are moving towards having multiple large page sizes.

What would be your preferred way forward?

Cheers,
Stefan

@tstuefe
Copy link
Member

@tstuefe tstuefe commented Nov 25, 2020

Hi Stefan,

Hi Thomas,

Thanks for sharing your concerns. I agree that this becomes more of a problem now when it is exposed outside the GC, that's why I wasn't sure where we should put this:

Not entirely sure where the best location for such helper is, but a static function ReservedSpace::actual_page_size(ReservedSpace) could work.

This is also why it was added as a static helper rather than a member. An alternative is to move this helper to a shared GC utility, and maybe document it a bit more. Adding multiple large page sizes will require this function to be updated, that's one of the reasons I wanted it shared not duplicated.

As you say the best thing would be if we had a way to really query the page size rather than doing a good estimation given the looks of the reserved space. But for the current use-cases in the GC, this estimation seems to be enough. Going the other route is, as you say, way more work.

A different solution would be to add a _largest_page_size member to the reserved space, which would be the larges page size used by that reservation. That might make sense if we are moving towards having multiple large page sizes.

What would be your preferred way forward?

Cheers,
Stefan

Thanks for taking my concerns seriously. I know that this is all a bit messy, and has a long history.

I briefly looked into some way to ask the OS about the page size of a memory region. That is certainly possible but annoyingly complicated.

On AIX we have a thing already. The only platform that has :)

On Linux you would have to scan /proc/self/smaps. May be worthwhile but complex and expensive for large processes.

On Windows I have no clue what do do. VirtualQuery() does not return page information. Neither do I know what to do on MacOS.

Interestingly, I found that we have os::page_info() and os::scan_pages(), which look like they would do what we want but they are empty on all platforms. Solaris leftover. I filed JDK-8257076 for you guys to track this since you may remove some of the associated NUMA coding too.


So the more practical and faster way would be to store this information ourselves. It would be better than trying to deduce what we did at reservation time.

I am currently working on a prototype to expand os::reserve_memory_special() to optionally return information about the page sizes in the reserved range. Kinda like sigset_t. That would allow the os implementors to do whatever (eg mix in 2M pages), and you have the information about that set of page sizes, which could be stored in ReservedSpace as a member. Later, that info can be used as "what the underlying page sizes are to our best knowledge".

That prototype would look a bit like this:

char* os::reserve_memory_special(..., pagesizeset_t* page_sizes = NULL);

This is a smaller version of what I think would be generally a good idea, which is for all os::reserve__... APIs to return meta information about the reserved space for the caller to hold on to. But this is only for os::reserve_memory_special(), and only for page sizes, so less daunting.

I'll post an RFR if I have a prototype ready. Then we may take another look at this change and rework this actual_page_size().

Cheers, Thomas

@kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Nov 25, 2020

So Thomas in what cases do we need the ReservedSpace to be able to handle multiple page-sizes? A simpler approach would be to aim for only having one page size per ReservedSpace. For example, in G1 it would be problematic or at least inefficient to have multiple page sizes per mapping.

@tstuefe
Copy link
Member

@tstuefe tstuefe commented Nov 25, 2020

So Thomas in what cases do we need the ReservedSpace to be able to handle multiple page-sizes? A simpler approach would be to aim for only having one page size per ReservedSpace. For example, in G1 it would be problematic or at least inefficient to have multiple page sizes per mapping.

Happens if you reserve a space with a size not aligned to the underlying large page size. See reserve_memory_special_huge_tlbfs_mixed() on Linux. In that case, we try to be smart and fill in the ends with small pages rather than failing.

@tstuefe
Copy link
Member

@tstuefe tstuefe commented Nov 25, 2020

Of which the Intel approach with the 2M pages could be seen as an expansion (e.g. if you have a heap of 1.5G, allocate 1G huge page, and fill in the ends with 2M pages, and if necessary with 4K pages).

@kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Nov 25, 2020

Sorry was a bit unclear, I know about this case but I was more thinking do we need to support it. Could we just always align correctly.

@kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Nov 25, 2020

The model would be so much nicer if a ReservedSpace always just had one page-size.

@tstuefe
Copy link
Member

@tstuefe tstuefe commented Nov 25, 2020

Yes, but I think it makes sense to allow the os layer to mix page sizes for large paged areas, for performance reasons. The fact that this coding exists, and that Intel wants to further complicate it and add 2M pages, means we have a need here.

Trying to avoid this just means that people add patches sideways to satisfy specific needs, which hurts maintainability. Also, I don't like to discourage first time contributors with lots of concerns, therefore I'd like a cleaner, more flexible os layer.

But no-one forces you to accept multi-page-sized-areas. If you really want just one page size, you can query the largest page size available beforehand and align the reservation size accordingly, and with my proposed change you could now assert the result and log it correctly.

But if one just generally wants large pages without caring for the precise layout, one could let os::reserve_memory_special() do its best, and would now get the information about what reserve_memory_special() did.

For example, were I to re-introduce large pages for Metaspace, I would like to have the luxury of just calling os::reserve_memory_special() without having to think about specific geometry - if the space is large enough for large pages, it should stitch the area together as best as it can.

@tschatzl
Copy link
Contributor

@tschatzl tschatzl commented Nov 25, 2020

@tstuefe
Copy link
Member

@tstuefe tstuefe commented Nov 25, 2020

Hi Thomas,

first off, the last thing I want is for matters to become more complicated. God no.

Today the virtual memory layer is inconsistent and undocumented and has subtle bugs. Among other things this means that PRs can take forever to discuss, which I guess is frustrating for newcomers. Or if they are pushed they can cause a lot of cleanup work afterwards.

Just attempting to write a consistent API description of the virtual memory layer, parameters expected behavior etc, is very difficult. I know since I tried. And that is a bad sign. Tests are missing too, since what you cannot describe you cannot test. For an interesting example, see JDK-8255978. That bug had been in there since at least JDK8.

One problem I see reoccurring with these APIs is that meta information about a reservation are needed later, but are lost. Many examples:

  • the exact flags with which a mapping was established (exec or not, MAP_JIT, etc)
  • which API has been used (AIX)
  • what the page sizes were at reservation time
  • whether the area had been stitched together with multiple mappings or a single one
    .. etc.

The code then often tries to guess these information from "circumstantial evidence", which introduces lots of strange dependencies and makes the coding fragile. ReservedSpace::actual_page_size() in this patch is one example.

A better way would be to have these information kept somewhere. Ideally, the OS would do this and I could ask it for properties of a memory mapping. In reality, the OS is not so forthcoming. So we should keep these information ourselves.

One of these information is the page sizes of a region. In that light you can see my proposal. I did not propose to change a single atom about how os::reserve_memory_special() works. I just proposed for it to just be honest and return information about what it did. So that this information could be kept inside ReservedSpace and reused instead of guessed. And since the reality today is that os::reserve_memory_special() can reserve multiple page sizes, a correct implementation would have to reflect that or be wrong.

You are right, reporting multiple page sizes is more difficult than just one. But even more complex is what we have today, where a mapping can have multiple page sizes but this is not acknowledged by upper layers. In this case, we have ReservedSpace::actual_page_size() which plain cannot be implemented correctly.

I am fine with the idea of restricting RS to just one page size. But then lets do that and get rid of mixed TLBFS reservation (and whatever Windows does, they may be doing something similar). We also should specify the page size as an explicit parameter and not confuse this with the alignment.

That code may be an artifact of 32 bit machines from 10+ years ago where address space fragmentation has been a real concern (typically Windows). Or RAM sizes were measured in hundreds of MB instead of GBs where (on Linux) pre-reserving hundreds of MB for huge pages is/has been an issue.

I am not sure what coding you refer to.

and that Intel wants to further complicate it and add 2M pages, means we have a need here.

The JDK-8256155 (Intel) CR does not state that requirement. Imho it only says that the author wants to use (any and all) configured pages for different reserved spaces. E.g. the machine has (on x64, to simplify the case) configured: 10 1G pages 1000 2M pages so the heap should use the 1G pages (assuming it's less than 10G), other reservations like code heap should first use the 50 2M pages before falling back to other page sizes to better use available TLB cache entries. I would prefer if we do not overcomplicate the requirements :)

Flexible stitching would have one advantage though, in that a huge page pool could be better used. In your example, if someone starts with a 12G heap, it would fail since no pool indicidually is large enough, but combining different page sizes would work.

I wont insist on this. As I wrote, I am fine with one-pagesize-per-mapping if we can arrive there.

Also probably this should be asked and followed up in the correct review thread.

Since it was relevant to this PR, I did mention it here.

I'd like a simpler, maybe less flexible but understandable by mere mortals OS layer :) lower layer that does not make upper layers too complicated.

I do too.

This is a kind of one-sided argument, taking only commit into account. Since actually giving back memory is expected nowadays, taking care of different random page sizes is complicated. E.g. when implementing G1's region memory management (in 8u40) the decision to only support a single page size for every one of its GC data structures has been a conscious one - because the complexity overhead did not justify the gains. Nobody complained yet.

I may be missing something here, but is hupe paged space not pinned? How can this be uncommitted? Okay yes, if it can be uncommitted, multiple page sizes should be avoided.

For example, were I to re-introduce large pages for Metaspace, I would like to have the luxury of just calling os::reserve_memory_special() without having to think about specific geometry - if the space is large enough for large pages, it should stitch the area together as best as it can.

That's true, but that Metaspace layer then needs to be aware of multiple page sizes when uncommitting, and (presumably) tracking liveness on the lowest granularity anyway. Which does not make the code easier.

See above. I was under the assumption that uncommit goes out of the window the moment you choose explicit large pages.

Thanks, Thomas

Thanks, Thomas

@kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Nov 26, 2020

You are correct that explicit huge pages are pinned and of course there might be situations where it would be beneficial to do the stitching, but I think the maintenance cost will be significantly higher. It sound like you would prefer a simpler model as well and int would certainly be interesting to see if there are any problems connected to getting rid of the mixed mappings we have today.

I also agree that we should do better book keeping since it is so hard to get the information later on.

@tstuefe
Copy link
Member

@tstuefe tstuefe commented Dec 3, 2020

Hi guys,

gentle ping. I know we are all busy with the impending code freeze :)

I am fine with removing the ability to reserve space with multiple page sizes. But that is something I cannot drive - probably someone at Oracle should, since it has repercussions for GC and for backward compatibility. Might also take a bit longer.

Meanwhile, we still have the new ReservedSpace::actual_page_size(), which is subtly broken and will be more so once JDK-8234930 is implemented. IIUC you did not like the proposal of making reserve_space_special() return information about the reserved page sizes. That's totally fine for me, but have we decided on how to deal with this instead?

Cheers, Thomas

@kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Dec 4, 2020

Hi Thomas,

I just filed JDK-8257757 to keep track of this, feel free to add information to it. Let's work together to try to improve this.

Cheers,
Stefan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot integrated
4 participants