Skip to content

JDK-8270308: Amalloc aligns size but not return value #4784

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

tstuefe
Copy link
Member

@tstuefe tstuefe commented Jul 14, 2021

Hi,

may I have reviews for this fix to Arena alignment handling in hotspot. This affects the old Arena classes as well as child classes (ResourceArea, HandleArea etc). This is a followup to https://bugs.openjdk.java.net/browse/JDK-8270179 and other recent Arena fixes.

This makes arenas more flexible, allowing to allocate with different alignments to achieve tighter packing. Theoretically arenas should be able to do that today, but it is actually broken. We did not notice any errors though, probably because the only platforms affected are 32-bit platforms other than x86 (x86 allows unaligned access). Also, today in the VM nobody mixes different alignments - you either allocate only with Amalloc, so with 64bit alignment, or only with AmallocWords (former Amalloc_4), which is 32bit on 32bit.

I redid this patch twice. My first attempts rewrote a lot of the code, but I scratched that and in this patch kept it focused only on the alignment issue. Defer cleanups to separate RFEs.

This patch establishes new alignment rules, but actual behavioral changes to current Arena users should be minimal and should only affect 32-bit anyway.

Before this patch, Arena allocation checked (or auto-aligned) the allocation size in an attempt to guarantee alignment of the allocation start address. That was neither needed nor sufficient - see JBS text.

The new code just allows any allocation size, aligned or unaligned. However, it now clearly guarantees alignment of the allocation start address. E.g., theoretically, you could allocate a 16bit word at a 64bit boundary.

The arena does this by aligning, if needed, before an allocation. Alignment gaps are filled (debug build + ZapResourceArea) with a special zap pattern.


Notes:

  • Chunk start, payload start, and payload end have to be aligned to max arena alignment. I wondered whether just statically assert sizeof(Chunk) to be aligned; but I did not find a pragma for this for all platforms, and did not want to play whack-the-mole with 32bit and strange compilers like Xlc. Therefore we align payload start separately.
  • To simplify implementation, I added the rule that the four standard chunk sizes (those cached in the pools) have to be aligned to max arena alignment. They mostly were already, just had to tweak numbers a bit for 32-bit.
  • Chunks are preceded with a canary now for debugging purposes and potential sanity tests in gtests (not used yet)
  • Arena::realloc():
    • Commented in header that the original alignment is preserved (which it is since Arealloc either changes the allocation in-place, so start address does not change, or reallocates with max possible alignment.
    • I fixed another pointer calculation to use pointer_delta when an allocation is grown in-place
    • I added code to zap the leftover memory if an allocation shrinks

Tests:

  • I wrote a large number of gtests for the Arena class. These test proper behavior on alloc, realloc, and free, including overwrite tests and gap pattern tests.
  • wrote a new gtests jtreg wrapper to run the Arena gtests with +UseMallocOnly - I think as long as we support it we should test it.

Ran all tests on 32bit and 64bit Linux on my box.

Nightlies at SAP are in progress.

Manual tests:

  • I also tested a higher arena max alignment (16 bytes on 64 bit), ran gtests, worked fine
  • I also tested an unaligned Chunk size on 64bit to test padding after header is handled correctly, works fine

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8270308: Amalloc aligns size but not return value

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4784/head:pull/4784
$ git checkout pull/4784

Update a local copy of the PR:
$ git checkout pull/4784
$ git pull https://git.openjdk.java.net/jdk pull/4784/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4784

View PR using the GUI difftool:
$ git pr show -t 4784

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4784.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 14, 2021

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 14, 2021

@tstuefe The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Jul 14, 2021
@tstuefe tstuefe force-pushed the JDK-8270308-Amalloc-aligns-size-but-not-return-value branch 5 times, most recently from 2ae2c2b to 6b398db Compare July 16, 2021 09:32
@tstuefe tstuefe force-pushed the JDK-8270308-Amalloc-aligns-size-but-not-return-value branch from 6b398db to 14e7c08 Compare July 16, 2021 09:56
@tstuefe tstuefe marked this pull request as ready for review July 16, 2021 11:07
@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 16, 2021
@mlbridge
Copy link

mlbridge bot commented Jul 16, 2021

Webrevs

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you made this too general and too complicated and I don't see the benefit.
Thank you for writing the gtests though. It was an area where they were missing.

// _hwm (in arena)
//
// Start of Chunk (this), bottom and top have to be aligned to max arena alignment.
// _hwm does not have to be aligned to anything - it gets aligned on demand
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather _hwm always has to be aligned to at least a word size for the machine. It's way simpler to think about. If all allocations are implicitly scaled up to or asserted to at least wordSize, then _hwm will be too. This generality doesn't seem beneficial.

//
// Allocation from an arena is possible in alignments [1..Chunk::max_arena_alignment].
// The arena will automatically align on each allocation if needed, adding padding if
// necessary. In debug builds, those paddings will be zapped with a pattern ('GGGG..').
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so here you say that an allocation from an arena is aligned on ? alignment and the result is padded if the request is unaligned So _hwm is always aligned on something. The something really should be minimally wordSize. Maybe that's the case but there's a lot of words here that imply otherwise, so I'm confused.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, _hwm is not aligned. It points to the address beyond the last allocation (like it did before).

_hwm is X + y, with X being the last allocation start address, aligned as the last caller wanted, and y being the length of the allocation, which does not have to be aligned to anything.

If the last allocation was "allocate 5 bytes at a 64byte boundary" (e.g. for a 5-char string in the resource area), the resulting _hwm would be X+5. It is unaligned. Any follow-up allocation requiring an alignment > 1 needs padding, and at that moment the pad is added before the next allocation.

// Allocate n bytes with 32-bit alignment
void* Amalloc32(size_t n, AllocFailType failmode = AllocFailStrategy::EXIT_OOM) {
return Amalloc_aligned(n, sizeof(uint32_t), failmode);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these three public for gtests? Can you make them private and the gtest class a friend instead? too many Amalloc choices!


TEST_VM(Arena, alloc_alignment_32) {
Arena ar(mtTest);
void* p1 = ar.Amalloc(1); // just one byte, leaves _hwm unaligned at offset 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't want this. Amalloc should still round up the allocation size to 64 bits, then return the pointer aligned to 64 bits.
The test that I was going to write is
p1 = AmallocWords(4)
p2 = Amalloc(some size that gets aligned to 64 bits)
verify p2 is 64 bit aligned.

I don't want _hwm to ever be unaligned.

@tstuefe
Copy link
Member Author

tstuefe commented Jul 19, 2021

I think you made this too general and too complicated and I don't see the benefit.
Thank you for writing the gtests though. It was an area where they were missing.

Hi Coleen,

thank you for looking at this.

The benefit of having the caller choose the alignment would be to allow the caller to allocate with tighter alignments than 64bit, wasting less space. Currently, the vast majority of allocations come via NEW_RESOURCE_ARRAY in allocation.hpp, which just allocates with 64bit alignment. For many allocations, this alignment is unnecessary. For example, strings need no alignment at all. If you repeatedly put strings into RA, you may waste a lot of memory due to this alignment. And we put a lot of strings into RAs. I should probably measure how much space we really waste this way.

(Side note, due to the fact that Arenas use C-heap, and the fact that the glibc caches way too aggressively, the JVM never really recovers from spikes in arena usage. See also #4510. We even have a proprietary patch at SAP where we switched to mmap() forCompilerarenas for that very reason. So, reducing the Arena footprint is actually worthwhile).

ATM we allow two alignments, pointer size and 64bit. But you cannot actually mix those since you may end up with unaligned allocations - the original bug. So the fix for that would be to correctly align at allocation time. I thought allowing just any alignment would be even simpler, and might make it possible later on have something like, say, a NEW_RESOURCE_ARRAY_1 to allocate unaligned space for strings which need no alignment at all.


However, if you don't like this, here is a different proposal. We could make the alignment a property of an arena: every arena has a fixed allocation alignment, that's what you get when allocating. Things like HandleAreas would have pointer-sized alignments, since allocating with a different alignment makes no sense there and introduces errors (alignment gaps would be misinterpreted as oops). Thread ResourceAreas could be 64bit aligned. That way, we even could reduce the number of Amalloc.. functions to just one: Amalloc(). Which would allocate with whatever default alignment the arena has.

That way we don't benefit from tighter packing, but the code would be simpler and more correct and safer. Since you cannot mix different alignments, _hwm would never be unaligned.

Cheers, Thomas

@tstuefe
Copy link
Member Author

tstuefe commented Jul 20, 2021

I decided to withdraw this patch and to provide a minimal version in a separate PR.

@tstuefe tstuefe closed this Jul 20, 2021
@tstuefe tstuefe deleted the JDK-8270308-Amalloc-aligns-size-but-not-return-value branch August 23, 2021 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

2 participants