-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8349146: [REDO] Implement a better allocator for downcalls #24829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back pminborg! A progress list of the required criteria for merging this PR into |
|
/contributor add @mernst-github |
|
@minborg This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 51 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
|
Here are the current benchmark results: and |
|
@minborg Syntax:
User names can only be used for users in the census associated with this repository. For other contributors you need to supply the full name and email address. |
|
Can we just cleanly revert commit 7764742? |
The idea was to use records for trusted components and less boilerplate. The idea with the interface was to hide the record accessors a bit. |
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks to be in parity with the original patch.
Webrevs
|
|
We can not just re-use the original test due to: https://bugs.openjdk.org/browse/JDK-8350455 See my comment here: #23142 (comment) |
I ran the test 250 times on my Mac machine (macOS 15.4.1)with no problems but, maybe we should problem list or exclude macOS testing? |
Yes, I think we should. The issue is quite intermittent, and last time it only started showing up in CI as well. I think we should move the |
src/java.base/share/classes/jdk/internal/foreign/BufferStack.java
Outdated
Show resolved
Hide resolved
src/java.base/share/classes/jdk/internal/foreign/BufferStack.java
Outdated
Show resolved
Hide resolved
| @Benchmark | ||
| public void byValue() throws Throwable { | ||
| // point = unit(); | ||
| MemorySegment unused = (MemorySegment) MH_UNIT_BY_VALUE.invokeExact( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This benchmark is a bit misleading, because the allocator object will add some noise in the mix. I suggest to have some allocator object ready in a field, and just pass that (avoiding the lambda). That would also be more similar to idiomatic FFM code (which assumes some allocator is already available at the callsite).
src/java.base/share/classes/jdk/internal/foreign/BufferStack.java
Outdated
Show resolved
Hide resolved
|
Updated benchmarks: and |
| private record PerThread(ReentrantLock lock, Arena arena, SlicingAllocator stack) { | ||
|
|
||
| @ForceInline | ||
| public Arena pushFrame(long size, long byteAlignment) { | ||
| boolean needsLock = Thread.currentThread().isVirtual() && !lock.isHeldByCurrentThread(); | ||
| if (needsLock && !lock.tryLock()) { | ||
| // Rare: another virtual thread on the same carrier competed for acquisition. | ||
| return Arena.ofConfined(); | ||
| } | ||
| if (!stack.canAllocate(size, byteAlignment)) { | ||
| if (needsLock) lock.unlock(); | ||
| return Arena.ofConfined(); | ||
| } | ||
| return new Frame(needsLock, size, byteAlignment); | ||
| } | ||
|
|
||
| static PerThread of(long byteSize, long byteAlignment) { | ||
| final Arena arena = Arena.ofAuto(); | ||
| return new PerThread(new ReentrantLock(), | ||
| arena, | ||
| new SlicingAllocator(arena.allocate(byteSize, byteAlignment))); | ||
| } | ||
|
|
||
| private final class Frame implements Arena { | ||
| private final boolean locked; | ||
| private final long parentOffset; | ||
| private final long topOfStack; | ||
| private final Arena confinedArena = Arena.ofConfined(); | ||
| private final SegmentAllocator frame; | ||
|
|
||
| @SuppressWarnings("restricted") | ||
| @ForceInline | ||
| public Frame(boolean locked, long byteSize, long byteAlignment) { | ||
| this.locked = locked; | ||
| parentOffset = stack.currentOffset(); | ||
| MemorySegment frameSegment = stack.allocate(byteSize, byteAlignment); | ||
| topOfStack = stack.currentOffset(); | ||
| // The cleanup action will keep the original automatic `arena` (from which | ||
| // the reusable segment is first allocated) alive even if this Frame | ||
| // becomes unreachable but there are reachable segments still alive. | ||
| frame = new SlicingAllocator(frameSegment.reinterpret(confinedArena, new CleanupAction(arena))); | ||
| } | ||
|
|
||
| record CleanupAction(Arena arena) implements Consumer<MemorySegment> { | ||
| @Override | ||
| public void accept(MemorySegment memorySegment) { | ||
| Reference.reachabilityFence(arena); | ||
| } | ||
| } | ||
|
|
||
| @ForceInline | ||
| private void assertOrder() { | ||
| if (topOfStack != stack.currentOffset()) | ||
| throw new IllegalStateException("Out of order access: frame not top-of-stack"); | ||
| } | ||
|
|
||
| @ForceInline | ||
| @Override | ||
| @SuppressWarnings("restricted") | ||
| public MemorySegment allocate(long byteSize, long byteAlignment) { | ||
| // Make sure we are on the right thread and not closed | ||
| MemorySessionImpl.toMemorySession(confinedArena).checkValidState(); | ||
| return frame.allocate(byteSize, byteAlignment); | ||
| } | ||
|
|
||
| @ForceInline | ||
| @Override | ||
| public MemorySegment.Scope scope() { | ||
| return confinedArena.scope(); | ||
| } | ||
|
|
||
| @ForceInline | ||
| @Override | ||
| public void close() { | ||
| assertOrder(); | ||
| // the Arena::close method is called "early" as it checks thread | ||
| // confinement and crucially before any mutation of the internal | ||
| // state takes place. | ||
| confinedArena.close(); | ||
| stack.resetTo(parentOffset); | ||
| if (locked) { | ||
| lock.unlock(); | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| private record PerThread(ReentrantLock lock, Arena arena, SlicingAllocator stack) { | |
| @ForceInline | |
| public Arena pushFrame(long size, long byteAlignment) { | |
| boolean needsLock = Thread.currentThread().isVirtual() && !lock.isHeldByCurrentThread(); | |
| if (needsLock && !lock.tryLock()) { | |
| // Rare: another virtual thread on the same carrier competed for acquisition. | |
| return Arena.ofConfined(); | |
| } | |
| if (!stack.canAllocate(size, byteAlignment)) { | |
| if (needsLock) lock.unlock(); | |
| return Arena.ofConfined(); | |
| } | |
| return new Frame(needsLock, size, byteAlignment); | |
| } | |
| static PerThread of(long byteSize, long byteAlignment) { | |
| final Arena arena = Arena.ofAuto(); | |
| return new PerThread(new ReentrantLock(), | |
| arena, | |
| new SlicingAllocator(arena.allocate(byteSize, byteAlignment))); | |
| } | |
| private final class Frame implements Arena { | |
| private final boolean locked; | |
| private final long parentOffset; | |
| private final long topOfStack; | |
| private final Arena confinedArena = Arena.ofConfined(); | |
| private final SegmentAllocator frame; | |
| @SuppressWarnings("restricted") | |
| @ForceInline | |
| public Frame(boolean locked, long byteSize, long byteAlignment) { | |
| this.locked = locked; | |
| parentOffset = stack.currentOffset(); | |
| MemorySegment frameSegment = stack.allocate(byteSize, byteAlignment); | |
| topOfStack = stack.currentOffset(); | |
| // The cleanup action will keep the original automatic `arena` (from which | |
| // the reusable segment is first allocated) alive even if this Frame | |
| // becomes unreachable but there are reachable segments still alive. | |
| frame = new SlicingAllocator(frameSegment.reinterpret(confinedArena, new CleanupAction(arena))); | |
| } | |
| record CleanupAction(Arena arena) implements Consumer<MemorySegment> { | |
| @Override | |
| public void accept(MemorySegment memorySegment) { | |
| Reference.reachabilityFence(arena); | |
| } | |
| } | |
| @ForceInline | |
| private void assertOrder() { | |
| if (topOfStack != stack.currentOffset()) | |
| throw new IllegalStateException("Out of order access: frame not top-of-stack"); | |
| } | |
| @ForceInline | |
| @Override | |
| @SuppressWarnings("restricted") | |
| public MemorySegment allocate(long byteSize, long byteAlignment) { | |
| // Make sure we are on the right thread and not closed | |
| MemorySessionImpl.toMemorySession(confinedArena).checkValidState(); | |
| return frame.allocate(byteSize, byteAlignment); | |
| } | |
| @ForceInline | |
| @Override | |
| public MemorySegment.Scope scope() { | |
| return confinedArena.scope(); | |
| } | |
| @ForceInline | |
| @Override | |
| public void close() { | |
| assertOrder(); | |
| // the Arena::close method is called "early" as it checks thread | |
| // confinement and crucially before any mutation of the internal | |
| // state takes place. | |
| confinedArena.close(); | |
| stack.resetTo(parentOffset); | |
| if (locked) { | |
| lock.unlock(); | |
| } | |
| } | |
| } | |
| } | |
| private record PerThread(ReentrantLock lock, Arena arena, SlicingAllocator stack) { | |
| @ForceInline | |
| @SuppressWarnings("restricted") | |
| public Arena pushFrame(long size, long byteAlignment) { | |
| boolean needsLock = Thread.currentThread().isVirtual() && !lock.isHeldByCurrentThread(); | |
| if (needsLock && !lock.tryLock()) { | |
| // Rare: another virtual thread on the same carrier competed for acquisition. | |
| return Arena.ofConfined(); | |
| } | |
| if (!stack.canAllocate(size, byteAlignment)) { | |
| if (needsLock) lock.unlock(); | |
| return Arena.ofConfined(); | |
| } | |
| Arena confinedArena = Arena.ofConfined(); | |
| long parentOffset = stack.currentOffset(); | |
| MemorySegment frameSegment = stack.allocate(size, byteAlignment); | |
| long topOfStack = stack.currentOffset(); | |
| // The cleanup action will keep the original automatic `arena` (from which | |
| // the reusable segment is first allocated) alive even if this Frame | |
| // becomes unreachable but there are reachable segments still alive. | |
| SegmentAllocator frame = new SlicingAllocator(frameSegment.reinterpret(confinedArena, new Frame.CleanupAction(arena))); | |
| return new Frame(this, needsLock, parentOffset, topOfStack, confinedArena, frame); | |
| } | |
| static PerThread of(long byteSize, long byteAlignment) { | |
| final Arena arena = Arena.ofAuto(); | |
| return new PerThread(new ReentrantLock(), | |
| arena, | |
| new SlicingAllocator(arena.allocate(byteSize, byteAlignment))); | |
| } | |
| } | |
| private record Frame(PerThread thread, boolean locked, long parentOffset, long topOfStack, Arena confinedArena, SegmentAllocator frame) implements Arena { | |
| record CleanupAction(Arena arena) implements Consumer<MemorySegment> { | |
| @Override | |
| public void accept(MemorySegment memorySegment) { | |
| Reference.reachabilityFence(arena); | |
| } | |
| } | |
| @ForceInline | |
| private void assertOrder() { | |
| if (topOfStack != thread.stack.currentOffset()) | |
| throw new IllegalStateException("Out of order access: frame not top-of-stack"); | |
| } | |
| @ForceInline | |
| @Override | |
| @SuppressWarnings("restricted") | |
| public MemorySegment allocate(long byteSize, long byteAlignment) { | |
| // Make sure we are on the right thread and not closed | |
| MemorySessionImpl.toMemorySession(confinedArena).checkValidState(); | |
| return frame.allocate(byteSize, byteAlignment); | |
| } | |
| @ForceInline | |
| @Override | |
| public MemorySegment.Scope scope() { | |
| return confinedArena.scope(); | |
| } | |
| @ForceInline | |
| @Override | |
| public void close() { | |
| assertOrder(); | |
| // the Arena::close method is called "early" as it checks thread | |
| // confinement and crucially before any mutation of the internal | |
| // state takes place. | |
| confinedArena.close(); | |
| thread.stack.resetTo(parentOffset); | |
| if (locked) { | |
| thread.lock.unlock(); | |
| } | |
| } | |
| } |
I find it a bit strange to nest a class inside a record. What do you think if I change it to two records?
| private record PerThread(ReentrantLock lock, | ||
| Arena arena, | ||
| SlicingAllocator stack, | ||
| CleanupAction cleanupAction) { | ||
|
|
||
| @ForceInline | ||
| public Arena pushFrame(long size, long byteAlignment) { | ||
| boolean needsLock = Thread.currentThread().isVirtual() && !lock.isHeldByCurrentThread(); | ||
| if (needsLock && !lock.tryLock()) { | ||
| // Rare: another virtual thread on the same carrier competed for acquisition. | ||
| return Arena.ofConfined(); | ||
| } | ||
| if (!stack.canAllocate(size, byteAlignment)) { | ||
| if (needsLock) lock.unlock(); | ||
| return Arena.ofConfined(); | ||
| } | ||
| return new Frame(needsLock, size, byteAlignment); | ||
| } | ||
|
|
||
| static PerThread of(long byteSize, long byteAlignment) { | ||
| final Arena arena = Arena.ofAuto(); | ||
| return new PerThread(new ReentrantLock(), | ||
| arena, | ||
| new SlicingAllocator(arena.allocate(byteSize, byteAlignment)), | ||
| new CleanupAction(arena)); | ||
| } | ||
|
|
||
| private record CleanupAction(Arena arena) implements Consumer<MemorySegment> { | ||
| @Override | ||
| public void accept(MemorySegment memorySegment) { | ||
| Reference.reachabilityFence(arena); | ||
| } | ||
| } | ||
|
|
||
| private final class Frame implements Arena { | ||
|
|
||
| private final boolean locked; | ||
| private final long parentOffset; | ||
| private final long topOfStack; | ||
| private final Arena confinedArena; | ||
| private final SegmentAllocator frame; | ||
|
|
||
| @SuppressWarnings("restricted") | ||
| @ForceInline | ||
| public Frame(boolean locked, long byteSize, long byteAlignment) { | ||
| this.locked = locked; | ||
| this.parentOffset = stack.currentOffset(); | ||
| final MemorySegment frameSegment = stack.allocate(byteSize, byteAlignment); | ||
| this.topOfStack = stack.currentOffset(); | ||
| this.confinedArena = Arena.ofConfined(); | ||
| // The cleanup action will keep the original automatic `arena` (from which | ||
| // the reusable segment is first allocated) alive even if this Frame | ||
| // becomes unreachable but there are reachable segments still alive. | ||
| this.frame = new SlicingAllocator(frameSegment.reinterpret(confinedArena, cleanupAction)); | ||
| } | ||
|
|
||
| @ForceInline | ||
| private void assertOrder() { | ||
| if (topOfStack != stack.currentOffset()) | ||
| throw new IllegalStateException("Out of order access: frame not top-of-stack"); | ||
| } | ||
|
|
||
| @ForceInline | ||
| @Override | ||
| @SuppressWarnings("restricted") | ||
| public MemorySegment allocate(long byteSize, long byteAlignment) { | ||
| // Make sure we are on the right thread and not closed | ||
| MemorySessionImpl.toMemorySession(confinedArena).checkValidState(); | ||
| return frame.allocate(byteSize, byteAlignment); | ||
| } | ||
|
|
||
| @ForceInline | ||
| @Override | ||
| public MemorySegment.Scope scope() { | ||
| return confinedArena.scope(); | ||
| } | ||
|
|
||
| @ForceInline | ||
| @Override | ||
| public void close() { | ||
| assertOrder(); | ||
| // the Arena::close method is called "early" as it checks thread | ||
| // confinement and crucially before any mutation of the internal | ||
| // state takes place. | ||
| confinedArena.close(); | ||
| stack.resetTo(parentOffset); | ||
| if (locked) { | ||
| lock.unlock(); | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| private record PerThread(ReentrantLock lock, | |
| Arena arena, | |
| SlicingAllocator stack, | |
| CleanupAction cleanupAction) { | |
| @ForceInline | |
| public Arena pushFrame(long size, long byteAlignment) { | |
| boolean needsLock = Thread.currentThread().isVirtual() && !lock.isHeldByCurrentThread(); | |
| if (needsLock && !lock.tryLock()) { | |
| // Rare: another virtual thread on the same carrier competed for acquisition. | |
| return Arena.ofConfined(); | |
| } | |
| if (!stack.canAllocate(size, byteAlignment)) { | |
| if (needsLock) lock.unlock(); | |
| return Arena.ofConfined(); | |
| } | |
| return new Frame(needsLock, size, byteAlignment); | |
| } | |
| static PerThread of(long byteSize, long byteAlignment) { | |
| final Arena arena = Arena.ofAuto(); | |
| return new PerThread(new ReentrantLock(), | |
| arena, | |
| new SlicingAllocator(arena.allocate(byteSize, byteAlignment)), | |
| new CleanupAction(arena)); | |
| } | |
| private record CleanupAction(Arena arena) implements Consumer<MemorySegment> { | |
| @Override | |
| public void accept(MemorySegment memorySegment) { | |
| Reference.reachabilityFence(arena); | |
| } | |
| } | |
| private final class Frame implements Arena { | |
| private final boolean locked; | |
| private final long parentOffset; | |
| private final long topOfStack; | |
| private final Arena confinedArena; | |
| private final SegmentAllocator frame; | |
| @SuppressWarnings("restricted") | |
| @ForceInline | |
| public Frame(boolean locked, long byteSize, long byteAlignment) { | |
| this.locked = locked; | |
| this.parentOffset = stack.currentOffset(); | |
| final MemorySegment frameSegment = stack.allocate(byteSize, byteAlignment); | |
| this.topOfStack = stack.currentOffset(); | |
| this.confinedArena = Arena.ofConfined(); | |
| // The cleanup action will keep the original automatic `arena` (from which | |
| // the reusable segment is first allocated) alive even if this Frame | |
| // becomes unreachable but there are reachable segments still alive. | |
| this.frame = new SlicingAllocator(frameSegment.reinterpret(confinedArena, cleanupAction)); | |
| } | |
| @ForceInline | |
| private void assertOrder() { | |
| if (topOfStack != stack.currentOffset()) | |
| throw new IllegalStateException("Out of order access: frame not top-of-stack"); | |
| } | |
| @ForceInline | |
| @Override | |
| @SuppressWarnings("restricted") | |
| public MemorySegment allocate(long byteSize, long byteAlignment) { | |
| // Make sure we are on the right thread and not closed | |
| MemorySessionImpl.toMemorySession(confinedArena).checkValidState(); | |
| return frame.allocate(byteSize, byteAlignment); | |
| } | |
| @ForceInline | |
| @Override | |
| public MemorySegment.Scope scope() { | |
| return confinedArena.scope(); | |
| } | |
| @ForceInline | |
| @Override | |
| public void close() { | |
| assertOrder(); | |
| // the Arena::close method is called "early" as it checks thread | |
| // confinement and crucially before any mutation of the internal | |
| // state takes place. | |
| confinedArena.close(); | |
| stack.resetTo(parentOffset); | |
| if (locked) { | |
| lock.unlock(); | |
| } | |
| } | |
| } | |
| } | |
| private record PerThread(ReentrantLock lock, | |
| Arena arena, | |
| SlicingAllocator stack, | |
| CleanupAction cleanupAction) { | |
| @SuppressWarnings("restricted") | |
| @ForceInline | |
| public Arena pushFrame(long size, long byteAlignment) { | |
| boolean needsLock = Thread.currentThread().isVirtual() && !lock.isHeldByCurrentThread(); | |
| if (needsLock && !lock.tryLock()) { | |
| // Rare: another virtual thread on the same carrier competed for acquisition. | |
| return Arena.ofConfined(); | |
| } | |
| if (!stack.canAllocate(size, byteAlignment)) { | |
| if (needsLock) lock.unlock(); | |
| return Arena.ofConfined(); | |
| } | |
| long parentOffset = stack.currentOffset(); | |
| final MemorySegment frameSegment = stack.allocate(size, byteAlignment); | |
| long topOfStack = stack.currentOffset(); | |
| Arena confinedArena = Arena.ofConfined(); | |
| // return new Frame(needsLock, size, byte | |
| // The cleanup action will keep the original automatic `arena` (from which | |
| // the reusable segment is first allocated) alive even if this Frame | |
| // becomes unreachable but there are reachable segments still alive. | |
| return new Frame(this, needsLock, parentOffset, topOfStack, confinedArena, new SlicingAllocator(frameSegment.reinterpret(confinedArena, cleanupAction))); | |
| } | |
| static PerThread of(long byteSize, long byteAlignment) { | |
| final Arena arena = Arena.ofAuto(); | |
| return new PerThread(new ReentrantLock(), | |
| arena, | |
| new SlicingAllocator(arena.allocate(byteSize, byteAlignment)), | |
| new CleanupAction(arena)); | |
| } | |
| private record CleanupAction(Arena arena) implements Consumer<MemorySegment> { | |
| @Override | |
| public void accept(MemorySegment memorySegment) { | |
| Reference.reachabilityFence(arena); | |
| } | |
| } | |
| } | |
| private record Frame(PerThread thead, boolean locked, long parentOffset, long topOfStack, Arena confinedArena, SegmentAllocator frame) implements Arena { | |
| @ForceInline | |
| private void assertOrder() { | |
| if (topOfStack != thead.stack.currentOffset()) | |
| throw new IllegalStateException("Out of order access: frame not top-of-stack"); | |
| } | |
| @ForceInline | |
| @Override | |
| @SuppressWarnings("restricted") | |
| public MemorySegment allocate(long byteSize, long byteAlignment) { | |
| // Make sure we are on the right thread and not closed | |
| MemorySessionImpl.toMemorySession(confinedArena).checkValidState(); | |
| return frame.allocate(byteSize, byteAlignment); | |
| } | |
| @ForceInline | |
| @Override | |
| public MemorySegment.Scope scope() { | |
| return confinedArena.scope(); | |
| } | |
| @ForceInline | |
| @Override | |
| public void close() { | |
| assertOrder(); | |
| // the Arena::close method is called "early" as it checks thread | |
| // confinement and crucially before any mutation of the internal | |
| // state takes place. | |
| confinedArena.close(); | |
| thead.stack.resetTo(parentOffset); | |
| if (locked) { | |
| thead.lock.unlock(); | |
| } | |
| } | |
| } |
Same as the suggestion above, you changed the code. I updated and remade the suggestion and changed it to two records. What do you think?
| return new PerThread(new ReentrantLock(), | ||
| arena, | ||
| new SlicingAllocator(arena.allocate(byteSize, byteAlignment)), | ||
| new CleanupAction(arena)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why you didn't use a lambda here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, not a big fan of records here -- it seems that many implementation details such as cleanup action, lock and slicing allocator are "leaked out" to the caller, that is now responsible to set things up correctly. I think a PerThread class with a constructor taking arena, size, alignment would make the code less coupled and more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, if you do that, you then don't need the of factory -- clients can just use new
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why you didn't use a lambda here?
I also think that CleanupAction should be changed to lambda
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using an anonymous class for the cleanup action had very adverse effects on performance. I didn't want to use a lambda for startup performance reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If using lambda affects performance, how about using anonymous classes?
return new PerThread(new ReentrantLock(),
arena,
new SlicingAllocator(arena.allocate(byteSize, byteAlignment)),
new Consumer<MemorySegment>() {
@Override
public void accept(MemorySegment memorySegment) {
Reference.reachabilityFence(arena);
}});There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anonymous classes also captures outer variables and break cleaner/GC. The only safe measures are local enums or records, which never capture outer variables (anonymous classes cannot be enum or records)
What was the result before this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - I left some stylistic comments on the use of records
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest version looks good to me as well.
|
The performance before this PR can be seen in the "confined" benchmarks above. In those benchmarks, a regular |
|
/integrate I will integrate this PR now @wenshao . Can you summarize your proposed changes below as It was a bit unclear to me what you meant. I can create a separate PR with those changes later. |
|
Going to push as commit 9f9e73d.
Your commit was automatically rebased without conflicts. |
My suggestion is to use two records (both PerThread and Frame) to replace the original inner class Frame. This would also eliminate one use of @ForceInline. This might just be my personal coding style preference – feel free to adopt it if you like. Just a suggestion! |
Yes - but I was referring to the performance of the native call -- not that of the allocation per se (e.g. E.g. it would be useful to also see how the by-value call has improved before/after this PR, in addition to looking at the difference between byValue/byRef in this PR. |
This PR is based on the work of @mernst-github and aims to implement an internal thread-local 'stack' allocator, which works like a dynamically sized arena, but with reset functionality to reset the allocated size back to a certain level. The underlying memory could stay around between calls, which could improve performance.
Re-allocated segments are not zeroed between allocations.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24829/head:pull/24829$ git checkout pull/24829Update a local copy of the PR:
$ git checkout pull/24829$ git pull https://git.openjdk.org/jdk.git pull/24829/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 24829View PR using the GUI difftool:
$ git pr show -t 24829Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24829.diff
Using Webrev
Link to Webrev Comment