Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK-8282405: Make thread resource areas signal safe #7624

Closed

Conversation

tstuefe
Copy link
Member

@tstuefe tstuefe commented Feb 25, 2022

In the context of signal handlers, we may allocate RA memory. That is not ideal but may happen. One example is error reporting - even if we are careful, some code down the stack may use RA. Another example is code running in the context of AsyncGetCallTrace. I'm sure there may be more examples.

The problem is that the signal may (rarely) leave the current thread's RA in an inconsistent state, especially if it got interrupted in the middle of a chunk turnover. Subsequent allocations from it inside the signal handler then would malfunction.

A simple solution would be double buffering. Let each thread have a second resource area, to be used only in signal handling. At the entrance of the hotspot signal handler (which everyone goes through, even in chain scenarios like with AsyncGetCallTrace) we would switch over to the secondary resource area, and switch back when leaving the hotspot signal handler.

Note that I proposed this on hs-runtime-dev [1] but I am actually not sure if the mailing lists work, since I did not see that mail delivered to subscribers. Therefore I went ahead and implemented a simple prototype.

The prototype keeps matters simple:

  • we just use two resource areas: the normal one and an alternate one for signal handling. So we don't handle recursive calls to signal handlers, see comment in signals_posix.cpp.
  • we preallocate both resource area at thread creation time. For the pros and cons of pre-allocating them vs creating them on demand, and possible further improvements, pls see [1].

Tests:

  • SAP nightlies
  • GHAs
  • I tested this manually by corrupting the resource area of a thread, then faulting, and inside the signal handler, I was able to use the secondary resource area as expected.
  • Automated tests are somewhat more difficult, akin to the existing SafeFetchInErrorHandlerTest. I am not sure if its worth the complexity.

[1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2022-February/054126.html


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7624/head:pull/7624
$ git checkout pull/7624

Update a local copy of the PR:
$ git checkout pull/7624
$ git pull https://git.openjdk.java.net/jdk pull/7624/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 7624

View PR using the GUI difftool:
$ git pr show -t 7624

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7624.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 25, 2022

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Feb 25, 2022

@tstuefe The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Feb 25, 2022
@tstuefe tstuefe force-pushed the JDK-8282405-make-ra-signal-safe branch from ffa5d85 to 19b550a Compare February 25, 2022 18:00
@tstuefe tstuefe marked this pull request as ready for review February 25, 2022 18:04
@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 25, 2022
@mlbridge
Copy link

mlbridge bot commented Feb 25, 2022

Webrevs

@tstuefe
Copy link
Member Author

tstuefe commented Mar 2, 2022

Gentle ping. All tests are green in GHA and at SAP. This patch increases stability for the async profiler and in hs_err reports.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Thomas,

Seems a reasonable idea in principle, but I'm unsure of some details.

At the entrance of the hotspot signal handler (which everyone goes through, even in chain scenarios like with AsyncGetCallTrace)

That is not my understanding. For AGCT the signal handler is installed directly for, e.g. SIGPROF, and we don't intercept or chain it in the VM AFAIK. Also UserHandler is separate from JVM_HANDLE_XXX_SIGNAL.

Further comments below.

Thanks,
David

Comment on lines 516 to 517
// When entering signal handling with a valid current Thread, we switch to the
// Thread's secondary resource area. That lets us safely use resource area
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: no need to indent paragraph continuation lines.

// memory during signal handling even if the signal left the primary area in
// an inconsistent state. We then swap back to the primary area upon leaving
// the signal handler.
// Note that signal handlers can nest, and in theory we should do this on each
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what happens if you get a secondary signal whilst in the midst of trying to switch to the secondary resource area? Seems we need to ensure signals be deferred whilst this happens so that it is atomic with respect to subsequent signals.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessary.

We already block asynchronous signals (see set_signal_handler()). We explicitly allow synchronous error signals though since blocking them would just mean the process terminates right away when receiving them while blocked.

Thinking this through more, I don't see actually a way how signals could destroy the secondary RA or interrupt the switchover:

  • we don't allow asynchronous signals for the time of the signal handling
  • we allow error signals but that would be a programming error that should be fixed. It is preventable. If I fault when accessing the secondary resource area or trying a switch over to the secondary resource area, I should code more defensively.

This is all assuming that for ACGT the same rules hold, and whoever uses ACGT sets up the signal handler with async signals blocked. But we could just block them ourselves inside ACGT if we think this is necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay that sounds reasonable - we can't get nested non-error signals.

@tstuefe
Copy link
Member Author

tstuefe commented Mar 3, 2022

Hi David,

Hi Thomas,

Seems a reasonable idea in principle, but I'm unsure of some details.

Thank you for taking the time to look at this!

At the entrance of the hotspot signal handler (which everyone goes through, even in chain scenarios like with AsyncGetCallTrace)

That is not my understanding. For AGCT the signal handler is installed directly for, e.g. SIGPROF, and we don't intercept or chain it in the VM AFAIK. Also UserHandler is separate from JVM_HANDLE_XXX_SIGNAL.

Oh, it is? Thanks for the catch, seems my assumption was false. How embarrassing :) But should be easy enough to fix by guarding ACGT directly.

Further comments below.

Thanks, David

Thanks, Thomas

@dholmes-ora
Copy link
Member

The SR_handler would also need this.

And AGCT is not always called from a signal handler e.g. we have a test:

./test/hotspot/jtreg/serviceability/AsyncGetCallTrace/

which seems to use it synchronously.

Maybe only error reporting (known to use the RA) should actually do this? It isn't obvious to me that UserHandler, SR_handler, or even AGCT are affected by this potential problem.

@tstuefe
Copy link
Member Author

tstuefe commented Mar 3, 2022

Maybe only error reporting (known to use the RA) should actually do this? It isn't obvious to me that UserHandler, SR_handler, or even AGCT are affected by this potential problem.

AGCT was affected by this. That was the original motivation for the patch. The patch was done by us downstream a while ago because we had sporadic crashes with corrupted RAs on the PPC port of the async profiler. The original variant of this patch did patch up AGCT directly, not the signal handler, and the crashes disappeared. So I would do this at least for AGCT too, and for error handling, and preferably for the regular "big" signal handler too.

About UserHandler and SR_handler, I agree. Those seem simple enough. Although switching the resource area does not really cost much. We also unblock error signals at each signal handler entry, and this feels like the same precaution.

Alternatively, if we think a signal handler should not use RA, I thought that we could add checks for this assumption. Temporarily forbidding RA access. But since that would incur cost on every RA allocation (you'd have to check every time) I would only do this for debug. Do you think this makes sense? But I would do this in a separate RFE, if at all.

@dholmes-ora
Copy link
Member

Out of curiosity, where does AGCT perform RA allocation? That seems like a bug ... but then AGCT is really a fingers-crossed mechanism anyway so not much point trying to make it too bullet-proof.

A debug mechanism to disallow RA allocation might be interesting ... but might just show we have a lot more RA allocation than we thought. :) Up to you if you want to investigate that further.

@tstuefe
Copy link
Member Author

tstuefe commented Mar 3, 2022

Out of curiosity, where does AGCT perform RA allocation? That seems like a bug ... but then AGCT is really a fingers-crossed mechanism anyway so not much point trying to make it too bullet-proof.

I agree that this would be a bug. I try to find out what the call stacks were.

AGCT seems like a best effort mechanism, but async profiler got a lot more popular recently, so it would be good to make it safer. And the RA double buffering mechanism I propose is quite simple, and we need it anyway for error handling.

A debug mechanism to disallow RA allocation might be interesting ... but might just show we have a lot more RA allocation than we thought. :) Up to you if you want to investigate that further.

I will try, but no promises :) This feels like it could be a whack the mole game.

@tstuefe
Copy link
Member Author

tstuefe commented Mar 3, 2022

Changes:

  • made Thread::switch_to_(primary|secondary)_resource_area() private to Thread, now only usable via the new RAII object Thread::ResourceAreaSwitcher
  • Thread::ResourceAreaSwitcher is nestable: first invocation will switch to secondary RA, last rollback back to primary, subsequent invocations are noops.
  • Now I guard AGCT explicitly too
  • needed to slightly modify the way VMError::report_and_die is guarded since we now use an RAII object
  • slimmed down comments

@tstuefe
Copy link
Member Author

tstuefe commented Mar 3, 2022

Test errors unrelated:

  • windows, linux x64: infra problems
  • linux 32-bit: caused by JDK-8281472

Tests at SAP went through. Will rerun them tonight with the new patch version.

@dholmes-ora
Copy link
Member

Hi Thomas,

I have a remaining concern. It seemed to me that after we switch the RA we should always have a ResourceMark to ensure that the secondary RA is cleared by the time we switch back. Or else we should be able to check it is "empty" before switching back. But that lead to a second concern - which really relates to how RA allocation works in general. Consider this hypothetical scenario:

void foo() {
  ResourceMark rm;
  bar();   // allocates in RA for some data structure X
  baz();  //  Also allocates for X causing X to expand in RA
}

now suppose a signal hits before baz(), and so we switch the the secondary RA. The signal handler invokes code that also leads to use of X and causes X expand, but now in the secondary RA. When the signal handler returns X has "dangling pointers" into the secondary RA.

In short you can't know that RA allocation in the secondary area is really safe without knowing what code is going to do the allocation and how that interacts with existing ResourceMarks on the stack. If you add a ResourceMark when switching to the secondary RA then you may cause a "dangling pointer", and if you don't then you may cause a leak in the secondary RA.

@tstuefe
Copy link
Member Author

tstuefe commented Mar 4, 2022

Hi David,

Hi Thomas,

I have a remaining concern. It seemed to me that after we switch the RA we should always have a ResourceMark to ensure that the secondary RA is cleared by the time we switch back. Or else we should be able to check it is "empty" before switching back. But that lead to a second concern - which really relates to how RA allocation works in general. Consider this hypothetical scenario:

void foo() {
  ResourceMark rm;
  bar();   // allocates in RA for some data structure X
  baz();  //  Also allocates for X causing X to expand in RA
}

now suppose a signal hits before baz(), and so we switch the the secondary RA. The signal handler invokes code that also leads to use of X and causes X expand, but now in the secondary RA. When the signal handler returns X has "dangling pointers" into the secondary RA.

In short you can't know that RA allocation in the secondary area is really safe without knowing what code is going to do the allocation and how that interacts with existing ResourceMarks on the stack. If you add a ResourceMark when switching to the secondary RA then you may cause a "dangling pointer", and if you don't then you may cause a leak in the secondary RA.

I don't understand. The signal handler, if it calls foo(), executes foo(), from start to finish, then returns. With RA1 as primary, RA2 as secondary area I see it like this:

                         RA1          RM1              RA1           RM2

                         n            -                0             -

--Enter foo()--
ResourceMark()           n            ->RA1[n]         0             - 
bar() alloc X            n+X          ->RA1[n]         0             -
--Signal!--                       
--Enter foo()--
ResourceMark()           n+X          ->RA1[n]         0             ->RA2[0]
bar() alloc X            n+X          ->RA1[n]         0+X           ->RA2[0]
baZ() alloc X            n+X          ->RA1[n]         0+X+X         ->RA2[0]
~ResourceMark()          n+X          ->RA1[n]         0             -
--Signal handler returns--                       
baZ() alloc X            n+X+X        ->RA1[n]         0             -
~ResourceMark()          n            -                0             -

In other words, entering the handler causes us to switch stacks, very similar to what one could do with sigaltstack(3). We set aside the primary RA and all its marks and switch to the secondary RA. We don't switch back to the primary RA without completely unrolling the secondary RA and all its marks.

I do agree though that when switching to the secondary RA we need a Mark. Otherwise subsequent invocations of a signal handler would just accumulate memory in the secondary RA.

@tstuefe
Copy link
Member Author

tstuefe commented Mar 4, 2022

BTW, I was sure I had seen problems with AsyncGetCallTrace and ResourceMarks recently, so I looked and found: https://bugs.openjdk.java.net/browse/JDK-8265150

@dholmes-ora
Copy link
Member

The signal handler doesn't call foo(), it just happens to call some code that also uses the data structure X. If X expands while on the alt-RA and doesn't contract again, then X will have a chunk that still refers to the alt-RA when we have switched back to the primary-RA.

Ideally such things can never happen and the alt-RA is always unused by the time the signal handler returns, but without looking at the kind of RA usage that might occur from the signal handler, I can't say whether it will be clean this way or not.

JDK-8265150 is a great example as it shows that allocating from the RA in a signal handler via AGCT is fundamentally broken. Even with the alt-RA do we still not have the ThreadCritical problem?

I don't want the perfect to be the enemy of the "good enough" here, but do want to be sure the cost:benefit ratio is sufficiently small to make this worthwhile.

@tstuefe
Copy link
Member Author

tstuefe commented Mar 4, 2022

The signal handler doesn't call foo(), it just happens to call some code that also uses the data structure X. If X expands while on the alt-RA and doesn't contract again, then X will have a chunk that still refers to the alt-RA when we have switched back to the primary-RA.

Ideally such things can never happen and the alt-RA is always unused by the time the signal handler returns, but without looking at the kind of RA usage that might occur from the signal handler, I can't say whether it will be clean this way or not.

Okay, I think I understand. Things like the former implementation of stringStream, which used an internal buffer allocated from RA, which could grow on demand. We switched stringStream to C-heap allocation for the very reason that this is broken by design.

Pointers allocated from RA should not escape the boundary of the ResourceMark it was allocated in, because the ResourceMark could unwind the RA and deallocate the chunk the pointer points to. That also means that you should not hand a on-demand-growable structure up and down the stack unless you are very sure that you don't cross a RM boundary. That excludes using RA memory as back buffers in utility objects that are handed down the stack naturally, like outputStream.

My argument is that this is a broken pattern, and it was one before my patch too. Switching the underlying RA itself is really not different from crossing a RM boundary. Look at it like this, if your allocation escapes an RM, it does not matter whether the Chunk it lives in is part of the secondary RA or part of the primary RA, it will get cleaned out and your allocation becomes invalid either way.

JDK-8265150 is a great example as it shows that allocating from the RA in a signal handler via AGCT is fundamentally broken. Even with the alt-RA do we still not have the ThreadCritical problem?

Definitely. Its interesting that such a shaky concept became the cornerstone of a popular product like the async profiler.

I don't want the perfect to be the enemy of the "good enough" here, but do want to be sure the cost:benefit ratio is sufficiently small to make this worthwhile.

Hmm, I think it is. I know our sporadic crashes on PPC in the async profiler disappeared with the use of secondary RAs.

@dholmes-ora
Copy link
Member

Okay. I think RA's have become much maligned because they have been used incorrectly in conjunction with RM's. It should be okay to pass an RA allocated object around in a well-defined way, but we've become overzealous in the use of RM's at the immediate use-site of an RA object, making it impossible to actually establish well-defined scopes in which that can happen. But I digress. :)

Applying one more band-aid to AGCT doesn't seem too harmful here.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not getting the nested entry story here - what scenarios do you envisage and how are they safe? If you can get nested entry then the counter itself can cause breakage as the switch between areas needs to be essentially an atomic operation and the counter prevents that.

// an inconsistent state.

void Thread::switch_to_primary_resource_area() {
_secondary_resource_area_switch_count --;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: no spaces with unary operators (here and elsewhere)

Copy link

@kimbarrett kimbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this perhaps narrows but does not fix the problem.

What if we're in the Chunk allocator when a signal is received, and the signal
handler allocates enough that it needs another chunk? The Chunk allocator uses
ThreadCritical to protect it's critical regions, which does not make it
re-entrant either.

And if the Chunk allocator finds the associated ChunkPool free-list to be
empty, then we need to go to malloc, which has the same problem of not being
async-signal safe.

I'm also not thrilled by the idea of devoting 1K (or more, if we discover 1K
isn't enough for all the allocations we might need) for each thread for this
purpose. That seems like quite a bit of memory being (mostly) wasted.

So basically I'm not convinced the proposed design is right yet.

One thing I was wondering (because I don't know the answer) is whether there
is any bound on the number of in-flight signals that might need to be handled
at the same time by different threads.

@tstuefe
Copy link
Member Author

tstuefe commented Mar 5, 2022

Hi Kim, thanks for looking at this. Answers below.

It seems like this perhaps narrows but does not fix the problem.

What if we're in the Chunk allocator when a signal is received, and the signal handler allocates enough that it needs another chunk? The Chunk allocator uses ThreadCritical to protect it's critical regions, which does not make it re-entrant either.

And if the Chunk allocator finds the associated ChunkPool free-list to be empty, then we need to go to malloc, which has the same problem of not being async-signal safe.

I'm also not thrilled by the idea of devoting 1K (or more, if we discover 1K isn't enough for all the allocations we might need) for each thread for this purpose. That seems like quite a bit of memory being (mostly) wasted.

So basically I'm not convinced the proposed design is right yet.

One thing I was wondering (because I don't know the answer) is whether there is any bound on the number of in-flight signals that might need to be handled at the same time by different threads.

There are some ideas I have to address your concerns:

  1. Let's arenas not allocate a chunk up front, but only on first use. I never liked that upfront allocation anyway, it did not make much sense to me. We pay for what we may never need. (I always suspected that the small initial chunk size was a workaround for the unconditional allocation costs and that if we delay allocation all chunks could be uni-sized).

By delaying the first chunk allocation, we prevent memory from being wasted if the secondary RA is not being used. All we would pay is the base cost for sizeof(ResourceArea) in Thread, 40 bytes. And that could be reduced too if we were to groom Arena a bit, but that's a different story.

  1. Don't use ChunkPool, for the secondary RA. Allocate the chunks directly via malloc.

That would mean we don't need to pull a central lock. We still use malloc() if someone uses RA inside signal handling - it is used on the first allocation, and if subsequent allocations expand the secondary RA.

Side note, I am curious if ChunkPool is even needed nowadays. I often find that the libc keeps returned memory close by. Re-acquisition via malloc is fast and often handled thread-local without synchronization. Especially compared to ChunkPool, where we 1) always pull a central lock and 2) free in a different thread from where the memory was malloced, which may not be ideal. So maybe removing ChunkPool would be okay in general.

  1. Maybe don't release the chunks in the secondary RA on stack unwinding. Keep them locally around in the secondary RA in a freelist.

That reduces follow-up mallocs. We now only call malloc for the first dive down the stack.

  1. Maybe don't allow expanding the secondary RA at all. Instead, make the first chunk - that we allocate on-demand only - large enough to handle what we think is reasonable for signal handling and during ACGT. Let's say 64K or so.

That reduces the number of dangerous malloc calls to exactly one: the initial large chunk. We keep that in the secondary RA and never release it.

The idea behind (4) is that we don't expect a ton of RA memory to be allocated in signal handling or in ACGT. We only expect small utility functions to be called, that don't need much space.

A variant of (4) could be to place the one big chunk we allow the secondary RA to have on the thread stack instead of using malloc(). Basically, we would use ResourceArea as an interface for normal stack memory allocated in the frame above us where we switched to the secondary RA. Unfortunately, this would reduce its usefulness for error reporting, which has to work in low-stack scenarios. But whoever calls ACGT at least could make sure it gets called on a large enough stack.

  1. If we use C-heap for the one big chunk, and are unhappy with this one remaining malloc: Pre-allocate the initial large chunk for the secondary RA. Since that would again increase the per-thread costs, just preallocate a bunch of chunks (2-4?) at VM start - whatever parallel use we expect - and keep them around in a global freelist. That list would have to be lock-free but only needs to implement take, not return, since we don't return the chunks.

The idea behind this is that we don't need this for every thread, we don't expect a ton of threads to use RA during signal handling. We may be able to just use this for a bunch of threads (the one (?) that does ACGT sampling, and maybe the one that crashes and writes a hs-err file)

I admit (5) is not really fleshed out yet in my head. But even with (1)..(4) we could have a solution that is a lot more signal safe than what we do today.

Cheers, Thomas

@tstuefe
Copy link
Member Author

tstuefe commented Mar 9, 2022

I put this back into draft for further refinement.

@kimbarrett , @dholmes-ora :

Atm I tend toward a solution as laid out in my comment #7624 (comment).

Condensed:

The secondary "safe" ResourceArea would be one that has a single chunk and cannot grow more chunks. That chunk would be either (a) allocated with raw ::malloc(), on demand only on first RA use, or (b) just use an array placed on the stack in the frame that entered ACGT.

If I go with (b), that solution may be restricted to AGCT only since with signal handling, we cannot be sure how much stack we have left.

If that sounds okay to you both, I go with this. I only want to spend time on this if we agree on the general direction.

..Thomas

@tstuefe tstuefe marked this pull request as draft March 9, 2022 09:12
@openjdk openjdk bot removed the rfr Pull request is ready for review label Mar 9, 2022
@mlbridge
Copy link

mlbridge bot commented Mar 11, 2022

Mailing list message from David Holmes on hotspot-runtime-dev:

Hi Thomas,

On 9/03/2022 7:15 pm, Thomas Stuefe wrote:

I put this back into draft for further refinement.

@kimbarrett , @dholmes-ora :

Atm I tend toward a solution as laid out in my comment https://github.com//pull/7624#issuecomment-1059713107.

Condensed:

The secondary "safe" ResourceArea would be one that has a single chunk and cannot grow more chunks. That chunk would be either (a) allocated with raw ::malloc(), on demand only on first RA use, or (b) just use an array placed on the stack in the frame that entered ACGT.

Sorry but how can you constrain the new RA to only a single chunk? What
if it needs to grow? IIUC we don't know exactly what RA allocation may
be attempted in the signal context via AGCT or other code.

And apologies but I'm about to disappear for a week on vacation.

Cheers,
David
-----

@tstuefe
Copy link
Member Author

tstuefe commented Mar 11, 2022

Hi Thomas,

On 9/03/2022 7:15 pm, Thomas Stuefe wrote:

I put this back into draft for further refinement.
@kimbarrett , @dholmes-ora :
Atm I tend toward a solution as laid out in my comment https://github.com/[/pull/7624](https://github.com/openjdk/jdk/pull/7624)#issuecomment-1059713107.
Condensed:
The secondary "safe" ResourceArea would be one that has a single chunk and cannot grow more chunks. That chunk would be either (a) allocated with raw ::malloc(), on demand only on first RA use, or (b) just use an array placed on the stack in the frame that entered ACGT.

Sorry but how can you constrain the new RA to only a single chunk? What if it needs to grow? IIUC we don't know exactly what RA allocation may be attempted in the signal context via AGCT or other code.

The idea is that even though we don't know exactly who or what uses RA, we can assume that it will not need a ton of memory. That holds def. true for error handling, and probably for ACGT too. The culprits are usually small utility functions, tightly guarded by ResourceMarks. So we should be able to safely get by with a single 64k or 128k chunk. And if we still see problems, those can be solved by enlarging the chunk or placing tighter resource marks.

It's a pragmatic way to defuse the problem. We'd go from "forbid RA under any circumstances" to covering ~95% of use cases.

And apologies but I'm about to disappear for a week on vacation.

No problem, have a good time!

Cheers, David -----

@tstuefe tstuefe closed this Mar 23, 2022
@tstuefe tstuefe deleted the JDK-8282405-make-ra-signal-safe branch February 15, 2023 06:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org
3 participants