Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8315559: Delay TempSymbol cleanup to avoid symbol table churn #16398

Closed
wants to merge 16 commits into from

Conversation

olivergillespie
Copy link
Contributor

@olivergillespie olivergillespie commented Oct 27, 2023

Attempt to fix regressions in class-loading performance caused by fixing a symbol table leak in JDK-8313678.

See lengthy discussion in https://bugs.openjdk.org/browse/JDK-8315559 for more background. In short, the leak was providing an accidental cache for temporary symbols, allowing reuse.

This change keeps new temporary symbols alive in a queue for a short time, allowing them to be re-used by subsequent operations. For example, when attempting to load a class we may call JVM_FindLoadedClass for multiple classloaders back-to-back, and all of them will create a TempNewSymbol for the same string. At present, each call will leave a dead symbol in the table and create a new one. Dead symbols add cleanup and lookup overhead, and insertion is also costly. With this change, the symbol from the first invocation will live for some time after it is used, and subsequent callers can find the symbol alive in the table - avoiding the extra work.

The queue is bounded, and when full new entries displace the oldest entry. This means symbols are held for the time it takes for 100 new temp symbols to be created. 100 is chosen arbitrarily - the tradeoff is memory usage versus 'cache' hit rate.

When concurrent symbol table cleanup runs, it also drains the queue.

In my testing, this brings Dacapo pmd performance back to where it was before the leak was fixed.

Thanks @shipilev , @coleenp and @MDBijman for helping with this fix.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8315559: Delay TempSymbol cleanup to avoid symbol table churn (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16398/head:pull/16398
$ git checkout pull/16398

Update a local copy of the PR:
$ git checkout pull/16398
$ git pull https://git.openjdk.org/jdk.git pull/16398/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16398

View PR using the GUI difftool:
$ git pr show -t 16398

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16398.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 27, 2023

👋 Welcome back ogillespie! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 27, 2023
@openjdk
Copy link

openjdk bot commented Oct 27, 2023

@olivergillespie The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Oct 27, 2023
@mlbridge
Copy link

mlbridge bot commented Oct 27, 2023

The queue destructor requires the queue to be empty, but this is
not easy to achieve in cases like ours with a class-level queue.
Not all uses need to have an empty queue at shutdown. This could
also be made optional at instantiation time if we want to keep
the assertions for other users (at the moment only g1 dirty card queue
set).
@olivergillespie olivergillespie changed the title 8315559: Dacapo pmd regressions 5-20% on all platforms in 22-b11 8315559: Delay TempSymbol cleanup to avoid symbol table churn Oct 30, 2023
TempNewSymbol now increments refcount again, messing with the
expectations. This is not a complete fix, I will have to read the
individual cases and make sure they are correct.
Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks really good. Test idea enclosed.

@@ -37,7 +37,8 @@ TEST_VM(SymbolTable, temp_new_symbol) {
Symbol* abc = SymbolTable::new_symbol("abc");
int abccount = abc->refcount();
TempNewSymbol ss = abc;
ASSERT_EQ(ss->refcount(), abccount) << "only one abc";
// TODO: properly account for Symbol cleanup delay queue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you can programmatically change the queue length to zero and keep these counts.

Then add a test with some loop of n being the queue length, and create n symbols and check that the first has been reclaimed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I have implemented this in the next commit, but I'm not sure what the idiomatic way to expose this value for testing is - I have just done it naively so far.

src/hotspot/share/oops/symbolHandle.hpp Outdated Show resolved Hide resolved
src/hotspot/share/oops/symbolHandle.hpp Outdated Show resolved Hide resolved
Fix indentation
Improve tests
Improve comment
Remove redundant null check
Improve naming
Pop when >, not >= max len
Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Thank you for fixing this problem.

@openjdk
Copy link

openjdk bot commented Oct 31, 2023

@olivergillespie This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8315559: Delay TempSymbol cleanup to avoid symbol table churn

Reviewed-by: coleenp, kbarrett, shade

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 128 new commits pushed to the master branch:

  • ed5b8c3: 8225220: When the Tab Policy is checked,the scroll button direction displayed incorrectly.
  • f32ab8c: 8320924: Improve heap dump performance by optimizing archived object checks
  • 93b9235: 8321120: Shenandoah: Remove ShenandoahElasticTLAB flag
  • 9b8eaa2: 8319872: AArch64: [vectorapi] Implementation of unsigned (zero extended) casts
  • b9b8263: 8317307: test/jdk/com/sun/jndi/ldap/LdapPoolTimeoutTest.java fails with ConnectException: Connection timed out: no further information
  • 0d0a657: 5108458: JTable does not properly layout its content
  • 2b00ac0: 8308753: Class-File API transition to Preview
  • b9df827: 8309871: jdk/jfr/api/consumer/recordingstream/TestSetEndTime.java timed out
  • 9498469: 8318983: Fix comment typo in PKCS12Passwd.java
  • 4dcbd13: 8314905: jdk/jfr/tool/TestView.java fails with RuntimeException 'Invoked Concurrent' missing from stdout/stderr
  • ... and 118 more: https://git.openjdk.org/jdk/compare/0c9a61c18545c7bd48e54e6b4e523b9ad8d0507d...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@coleenp, @kimbarrett, @shipilev) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 31, 2023
assert(_head == nullptr, "precondition");
assert(_tail == nullptr, "precondition");
}
#endif

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this being removed? Without some good explanation, I'm disapproving this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These assertions require the queue to be empty when the destructor is called. My static queue is destroyed at shutdown, and it may not be empty. I see no explicit problem with a queue not being empty when destroyed, I can only assume it was added to serve as an extra reminder for the specific use-case of g1dirtycardsetqueue which is the only existing user of NonblockingQueue. In my opinion it's better for individual queue owners to decide if they care about this.

Alternatives: I can modify NonblockingQueue to accept an argument (e.g. check_empty_at_shutdown) which is true for g1 card set and false for my usage. Or I can avoid NonblockingQueue entirely as there are a few other review concerns arising from its limitations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The asserts are there to catch memory leaks. It is the responsibility of the
owner of the NBQ to ensure it is empty before deleting it. (Probably that
should have been mentioned in a comment in the header.) This is a generic
requirement, and is not specific to G1DCQS (the sole previous client).

HotSpot code avoids variables with static storage duration for types with
non-trivial construction or destruction. (I thought this was discussed in the
style guide, but apparently not. It's been discussed multiple times in
various reviews.) That's what's happening here - there's no owning object for
the NBQ, and there's no explicit setup/teardown of it during VM initialization
and destruction. That's the reason for hitting the asserts.

So no, don't add a configuration argument for NBQ. I wouldn't approve that
either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Is the 'leak' is relevant in this case since we're shutting down anyway? I tried draining the queue in before_exit but that doesn't seem to run with ctrl-c/SIGINT shutdowns.

HotSpot code avoids variables with static storage duration for types with
non-trivial construction or destruction

Do you have a suggested alternative? Everything in symbol table seems to be static.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've filed a bug report against the style guide to provide guidance in this area.
https://bugs.openjdk.org/browse/JDK-8319242

We typically define a variable with a pointer type, and initialize it somewhere in the VM startup process. So in
this case, something like

// declared in class
static TempSymbolDelayQueue* _cleanup_delay;

// in the .cpp
TempSymbolDelayQueue* SymbolHandleBase::_cleanup_delay = nullptr;
// and update uses accordingly.

Also declare and define an initialization function to initialize that variable, and call it somewhere in init.cpp,
probably in init_globals(). Use the style that exists there. I expect if you don't put it early enough that you'll
reliably get an obvious crash from attempting to access a nullptr.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you suggest, in this case the "leak" doesn't matter because we're on the
way to shutdown. One might think the asserts could be conditionalized on that,
but so far as I know there isn't a reliable "on the way to shutdown" detector.
The idiom described above is what's done throughout HotSpot for initialization
(to ensure we have control of initialization order, among other things), and
we often don't worry about the teardown side of things.

TempSymbolDelayQueueNode* result = _cleanup_delay.pop();
if (result != nullptr) {
delete result;
Atomic::dec(&_cleanup_delay_len);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of a limitation on NonblockingQueue (from the class description: "A
queue may appear empty even though elements have been added and not
removed."), it is theoretically possible for the max-entries value to be
exceeded. (List is empty, thread1 starts a push but is paused, other threads
push lots of entries.) But that will eventually be cleaned up by completion
of the initial push and then later draining the list. So I don't think this
is a problem in practice, but wanted to note that I'd looked at the question.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Yes, I agree that it's not perfectly bounded but that this usage works out okay.


// If the queue is now full, implement a one-in, one-out policy.
if (Atomic::add(&_cleanup_delay_len, 1, memory_order_relaxed) > _cleanup_delay_max_entries) {
TempSymbolDelayQueueNode* result = _cleanup_delay.pop();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NonblockingQueue's push and pop operations are subject to ABA problems, and
require the client to address that in some fashion. There's nothing here to do
that. I think one possibility would be to wrap the push/pop calls in a
GlobalCounter::CriticalSection and do a GlobalCounter::write_synchronize
before deleting a node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have to add more code to wrap NonblockingQueue, please implement it in a .cpp file. I thought NBQ was sufficient for this. Maybe we want some other data structure for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have attempted to avoid locks if at all possible. Re ABA problems, my reading of the push/pop concern is that the problem occurs if a caller pops a node and then re-uses that same node later by re-pushing it. Something like this:

Queue contains Node1(val=a, next=null).

  1. Thread 1 (start push) pushes Node2(val=b, next=null)
  2. Thread 2 pops Node1(val=a, next=null)
  3. Thread 2 sees that Node1.next is null and reuses the container, making Node1(val=c, next=null)
  4. Thread 2 (start push) starts to push Node1(val=c, next=null)
  5. Thread 1 (finish push) sets Node1.next=Node2
  6. Thread 2 (finish push) sets Node2.next=Node1 <-- circular list

Since my change doesn't reuse nodes, I believe it's safe.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coleen -

I think NBQ is a reasonable choice for use here. But it's not a complete
solution on its own. It imposes documented requirements on clients. I don't
think we have a different data structure for this purpose (thread-safe FIFO
without locks), so any alternative would need to be invented, and would be
solving the same problem as NBQ and the surrounding client-provided code.

Oliver -

The current usage is not safe. The reuse can occur through the allocator. For
example, one thread starts a pop. Another thread steals that pop, then deletes
the object. Later, an allocation gets a new node at the same address as the
deleted node. That newly allocated node makes its way through the queue to
eventually become visible to that first thread's still in-progress pop. (So
this is an SMR bug. You generally can't delete an object while some other
thread might be looking at it.)

GlobalCounter is not a locking mechanism. It is an RCU-style synchronization
mechanism, so related to but different from RWLocks. In particular, readers
(threads in a critical section) never block due to this mechanism - only
write_synchronize blocks.

A problem with using GlobalCounter in that simplistic way is that once the
queue is "full", the one-in-one-out policy is going to have every allocation
hit GlobalCounter::write_synchronize (a potentially somewhat expensive
operation, since it needs to iterate over all threads), at least until the
queue is bulk drained. Switching over to a one-in-N-out policy could ameliate
that by batching the synchronizes over several nodes, and also remove the need
for complete bulk draining. Have min/max queue size and switching between
insert-only and one-in-N-out policies depending on the current size seems like
a possible solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the details, I hadn't considered the SMR angle. I'll think about alternatives.

Copy link
Contributor

@coleenp coleenp Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a brief conversation with Kim, where I bemoaned that there should be a simpler solution, he suggested maybe a fixed ring buffer with a xchg to replace the n'th element, like:

const uint N = ...;
Symbol* volatile _delay_queue[N]; // initialize to nullptr
volatile uint _index = 0;

void add(Symbol* s) {
  ... increment refcount for s 
  uint i = Atomic::add(&_index, 1u) % N;
  Symbol* old = Atomic::xchg(&_delay_queue[i], s);
  if (old != nullptr) {
    ... decrement refcount for old, possibly deleting it
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳 no more NonblockingQueue! I think this is great, thanks, I will try it out today.

Edit: done.

_cleanup_delay.push(*node);

// If the queue is now full, implement a one-in, one-out policy.
if (Atomic::add(&_cleanup_delay_len, 1, memory_order_relaxed) > _cleanup_delay_max_entries) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is incrementing relaxed? Now I have to think hard about whether there
might be any ordering problems resulting from that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A colleague suggested that was the appropriate constraint ("there is no transitive memory effects that ride on the counter"), but I don't understand it enough to defend it myself. Atomic::inc uses memory_order_conservative. Happy to go with whatever is preferred.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the things I'm worried about is that I think the decrement operation
might be able to become temporarily negative. I'm not sure whether that's a
problem. Maybe not. I was a little surprised that the queue length stuff is
signed rather than unsigned, as I expected the latter.

The idiom I've usually used for such things is increment-before-push and
decrement-after-pop. That ensures the counter never goes negative or
underflows. Also usually use unsigned types for "sizes".

src/hotspot/share/oops/symbolHandle.hpp Outdated Show resolved Hide resolved
_cleanup_delay.push(*node);

// If the queue is now full, implement a one-in, one-out policy.
if (Atomic::add(&_cleanup_delay_len, 1, memory_order_relaxed) > _cleanup_delay_max_entries) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the things I'm worried about is that I think the decrement operation
might be able to become temporarily negative. I'm not sure whether that's a
problem. Maybe not. I was a little surprised that the queue length stuff is
signed rather than unsigned, as I expected the latter.

The idiom I've usually used for such things is increment-before-push and
decrement-after-pop. That ensures the counter never goes negative or
underflows. Also usually use unsigned types for "sizes".


// If the queue is now full, implement a one-in, one-out policy.
if (Atomic::add(&_cleanup_delay_len, 1, memory_order_relaxed) > _cleanup_delay_max_entries) {
TempSymbolDelayQueueNode* result = _cleanup_delay.pop();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coleen -

I think NBQ is a reasonable choice for use here. But it's not a complete
solution on its own. It imposes documented requirements on clients. I don't
think we have a different data structure for this purpose (thread-safe FIFO
without locks), so any alternative would need to be invented, and would be
solving the same problem as NBQ and the surrounding client-provided code.

Oliver -

The current usage is not safe. The reuse can occur through the allocator. For
example, one thread starts a pop. Another thread steals that pop, then deletes
the object. Later, an allocation gets a new node at the same address as the
deleted node. That newly allocated node makes its way through the queue to
eventually become visible to that first thread's still in-progress pop. (So
this is an SMR bug. You generally can't delete an object while some other
thread might be looking at it.)

GlobalCounter is not a locking mechanism. It is an RCU-style synchronization
mechanism, so related to but different from RWLocks. In particular, readers
(threads in a critical section) never block due to this mechanism - only
write_synchronize blocks.

A problem with using GlobalCounter in that simplistic way is that once the
queue is "full", the one-in-one-out policy is going to have every allocation
hit GlobalCounter::write_synchronize (a potentially somewhat expensive
operation, since it needs to iterate over all threads), at least until the
queue is bulk drained. Switching over to a one-in-N-out policy could ameliate
that by batching the synchronizes over several nodes, and also remove the need
for complete bulk draining. Have min/max queue size and switching between
insert-only and one-in-N-out policies depending on the current size seems like
a possible solution.

assert(_head == nullptr, "precondition");
assert(_tail == nullptr, "precondition");
}
#endif

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The asserts are there to catch memory leaks. It is the responsibility of the
owner of the NBQ to ensure it is empty before deleting it. (Probably that
should have been mentioned in a comment in the header.) This is a generic
requirement, and is not specific to G1DCQS (the sole previous client).

HotSpot code avoids variables with static storage duration for types with
non-trivial construction or destruction. (I thought this was discussed in the
style guide, but apparently not. It's been discussed multiple times in
various reviews.) That's what's happening here - there's no owning object for
the NBQ, and there's no explicit setup/teardown of it during VM initialization
and destruction. That's the reason for hitting the asserts.

So no, don't add a configuration argument for NBQ. I wouldn't approve that
either.

@dholmes-ora
Copy link
Member

I tried draining the queue in before_exit but that doesn't seem to run with ctrl-c/SIGINT shutdowns.

before_exit is executed on all non-VM-abort VM shutdown paths.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also looks good to me with a couple of minor comments. Kim might be away right now and it appears that you addressed his previous comments with this latest version.

src/hotspot/share/oops/symbolHandle.hpp Outdated Show resolved Hide resolved
src/hotspot/share/oops/symbolHandle.hpp Outdated Show resolved Hide resolved
Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, looks much better. I have cosmetic comments.

@@ -33,6 +33,8 @@ TEST_VM(SymbolTable, temp_new_symbol) {
JavaThread* THREAD = JavaThread::current();
// the thread should be in vm to use locks
ThreadInVMfromNative ThreadInVMfromNative(THREAD);
// Disable the temp symbol cleanup delay queue because it increases refcounts.
TempNewSymbol::set_cleanup_delay_enabled(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we have additional check for "enabled" flag in hot production code only to make these tests happy? If so, can we "just" drain the delay queue after new_symbol here? Maybe with helper method here in test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's true. I can try to avoid it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a small helper in the latest commit - do you think it's reasonable?

src/hotspot/share/oops/symbolHandle.hpp Outdated Show resolved Hide resolved
src/hotspot/share/oops/symbolHandle.hpp Outdated Show resolved Hide resolved
src/hotspot/share/oops/symbolHandle.hpp Outdated Show resolved Hide resolved
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 10, 2023
@openjdk openjdk bot added ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 10, 2023
@@ -43,16 +44,29 @@
// probe() and lookup_only() will increment the refcount if symbol is found.
template <bool TEMP>
class SymbolHandleBase : public StackObj {
static Symbol* volatile _cleanup_delay_queue[];
static volatile uint _cleanup_delay_index;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting the delay queue implementation in SymbolHandleBase<> makes unnecessary and unused
data and possibly copies of the code. It is only used in the case where the template parameter is true.
Better would be to put the cleanup delay queue in a separate, non-templated, class. The entire
implementation of the queue could then be in the .cpp file. (I suggest the overhead of an out-of-line
call to add to the queue is in the noise, given that adding to the queue performs 3 atomic RMW
operations.) So something like this:

// In symbolHandle.hpp.
class TempSymbolCleanupDelayer : AllStatic {
  // Or make these file-scoped statics in the .cpp file.
  static const uint QueueSize = 128;
  static Symbol* volatile _queue[];
  static volatile uint _index;

public:
  static void delay_cleanup(Symbol* s);
};

// In symbolHandle.cpp.
Symbol* volatile TempSymbolCleanupDelayer::_queue[QueueSize] = {};
volatile uint TempSymbolCleanupDelayer::_index = 0;

void TempSymbolCleanupDelayer::delay_cleanup(Symbol* sym) {
  assert(sym != nullptr, "precondition");
  sym->increment_refcount();
  uint i = Atomic::add(&_index, 1u) % QueueSize;
  Symbol* old = Atomic::xchg(&_queue, sym);
  Symbol::maybe_decrement_refcount(old);
}

Code is completely untested. It incorporates suggestions I'm making elsewhere in this review too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a reasonable suggestion. It would be better in as a singleton in the cpp file.

Symbol* _temp;

public:
static constexpr uint CLEANUP_DELAY_MAX_ENTRIES = 128;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be public.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in the test, do you prefer another approach like friend class?

Symbol* _temp;

public:
static constexpr uint CLEANUP_DELAY_MAX_ENTRIES = 128;

SymbolHandleBase() : _temp(nullptr) { }

// Conversion from a Symbol* to a SymbolHandleBase.
// Does not increment the current reference count if temporary.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is no longer true for temp symbols, since adding to the delay queue increments the refcount.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted, thanks.

src/hotspot/share/oops/symbolHandle.hpp Show resolved Hide resolved
@@ -84,8 +111,22 @@ class SymbolHandleBase : public StackObj {
static unsigned int compute_hash(const SymbolHandleBase& name) {
return (unsigned int) name->identity_hash();
}

static void drain_cleanup_delay_queue() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious that draining the queue is useful. Unless there's a reason I'm missing, I suggest not doing so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel too strongly either way but someone else previously suggested draining during the periodic task so I added it.
The benefit is not leaving Symbols hanging around in the queue indefinitely (though granted, a fixed number of them, so the memory waste is limited). The downside is a small piece of added code and work on the periodic task.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find any discussion of whether draining is needed in this PR, and draining is in the initial commit.
Other downsides include the need to test that feature and the impact that feature has on testing other parts
of this change. Unless someone argues for it, I'd prefer to see it removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is one additional test for the draining but it's quite simple. The drain feature actually helps with the other tests, that's how we can avoid the queue interfering with ref counts for the tests (create temp symbol, immediately drain queue), so even if we don't drain periodically I'd still leave the feature there for the existing tests.

What about the fact that not draining the queue means we could keep 128 symbols alive indefinitely? Are you not concerned because that's a small amount, and/or because natural churn will likely keep the queue moving?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the footprint savings is probably negligible. I like the idea that the queue is periodically drained because it might show issues where we've messed up the refcounts, but they'd be hard to debug, so maybe not worth it. So I see neither the harm nor the benefit of draining the queue in the periodic task, but we should have the feature for the one test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed drain from the concurrent cleanup. I left it public and tested, is that reasonable or do you prefer I make it private somehow (not sure the typical way to have test-only methods)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that there's a test. Some gtests use friend declarations if they're in a class, but test_symbolTable isn't. I don't think it's worth changing.

void add_to_cleanup_delay_queue(Symbol* sym) {
sym->increment_refcount();
STATIC_ASSERT(is_power_of_2(CLEANUP_DELAY_MAX_ENTRIES)); // allow modulo shortcut
uint i = Atomic::add(&_cleanup_delay_index, 1u) & (CLEANUP_DELAY_MAX_ENTRIES - 1);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with @shipilev . Just use % CLEANUP_DELAY_MAX_ENTRIES. This is an entirely
unreasonable amount of code to explicitly provide a micro-optimization that any non-stupid compiler
will do for us anyway. At most, add a comment to the definition of that constant that being a power
of 2 benefits the ring-buffer. But I wouldn't even bother with that.

Symbol* old = Atomic::xchg(&_cleanup_delay_queue[i], sym);
if (old != nullptr) {
old->decrement_refcount();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conditional refcount decrement could instead be Symbol::maybe_decrement_refcount(old);.

@olivergillespie
Copy link
Contributor Author

Thanks for the suggestions, I'm hoping to get back to this next week.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still looks good to me.

Copy link

@kimbarrett kimbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am good with this version. New file needs a copyright header.

src/hotspot/share/oops/symbolHandle.cpp Show resolved Hide resolved
@olivergillespie
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Dec 4, 2023
@openjdk
Copy link

openjdk bot commented Dec 4, 2023

@olivergillespie
Your change (at version b88c70d) is now ready to be sponsored by a Committer.

@coleenp
Copy link
Contributor

coleenp commented Dec 4, 2023

/sponsor

@openjdk
Copy link

openjdk bot commented Dec 4, 2023

Going to push as commit d23f4f1.
Since your change was applied there have been 129 commits pushed to the master branch:

  • c17b8cf: 8320655: awt screencast robot spin and sync issues with native libpipewire api
  • ed5b8c3: 8225220: When the Tab Policy is checked,the scroll button direction displayed incorrectly.
  • f32ab8c: 8320924: Improve heap dump performance by optimizing archived object checks
  • 93b9235: 8321120: Shenandoah: Remove ShenandoahElasticTLAB flag
  • 9b8eaa2: 8319872: AArch64: [vectorapi] Implementation of unsigned (zero extended) casts
  • b9b8263: 8317307: test/jdk/com/sun/jndi/ldap/LdapPoolTimeoutTest.java fails with ConnectException: Connection timed out: no further information
  • 0d0a657: 5108458: JTable does not properly layout its content
  • 2b00ac0: 8308753: Class-File API transition to Preview
  • b9df827: 8309871: jdk/jfr/api/consumer/recordingstream/TestSetEndTime.java timed out
  • 9498469: 8318983: Fix comment typo in PKCS12Passwd.java
  • ... and 119 more: https://git.openjdk.org/jdk/compare/0c9a61c18545c7bd48e54e6b4e523b9ad8d0507d...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 4, 2023
@openjdk openjdk bot closed this Dec 4, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Dec 4, 2023
@openjdk
Copy link

openjdk bot commented Dec 4, 2023

@coleenp @olivergillespie Pushed as commit d23f4f1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@olivergillespie olivergillespie deleted the keepalive2 branch December 4, 2023 13:56
@olivergillespie
Copy link
Contributor Author

Thanks all for the help and reviews!

@coleenp
Copy link
Contributor

coleenp commented Dec 4, 2023

Nice work @olivergillespie

@olivergillespie
Copy link
Contributor Author

/backport jdk21u-dev

@openjdk
Copy link

openjdk bot commented Jan 9, 2024

@olivergillespie the backport was successfully created on the branch backport-olivergillespie-d23f4f12 in my personal fork of openjdk/jdk21u-dev. To create a pull request with this backport targeting openjdk/jdk21u-dev:master, just click the following link:

➡️ Create pull request

The title of the pull request is automatically filled in correctly and below you find a suggestion for the pull request body:

Hi all,

This pull request contains a backport of commit d23f4f12 from the openjdk/jdk repository.

The commit being backported was authored by Oli Gillespie on 4 Dec 2023 and was reviewed by Coleen Phillimore, Kim Barrett and Aleksey Shipilev.

Thanks!

If you need to update the source branch of the pull then run the following commands in a local clone of your personal fork of openjdk/jdk21u-dev:

$ git fetch https://github.com/openjdk-bots/jdk21u-dev.git backport-olivergillespie-d23f4f12:backport-olivergillespie-d23f4f12
$ git checkout backport-olivergillespie-d23f4f12
# make changes
$ git add paths/to/changed/files
$ git commit --message 'Describe additional changes made'
$ git push https://github.com/openjdk-bots/jdk21u-dev.git backport-olivergillespie-d23f4f12

@olivergillespie
Copy link
Contributor Author

/backport jdk17u-dev

@openjdk
Copy link

openjdk bot commented Jan 9, 2024

@olivergillespie Could not automatically backport d23f4f12 to openjdk/jdk17u-dev due to conflicts in the following files:

  • src/hotspot/share/oops/symbolHandle.hpp
  • test/hotspot/gtest/classfile/test_placeholders.cpp

Please fetch the appropriate branch/commit and manually resolve these conflicts by using the following commands in your personal fork of openjdk/jdk17u-dev. Note: these commands are just some suggestions and you can use other equivalent commands you know.

# Fetch the up-to-date version of the target branch
$ git fetch --no-tags https://git.openjdk.org/jdk17u-dev.git master:master

# Check out the target branch and create your own branch to backport
$ git checkout master
$ git checkout -b backport-olivergillespie-d23f4f12

# Fetch the commit you want to backport
$ git fetch --no-tags https://git.openjdk.org/jdk.git d23f4f12adf1ea26b8c340efe2c3854e50b68301

# Backport the commit
$ git cherry-pick --no-commit d23f4f12adf1ea26b8c340efe2c3854e50b68301
# Resolve conflicts now

# Commit the files you have modified
$ git add files/with/resolved/conflicts
$ git commit -m 'Backport d23f4f12adf1ea26b8c340efe2c3854e50b68301'

Once you have resolved the conflicts as explained above continue with creating a pull request towards the openjdk/jdk17u-dev with the title Backport d23f4f12adf1ea26b8c340efe2c3854e50b68301.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
5 participants