-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8261447: MethodInvocationCounters frequently run into overflow #2511
Conversation
👋 Welcome back lucy! A progress list of the required criteria for merging this PR into |
/cc hotspot-dev |
@RealLucy |
Webrevs
|
src/hotspot/share/oops/method.hpp
Outdated
@@ -708,6 +725,7 @@ class Method : public Metadata { | |||
} | |||
#ifndef PRODUCT | |||
static ByteSize compiled_invocation_counter_offset() { return byte_offset_of(Method, _compiled_invocation_count); } | |||
static ByteSize compiled_invocation_counter_offset64() { return byte_offset_of(Method, _compiled_invocation_count64); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compiled_invocation_counter_offset()
looks unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Removed.
src/hotspot/share/oops/method.hpp
Outdated
#else | ||
// for PrintMethodData in a product build | ||
int compiled_invocation_count() const { return 0; } | ||
int compiled_invocation_count() const { return 0; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compiled_invocation_count()
looks unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compiled_invocation_count() was used in compare_methods() and collect_invoked_methods(). I have converted those call sites to compiled_invocation_count64().
src/hotspot/share/oops/method.hpp
Outdated
#if defined(VM_LITTLE_ENDIAN) | ||
struct { | ||
int _compiled_invocation_count; // Number of nmethod invocations so far (for perf. debugging) | ||
// Must preserve this as int. Is used outside the jdk by SA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not update the SA code to access 64 bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two reasons: scope limitation and lack of expertise in SA code. Don't we need the 32bit decl anyway because it is contained in vmstructs.cpp?
Thank you Tobias for having a look. I'll post an update asap. |
@veresov please review these changes |
I don't really like these .*64 suffixes. Can we just make the counter 64 bit, update the SA as Tobias suggested and keep the existing method names? Or is there a reason for doing this that eludes me? |
I introduced the *64 suffixes to not break anything that still uses the old calls. As old uses disappear step by step, I'm more than happy to remove the suffixes. I will have a look into SA and try to make it 64bit counter ready. There may be no new version before the weekend is over. |
/cc serviceability-dev This is a request for help. Could someone with SA knowledge please check if my assumption is correct? In hotspot code, the field Method::_compiled_invocation_count is annotated with a comment that it is used by SA. The field is also exposed via vmStructs.cpp to enable such use. I have scanned SA code in OpenJDK11 and OpenJDK head but found no evidence that this particular field is accessed. Is this finding/assumption correct? If so, I could just stop exposing the field, making my life easier. Thanks! |
@RealLucy |
|
Looks like I have completely messed up my pull request. Please disregard for now. I'm trying to find a way how to clean up. Maybe I'll just start over. |
67fb3f7
to
273d55c
Compare
273d55c
to
0a99ee4
Compare
OK, my pull request is back in a reviewable state. Here is what changed:
In summary: All global invocation counters are 64-bit now. From those counters that register method-individual calls, only _compiled_invocation_count and _nof_megamorphic_calls were widened to 64-bit. The three remaining method-individual counters (invocation_count, interpreter_invocation_count. and backedge_count) remain untouched. I appreciate your feedback! Here is how stats look like now:
|
src/hotspot/share/oops/method.cpp
Outdated
tty->print_cr (" interpreter_invocation_count: %8d ", interpreter_invocation_count()); | ||
tty->print_cr (" invocation_counter: %8d ", invocation_count()); | ||
tty->print_cr (" backedge_counter: %8d ", backedge_count()); | ||
// Internal counting is based on signed int counters. They tend to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a good reason to not simply make them unsigned int?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, depends on what you accept as a good reason. :-)
I decided to keep the counters as they are to limit the scope of the change. A grep for backedge_counter returns 94 lines, for example. Deep down, these counters are InvocationCounters, declared as uint. On their way up to the surface, they are treated signed or unsigned. Pretty inconsistent, yes. But a huge task to get it all straight, including checking/fixing assembly code.
Is that reason enough?
Mailing list message from David Holmes on hotspot-dev: On 18/02/2021 9:25 pm, Lutz Schmidt wrote:
I guess so :) It sounds terribly messy and confused. Thanks, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version looks ok. I understand that you don't want to clean up the whole singed / unsigned mess. That's fine with me. I'd only like to see one confusing comment removed or replaced.
You may also want to check your 64 bit overflow time in the description: I guess 185 days matches a 1 THz counter update frequency. With 1 GHz it should be above 500 years.
src/hotspot/share/runtime/java.cpp
Outdated
return ((*b)->invocation_count() + (*b)->compiled_invocation_count()) | ||
- ((*a)->invocation_count() + (*a)->compiled_invocation_count()); | ||
// invocation_count() may have overflowed already. Interpret it's result as | ||
// unsigned int to shift the limit of meaningless results by a factor of 2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code is fine, but this comment doesn't make sense to me. The result is the same with your version. But it has the advantage that it avoids signed integer overflow (undefined behavior).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. The comments could be misleading. They are gone.
I corrected the "time to overflow" statement. In my first estimate, I just forgot 10 of the 64 bits.
Thank you for your thorough look into the change.
@RealLucy This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 199 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@TobiHartmann |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
Thank you for your review, Tobias! |
Hi @RealLucy , I didn't do an actual review just made a passing comment. |
@dholmes-ora OK then. I just didn't want to ignore anyone's opinion. |
I left a comment a while ago about the unsigned int casts (in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have few comments.
@@ -156,6 +156,7 @@ VtableStub* VtableStubs::create_itable_stub(int itable_index) { | |||
if (s == NULL) { | |||
return NULL; | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why you did not update asm instruction to update nof_megamorphic_calls
in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is plain simple: there is no incrementq() for x86_32. I could emulate that with a few lines like
address ctrAddr = (address)SharedRuntime::nof_megamorphic_calls_addr();
__ lea(rscratch1, ExternalAddress(ctrAddr));
__ addl(Address(rscratch1, 0), 1);
__ adcl(Address(rscratch1, 4), 0);
Not sure if that would be desirable here. Just let me know. As is, the code just updates the less significant half of the 8-byte counter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let keep as it is. Then revert this file back - the only change is new empty line.
src/hotspot/share/oops/method.cpp
Outdated
tty->print_cr (" backedge_counter: " UINT32_FORMAT_W(11) " %s", bec, addMsg); | ||
|
||
if (method_data() != NULL) { | ||
unsigned int dcc = (unsigned int)method_data()->decompile_count(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decompile_count() returns uint
why do cast and why you check decompile_count for overflow? It is very rare updated and limited by PerMethodRecompilationCutoff
flag (400 by default):
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/oops/methodData.hpp#L2391
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That uint slipped my attention. And with the cutoff parameter, overflow is no issue. I can remove the cast and the check.
src/hotspot/share/oops/method.cpp
Outdated
// another factor of 2 before the counter values become meaningless. | ||
// Print a "overflow" notification to create awareness. | ||
const char* addMsg; | ||
unsigned int maxInt = (1U<<31) - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use INT_MAX?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be better to change the logic to check if the counter is >= InvocationCounter::count_limit
then it's in overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With overflow I do not mean "counter overflow" as it is used to trigger compiler activities but plain simple "range of singed int exceeded". Should I use different wording? E.g. "counter in signed int overflow"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it will never happen. These values come from InvocationCounter::count()
. And it will never return a value > 2^31 - 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, it can return 2^31. Which would be an indication of a counter overflow (InvocationCounter::count_limit == 2^31). Your code is correct in both senses. It is a counter overflow and a signed int overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, my arithmetic is bad again. InvocationCounter::count_limit is 2^30. So, I don't think there ever going to be an overflow that you're looking for. The sign bit is always 0. Or am I missing something again? :)
@veresov I can't see your comment re. the casts. The only comment I see is re. the *64 suffixes. |
No it was a different question. I see my comments try clicking on the "Files changed" tab and scrolling down to method.cpp. |
Igor, I'm sorry, I can't see your comments. Neither those for method.cpp nor those for java.cpp. I checked the piper mail traffic as well. No mail from you after the *64 suffix comment. For simplicity, could you just paste your comment here, please? Or send them to me via mail (off-list, first.last at sap.com)? |
Ok, odd. I've sent you an email. |
src/hotspot/share/oops/method.cpp
Outdated
// the overflow limit, we interpret the return value as unsigned int. | ||
// This is ok because counters are unsigned by nature, and it gives us | ||
// another factor of 2 before the counter values become meaningless. | ||
// Print a "overflow" notification to create awareness. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What invocation_count()
returns (which is currently equivalent to interpreter_invocation_count()
btw) comes from the InvocationCounter::count()
which cannot grow beyond 2^31, so all these counts are always positive. What exactly do these casts to unsigned do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, why do we need the casts to unsigned in this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you increment (2^31-1), you get 2^31 which is 0x80000000. When interpreted as signed int, it is MIN_INT. I don't want that. I want to treat the value as positive number - what it actually is. There is no negative count!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to make a point that these counters are always < MAX_INT. InvocationCounter::count()
shifts the counter right by 1, ensuring that the sign bit is 0. Method::{invocation, backedge, interpreter_invocation}_count()
can also return InvocationCounter::count_limit
, but this one is 2^30, which is also positive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slowly, but surely, we are coming to a common understanding. Thanks for educating me. I hadn't seen the range limitation for the counters. Now that I know, I recognise there is some knowledge in my almost faded memory.
OK, these three counters will never get dangerously close to 2^31-1. I will rework the section, remove some checks and casts. This will only happen tomorrow (Wednesday) which starts in 90 minutes in my time zone (GMT+1).
src/hotspot/share/runtime/java.cpp
Outdated
return ((*b)->invocation_count() + (*b)->compiled_invocation_count()) | ||
- ((*a)->invocation_count() + (*a)->compiled_invocation_count()); | ||
return (int32_t)(((uint32_t)(*b)->invocation_count() + (*b)->compiled_invocation_count()) | ||
- ((uint32_t)(*a)->invocation_count() + (*a)->compiled_invocation_count())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct? The arithmetic look to be: (int32_t) (uint64_t - uint64_t). If the 64 values inside don't fit in 32, you'll get a negative number which would break the sorting logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you've fixed the types since the last comment, but it think it's still broken (and has been before).
How about:
int64_t diff = ((*b)->compiled_invocation_count() - (*a)->compiled_invocation_count()) + ((*b)->invocation_count() - (*a)->invocation_count());
if (diff > 0) return 1;
else if (diff < 0) return -1;
else return 0;
It's kind of hacky too, because it assumes that compiled_invocation_count() are positive and didn't overflow. But at least we'd get rid of a possible overflow during summation. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. As soon as there is overflow, the original formula doesn't do the trick either.
We can fix it as long as either (unsigned int)invocation_count() does not wrap around from 2^32-1 to 0. The entire expression is calculated as int64_t, protecting us from overflow for the next few years. If we then calculate the return value as you propose, we are good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your new code here casts to uint32_t are probably unnecessary.
src/hotspot/share/oops/method.cpp
Outdated
// another factor of 2 before the counter values become meaningless. | ||
// Print a "overflow" notification to create awareness. | ||
const char* addMsg; | ||
unsigned int maxInt = (1U<<31) - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be better to change the logic to check if the counter is >= InvocationCounter::count_limit
then it's in overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
Thank you, Igor! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good.
Thank you, Vladimir! |
/integrate |
@RealLucy Since your change was applied there have been 200 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 268d9b7. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Dear community,
may I please request reviews for this fix, improving the usefulness of method invocation counters.
Thank you!
Lutz
Progress
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk pull/2511/head:pull/2511
$ git checkout pull/2511