-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve global-cached-lookups metric reporting #13219
Improve global-cached-lookups metric reporting #13219
Conversation
@@ -245,7 +245,8 @@ static long getByteLengthOfObject(@Nullable Object o) | |||
{ | |||
if (null != o) { | |||
if (o.getClass().getName().equals(STRING_CLASS_NAME)) { | |||
return ((String) (o)).length(); | |||
// Each String object has ~40 bytes of overhead | |||
return ((long) ((String) (o)).length() * Character.BYTES) + 40; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also consider doing this instead:
druid/processing/src/main/java/org/apache/druid/segment/StringDimensionDictionary.java
Lines 41 to 47 in 346fbf1
@Override | |
public long estimateSizeOfValue(String value) | |
{ | |
// According to https://www.ibm.com/developerworks/java/library/j-codetoheap/index.html | |
// Total string size = 28B (string metadata) + 16B (char array metadata) + 2B * num letters | |
return 28 + 16 + (2L * value.length()); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out! The I think depends on the particular version of java and beyond that the particular distribution of it. In my testing I found 40 bytes to be a good estimate for overhead. Sometimes I find that its a bit less. Will leave as is unless you feel strongly about changing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can proceed with what you have. It's a difference of just 4 bytes (40 vs 44) per String anyway, and as you mention, it does depend on java version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's better to use the same calculation and code.
@zachjsh - can you reuse the code? The common code can be placed in StringUtils
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
It was found that the
namespace/cache/heapSizeInBytes
metric that tracks the total heap size in bytes of all lookup caches loaded on a service instance was being under reported. We were not accounting for the memory overhead of the String object, which I've found in testing to be ~40 bytes. While this overhead may be java version dependent, it should not vary much, and accounting for this provides a better estimate. Also fixed some logging, and reading bytes from the JDBI result set a little more efficient by saving hash table lookups. Also added some of the lookup metrics to the default statsD emitter metric whitelist.This PR has: