-
Notifications
You must be signed in to change notification settings - Fork 171
8310026: [8u] make java_lang_String::hash_code consistent across platforms #336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -177,14 +177,24 @@ class java_lang_String : AllStatic { | |||||||||
| // hash P(31) from Kernighan & Ritchie | ||||||||||
| // | ||||||||||
| // For this reason, THIS ALGORITHM MUST MATCH String.hashCode(). | ||||||||||
| template <typename T> static unsigned int hash_code(T* s, int len) { | ||||||||||
| static unsigned int hash_code(const jchar* s, int len) { | ||||||||||
| unsigned int h = 0; | ||||||||||
| while (len-- > 0) { | ||||||||||
| h = 31*h + (unsigned int) *s; | ||||||||||
| s++; | ||||||||||
| } | ||||||||||
| return h; | ||||||||||
| } | ||||||||||
|
|
||||||||||
| static unsigned int hash_code(const jbyte* s, int len) { | ||||||||||
| unsigned int h = 0; | ||||||||||
| while (len-- > 0) { | ||||||||||
| h = 31*h + (((unsigned int) *s) & 0xFF); | ||||||||||
| s++; | ||||||||||
| } | ||||||||||
| return h; | ||||||||||
| } | ||||||||||
|
|
||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand this. According to the comment, both of these functions are to mimic Assuming the For example, let string be a single unicode "ぁ" character, aka Hash for the first would use the jchar* variant, len=1, and return 0x3041. Hash for the UTF8 variant would get, I assume, a byte array of I must be missing something basic here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not really sure what you are suggesting is a problem here, Thomas. I /think/ the only problem here is that the comment is wrong. You are right that only the The problem this is fixing is to do with the disparity between This current fix decouples the definitions of As far as I can tell it doesn't actually matter what interpretation is placed on the data sitting in field There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Few uses of
other uses dealing with Strings use
In newer JDKs, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Thank you, Andrew, for disentangling this. I get it now. The byte variant only has to match its java counterpart in the agent. |
||||||||||
| static unsigned int hash_code(oop java_string); | ||||||||||
|
|
||||||||||
| // This is the string hash code used by the StringTable, which may be | ||||||||||
|
|
||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Byte.toUnsignedInt()would be clearer.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but that would also be different to upstream jdk11u