Skip to content

8289389: Fix warnings: type should also implement hashCode() since it overrides Object.equals()#821

Closed
andy-goryachev-oracle wants to merge 6 commits intoopenjdk:masterfrom
andy-goryachev-oracle:8289389.hash
Closed

8289389: Fix warnings: type should also implement hashCode() since it overrides Object.equals()#821
andy-goryachev-oracle wants to merge 6 commits intoopenjdk:masterfrom
andy-goryachev-oracle:8289389.hash

Conversation

@andy-goryachev-oracle
Copy link
Contributor

@andy-goryachev-oracle andy-goryachev-oracle commented Jul 8, 2022

  • added missing hashCode() methods

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8289389: Fix warnings: type should also implement hashCode() since it overrides Object.equals()

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jfx pull/821/head:pull/821
$ git checkout pull/821

Update a local copy of the PR:
$ git checkout pull/821
$ git pull https://git.openjdk.org/jfx pull/821/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 821

View PR using the GUI difftool:
$ git pr show -t 821

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jfx/pull/821.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 8, 2022

👋 Welcome back andy-goryachev-oracle! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Ready for review label Jul 8, 2022
@mlbridge
Copy link

mlbridge bot commented Jul 8, 2022

Webrevs

Copy link
Member

@kevinrushforth kevinrushforth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a complete review, just one thing I spotted quickly.

@hjohn
Copy link
Collaborator

hjohn commented Jul 8, 2022

Just a general note, shouldn't most of these use Objects.hash?

I see a lot of code in the form of:

  if(x == null) return 0;

  return x.hashCode();

And even a few where hashes are calculated manually for two values:

  x.hashCode() * 31 + y.hashCode();

These could be written as:

  Objects.hash(x);  // null check for free
  Objects.hash(x, y);  // free null checks and hashes are merged

h = 31 * h + value.hashCode();
}
return h;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an example, but wouldn't: Objects.hash(relative, origin, value) here work just as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I would just turn this class into a record and everything will be taken care of. I'm not sure it is within the scope of the fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such a change is definitely out of scope. More importantly, we can't use record anywhere in JavaFX until we bump the minimum version of the JDK needed to run JavaFX. I plan to start a discussion on the openjfx-dev mailing list, since it has been something I've wanted to do for a while now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjohn: yes, but at a price: Object.hash(Object ...) incurs overhead by creating a temporary Object[] + boxing of a Boolean.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't realize you checked how this is optimized by the JIT.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to reduce collisions, the hash of each component is typically added to h * 31 even when that hash is 0, whereas you skip the h = 31 * h in the case of null. It might not be a problem in practice, since value and origin are unlikely to collide, being of different types, but you might want to look at it.

In any case, I need to time to look at it, which I won't have until after JavaFX 19 RDP1, so let's leave this until then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you bring a good point, Kevin, thank you!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use Boolean.hashCode(relative); for a boolean.

Kevin, I checked Effective Java 3rd Edition and it also says to use 0 (or some other constant) for null.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using 0 for a null component of the hash is fine, as long as the accum variable is initialized to something > 0, and as long as you accumulate all values, even a 0. By which I mean that the following is not OK:

    int h = 1;
    if (comp1 != null) h = 31 * h + comp1.hashCode();
    if (comp2 != null) h = 31 * h + comp2.hashCode();

nor is this:

    int h = (comp1 == null) ? 0 : comp1.hashCode();
    h = 31 * h + (comp2 == null) ? 0 : comp2.hashCode();

but this is:

    int h = 1;
    h = 31 * h + (comp1 == null) ? 0 : comp1.hashCode();
    h = 31 * h + (comp2 == null) ? 0 : comp2.hashCode();

Anyway, the latest change in this PR is good.

@kevinrushforth kevinrushforth self-requested a review July 12, 2022 11:38
Copy link
Collaborator

@nlisker nlisker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for one case, looks good. I think that nearly all of the equals implementations are not the best (or just wrong), but it's outside the scope of the PR.

Comment on lines 1714 to 1715
if (obj == null) return false;
return id == ((RT22599_DataType)obj).id;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note that the equals method here is wrong. The hashCode implementation is fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does look dubious. Since this was written for a specific unit test, it might have been intentional to have a "wrong" equals method, but if so, a comment should have been added to that effect. In any case, fixing equals is out of scope for this PR. And given what equals does, the proposed hashCode method is correct.

Comment on lines 5086 to 5087
if (obj == null) return false;
return id == ((RT22599_DataType)obj).id;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as ListViewTest.

Comment on lines 5742 to 5743
if (obj == null) return false;
return id == ((RT22599_DataType)obj).id;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as ListViewTest.

Comment on lines 3076 to 3077
if (obj == null) return false;
return id == ((RT22599_DataType)obj).id;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as ListViewTest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this and ListViewTest - the logic in hashCode() follows the logic in equals().
it is fairly safe change, as these objects are never put into a hashtable and never accessed outside of the test context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Comment on lines 2321 to 2322
return false;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as ListViewTest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the case of FXMLoader, hashCode() also follows the logic of equals().
However, we could mix something specific to FXMLoader to the hash, to avoid collision with origin URL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine as you now have it. If anyone feels it is worth it, we could file a follow-on issue to look into whether equals should be changed, but I don't know whether it is worth it.

h = 31 * h + value.hashCode();
}
return h;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use Boolean.hashCode(relative); for a boolean.

Kevin, I checked Effective Java 3rd Edition and it also says to use 0 (or some other constant) for null.

Comment on lines +455 to +457
public int hashCode() {
return sourceClip.hashCode();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you ignoring the significant fields? equals compares priority, loopCount, volume, balance, rate and pan in addition.

Comment on lines +528 to +531
if (player == null) {
return 0;
}
return player.hashCode();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, but can also be a ternary return player == null ? 0 : player.hashCode();

@hjohn
Copy link
Collaborator

hjohn commented Jul 19, 2022

I really must wonder at some of these changes. They seem... unnecessary. Just because equals is overridden, doesn't mean hashCode must be as well, especially not if the objects involved are mutable -- it might in fact be better to leave it alone or return a constant value from it (or only use immutable fields).

For example, let's take FXMLLoader. I put it in a HashMap. I modify its location. With a traditionally implemented hashCode that checks the same fields as equals, the object will now fail HashMap contains check, even though I am actually using the same instance. If we had left the default hashCode inplace, it would be found correctly (as the identity hash code will be unchanged). However, if we had used a different instance, then the default identity hashCode would not work either. Returning a constant from hashCode then would "solve" the problem again (at the expense of poor performance in large hash maps).

TLDR; don't use mutable fields in hashCode, if there are only mutable fields, return a constant. Or turn off these too broad warnings that don't really grasp the full impact. I know this is considered a bit of a "holy grail", but it fails in the face of mutable objects that it might be better not to implement hashCode at all and give the wrong impression.

@andy-goryachev-oracle
Copy link
Contributor Author

John, thank you for your comments!

I fully agree with the idea of not putting objects with mutable fields that participate in either equals() or hashCode() into Hashtable. This scenario might still be legitimate, for example by not modifying the fields in question after instantiation.

I also agree that some of the changes might look unnecessary, or perhaps we should either return super.hashCode() or maybe getClass().hashCode() -- not a constant, as hashCode() guarantees some variability, while a constant does not.

I do think, however, that suppressing this warning or turning it off is not a good idea. it forces the developer to think about exactly this issue.

Do you have any specific objections?

@hjohn
Copy link
Collaborator

hjohn commented Jul 19, 2022

John, thank you for your comments!

I fully agree with the idea of not putting objects with mutable fields that participate in either equals() or hashCode() into Hashtable. This scenario might still be legitimate, for example by not modifying the fields in question after instantiation.

Risky, but that's on the user... they'll have to check the source code to see how hashCode is implemented before doing this kind of thing.

I also agree that some of the changes might look unnecessary, or perhaps we should either return super.hashCode() or maybe getClass().hashCode() -- not a constant, as hashCode() guarantees some variability, while a constant does not.

Yes, getClass().hashCode() is recommended, it is constant for all instances of that class.

I do think, however, that suppressing this warning or turning it off is not a good idea. it forces the developer to think about exactly this issue.

Yeah, I suppose you can "surpress" the warning by overriding it with getClass().hashCode() if there are no immutable fields. Even though performance would suffer, it would at least be working correctly all the time if someone accidently put these in a hashed collection. Once the problem is discovered and the developer realizes their mistake, they can always wrap the object to provide a hash code more suitable to their purpose.

I think many of these have equals implemented primarily because they're used as values for properties. These need equals because of how changes are detected.

@tomsontom
Copy link
Collaborator

I really must wonder at some of these changes. They seem... unnecessary. Just because equals is overridden, doesn't mean hashCode must be as well, especially not if the objects involved are mutable -- it might in fact be better to leave it alone or return a constant value from it (or only use immutable fields).

This not true - there's a contract between hashCode and equals quoting the JavaDoc of hashCode():

If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

@hjohn
Copy link
Collaborator

hjohn commented Jul 21, 2022

I really must wonder at some of these changes. They seem... unnecessary. Just because equals is overridden, doesn't mean hashCode must be as well, especially not if the objects involved are mutable -- it might in fact be better to leave it alone or return a constant value from it (or only use immutable fields).

This not true - there's a contract between hashCode and equals quoting the JavaDoc of hashCode():

Thanks, I'm well aware. The issue here is that once a mutable object is stored in a map, you can have either:

  1. Have it find and match the same instance of the object (by not overriding hashCode or calling System.identityHashCode()), regardless of whether that object was mutated in the mean time
  2. Have it find and match any instance (by overriding hashCode and using the same mutable fields as equals uses), but only if the original instance wasn't mutated in the mean time (its bucket will not be adjusted by hash map)
  3. Have it work in all cases, but with terrible performance (by overriding hashCode with the suggested getClass().hashCode() which is effectively a constant)

This isn't mentioned anywhere aside from the small note ("provided no information used in {@code equals} comparisons on the object is modified").

@andy-goryachev-oracle
Copy link
Contributor Author

John:

You do bring a good point, I am sure someone have tripped this trap by adding a mutable object to a hashtable and then mutating it afterwards. We do have some words of caution in the java.util.Map interface javadoc:

"Note: great care must be exercised if mutable objects are used as map keys..."

At the same time, I feel like this discussion goes beyond the scope of this PR, as it falls under the rubric of design decisions in the client code.

Is there anything specific you think should be changed in this PR, or are you against these code changes in principle?

@hjohn
Copy link
Collaborator

hjohn commented Jul 21, 2022

Yes, it's beyond the scope of PR. I suppose people will have to be careful as with all classes that are mutable and have decided to override equals/hashCode -- no reason to stop now I suppose.

Copy link
Member

@kevinrushforth kevinrushforth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good now. I noted a couple places that might need a change.

Comment on lines 1714 to 1715
if (obj == null) return false;
return id == ((RT22599_DataType)obj).id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does look dubious. Since this was written for a specific unit test, it might have been intentional to have a "wrong" equals method, but if so, a comment should have been added to that effect. In any case, fixing equals is out of scope for this PR. And given what equals does, the proposed hashCode method is correct.

Comment on lines 3076 to 3077
if (obj == null) return false;
return id == ((RT22599_DataType)obj).id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Comment on lines 2321 to 2322
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine as you now have it. If anyone feels it is worth it, we could file a follow-on issue to look into whether equals should be changed, but I don't know whether it is worth it.

h = 31 * h + value.hashCode();
}
return h;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using 0 for a null component of the hash is fine, as long as the accum variable is initialized to something > 0, and as long as you accumulate all values, even a 0. By which I mean that the following is not OK:

    int h = 1;
    if (comp1 != null) h = 31 * h + comp1.hashCode();
    if (comp2 != null) h = 31 * h + comp2.hashCode();

nor is this:

    int h = (comp1 == null) ? 0 : comp1.hashCode();
    h = 31 * h + (comp2 == null) ? 0 : comp2.hashCode();

but this is:

    int h = 1;
    h = 31 * h + (comp1 == null) ? 0 : comp1.hashCode();
    h = 31 * h + (comp2 == null) ? 0 : comp2.hashCode();

Anyway, the latest change in this PR is good.


@Override
public int hashCode() {
int h = sourceClip.getLocator().getURI().hashCode();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is URI::hashCode() guaranteed to return a non-zero value? If not, then initializing h to 1 and accumulating this would be better.


@Override
public int hashCode() {
int h = Float.floatToIntBits(x);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best to initialize h to 1 and accumulate x.

@andy-goryachev-oracle
Copy link
Contributor Author

excellent suggestions, Kevin - many thanks!

Copy link
Member

@kevinrushforth kevinrushforth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good.

I note that by initializing the hash to CLASSNAME.class.hashCode(), instead of a constant, the hash code produced will not be predictable or stable across different invocations of the JVM. This is entirely permissible by the spec of Object.hashCode(), so it is just a (somewhat) interesting note.

@openjdk
Copy link

openjdk bot commented Jul 22, 2022

@andy-goryachev-oracle This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8289389: Fix warnings: type should also implement hashCode() since it overrides Object.equals()

Reviewed-by: kcr

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 12 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@kevinrushforth, @nlisker) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Ready to be integrated label Jul 22, 2022
@andy-goryachev-oracle
Copy link
Contributor Author

The changes look good.

I note that by initializing the hash to CLASSNAME.class.hashCode(), instead of a constant, the hash code produced will not be predictable or stable across different invocations of the JVM. This is entirely permissible by the spec of Object.hashCode(), so it is just a (somewhat) interesting note.

I'd say this is a desired outcome.

There is a group of vulnerabilities caused by predictive nature of hashCode, for example, a DOS attack which places large number of objects into the same hashtable bin, see
https://stackoverflow.com/questions/8669946/application-vulnerability-due-to-non-random-hash-functions

I know this has no impact on this PR, just wanted to add add to your interesting note, Kevin.

@andy-goryachev-oracle
Copy link
Contributor Author

If there are no objections, I'd like to integrate this PR on Monday (25 July)

@andy-goryachev-oracle
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Ready to sponsor label Jul 25, 2022
@openjdk
Copy link

openjdk bot commented Jul 25, 2022

@andy-goryachev-oracle
Your change (at version 530bdd9) is now ready to be sponsored by a Committer.

@nlisker
Copy link
Collaborator

nlisker commented Jul 25, 2022

/sponsor

@openjdk
Copy link

openjdk bot commented Jul 25, 2022

Going to push as commit 075cc80.
Since your change was applied there have been 12 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jul 25, 2022
@openjdk openjdk bot closed this Jul 25, 2022
@openjdk openjdk bot removed ready Ready to be integrated rfr Ready for review sponsor Ready to sponsor labels Jul 25, 2022
@openjdk
Copy link

openjdk bot commented Jul 25, 2022

@nlisker @andy-goryachev-oracle Pushed as commit 075cc80.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@andy-goryachev-oracle andy-goryachev-oracle deleted the 8289389.hash branch July 25, 2022 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

5 participants