New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Reordering of comparison in @EqualsAndHashCode for better performance #1543

Open
stsypanov opened this Issue Dec 19, 2017 · 7 comments

Comments

Projects
None yet
4 participants
@stsypanov

stsypanov commented Dec 19, 2017

Consider simple class:

@EqualsAndHashCode
class Smth {
  Long id;
  List<Long> props;
  String name;
  int count;
}

It generates this code:

public boolean equals(Object o) {
  if (o == this) return true;
  if (!(o instanceof Smth)) return false;
  final Smth other = (Smth) o;
  if (!other.canEqual((Object) this)) return false;
  final Object this$id = this.id;
  final Object other$id = other.id;
  if (this$id == null ? other$id != null : !this$id.equals(other$id)) return false;
  final Object this$props = this.props;
  final Object other$props = other.props;
  if (this$props == null ? other$props != null : !this$props.equals(other$props)) return false;
  final Object this$name = this.name;
  final Object other$name = other.name;
  if (this$name == null ? other$name != null : !this$name.equals(other$name)) return false;
  if (this.count != other.count) return false;
  return true;
}

protected boolean canEqual(Object other) {
  return other instanceof Smth;
}

Then the same code generated by IDEA (for Java 7+):

public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    Smth smth = (Smth) o;
    return count == smth.count &&
            Objects.equals(id, smth.id) &&
            Objects.equals(props, smth.props) &&
            Objects.equals(name, smth.name);
}

IDEA-generated code reorders comparison (primitives and wrappers first) yielding better performance for the case when count or id fields are mutually different and we can immediately return false skipping more complicated comparison of strings and collections.

It'd be nice if Lombok could do the same.

@Maaartinus

This comment has been minimized.

Show comment
Hide comment
@Maaartinus

Maaartinus Dec 20, 2017

Contributor

I'm not sure about the usefulness of such an optimization. Obviously, testing the string before the list is better if only the former differs. It's probably also better in the average real use case, but what if the string is megabytes long and the list differ already in their size?

I always order my fields accordingly; not as an optimization, but placing the simple things first makes sense to me. There's a proposal for generating compareTo, where the fields can be ranked via a special annotation.

Contributor

Maaartinus commented Dec 20, 2017

I'm not sure about the usefulness of such an optimization. Obviously, testing the string before the list is better if only the former differs. It's probably also better in the average real use case, but what if the string is megabytes long and the list differ already in their size?

I always order my fields accordingly; not as an optimization, but placing the simple things first makes sense to me. There's a proposal for generating compareTo, where the fields can be ranked via a special annotation.

@stsypanov

This comment has been minimized.

Show comment
Hide comment
@stsypanov

stsypanov Dec 20, 2017

My proposal is only about primitives and wrappers like Integer (their comparison is in fact comparison of underlying primitives) as you are never sure about other fields of type Object.

stsypanov commented Dec 20, 2017

My proposal is only about primitives and wrappers like Integer (their comparison is in fact comparison of underlying primitives) as you are never sure about other fields of type Object.

@Maaartinus

This comment has been minimized.

Show comment
Hide comment
@Maaartinus

Maaartinus Dec 20, 2017

Contributor

@stsypanov You're right, somehow I "saw" name being placed before props. I may be misreading it again, but the primitive count was placed before the wrapper id and the wrapper wasn't moved before anything as it was already at the top.

Anyway, I guess, the primitive comparison is so cheap that moving primitives up sounds like a good idea. I'd place wrappers next because of the two memory indirections (which may be much more costly than the comparison itself). Everything else then in the last group.

Disclaimer: I'm just a random commenter here, so whatever I say, doesn't matter. ;)

Contributor

Maaartinus commented Dec 20, 2017

@stsypanov You're right, somehow I "saw" name being placed before props. I may be misreading it again, but the primitive count was placed before the wrapper id and the wrapper wasn't moved before anything as it was already at the top.

Anyway, I guess, the primitive comparison is so cheap that moving primitives up sounds like a good idea. I'd place wrappers next because of the two memory indirections (which may be much more costly than the comparison itself). Everything else then in the last group.

Disclaimer: I'm just a random commenter here, so whatever I say, doesn't matter. ;)

@stsypanov

This comment has been minimized.

Show comment
Hide comment
@stsypanov

stsypanov Dec 21, 2017

@Maaartinus your comment is useful as it helped me to clarify what is expected from requested change.

stsypanov commented Dec 21, 2017

@Maaartinus your comment is useful as it helped me to clarify what is expected from requested change.

@victorwss

This comment has been minimized.

Show comment
Hide comment
@victorwss

victorwss Jan 9, 2018

Contributor

Strings are ubiquitous and are reasonably fast to compare if their hashCode()'s do not collide nor happens to be zero. So, I propose that we should use four groups: primitives first, wrappers second, strings third and everything else last.

Contributor

victorwss commented Jan 9, 2018

Strings are ubiquitous and are reasonably fast to compare if their hashCode()'s do not collide nor happens to be zero. So, I propose that we should use four groups: primitives first, wrappers second, strings third and everything else last.

@rspilker

This comment has been minimized.

Show comment
Hide comment
@rspilker

rspilker Jan 10, 2018

Collaborator

I think primitives and primitive wrappers first is a good idea.

Regarding Strings, I'm a bit on the fence here. There could be many other objects that are even faster, and with upcoming value types, if their fields are primitives, they are probably faster. And if we give Strings special treatment, there is no way for a user to optimize their code. Currently, they can do that by reordering their fields.

Collaborator

rspilker commented Jan 10, 2018

I think primitives and primitive wrappers first is a good idea.

Regarding Strings, I'm a bit on the fence here. There could be many other objects that are even faster, and with upcoming value types, if their fields are primitives, they are probably faster. And if we give Strings special treatment, there is no way for a user to optimize their code. Currently, they can do that by reordering their fields.

@Maaartinus

This comment has been minimized.

Show comment
Hide comment
@Maaartinus

Maaartinus Jan 10, 2018

Contributor

@victorwss String are only fast to compare if they hash codes differ. When the hash codes are equal (either by collision or because of the strings being equal), then you're out of luck and need to compare their arrays.

Another problem: In order to use the hash code, you have to call String#hashCode, which possibly has to compute the hash code and that for both strings. This may take much longer than the comparison. Thereafter, they're cached (unless zero), but this may or may not be useful. You could use reflection instead, but that's possibly slow, surely hacky and complicated by compact strings since JDK 9.

Java itself uses neither String.hash nor String.hashCode() in its equals and we're in a worse position as the field is private. So I don't think, we cab do anything fast here.


@rspilker I guess, we could use a benchmark. I'd bet, primitives are the fastest with a large margin as they require no indirection and method call.

Primitive wrappers could IMHO profit most from testing

 Integer a = this.myInt, b = that.myInt;
 if ((a != b) && (a == null || b == null || a.intValue() != b.intValue())) return false;

and avoiding the call.

All objects might profit from testing

 if (a != b && (a == null || b == null || !a.equals(b))) return false;

instead of

 if (a != null ? !a.equals(b) : b != null) return false;

The added tests are redundant, but avoid the indirection and the call in some common cases. The added cost in case the call is necessary is probably negligible (it'd get eliminated completely with inlining, but I guess, this doesn't happen).

Currently, they can do that by reordering their fields.

Sure, imagine a value class starting with an UUID.

Contributor

Maaartinus commented Jan 10, 2018

@victorwss String are only fast to compare if they hash codes differ. When the hash codes are equal (either by collision or because of the strings being equal), then you're out of luck and need to compare their arrays.

Another problem: In order to use the hash code, you have to call String#hashCode, which possibly has to compute the hash code and that for both strings. This may take much longer than the comparison. Thereafter, they're cached (unless zero), but this may or may not be useful. You could use reflection instead, but that's possibly slow, surely hacky and complicated by compact strings since JDK 9.

Java itself uses neither String.hash nor String.hashCode() in its equals and we're in a worse position as the field is private. So I don't think, we cab do anything fast here.


@rspilker I guess, we could use a benchmark. I'd bet, primitives are the fastest with a large margin as they require no indirection and method call.

Primitive wrappers could IMHO profit most from testing

 Integer a = this.myInt, b = that.myInt;
 if ((a != b) && (a == null || b == null || a.intValue() != b.intValue())) return false;

and avoiding the call.

All objects might profit from testing

 if (a != b && (a == null || b == null || !a.equals(b))) return false;

instead of

 if (a != null ? !a.equals(b) : b != null) return false;

The added tests are redundant, but avoid the indirection and the call in some common cases. The added cost in case the call is necessary is probably negligible (it'd get eliminated completely with inlining, but I guess, this doesn't happen).

Currently, they can do that by reordering their fields.

Sure, imagine a value class starting with an UUID.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment