Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Geometry.equals(Geometry) to have equalsExact semantics #159

Open
dr-jts opened this issue Sep 5, 2017 · 17 comments
Open

Change Geometry.equals(Geometry) to have equalsExact semantics #159

dr-jts opened this issue Sep 5, 2017 · 17 comments

Comments

@dr-jts
Copy link
Contributor

dr-jts commented Sep 5, 2017

A long-standing confusion in the JTS Geometry class is that Geometry.equals(Geometry) has the semantics of topological-equality, rather than exact-equality (which is more usual for Java). This impacts using JTS in collections or frameworks which rely on equals having standard Java semantics (for example, ORMs such as JPA). In most use cases for collections and containers the desire is for exact-equality semantics.

Even more confusingly, to mitigate this the method Geometry.equals(Object) was added, which has exact-equality semantics. This helps ensure that some container classes "do the right thing" of using the faster exact-equality, but it's hard to be sure they are doing this without close inspection.

Geometry.equals(Geometry) should be changed to have exact-equality semantics.

To compute topological-equality the Geometry.equalsTopo(Geometry) method can be used. The use cases for topological-equality seem to be fairly rare, so having a slightly more verbose method name should not be a problem. This causes a slight discrepancy with the standard OGC Spatial Predicate definition, but the benefits of comprehension and safety outweigh this.

When this change is made the existing Geometry.equalsExact(Geometry) method will be redundant and can be deprecated. EDIT: Geometry.equalsExact(Geometry) will be kept, since the name clearly indicates the semantics. equals will delegate to it.

NOTE that this is definitely a breaking API change.

NOTE 2: As pointed out below, this does not affect the implentation of hashCode(), which is currently appropriate for either definition of equality.

@airbreather
Copy link
Contributor

Wouldn't the more painful part be that Geometry.equals(Geometry) uses the semantics of equalsTopo, whereas Geometry.equals(object) uses the semantics of equalsExact?

i.e., assuming JTS works the same as NTS, you ought to wind up with this apparent inconsistency, which seems more confusing (to me) than the exact truth value on either line (especially in real-world scenarios where o2 is only typed as Object by virtue of it being a key in a hash-based collection):

static void tst()
{
    Geometry g1 = createNewGeom();
    Geometry g2 = createSimilarGeom(); /* such that g1.equalsTopo(g2) && !g1.equalsExact(g2) */
    Object o2 = g2;
    g1.equals(g2); /* true */
    g1.equals(o2); /* false */
}

Also, rather than being impossible*, hashCode() is already correct for both topological and "exact" equality: it will always return the same value for two topologically equal instances, therefore it will always return that same value for two "exactly" equal instances. Admittedly, you could invite a "better"** (correct) hashCode() implementation (i.e., likely fewer collisions) by restricting the semantics of Geometry.equals(Object) to "exact" equality, but just using the extents is sufficient.

*In fact, you can always trivially make a correct hashCode() implementation by simply doing return 0;... all implementations of that method which are more complicated than that are (knowingly or otherwise) voluntarily introducing that extra complexity in exchange for improving performance of hash-based collections that key off of that object, not because it's necessary for correctness.

**This actually wouldn't be unambiguously "better" overall, though: in many real-world cases, I expect that getEnvelopeInternal() will already be cached, so it's blindingly fast today... making it slower by, e.g., scanning the entire coordinate data (which I imagine is the most likely alternative) doesn't seem likely to eliminate enough collisions in "interesting" real-world scenarios to pay for itself. YMMV.

@dr-jts
Copy link
Contributor Author

dr-jts commented Oct 17, 2017

Also, rather than being impossible*, hashCode() is already correct for both topological and "exact" equality

Yes, correct. My error. So this is not a driver for this change. Updated the issue to reflect this.

And agreed that the current hashcode based on extent is a good balance of performance and selectivity.

@dr-jts
Copy link
Contributor Author

dr-jts commented Oct 17, 2017

Wouldn't the more painful part be that Geometry.equals(Geometry) uses the semantics of equalsTopo, whereas Geometry.equals(object) uses the semantics of equalsExact?

Yes, exactly. That's why this is a breaking change.

The example you give is why this change would improve the API, by removing the un-obvious difference between equals(Geometry) and equals(Object). Or am I missing your point?

@airbreather
Copy link
Contributor

airbreather commented Oct 17, 2017

The example you give is why this change would improve the API, by removing the un-obvious difference between equals(Geometry) and equals(Object). Or am I missing your point?

I'm missing something in general, I guess.

The way the issue is described, it sounds like the implementation of Geometry.equals(Geometry) needs to change (because of the two methods that "Geometry.equals" could refer to, this is the only one that hasn't been implemented that way for a while now).

But then as it goes on, it justifies itself as though the method to be changed is Geometry.equals(Object) ("frameworks which rely on equals having standard Java semantics" and the (edit: now-deleted) note about hashCode()).

The description doesn't (and definitely didn't) read like "change equals(Geometry) to behave the same way that equals(Object) does", which is why I chose to ask for clarification.

@dr-jts
Copy link
Contributor Author

dr-jts commented Oct 18, 2017

Good point - I will clarify the issue description.

@AndreasWBartels
Copy link

Please do this change, or implement equalsTopo support for GeometryCollection.
The current implemtation throws a IllegalArgumentException on instances of GeometryCollection.

@dr-jts
Copy link
Contributor Author

dr-jts commented Jun 6, 2019

Please do this change, or implement equalsTopo support for GeometryCollection.
The current implemtation throws a IllegalArgumentException on instances of GeometryCollection.

The reason there is no equalsTopo support (and in fact relate in general) for GeometryCollection is that it's too hard to compute the exact topology in certain situations (in particular, lines and polygons all crossing at non-vertices). So this won't happen, at least until some better algorithm comes along.

Will have to think about the effects of this breaking change before implementing it.

@aaime
Copy link

aaime commented Jan 10, 2021

Just stumbled into this by doing some PMD related cleanups in GeoTools.
One of the PMD rules for tests is not to use assertTrue when assertEquals is a logical equivalent, e.g.:
assertEquals(a,b)
is better than:
assertTrue(a.equals(b))
as assertEquals has a more descriptive failure message, showing the two values, while assertTrue gives nothing unless a message is explicitly provided.

Now, doing the migration, with a and b being geometries, ended up changing the semantics of the comparison, from equalsTopo to equalsExact, because the assertEquals method in JUnit uses Object as argument, ends up bidinding the "Geometry.equals(Object)`` implementation. I know the reason why it happens, but still find it confusing.

@dr-jts dr-jts changed the title Change Geometry.equals to have equalsExact semantics Change Geometry.equals(Geometry) to have equalsExact semantics Jan 11, 2021
@dr-jts
Copy link
Contributor Author

dr-jts commented Jan 11, 2021

@aaime yes, that's the kind of crazy-making thing that happens with the current API semantics. I take it then that you would be in favour of this change?

@dr-jts
Copy link
Contributor Author

dr-jts commented Jan 11, 2021

@aaime actually it seems to me to be suspect that the topological equality would be used in a unit test. It is a relatively weak condition, which might mask certain kinds of errors. And in a unit test it should be possible to be more precise about what the expected value is.

@dr-jts
Copy link
Contributor Author

dr-jts commented Jan 11, 2021

@airbreather Are you in favour of this change?

@airbreather
Copy link
Contributor

@airbreather Are you in favour of this change?

Hm. I am in favor of removing the inconsistency that I posted earlier, and if we have to change the semantics of one, then Geometry.equals(Geometry) seems to be the right one to change, even despite the OGC discrepancy.

I wonder if it might not make more sense to deprecate and start working towards removing Geometry.equals(Geometry) entirely? After all, if it's going to work exactly the same as Geometry.equals(object), then I don't see much of a reason to have a separate method for it other than backwards compatibility.

When this change is made the existing Geometry.equalsExact(Geometry) method will be redundant and can be deprecated.

Personally, I prefer the equalsExact spelling, since it makes it clear how the equality check would be carried out.

@dr-jts
Copy link
Contributor Author

dr-jts commented Jan 12, 2021

@airbreather Are you in favour of this change?

Hm. I am in favor of removing the inconsistency that I posted earlier, and if we have to change the semantics of one, then Geometry.equals(Geometry) seems to be the right one to change, even despite the OGC discrepancy.

Good.

I wonder if it might not make more sense to deprecate and start working towards removing Geometry.equals(Geometry) entirely? After all, if it's going to work exactly the same as Geometry.equals(object), then I don't see much of a reason to have a separate method for it other than backwards compatibility.

Agreed. Start with deprecation, then remove.

When this change is made the existing Geometry.equalsExact(Geometry) method will be redundant and can be deprecated.

Personally, I prefer the equalsExact spelling, since it makes it clear how the equality check would be carried out.

Agreed. Will keep equalsExact, since it is clear about what it means. equals will delegate to it.

@aaime
Copy link

aaime commented Jan 12, 2021

I'm in agreement with any move that resolves the inconsistencies, the current situation is hard explain and reason about. The test that failed after migration from assertTrue to assertEquals really needed the equalsTopo behavior, but we can just update the tests to explicitly use that method when the chanage in JTS happens.

@dr-jts
Copy link
Contributor Author

dr-jts commented Jan 12, 2021

The test that failed after migration from assertTrue to assertEquals really needed the equalsTopo behavior, but we can just update the tests to explicitly use that method when the chanage in JTS happens.

Why not update it now, so that it works?

@aaime
Copy link

aaime commented Jan 12, 2021

Cause I already have spent enough time on this. Those upgrading JTS in GeoTools will have a reason to change the tests (they are not many, I'm not setting a major roadblock in front of them).

@jnh5y
Copy link
Contributor

jnh5y commented Jan 12, 2021

@aaime thanks for pointing that out! Hopefully @jodygarnett and I will remember when the time comes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants