Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: custom classes for GEOS exceptions #991

Open
jorisvandenbossche opened this issue Sep 25, 2020 · 13 comments
Open

API: custom classes for GEOS exceptions #991

jorisvandenbossche opened this issue Sep 25, 2020 · 13 comments
Milestone

Comments

@jorisvandenbossche
Copy link
Member

Moving an inline discussion to its separate dedicated issue: #983 (comment)

Currently, Shapely has a few custom error classes, such as TopologicalError. Using an example from the tests:

from shapely.geometry import Polygon
p1 = [(339, 346), (459,346), (399,311), (340, 277), (399, 173),
      (280, 242), (339, 415), (280, 381), (460, 207), (339, 346)]
p2 = [(339, 207), (280, 311), (460, 138), (399, 242), (459, 277),
      (459, 415), (399, 381), (519, 311), (520, 242), (519, 173),
      (399, 450), (339, 207)]

>>> Polygon(p1).within(Polygon(p2))
TopologyException: side location conflict at 459 346
Traceback (most recent call last):
  File "/home/joris/miniconda3/envs/DS-python-geospatial/lib/python3.8/site-packages/shapely/predicates.py", line 15, in __call__
    return self.fn(this._geom, other._geom, *args)
  File "/home/joris/miniconda3/envs/DS-python-geospatial/lib/python3.8/site-packages/shapely/geos.py", line 580, in errcheck_predicate
    raise PredicateError("Failed to evaluate %s" % repr(func))
shapely.errors.PredicateError: Failed to evaluate <_FuncPtr object at 0x7f0a641ce1c0>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/joris/miniconda3/envs/DS-python-geospatial/lib/python3.8/site-packages/shapely/geometry/base.py", line 752, in within
    return bool(self.impl['within'](self, other))
  File "/home/joris/miniconda3/envs/DS-python-geospatial/lib/python3.8/site-packages/shapely/predicates.py", line 18, in __call__
    self._check_topology(err, this, other)
  File "/home/joris/miniconda3/envs/DS-python-geospatial/lib/python3.8/site-packages/shapely/topology.py", line 35, in _check_topology
    raise TopologicalError(
shapely.errors.TopologicalError: The operation 'GEOSWithin_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.polygon.Polygon object at 0x7f0a6488e490>

So you can see that here a custom TopologicalError exception class is used with a general message about geometries likely being invalid, but the original GEOS error message is also still printed at the beginning of the stacktrace.

Using pygeos on the same example:

>>> import pygeos
>>> p1 = pygeos.polygons(p1)
>>> p2 = pygeos.polygons(p2)
>>> pygeos.within(p1, p2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/joris/miniconda3/envs/DS-python-geospatial/lib/python3.8/site-packages/pygeos/decorators.py", line 55, in wrapped
    return func(*args, **kwargs)
  File "/home/joris/miniconda3/envs/DS-python-geospatial/lib/python3.8/site-packages/pygeos/predicates.py", line 653, in within
    return lib.within(a, b, **kwargs)
pygeos.GEOSException: TopologyException: side location conflict at 459 346

Here, pygeos currently uses a generic GEOSException class for all exceptions raised directly by GEOS, and then adds the specific error message from GEOS.


So for Shapely 2.0 when moving in pygeos' code (started in #983), we need to decide if we are fine with the current approach in pygeos or if we want to extend it to be more like current Shapely's behaviour.

Personally, I am fine with the approach of pygeos: the error message already indicates it is a Topology related error, so I am not sure a custom exception class gives that much added value.

cc @caspervdw @sgillies @mwtoews @brendan-ward

@jorisvandenbossche
Copy link
Member Author

And to give an example of where Shapely has another custom exception class: parsing WKB/WKT.

>>> from shapely.wkt import loads
>>> loads("nonsense")
ParseException: Unknown type: 'NONSENSE'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/joris/miniconda3/envs/DS-python-geospatial/lib/python3.8/site-packages/shapely/wkt.py", line 10, in loads
    return geos.WKTReader(geos.lgeos).read(data)
  File "/home/joris/miniconda3/envs/DS-python-geospatial/lib/python3.8/site-packages/shapely/geos.py", line 285, in read
    raise WKTReadingError(
shapely.errors.WKTReadingError: Could not create geometry because of errors while reading input.

vs

>>> import pygeos
>>> pygeos.from_wkt("nonsense")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/joris/miniconda3/envs/DS-python-geospatial/lib/python3.8/site-packages/pygeos/io.py", line 160, in from_wkt
    return lib.from_wkt(geometry, **kwargs)
pygeos.GEOSException: ParseException: Unknown type: 'NONSENSE'

@brendan-ward
Copy link
Collaborator

Custom exception classes add value where it is possible to programmatically catch and then respond to them as separate categories or where they add clarity / context to the emitted exception. I'm not seeing much of either case here, so in my opinion, it seems OK to deprecate these.

For topology errors, every time I've hit those, it required manual intervention and some custom workaround - so I wouldn't want to catch and handle this automatically, as distinct from other exceptions.

For WKT errors, the pygeos stacktrace makes it clear that it is coming from from_wkt call, so it doesn't seem like WKTReadingError adds much additional context there. Again, this is one that would require manual inspection and correction rather than being automatically handled.

Some of the other custom errors don't seem like they need to be custom errors. Some seem like ValueError if caught during validation here rather than in GEOS (e.g., DimensionError), others seem like ImportError or RuntimeError (e.g., UnsupportedGEOSVersionError).

@sgillies
Copy link
Contributor

I think about exceptions very differently now, compared to when shapely was written. The most important thing, in my opinion (influenced by Brett Slatkin), is to be able to distinguish between errors in shapely usage and bugs in shapely itself. For the first category we need exceptions that derive from a class in shapely.errors. Bugs in shapely would manifest themselves as builtin Python exceptions or some sort of low-level GEOS exception. Perhaps the pygeos.GEOSException can play that role? We'd aspire to isolate users from pygeos.GEOSException by preventing GEOS methods from being called with args that will lead to GEOS errors and, if we can't do that, by catching this error and re-raising something that users know to handle.

What do you think about that? I'm not wedded to any specific exception class name in shapely 1.x.

@brendan-ward
Copy link
Collaborator

@sgillies If I understand correctly, this means that the custom shapely exceptions would largely be derived from validation in shapely prior to calling internals (pygeos) / GEOS where a validation error would raise pygeos.GEOSException? So instead of returning a more generic ValueError, we'd return a more specific exception type?

There has been some discussion in pygeos of validation prior to calling GEOS, but at that time the preference was to leave validation up to GEOS. Is your instinct to add more validation on the shapely side prior to calling into the internals / GEOS?

👍 on goal of helping users distinguish between incorrect usage and bugs.

@sgillies
Copy link
Contributor

@brendan-ward In the past I've been hesitant to insist on validation, because not checking for GEOS validity can allow some optimization, right? Allowing exceptions to bubble up from lower down has seemed okay in some cases.

What I'm looking for is a state like this:

try:
    something()
except shapely.errors.ShapelyError:
    # fix my own code.
except Exception:
    # open issue in the Shapely tracker

I don't think we necessarily need more validation on the front end. It's a balance.

@jorisvandenbossche
Copy link
Member Author

to be able to distinguish between errors in shapely usage and bugs in shapely itself. For the first category we need exceptions that derive from a class in shapely.errors.

That sounds good to me. But:

Bugs in shapely would manifest themselves as builtin Python exceptions or some sort of low-level GEOS exception. Perhaps the pygeos.GEOSException can play that role?

I would rather say that GEOSException is the first category and not this second?
If you get a GEOSException, that is typically (as far as I understand) a usage bug and not a bug in Shapely. For example, if you pass an invalid string to the WKT parser, then you get a GEOSException, and this is a user error, not a Shapely bug. Or, if you pass an invalid (eg self-intersecting) geometry to some spatial operation that cannot handle this, then you get an exception from GEOS, and again this is a user error and not a bug in Shapely, I think?

I think your code snippet in the comment above looks good, but so for me this shapely.errors.ShapelyError is basically GEOSException (and the naming of this can be changed of course).

We'd aspire to isolate users from pygeos.GEOSException by preventing GEOS methods from being called with args that will lead to GEOS errors and, if we can't do that, by catching this error and re-raising something that users know to handle.

What's the reason to isolate users from GEOS error messages? Is it because you find them not informative enough?
(I would say that the WKT parsing error is very clear; the custom TopologyError from Shapely on the other hand is maybe more informative (it gives a hint for the possible reason, i.e. invalidity of the geometry) than GEOS' TopologyException)

Buts so if I understand correctly, it is more about "adding more context to the error message from GEOS", and not directly about hiding GEOS exceptions. I think we can certainly look into ways to refine the GEOSExceptions a bit more, if needed (eg append with additional message in certain operations)

@sgillies
Copy link
Contributor

sgillies commented Oct 2, 2020

@jorisvandenbossche thank you, yes, I think you are right that often exceptions coming up from GEOS aren't shapely bugs and that the situation isn't so black-and-white as I made it out to be.

Has the pygeos project looked into preserving the types of exceptions raised by GEOS? It seems like the distinction between TopologyException and ParseException (for example) could be nice to keep.

@jorisvandenbossche
Copy link
Member Author

Has the pygeos project looked into preserving the types of exceptions raised by GEOS? It seems like the distinction between TopologyException and ParseException (for example) could be nice to keep.

So internally, in GEOS C++, they have a bunch of different exception types (TopologyException, UnsupportedOperationException, IllegalArgumentException, ParseException, ... eg see https://github.com/libgeos/geos/tree/master/include/geos/util).
But in the C API, this becomes just a general error notice with a message.

So at the moment, we detect GEOS errors (based on return values), and then get the error message from the handler, and then raise a GEOSException with this message.

In theory, it would certainly be possible to have multiple exception classes, and inspect the error message string, and then raise a custom class based on the first word of the error message (the GEOS error message alway start with "...Exception: .."). But since this part is written in C, it's a bit a trade-off with complexity of implementation.

@jorisvandenbossche
Copy link
Member Author

Further thoughts on this?

For now, in the shapely-2.0 branch, I am going with the exceptions as now raised by PyGEOS (so the current GEOSException), but we can of course still change this if we have a clear picture where we want to go (in practice, for now it's mostly updating some tests regarding the expected raised error type/message)

@jorisvandenbossche
Copy link
Member Author

This came up again in #1065 (comment), so would like to revive this to come to some sort of conclusion on this thread.


I think in the above discussion we converged to agreeing that the errors from GEOS which pygeos currently raises as GEOSException exceptions, are errors to be seen by the user (and not notifying bugs in shapely/pygeos). But one question for this is: are we fine with having a single GEOSException class for all those errors, or do we want multiple exception classes?

Concrete example: the master branch of Shapely has the TopologicalError and PredicateError classes that get raised in the errcheck functions in the ctypes wrapper that check the return value from the GEOS function. On the shapely-2.0 branch / in pygeos, those cases always return a GEOSException instead (the return value of the GEOS function is now check in the C code, and there we currently use a single exception class for all GEOS errors).

In theory, also in the C code we could define multiple exception classes, and depending on the function in question, raise a different python error when GEOS errors. I think this is certainly possible, but I am not sure what the added value would be by doing this.

Small example for the current PredicateError. With shapely master:

from shapely.geometry import Polygon
g1 = Polygon([(0, 0), (0, 1), (3, 1), (3, 0), (0, 0)])
g2 = Polygon([(1, -1), (1, 2), (2, 2), (2, -1), (1, -1)])

>>> g1.relate_pattern(g2, 'fail')
IllegalArgumentException: IllegalArgumentException: Should be length 9, is [fail] instead
...
PredicateError: Failed to evaluate <_FuncPtr object at 0x7fb16abb9f40>

and with shapely-2.0 branch:

>>> g1.relate_pattern(g2, 'fail')
...
GEOSException: IllegalArgumentException: IllegalArgumentException: Should be length 9, is [fail] instead

Is it needed that this case (the function being a predicate) returns a specific class? (to be able to specifically catch errors coming from predicates, and not coming from other shapely functions)


The case in #1065 is about voronoi_diagram. Currently this raises a ValueError:

from shapely.wkt import loads as load_wkt
from shapely.ops import voronoi_diagram
poly = load_wkt('POLYGON ((0 0, 0.5 0, 0.5 0.5, 0 0.5, 0 0))')

>>> voronoi_diagram(poly, tolerance=0.6)
IllegalArgumentException: Invalid number of points in LinearRing found 2 - must be 0 or >= 4
...
ValueError: Could not create Voronoi Diagram with the specified inputs. Try running again with default tolerance value.

With pygeos this gives:

>>> pygeos.voronoi_polygons(pygeos.Geometry('POLYGON ((0 0, 0.5 0, 0.5 0.5, 0 0.5, 0 0))'), tolerance=0.6)
...
GEOSException: IllegalArgumentException: Invalid number of points in LinearRing found 2 - must be 0 or >= 4

Here, Shapely adds some useful context / hint to the error message. This is really useful, and we could look into a way to specify an extra message to be added to the GEOS message in the C code as well.
To me, that seems more important than having this raise a custom exception type like VoronoiDiagramError instead of the general GEOSException (you know you are calling the voronoi_diagram function, so does the error message need to indicate that?)

@sgillies
Copy link
Contributor

sgillies commented Jul 6, 2021

@jorisvandenbossche I'd like to reiterate something that I've said before. The advantage of designing shapely so that it only intentionally raises exceptions deriving from ShapelyError is that a user can easily catch all of the intentionally raised exceptions in their code:

except ShapelyError as exc:  # a shapely API method can't return a result
    do_something(exc)
except Exception:  # user is mis-using Python or there is a shapely bug
    do_something_else()

The disadvantage of intentionally raising two different root exceptions (ShapelyError and GEOSException) is that users have to know to catch both classes.

except (ShapelyError, GEOSException) as exc:  # a shapely API method can't return a result
    do_something(exc)
except Exception:  # user is mis-using Python or there is a shapely bug
    do_something_else()

To me this seems unnecessarily complicated. One root exception for this package seems more simple.

How about this?

  • Pygeos continues to raise a single all purpose GEOSException. No change on the pygeos end.
  • Shapely methods catch this exception when they should and re-raise an exception deriving from ShapelyError.

I'm willing to do the work of catching and re-raising. It's worth it.

@jorisvandenbossche
Copy link
Member Author

The disadvantage of intentionally raising two different root exceptions (ShapelyError and GEOSException)

As I mentioned above (#991 (comment)), for me the purpose of ShapelyError and GEOSException is the same (a custom error class indicating an error related to our own methods or the actual spatial operation). Now it are two different exception classes since they are implemented by two packages, but when we merge pygeos into Shapely, this can be a single exception class, and it's just a naming question.

Short term I am not sure it is worth the effort to catch/re-raise this error. Once we actually move the pygeos code into shapely (instead of importing it as we do now in the shapely-2.0 branch), we can simply do a search/replace of the old name. It's only temporary on a development branch that we would be raising two different errors classes.

@sgillies
Copy link
Contributor

sgillies commented Jul 7, 2021

@jorisvandenbossche okay! As long as we're headed toward a single root exception for Shapely, I'm happy. 2.0 can be released with

class ShapelyError(Exception):
    pass

class GEOSException(ShapelyError):
    pass

and if subclassing GEOSException ever becomes useful, we can do that. How does that sound?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants