Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support releasing the GIL in the cython algorithms #226

Open
jorisvandenbossche opened this issue Nov 5, 2020 · 4 comments
Open

Support releasing the GIL in the cython algorithms #226

jorisvandenbossche opened this issue Nov 5, 2020 · 4 comments
Labels
enhancement New feature or request to-shapely This issue needs to be transferred to shapely

Comments

@jorisvandenbossche
Copy link
Member

As a follow-up on #197, we can investigate if we can release the GIL in the cython with similar tricks as in the C code.

In any case, just putting a with nogil: context around the for loop does not work, as cython does not allow accessing elements in an object dtype numpy array in a nogil context.

We might need to discuss this with the cython team about ways to force cython to allow this (in cases the user can guarantee this doesn't give problems, a bit similar like the @cython.boundscheck(False) also disables safety checks giving the developer the responsibility to not violate the assumptions).

I was experimenting with this using a small toy example (that doesn't involve the full GEOS/pygeos setup, and so easier to discuss with cython people as well): an object array of floats (where the actual C floating point value is also accessible on the PyObjec struct, similarly as our GEOS pointer): https://nbviewer.jupyter.org/gist/jorisvandenbossche/b03bfe660d02eb43922cd632f3fa9857

@brendan-ward
Copy link
Contributor

Having something builtin to Cython that allows us to do this would be good.

One potential option - though a bit inefficient - would be to split this into 3 loops:

  1. while holding gil, loop over all GeometryObjects and get GEOS geometries. For those that meet condition of that particular function, add the pointer to the GEOS geometry to a resizeable vector (or malloc'd block of memory if size known in advance)
  2. within a nogil block loop over all GEOS geometries and perform the more computationally expensive GEOS operations on them
  3. then while holding gil again, loop over all GEOS geometries and create GeometryObjects from them, and return these

We'd always have the latter 2 loops (depending on return type), and looping over pointers should be pretty fast, so this seems potentially better than having to hold the gil the entire time.

@brendan-ward
Copy link
Contributor

I found another potential issue: try / except / finally require the gil.

Provided I'm seeing this correctly, this means that we should not declare the exception value for GEOS functions within _geos.pxd, especially anywhere that raising the exception without cleanup would leave dangling GEOS geometries or other allocated memory.

I think the fix is simply to state the exception value in a comment next to the declaration in _geos.pxd and add specific checks for that to the implementation. Unfortunately a bit more verbose, but this should allow blocks that do not operate on Python objects to release the gil.

@brendan-ward
Copy link
Contributor

Correction: it looks fine to declare the exception value for GEOS functions; these do not automatically require use of the gil and are a good signal of what to watch for.

It looks like we can have avoid try / except / finally blocks and keep a higher level function marked with nogil, but when we do need to raise an exception, we check return type ourselves and use a with gil block, e.g.,

cdef void some_func() nogil:
    cdef GEOSGeometry *geom = GEOSFunc_r(handle, geom) # returns NULL on error
    if geom == NULL:
        with gil:
            raise RuntimeError("some error occurred...")

Not sure of the performance implications of this, but the compiler allowed this, and testing it produced the expected RuntimeError

@jorisvandenbossche
Copy link
Member Author

when we do need to raise an exception, we check return type ourselves and use a with gil block

Indeed, that's the typical way to raise exceptions in a nogil block (see also my example linked in the top post where I use the same pattern). This shouldn't have a performance impact, at least not yet for the typical case you care about when no errors are raised, as this with gil is only reached when an error happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request to-shapely This issue needs to be transferred to shapely
Projects
None yet
Development

No branches or pull requests

3 participants