Trac #768 - Add GEOSSTRtree_nearest to CAPI #61

Closed
wants to merge 7 commits into
from

Projects

None yet

3 participants

@dbaston
dbaston commented Feb 13, 2016

https://trac.osgeo.org/geos/ticket/768

This is a necessary precursor to implementing MinimumClearance in GEOS/PostGIS, but useful in its own right.

@dbaston
dbaston commented Feb 13, 2016

@strk how does one run the GEOS unit tests through valgrind? When I intentionally introduce a memory leak, it doesn't show up with valgrind --trace-children=yes ctest.

@strk
Member
strk commented Feb 15, 2016

I had no idea you could use "ctest" (I dont' use cmake).
What I do is running the specific test I want to run via libtool,
example:

 cd tests/unit
 libtool --mode=execute valgrind ./geos_unit
@dbaston
dbaston commented Feb 15, 2016

Thanks, that did the trick.

@dbaston dbaston initial work for GEOSSTRtree_nearest.
dffba3d
@dbaston
dbaston commented Apr 13, 2016

@strk any thoughts about this pull request?

@mloskot
Member
mloskot commented Apr 14, 2016

@dbaston Since there is no MemCheck config in the CMakeLists.txt, you would have to patch it using CTEST_MEMORYCHECK_* variables, then it would be possible to use CTest MemCheck.

@strk
Member
strk commented Apr 14, 2016

Could you add some documentation about the new CAPI entry ?
A few lines in a comment above the signature in the public
header file would do.

Also, can we avoid the void pointers in the distance function ?

@dbaston
dbaston commented Apr 14, 2016

Yes, some documentation is clearly needed. The idea with the void pointers (and the purpose of a user providing their own distance function) is because we don't know the type of the objects stored in the STRtree (PostGIS stores LWGEOMS in them, for example). This is similar to the use of void pointers in the user-provided callback to GEOSSTRtree_query. If the distance function is NULL, it assumes a distance between GEOS geometries.

Looking at this again after quite a bit of time away, I can see that the way I exposed this flexibility in the CAPI is flawed. There's currently no way you could query a tree of LWGEOMS. I think the signature actually needs to be:

extern const void* GEOS_DLL GEOSSTRtree_nearest(
    GEOSSTRtree *tree,
    const GEOSGeometry* envelope,
    const void* item,
    int (*distancefn)(const void* item1, const void* item2, double* distance));

Is there a better way to do it?

@strk
Member
strk commented Apr 14, 2016

On Thu, Apr 14, 2016 at 06:10:32AM -0700, Dan Baston wrote:

extern const void* GEOS_DLL GEOSSTRtree_nearest(
GEOSSTRtree tree,
const GEOSGeometry
envelope,
const void* item,
int (distancefn)(const void item1, const void* item2, double* distance));

Until I see the documentation I can't tell if there's a better way to
do it :)

Note that I didn't implement the GEOSSTRtree interface myself so that
part is also obscure to me. I guess the tree stores pairs of
"envelopes" and associated arbitrary pointers, and only ever uses
those envelopes to do its operations, is that correct ?

In your suggested signature, is the item parameter the one that
would be passed as first parameter to the distancefn ?
Any reason it should be named "item" if that's the case, rather
than just being an arbitrary "context" for the caller ?
(it could for example also store the GEOS context).

@dbaston
dbaston commented Apr 14, 2016

Yes, item would be passed as the first parameter to distancefn. Perhaps the signature of distancefn could be changed to take something more general, but I tried not to stray from the JTS implementation. Here's a crack at some documentation:

/*
 * Returns the nearest item in the STRtree to the supplied item
 *
 * @param tree the STRtree to search
 * @param item the item with which the tree should be queried
 * @param itemGeom a GEOSGeometry having the bounding box of 'item'.  If 'item' is
 *            itself a GEOSGeometry, it may be provided again here
 * @param distancefn a function that can compute the distance between two items
 *            in the STRtree.  The function should return nonzero in case of error,
 *            and should store the computed distance to the location pointed to by
 *            the 'distance' argument.  If NULL, the default GEOS geometry distance 
 *            function will be used.
 * 
 * @return a const pointer to the nearest item in the tree to 'item', or NULL in
 *            case of exception
 */
extern const void* GEOS_DLL GEOSSTRtree_nearest(GEOSSTRtree *tree,
                                        const void* item,
                                        const GEOSGeometry* itemEnvelope,
                                        int (*distancefn)(const void* item1, const void* item2, double* distance));

@dbaston
dbaston commented Apr 14, 2016

I guess the tree stores pairs of "envelopes" and associated arbitrary pointers, and only ever uses
those envelopes to do its operations, is that correct ?

Yes. It's similar to the JTS implementation, which stores Object.

@strk
Member
strk commented Apr 14, 2016

On Thu, Apr 14, 2016 at 09:04:33AM -0700, Dan Baston wrote:

Yes, item would be passed as the first parameter to distancefn. Perhaps the signature of distancefn could be changed to take something more general, but I tried not to stray from the JTS implementation. Here's a crack at some documentation:

/*
 * Returns the nearest item in the STRtree to the supplied item
 *
 * @param tree the STRtree to search
 * @param item the item with which the tree should be queried
 * @param itemGeom a GEOSGeometry having the bounding box of 'item'.  If 'item' is
 *            itself a GEOSGeometry, it may be provided again here

You meant "itemEnvelope" here, I guess.
"May" or "Must" be provided again ?

  • @param distancefn a function that can compute the distance between two items
  •        in the STRtree.  The function should return nonzero in case of error,
    
  •        and should store the computed distance to the location pointed to by
    
  •        the 'distance' argument.  If NULL, the default GEOS geometry distance 
    
  •        function will be used.
    

Will the default GEOS geometry distance assume that the void pointer
is in effect a GEOSGeometry ?
Should the provided function guarantee an invariant based on the
envelope of each item in the tree ? (like: the distance between the
envelopes of two items cannot be bigger than the distance returned
by the distance function, or something along those lines).

  • @return a const pointer to the nearest item in the tree to 'item', or NULL in
  •        case of exception
    

Could the function be extended to return the N nearest items ?

@dbaston
dbaston commented Apr 14, 2016 edited

You meant "itemEnvelope" here, I guess.

Yes

"May" or "Must" be provided again ?

Let's say "must", but I think a better solution is to have two signatures (see below)

Will the default GEOS geometry distance assume that the void pointer
is in effect a GEOSGeometry ?

Yes. Again, resolved by having two signatures.

Should the provided function guarantee an invariant based on the
envelope of each item in the tree ?

Presumably, but I've never seen it documented what that invariant is. I've never used anything other than a plan-old distance function

Could the function be extended to return the N nearest items ?

Unfortunately not, I think that change would have to happen at the JTS level.

Since this is proving to be a bit cumbersome in the most likely case, where the tree is in fact storing GEOSGeometry objects, maybe there should be two functions:

/*
 * Returns the nearest item in the STRtree to the supplied GEOSGeometry
 *
 * @param tree the STRtree to search
 * @param geom the geometry with which the tree should be queried
 * @return a const pointer to the nearest GEOSGeometry in the tree to 'geom', or NULL in
 *            case of exception
 */
extern const GEOSGeometry* GEOS_DLL GEOSSTRtree_nearest(GEOSSTRtree *tree, const GEOSGeometry* geom);

/*
 * Returns the nearest item in the STRtree to the supplied item
 *
 * @param tree the STRtree to search
 * @param item the item with which the tree should be queried
 * @param itemEnvelope a GEOSGeometry having the bounding box of 'item'
 * @param distancefn a function that can compute the distance between two items
 *            in the STRtree.  The function should return nonzero in case of error,
 *            and should store the computed distance to the location pointed to by
 *            the 'distance' argument.
 * @return a const pointer to the nearest item in the tree to 'item', or NULL in
 *            case of exception
 */
extern const void* GEOS_DLL GEOSSTRtree_nearest_generic(GEOSSTRtree *tree,
                                        const void* item,
                                        const GEOSGeometry* itemEnvelope,
                                        int (*distancefn)(const void* item1, const void* item2, double* distance));

@strk
Member
strk commented Apr 15, 2016

/*

  • Returns the nearest item in the STRtree to the supplied GEOSGeometry
    *
  • @param tree the STRtree to search
  • @param geom the geometry with which the tree should be queried
  • @return a const pointer to the nearest GEOSGeometry in the tree to 'geom', or NULL in
  •        case of exception
    
    /
    extern const GEOSGeometry
    GEOS_DLL GEOSSTRtree_nearest_generic(GEOSSTRtree tree, const GEOSGeometry geom);

An empty tree would then throw an exception ? Or the input geometry ?
The function name, I guess, would be GEOSSTRtree_nearest_geometry ?

/*

  • Returns the nearest item in the STRtree to the supplied item
    *
  • @param tree the STRtree to search
  • @param item the item with which the tree should be queried
  • @param itemEnvelope a GEOSGeometry having the bounding box of 'item'

I'd say that "itemEnvelop" is the object with which the tree is
queried, while "item" (or some other name) is an opaque object
that is passed to the distancefn callback function for each of
the "candidate" (to be better explained?) in the tree ?

  • @param distancefn a function that can compute the distance between two items
  •        in the STRtree.  The function should return nonzero in case of error,
    
  •        and should store the computed distance to the location pointed to by
    
  •        the 'distance' argument.
    
  • @return a const pointer to the nearest item in the tree to 'item', or NULL in
  •        case of exception
    
    /
    extern const void
    GEOS_DLL GEOSSTRtree_nearest_generic(GEOSSTRtree tree,
    const void
    item,
    const GEOSGeometry* itemEnvelope,
    int (distancefn)(const void item1, const void* item2, double* distance));
@dbaston
dbaston commented Apr 15, 2016

An empty tree would then throw an exception ? Or the input geometry ?

Yes, an empty tree or input geometry should throw an exception and return NULL. I will add test cases for this.

I'd say that "itemEnvelop" is the object with which the tree is
queried, while "item" (or some other name) is an opaque object
that is passed to the distancefn callback function for each of
the "candidate" (to be better explained?) in the tree ?

I don't want to provide too much detail on how the function is executed; the important thing (from a caller's perspective) is that it computes the distance between objects of the type that is stored in the tree, and with which the tree is queried. This is just coming from the JTS interface ItemDistance:

    /**
     * Computes the distance between two items.
     *
     * @param item1
     * @param item2
     * @return the distance between the items
     *
     * @throws IllegalArgumentException if the metric is not applicable to the arguments
     */

I will change the CAPI to have both GEOSSTRtree_nearest and GEOSSTRtree_nearest_generic, and then I'll add a test case using GEOSSTRtree_nearest_generic with a user-defined geometry type. Sound good?

@strk
Member
strk commented Apr 15, 2016
@dbaston
dbaston commented Apr 15, 2016

I'm not understanding the need to provide a context in this particular case, but I'm no expert here and very happy to take your word for it. That said, I don't want to thread a void pointer all the way through this algorithm, breaking similarity with JTS, for what's probably a fringe use case. How about we just forget about supporting a "nearest" operation on anything other than GEOSGeometry? That leaves us with simply

/*
 * Returns the nearest item in the STRtree to the supplied GEOSGeometry
 *
 * @param tree the STRtree to search
 * @param geom the geometry with which the tree should be queried
 * @return a const pointer to the nearest GEOSGeometry in the tree to 'geom', or NULL in
 *            case of exception
 */
extern const GEOSGeometry* GEOS_DLL GEOSSTRtree_nearest(GEOSSTRtree *tree, const GEOSGeometry* geom);

@strk
Member
strk commented Apr 16, 2016
@dbaston
dbaston commented Apr 16, 2016

@strk thanks for the review and analysis. This matches my understanding of the API too. (I'm happy to suggest some docstring comments for the rest of the STRtree functions too, after this is wrapped up).

Looking at this some more, it looks like we can pass void* userdata to the user-provided distance function without "contaminating" the ItemDistance class, as I had thought before. So we can do the two signatures, adding another parameter to the GEOSSTRtree_nearest_generic version:

/*
 * Returns the nearest item in the STRtree to the supplied GEOSGeometry
 *
 * @param tree the STRtree to search
 * @param geom the geometry with which the tree should be queried
 * @return a const pointer to the nearest GEOSGeometry in the tree to 'geom', or NULL in
 *            case of exception
 */
extern const GEOSGeometry* GEOS_DLL GEOSSTRtree_nearest(GEOSSTRtree *tree, const GEOSGeometry* geom);

/*
 * Returns the nearest item in the STRtree to the supplied item
 *
 * @param tree the STRtree to search
 * @param item the item with which the tree should be queried
 * @param itemEnvelope a GEOSGeometry having the bounding box of 'item'
 * @param distancefn a function that can compute the distance between two items
 *            in the STRtree.  The function should return nonzero in case of error,
 *            and should store the computed distance to the location pointed to by
 *            the 'distance' argument.
 * @param userdata optional pointer to arbitrary data; will be passed to distancefn
 *            each time it is called.
 * @return a const pointer to the nearest item in the tree to 'item', or NULL in
 *            case of exception
 */
extern const void* GEOS_DLL GEOSSTRtree_nearest_generic(GEOSSTRtree *tree,
                                                        const void* item,
                                                        const GEOSGeometry* itemEnvelope,
                                                        int (*distancefn)(const void* item1, const void* item2, double* distance, void* userdata),
                                                        void* userdata);
@strk
Member
strk commented Apr 17, 2016
@dbaston
dbaston commented Apr 18, 2016

OK, I've modified the PR to:

  • add Doxygen comments for the GEOSSTRtree_nearest functions and all other functions having to do with STRtrees
  • correctly handle a nearest query on an empty tree
  • add unit tests showing the use of GEOSSTRtree_nearest with a non-GEOS data type
  • allow user to pass an arbitrary void pointer to distancefn

The motivation for having a GEOSGeometry-specific version of the "nearest" function was that the generic version is a bit confusing....for a GEOSGeometry you'd have to call it as GEOSSTRtree_nearest(tree, g, g, NULL, NULL). I'm fine with taking it out though if you prefer.

@strk
Member
strk commented Apr 19, 2016

From the documentation of the generic function it doesn't look like you can pass NULL as the distance function, right ? So to use a geometry-distance function you should still pass in your own distance function.

If you want to keep the geometry-specific function you should document very carefully that it will assume that the items put in the STRtree objects are in fact geometries.

Please also add tests for passing empty geometries (this one would be also interesting to do for the tree insertion functions, like passing empty envelope).

Any reason not to use a typedef for the callback function ? As it is used for other callbacks, it would sound more consistent.

@dbaston
dbaston commented Apr 19, 2016

If you want to keep the geometry-specific function you should document very carefully that it will assume that the items put in the STRtree objects are in fact geometries.

Added a note to the docstring.

Please also add tests for passing empty geometries (this one would be also interesting to do for the tree insertion functions, like passing empty envelope).

Added.

Any reason not to use a typedef for the callback function ? As it is used for other callbacks, it would sound more consistent.

I remember that I found the typedef of GEOSQueryCallback confusing (as opposed to seeing the expected function signature) when I first used used this pat of the API a while back, but I think having the function signature described in the docstring mitigates this. I added a typedef for GEOSDistanceCallback.

@strk
Member
strk commented Apr 19, 2016
@dbaston
dbaston commented Apr 19, 2016

My (very limited) understanding is that it only requires C++11 if you want to use it as a template parameter.
http://stackoverflow.com/questions/10979984/are-there-any-penalties-for-defining-a-struct-inside-a-function

@strk
Member
strk commented Apr 19, 2016
@strk
Member
strk commented Apr 19, 2016

Merged as r4184 = 63d2554 (refs/remotes/trunk)
Thanks

@strk strk closed this Apr 19, 2016
@dbaston
dbaston commented Apr 19, 2016

Excellent - thanks for working with me on this!

@strk
Member
strk commented Apr 19, 2016
@strk strk pushed a commit that referenced this pull request Apr 19, 2016
Sandro Santilli Add GEOSSTRtree_nearest API
Includes tests for the new API and pre-existing STRtree API
Closes #768

Patch by Daniel Baston <dbaston@gmail.com>
via #61

git-svn-id: http://svn.osgeo.org/geos/trunk@4184 5242fede-7e19-0410-aef8-94bd7d2200fb
63d2554
@strk
Member
strk commented Nov 10, 2016

The PR was incomplete, can you look at it @dbaston ?
See https://trac.osgeo.org/geos/ticket/796

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment