-
Notifications
You must be signed in to change notification settings - Fork 786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RDP filter #3865
Comments
GEOS already has a DouglasPeuckerSimplifier algorithm. Is the RDP algorithm you propose different? As for your second consideration for shortest distances using geodescis and Rhumb lines, these are geographic operations. As far as I know, the common Douglas–Peucker algorithms assume projected 2D Cartesian space, not geographic space. GEOS mostly operates with 2D projected coordinates (occasionally 3D orthogonal coords). There are a few geographic-oriented libraries available. The main one is GeographicLib's C++ implementation. There are some (e.g.) Python bindings with geofun to work with geodesic and Rhumb lines. |
@avitase: I could find this quite useful, e.g. for analysis and design of national border delineations, which are typically given as series of geodesic segments. ArchitectureTo get this into the PROJ code base, I believe that the path of least resistance would be to first and foremost provide the essential "perpendicular distance to geodesic" building block, which fits smoothly into the Coordinate Operation concept, without having to extend the API surface. Perhaps using a syntax like
i.e. defining the geodesic segment using OutputThe next question would be what the output of the rdp-operation should be. Obviously one of the output components would be the distance from the point to the geodesic. But I assume that (depending on your implementation strategy), additional user-useful information could be conveyed. Personally, I could find use for the coordinates of the intersection between the geodesic, and its perpendicular-to-the-input-point, i.e. a 3D output in the format (longitude, latitude, distance). Unit-wise, this would fit with the structure of an ordinary (latitude, longitude, height) tuple, i.e. (angle, angle, length). It would, however, fit poorly with 2D filters like the InversionCan this be trivially inverted, i.e. computing the geographical coordinates of the "point a km from the geodesic segment, which perpendicularly intersects it b km from the segment origin"? RationaleAs already mentioned, implementing the distance-to-geodesic-segment rather than the entire RDP algorithm, has the advantage of not having to extend the PROJ API and architecture: It fits neatly with PROJ's traditional "streaming mode" processing model, where one point is considered at a time, i.e. the coordinate operation building blocks are sequentially served their input points from an external data flow architecture. They do not have any memory of past points passed, and hence cannot carry out the reduction part of the RDP algorithm: This would require a buffering API, taking as input a set of N coordinate tuples, and returning another set of M coordinate tuples. This functionality does not currently exist in PROJ, and it probably would be a poor fit, architecturally and API-wise. So as you already suggest, the "full algorithm" implementation (which is trivial once having your distance-to-geodesic available) fits better into PROJ-wrapping libraries such as pyproj. I agree fully with your statement that it feels more natural to implement this feature once here instead of multiple times in different wrappers General architectural remarkIn general, I believe that a large number of geodetical operations can meaningfully fit into PROJ's point-sequential coordinate operation architecture. In Rust Geodesy, I have e.g. implemented geodesics, radii-of-curvature, and auxiliary latitudes as plain operators. As one of the aims of Rust Geodesy is to be a playground for potential PROJ improvements, these are obviously something I think should be considered for PROJ, and your RDP suggestion is very much in line with this. |
@mwtoews, thank you for your comment! You are right, there are implementations of RDP out there in the wild. However, what I propose here is an implementation that uses a different metric to decide whether a point is (in)significant. For short distances on Earth, I don't expect any difference between an RDP operating on a Mercator projection using the canonical cartesian distance or the accurate geodesic distance as the distance is only used as a threshold value. Still, if the distance between points becomes large, there will be a difference and I don't see any good reason to not use the right metric here. (Whatever right means depends on the context, for example, the genuine movement of a vessel between points could be a Rhumb line, and for a plane, it could be a geodesic.)
I know, The question is whether this is something that should be part of PROJ.
@busstoptaktik, I really like this idea, thanks for the suggestion! In fact, I already have written a library (in Python) that does exactly this: github.com/avitase/geordpy/. There, my naive implementation only needs the (absolute) distance but estimating the distance on the segment actually comes for free and could thus be yielded as well. Once such a distance function is available, implementing the RDP algorithm can indeed be done with a few LoCs.
The inversion is the canonical direct geodesic problem if one can disambiguate the point at +90° and -90°, right? I conjecture that this could be done by assigning a sign to the distance. Open question |
I'm on vacation right now. I'll respond once I've returned (Sept 8). |
I found some answer(s) to my open question(s):
|
Would you accept a pull request that adds an implementation of the RDP algorithm?
An implementation that uses the shortest distance between a point and a geodesic line segment (or alternatively a Rhumb line segment?) is relatively straightforward, but there are some small caveats; hence, having a robust implementation here can save others some time scratching their head in a debugger session. However, I think this feature could come in handy in libraries such as pyproj that wrap PROJ but somehow it feels more natural to implement this feature once here instead of multiple times in different wrappers. What do you think?
The text was updated successfully, but these errors were encountered: