-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osmdata_sp failure #85
Comments
Reproduce with this: Polygon(cbind(c(0, 0, 1), c(0, 1, 1))) cran/sp@0884691#diff-e2cb7a7e598b8bafc380e1bfcdcb339fR38 It's common in the real world to get shapes that only have as many coordinates as they need (like a triangle), but it's standard in GISy things to close the ring explicitly. I think it's so that you can be sure that you aren't a line, though a line can intersect its start point - there is a kind of rationale but it escapes me. Imagine a polygon attempted with only two coordinates and an automated additional third to close it, the pathologies you need to catch get a bit out of hand. :) At any rate, sp has only recently added this warning, it used to silently "close" by adding the extra coordinate. |
@mdsumner Curious stuff this.
and
so there are then 3! repeated vertices in a four-vertex polygon. (And rownames are crucial in |
that was another pathology I was trying to remember. This stuff was impossible to understand "empirically", the code had to be understood. You remind me too, this was a huge breakthrough for me personally: https://stat.ethz.ch/pipermail/r-sig-geo/2016-March/024214.html It meant that the very simplest case, a triangle with three unique coordinates, would trigger auto-closing and also auto-re-winding in the most confusing way. It's an edge case, but also the first thing anyone tries. To the credit of the authors sp solved a much larger gap at the time, it wasn't ever designed for "componentry", but for various reasons no one's really dug into it much until recently. This "sub-feature identity" (terminology?) of vertices, segments, paths is something that ggplot2/fortify ruffled pretty strongly too, and it's a big missing piece of SF. Is that "need for rownames" a ubiquitous feature in OSM? (I need to get across this OSM stuff) |
You can probably see that I'm not even close to your "very simplest case" yet - i've just dealt with processing **non-**polygons with < 3 unique vertices. That now leaves me free to move on to your 3-vertex cases ... rownames are crucial in OSM because they contain the OSM ids that allow everything to be hierarchically inter-related. It's what distinguishes |
Ah I see - I understand the value, but the actual rownames in R are a bit flaky - I'm terrified of R's base rowname management as a "unique ID". Are they mighty handy because they are robust, or because they "exist long enough" to allow tracing "far enough"? (Consider that this stance was developed pre-dplyr join-nirvana, though - probably rownames are fine, but I was too lost in match/merge etc.). This is why sc committed to memory heavy unique IDs for every row (it was a blocker originally, but I just went with it, to resolve later), it means at least in theory that an arbitrary set of inputs can be combined by rbinding each analogous table together. I see it as a unresolved issue, how to apply and persist unique IDs when they are valuable, but drop/compress them when a structural array-like-index is sufficient (i.e. cbind, or match). |
new |
This might be just a case of ill-formed OSM data, but that's very unlikely, so better check it out ...
The text was updated successfully, but these errors were encountered: