Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generality of st_cast #110

Closed
mdsumner opened this issue Dec 10, 2016 · 19 comments
Closed

generality of st_cast #110

mdsumner opened this issue Dec 10, 2016 · 19 comments

Comments

@mdsumner
Copy link
Member

mdsumner commented Dec 10, 2016

I've tested a case of casting MULTIPOLYGON to MULTILINESTRING, inserting this into st_cast.sfg

MULTILINESTRING = {chkTp(x, "MULTIPOLYGON"); st_multilinestring(unlist(x, recursive = FALSE))},

This works for what I want, but then breaks the case for LINESTRING to MULTILINESTRING, since each case is exclusive.

Will other coercions be handled different than the current design in st_cast - it needs a wider mapping of type to type - or is st_cast perhaps not the place for this?

@edzer
Copy link
Member

edzer commented Dec 10, 2016

I'd be happy to see this work more general, no reason to stay with the switch. Feel free to send a PR.

@mdsumner
Copy link
Member Author

Ok thanks that's good.

Currently, only the first part is kept for conversion to the single-part types, but another option is to replicate the feature out for every part, though now I think that's not a job for st_cast.

Other related issues,

  • should a linestring be automatically closed if not already on cast to polygon
  • should a polygon closing coordinate be dropped on cast to point
  • should multipoint be interpretable as linestring/polygon

I'll proceed with

  • provide for all possible conversions
  • error on cases that result in loss (e.g. multi-ring multipolygon to polygon, but allow single-ring mpl to polygon)
  • auto-close linestrings turned into polygon
  • not keep the duplicated polygon coordinate on conversion to multipoint (keep it for to linestring obvs.)
  • allow multipoint native order conversion to linestring/polygon (but with a warning)
  • in the case of error on loss, suggest to use alternative function that duplicates the features appropriately

@edzer
Copy link
Member

edzer commented Dec 11, 2016

Sounds good!

@mdsumner
Copy link
Member Author

The new plot defaults are excellent for this!

@edzer
Copy link
Member

edzer commented Dec 11, 2016

Glad you like them!

plot(nc)

rplot001

@edzer
Copy link
Member

edzer commented Dec 11, 2016

Note that we now have st_polygonize and st_linemerge, to help.

@mdsumner
Copy link
Member Author

Oh cool that's great. I'm happy with the framework I have for this now. More in a few days.

@edzer
Copy link
Member

edzer commented Dec 20, 2016

@mdsumner : any news on this one?

@mdsumner
Copy link
Member Author

Nothing useful yet, I got most of the combinations working in test-mode but not ready for integration. I have questions about managing the require-closed argument to the internal poly-builder (MatrixSetSet) - can it be brought up to the user to decide etc? but I'm not able to spend time on that right now. I'm hoping to get back to this before the end of the year.

@edzer
Copy link
Member

edzer commented Dec 21, 2016

You mean we can let sf close a polygon in case it is not closed? Yeah, we could do that; sp does that too. Conversion from polygon to multipoint could also drop the closing (=duplicate) point.

@mdsumner
Copy link
Member Author

mdsumner commented Dec 21, 2016 via email

@edzer
Copy link
Member

edzer commented Dec 21, 2016

Mike, I see the general picture like this:

cast

where:

  • a green double arrow means we can cast back and forth by simply changing the class name, since the representations are identical and will not change (unless a non-closed line gets closed when cast to a polygon; I wouldn't worry about the roundtrip being different since this is what the user asked for; in general, line->polygon has constraints that the user is responsible for)
  • a dashed arrow means that (i) when going to the left, we create more geometries, hence have to duplicate attributes (potentially with a warning for the case the attribute is not constant throughout the geometry, we have a field for this in sf objects - relation_to_geometry - still mostly unused but intended for this case), and (ii) when going to the right, a grouping variable is needed that tells how geometries are to be merged, and a rule (function) how attribute fields are aggregated (given this grouping variable, they can be aggregated using base::aggregate). When going to the left, we could add a such a factor / index as attribute, and use that by default on the way back.
  • the bold dashed line arrow, well, similar properties as the thin dashed one.

Do look at c.sfg, which already does a lot of the arrows to the right.

The 10 other geometries still need to be fitted in, but are in one of the columns of the graph.

@mdsumner
Copy link
Member Author

Thanks for the diagram and discussion, it's very helpful. I've pushed a tentative approach here:

https://github.com/mdsumner/sfr/blob/cast-additions/R/cast-individual.R

I've only done the from-to cases for the six usual suspects (MULTIPOLYGON, MULTILINESTRING, MULTIPOINT, POLYGON, LINESTRING, POINT). I used an override of the internal MtrxSetSet, before I realized there's no checks on polygon sense unless st_polygonize is used (which you pointed out previously). I'm also not doing the one-to-many expansion, I just drop all but the first geometry.

I expect to replace my st_multipolygon and st_polygon calls with st_polygonize on appropriate linestring sets. I see that it is:

  • very strict on non-closing inputs, you need the final repeated coordinate
  • not strict with overlapping lines and internal boundaries (no hole-nesting detection)
## from the first vignette "mpol"
plot(st_polygonize(st_cast(mpol, "MULTILINESTRING")), col = "grey")

I think the code layout I have is reasonable, in terms of a switch for the the output type, dispatched on the generic for the input type. It's pretty easy to see how to modify things from this.

I'll do some more testing and exploring before I PR this, but keen to get any feedback.

I'm currently not sure where st_cast should end, and where a "st_explode" should begin - though it seems you have clear ideas on this already.

@edzer
Copy link
Member

edzer commented Dec 27, 2016

Mike, that looks like great work; it does essentially the conversion of a single geometry to another single geometry. I've been working (not completely finished) on casting sfc_XX objects (geometry sets) with expansion and collapsing, draft here. I source it in an R session, test examples now follow the R functions.

What's missing in the above graph is the GEOMETRY case, where we have a mix of geometry types. That is the case where we typically want to convert each to a single type, so we have a uniform sfc_SOMETHING where SOMETHING is not GEOMETRY, and this is where your code comes in.

@mdsumner
Copy link
Member Author

Great, this is sounding good, it's going to be very powerful! That turned out tidier than I was expecting, but you had already put the scaffolding in to make those methods make sense.

@edzer
Copy link
Member

edzer commented Dec 30, 2016

Could you make a PR with your cast-individual.R?

@etiennebr
Copy link
Member

I'm kind of confused about the difference between GEOMETRY and GEOMETRYCOLLECTION and see both as interchangeable; I'd like to know if I'm wrong. Is there an official policy on which one to prefer (if they are interchangeable) ? I see you didn't include GEOMETRY in the diagram. Should everything be coerced to GEOMETRYCOLLECTION ?

@mdsumner
Copy link
Member Author

mdsumner commented Jan 7, 2017 via email

@edzer
Copy link
Member

edzer commented Jan 7, 2017

Geometry is the the type of a simple feature column when the feature geometries in the column are a mix of types. GeometryCollection is the geometry type of a single feature geometry where this geometry can be anything, including a mix of geometry types. I don't think the standard is clear about whether GeometryCollections can be nested; I assume they can't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants