Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to go from MULTIPOINT to just point? #114

Closed
Robinlovelace opened this issue Dec 12, 2016 · 21 comments
Closed

How to go from MULTIPOINT to just point? #114

Robinlovelace opened this issue Dec 12, 2016 · 21 comments

Comments

@Robinlovelace
Copy link
Contributor

Reproducible example:

x = st_line_sample(st_transform(ls, 3857), density = c(1/1000, 1/10000)) # one per km, one per 10 km
class(x)
## [1] "sfc_MULTIPOINT" "sfc"   

Is there a function, e.g. as.st_points that could convert this into just points? If not any ideas of the best way to go about creating such a function?

@Robinlovelace
Copy link
Contributor Author

Update - works (with warning) for sp objects:

y = as(x, "Spatial")
plot(y)
SpatialPoints(y)
class       : SpatialPoints 
features    : 112 
extent      : 0, 5565.975, -7.081155e-10, 110823.7  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
Warning message:
In validityMethod(object) :
  duplicate rownames are interpreted by rgeos as MultiPoints; use SpatialMultiPoints to define these; in future sp versions this warning will become an error

@edzer
Copy link
Member

edzer commented Dec 12, 2016

could sfc_MULTIPOINT -> sfc_POINT be part of st_cast? We now see

> st_cast(st_sfc(st_multipoint(matrix(1:6,,2))), "POINT")
Geometry set for 1 feature 
geometry type:  POINT
dimension:      XY
bbox:           xmin: 1 ymin: 4 xmax: 3 ymax: 6
epsg (SRID):    NA
proj4string:    NA
POINT(1 4) 
Warning message:
In chk(x, "MULTIPOINT") :
  object does not have length one: casting causes information loss

but maybe one could return all points? Or an sf object with an integer attribute denoting the original ID (nr) in the sfc_MULTIPOINT?

@Robinlovelace
Copy link
Contributor Author

I think returning all points would be good - I imagine the saved ID could be useful in some instances. Not sure if that reply is useful...

I found a way to stop the warning in sp btw - just remove the repeated matrix row names:

ropensci/stplanr@52734f7#diff-c451334402a5623377b6bfc78472163dR73

@Robinlovelace
Copy link
Contributor Author

Note also: I think I found a bug in sp: spTransform() seems to fail on multipoint objects...

@mdsumner
Copy link
Member

I think this is not a job for st_cast but a new function, because of the implied 1-to-many aspect (and more generally perhaps the many-to-1 possibilties within aggregate operations).

I expect st_cast to work within a mutate() call for example, which would be precluded if st_cast was exploding multis to single rather than "dropping".

@mdsumner
Copy link
Member

mdsumner commented Dec 13, 2016

I would go about it something like this, though there's probably more concise ways. This is all standard R data.frame/list idioms, I'm exploring similar techniques for the general cast case (without changing row numbers at the top level).

The following assumes there's no mixing, the input is all MULTIPOINT.

## mk multipoint worker
mkmpt <- function(n, mean = 0) st_multipoint(matrix(rnorm(n * 2, mean), ncol = 2))
## mk sf worker
mksf <- function(d, gc) {
  stopifnot(nrow(d) == length(gc))
  d[["geometry"]] <- gc
 st_as_sf(d)
}
## drop geom
drop_geom <- function(x) {
  as.data.frame(x)[-match(attr(x, "sf_column"), names(x))]
}

## create a data set
mp <- mksf(data.frame(a = c("a", "b"), b = 1:2), st_sfc(list(mkmpt(10), mkmpt(20, 5))))

## convert multipoint to point
## simple case (it's all multipoint)
mp_to_pt <- function(mp) {
  ## build map of grouping in geometry
 idx <- unlist(lapply(st_geometry(mp), nrow))
 ## initialize replacement object
 ## expand out (by duplicating rows for every individual coordinate)
 d <- drop_geom(mp)[rep(seq_len(nrow(mp)), idx),  ]
 rownames(d) <- NULL
 ## split/explode the multipoints and rebuild 
 d[["geometry"]] <- st_sfc(unlist(lapply(st_geometry(mp), function(a) lapply(split(a, seq_len(nrow(a))), st_point)), recursive = FALSE, use.names = FALSE), 
                          crs = st_crs(mp))

 st_as_sf(d)
}

## do the conversion, MULTIPOINT sf to POINT sf with auto-one-to-many expansion
mp_to_pt(mp)

I think there's a nice case for a general decomposition to rows-per part, but it's got complications - like holes being elevated to islands, so I'm not sure how much of this will belong in sf directly.

@edzer
Copy link
Member

edzer commented Dec 13, 2016

How about an st_un_multi for all MULTI* objects? It could also unfold GEOMETRYCOLLECTIONS one level down. Or st_unwrap, to take a verb?

@Robinlovelace
Copy link
Contributor Author

Both sound good, but I like the one with multi in it better as it makes the link to MULTI* features more clearly. I think st_unmulti is another contender but less clear than your st_un_multi suggestion.

@edzer
Copy link
Member

edzer commented Dec 22, 2016

Here's some draft code that un-multipoints:

st_un_multipoint = function(x) {
    g = st_geometry(x)
    i = rep(seq_len(nrow(x)), sapply(g, nrow))
    x = x[i,]
    st_geometry(x) = st_sfc(do.call(c,
        lapply(g, function(geom) lapply(1:nrow(geom), function(i) st_point(geom[i,])))))
    x$original_geom_id = i
    x
}

# example:
library(sf)
g = st_sfc(st_multipoint(matrix(1:6,,2)),
    st_multipoint(matrix(1:4,,2)),
    st_multipoint(matrix(11:18,,2)))
df = data.frame(a = 1:3, b = 5:7)
st_geometry(df) = g
df

st_un_multipoint(df)

@etiennebr
Copy link
Member

I think this is equivalent to st_points in postgis

@edzer
Copy link
Member

edzer commented Jan 15, 2017

We now have:

> st_cast(x, "POINT")
Geometry set for 112 features 
geometry type:  POINT
dimension:      XY
bbox:           xmin: 0 ymin: -7.081155e-10 xmax: 5565.975 ymax: 110823.7
epsg (SRID):    3857
proj4string:    +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs
First 5 geometries:
POINT(0 501.464607505533)
POINT(0 1504.39382251802)
POINT(0 2507.3230375305)
POINT(0 3510.25225254298)
POINT(0 4513.18146755547)
> ?st_cast
> st_cast(x, "POINT", group_or_split = FALSE)
Geometry set for 2 features 
geometry type:  POINT
dimension:      XY
bbox:           xmin: 0 ymin: -7.081155e-10 xmax: 5565.975 ymax: 501.4646
epsg (SRID):    3857
proj4string:    +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs
POINT(0 501.464607505533)
POINT(5565.97453966368 -7.08115455161362e-10)
Warning messages:
1: In st_cast.MULTIPOINT(X[[i]], ...) : point from first coordinate only
2: In st_cast.MULTIPOINT(X[[i]], ...) : point from first coordinate only

As it is now, st_cast by default does the one-to-many, and even (if you specify ids) many-to-one with an aggregate on the attributes. I will move the latter to a proper aggregate method, but don't see how st_point catches all the one-to-many casts -- I suggest to leave that in.

@Robinlovelace
Copy link
Contributor Author

Robinlovelace commented Mar 29, 2017

Update: this is still not working:

# test multipoint to point conversion
x = st_multipoint(matrix(1:10, ncol = 2))
st_cast(x = x, to = "POINT")
POINT(1 6)
Warning message:
In st_cast.MULTIPOINT(x = x, to = "POINT") :
  point from first coordinate only

@Robinlovelace
Copy link
Contributor Author

Robinlovelace commented Mar 29, 2017

I was expecting 5 points, that can get ids associated. Equivalent sp code:

p = sp::SpatialPoints(matrix(1:10, ncol = 2))

How does one do that in sf?

@mdsumner
Copy link
Member

From scratch:

d <- as.data.frame(matrix(1:10, ncol = 2))
d$id <- 1:5
st_as_sf(d, coords = c("V1", "V2"))

But the casting method also works if it's sfc:

st_cast(st_sfc(x), "POINT")

Compare when the group_or_split is FALSE (this argument is not available to all st_cast methods):

st_cast(st_sfc(x), "POINT", group_or_split = FALSE)

@Robinlovelace
Copy link
Contributor Author

Thanks for the clarification - I missed out the st_sfc part - gradually making sense - seems like a logical system. I'd say this issue is closed then, although not sure about other feature types such as POLYGON. For my use-case this issue is closed.

@Robinlovelace
Copy link
Contributor Author

Question on this @edzer: how to create a POINT sf with 5 features based on this:

library(sf)
set.seed(1985)
m3 = matrix(runif(10), ncol = 2)
mp = st_multipoint(x = m3)
p = st_cast(x = mp, to = "POINT")

At the moment I'm getting this useful warning but no indication of how to make it include all points:

#> Warning in st_cast.MULTIPOINT(x = mp, to = "POINT"): point from first

@edzer
Copy link
Member

edzer commented Jul 24, 2017

You can't create a POINT that includes more than one point. You can create a list of points, though:

p = st_cast(x = st_sfc(mp), to = "POINT")

@Robinlovelace
Copy link
Contributor Author

Thanks - makes sense. This if for some documentation of the new aggregate method you've created, which looks awesome. PR incoming!

@mbedward
Copy link

mbedward commented Aug 4, 2017

Is there a PR, or one in the works, for st_un_multipoint?

@tim-salabim
Copy link
Member

Is st_cast(mp, "POINT") not enough?

@mbedward
Copy link

mbedward commented Aug 4, 2017

My bad - I thought from first reading the discussion that there was a problem with using casting for this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants