adds runifpoint #1204

Nowosad · 2019-11-24T17:00:55Z

@edzer this PR adds the spatstat::runifpoint function's support to sf. Please take a look at the code and let me know if this implementation is ok. If so, I can add some documentation, etc.

The second thing is to decide which sampling types from spatstat are worth adding - see pages 3 and 4 at https://spatstat.org/resources/spatstatQuickref.pdf for the complete list.

Example of use:

library(sf)
#> Linking to GEOS 3.7.1, GDAL 2.3.2, PROJ 5.2.0

window = read_sf(system.file("shape/nc.shp", package = "sf"))
p1 = st_sample(window, size = 20, type = "runifpoint")
plot(p1)

^{Created on 2019-11-24 by the reprex package (v0.3.0)}

edzer · 2019-11-24T19:53:31Z

This looks attractive, but still needs a bit of thought. Pinging @rubak.

geographic coordinates; with newer maptools, you get

> p1 = st_sample(window, size = 20, type = "runifpoint")
Error in maptools::as.owin.SpatialPolygons(as(x, "Spatial")) : 
  Only projected coordinates may be converted to spatstat class objects

for very good reasons, but we should catch that earlier. Actually, type "random" (sf native) does "proper" sampling on the sphere in that case, can spatstat can do that too somehow, Ege?

naming: "runifpoint" is not a sampling type, but a routine. If . For testing now this would be OK, but once we release this people start counting on that name. The type as used by sf is still "random", but the path taken is through spatstat.
at line 149 we still kick out points outside the window, but spatstat already did that. This may be a costly redundant operation if geometries are complex and point sets large.

edzer · 2019-11-25T10:40:16Z

Further thoughts: spatstat doesn't do anything fancy for uniform random sampling, we want it for the special cases, like type = Strauss, type = StraussHard, type = Thomas and so on. It would be best to set up a single interface to all these cases.

rubak · 2019-11-25T15:35:40Z

Indeed, spatstat doesn't do anything special for uniform sampling, and we don't handle geographic coordinates at all. This has all been postponed to spatstat.sphere which should interface s2 in a spatstat-centric way. However, this still requires some work by me and is stalled at the moment.

For a list of the random point generators in spatstat see the spatstat package help (p. 3-4 in this pdf).

Nowosad · 2019-11-26T13:54:44Z

@edzer
Ad1. Should spatstat "types" return an error for lonlat data? I could add a test for that and stop the code if data is not projected.
Ad2. Could you take a look at the documentation and decide which sampling types should be implemented? I think their names should be the same as in spatstat, but without the r prefix.
Ad3. I think this is not a problem - I return the object in line 146. Therefore line 149 is not executed.

edzer · 2019-11-26T13:59:16Z

Having a return() on line 146 is entirely against the logic of the whole if/else block, that's why I didn't see it. R return()s are the GOTOs of Fortran.

edzer · 2019-11-26T20:58:49Z

Yes, a simple solution would be using type to create the function call, e.g. sth along the lines of

type = "MaternI"
f = eval(parse(text = paste0("spatstat::r", type)))
f(n = 30, ...) # calls rMaternI(n = 30, ...)

which would avoid writing an interface for each and every method. But this would require that the functions share a common interface (at least: parameter names).Each of them seems to have win, but not all of them have an n argument for instance. Do you see a way out here, @rubak?

Also: all rxxx functions seem to have an nsim argument. One could interface that by returning a set of MULTIPOINT geometries, each with a single simulation, when used. (Not sure if that is a good idea.)

rubak · 2019-11-27T11:02:05Z

I think it should be a curated list of simulation functions from spatstat with a direct interface from sf since it otherwise maybe difficult for the user to find relevant functions. Then we also have the option of including a unit test for each function we claim to interface. The sketched way of calling these by @edzer is great. Sometimes I like to start with the documentation when thinking such an interface through:

@param type character; indicates the spatial sampling type; either one of 
\code{random}, \code{hexagonal} and \code{regular} or one of the random 
generators from \pkg{spatstat} listed in the Details section.

#' @details if \code{x} has dimension 2 (polygons) and geographical coordinates 
(long/lat), uniform random sampling on the sphere is applied, see e.g. 
\url{http://mathworld.wolfram.com/SpherePointPicking.html}. 
The \pkg{spatstat} can be used to generate other types of sampling points if 
\code{x} has dimension 2 (polygons) and planar coordinates (not long/lat) by 
using the appropriate \code{type} (see below). In this case arguments \code{size} 
and \code{exact} are ignored and all relevant sampling parameters for 
\pkg{spatstat} should be passed through \code{\dots}. For each sampling type 
below we refer to the relevant help page in \pkg{spatstat} for details on sampling 
parameters. The \pkg{spatstat} function name is the \code{type} argument 
prefixed by \code{r}, e.g., for \code{type="Thomas"} see the help of 
\code{\link[spatstat]{rThomas}}. The available sampling types are:
\describe{
\item{\code{type="point"}}{
Independent point distribution. Fixed number of (non-uniform) independent sampling points.
}
\item{\code{type="poispp"}}{
Poisson process. Random number of independent sampling points.
}
\item{\code{type="Thomas"}}{
Thomas process. Random number of clustered sampling points.
}
\item{\code{type="MatClust"}}{
Matern cluster process. Random number of clustered sampling points.
}
\item{\code{type="MaternI"}}{
Matern type I inhibiton process. Random number of regularly spaced sampling points.
}
\item{\code{type="MaternII"}}{
Matern type II inhibiton process. Random number of regularly spaced sampling points.
}
}

rubak · 2019-11-27T11:07:23Z

This list is not complete, but a starting point. Most simulation functions in spatstat generate a random number of sampling points, so I thought it was better to explicitly say that size is ignored and then use the argument names from spatstat. Let me know what you think.

edzer · 2019-11-27T11:12:21Z

I think this is a good approach: making clear where the docs are to be found. I'm hesitant to add unit tests to sf, because it would test spatstat functionality really; a change in spatstat would then cause a breaking sf.

rubak · 2019-11-27T11:51:55Z

I see your point about the unit tests. What about examples? Are you OK with having an example of some of the spatstat sampling types at the end of the help page?

edzer · 2019-11-27T12:14:59Z

Yes, that makes sense.

rubak · 2019-11-27T13:10:24Z

OK. With the proposal by @edzer I think the spatstat part of the function can be written simply as (plus some checking etc. for geographic coordinates etc.):

st_sample_poly_spatstat <- function(x, ..., type){
  spatstat_fun <- try(get(paste0("r", type), asNamespace("spatstat")), silent = TRUE)
  if(inherits(spatstat_fun, "try-error")){
    stop(paste0("r", type), " is not an exported function from spatstat.")
  }
  rslt <- try(spatstat_fun(..., win = maptools::as.owin.SpatialPolygons(as(x, "Spatial"))), silent = TRUE)
  if(inherits(rslt, "try-error")){
    stop("The spatstat function ", paste0("r", type),
         " did not return a valid result. Consult the help file.\n",
         "Error message from spatstat:\n", rslt)
  }
  return(rslt)
}
x <- sf::st_sfc(sf::st_polygon(list(rbind(c(0,0),c(10,0),c(10,10),c(0,0)))))
pts <- st_sample_poly_spatstat(x, type = "thomas")
#> Error in st_sample_poly_spatstat(x, type = "thomas"): rthomas is not an exported function from spatstat.
pts <- st_sample_poly_spatstat(x, kappa = 1, mu = 10, type = "Thomas")
#> Error in st_sample_poly_spatstat(x, kappa = 1, mu = 10, type = "Thomas"): The spatstat function rThomas did not return a valid result. Consult the help file.
#> Error message from spatstat:
#> Error : 'scale' should be a single number
pts <- st_sample_poly_spatstat(x, kappa = 1, mu = 10, scale = 0.1, type = "Thomas")
plot(pts)

This also allows the advanced user to call any random generator from spatstat which has argument win, but I still think we should list the relevant ones and let this be a hidden feature which may be removed at any point.

Nowosad · 2019-12-08T09:20:36Z

@edzer, I have tried to add spatstat sampling to be inside of the st_sample() function, but I have failed. You can see my code at fec8a53 and the results below. The problem is probably with ..., which goes from one function to another inside sf, and then to some other in spatstat...

library(sf)
#> Linking to GEOS 3.7.1, GDAL 2.3.2, PROJ 5.2.0

x <- sf::st_sfc(sf::st_polygon(list(rbind(c(0, 0), c(10, 0), c(10, 10), c(0, 0)))))
plot(x)

# error expected
pts <- st_sample(x, type = "thomas")
#> Error in st_poly_sample(x, ..., size, type = type): sampling type thomas not implemented for polygons

# error expected
pts <- st_sample(x, kappa = 1, mu = 10, type = "Thomas")
#> Error in st_poly_sample(x, ..., size, type = type): The spatstat function rThomas did not return a valid result. Consult the help file.
#> Error message from spatstat:
#> Error : 'scale' should be a single number

# points expected
pts <- st_sample(x, kappa = 1, mu = 10, scale = 0.1, type = "Thomas")
#> Error in st_poly_sample(x, ..., size, type = type): The spatstat function rThomas did not return a valid result. Consult the help file.
#> Error message from spatstat:
#> Error in st_poly_sample(x, ..., size, type = type) : 
#>   argument "size" is missing, with no default
plot(pts)
#> Error in plot(pts): object 'pts' not found

^{Created on 2019-12-08 by the reprex package (v0.3.0)}

Nowosad · 2019-12-08T09:22:03Z

Alternative approach would be to add a new function st_sample_spatstat() for samplings using spatstat. It is a slightly modified version of the function by @rubak. Let me know what you think @edzer.

st_sample_spatstat = function(x, ..., type){
    if (!requireNamespace("spatstat", quietly = TRUE)){
        stop("package spatstat required, please install it first")
    }
    spatstat_fun = try(get(paste0("r", type), asNamespace("spatstat")), silent = TRUE)
    if(inherits(spatstat_fun, "try-error")){
        stop(paste0("r", type), " is not an exported function from spatstat.")
    }
    if (!requireNamespace("maptools", quietly = TRUE)){
        stop("package maptools required, please install it first")
    }
    spatstat_fun = try(get(paste0("r", type), asNamespace("spatstat")), silent = TRUE)
    rslt = try(spatstat_fun(..., win = maptools::as.owin.SpatialPolygons(as(x, "Spatial"))), silent = TRUE)
    if(inherits(rslt, "try-error")){
        stop("The spatstat function ", paste0("r", type),
             " did not return a valid result. Consult the help file.\n",
             "Error message from spatstat:\n", rslt)
    }
    return(rslt)
}

library(sf)
#> Linking to GEOS 3.7.1, GDAL 2.3.2, PROJ 5.2.0

x <- sf::st_sfc(sf::st_polygon(list(rbind(c(0, 0), c(10, 0), c(10, 10), c(0, 0)))))
plot(x)

# error expected
pts <- st_sample_spatstat(x, type = "thomas")
#> Error in st_sample_spatstat(x, type = "thomas"): rthomas is not an exported function from spatstat.

# error expected
pts <- st_sample_spatstat(x, kappa = 1, mu = 10, type = "Thomas")
#> Error in st_sample_spatstat(x, kappa = 1, mu = 10, type = "Thomas"): The spatstat function rThomas did not return a valid result. Consult the help file.
#> Error message from spatstat:
#> Error : 'scale' should be a single number

# points expected
pts <- st_sample_spatstat(x, kappa = 1, mu = 10, scale = 0.1, type = "Thomas")
plot(pts)

^{Created on 2019-12-08 by the reprex package (v0.3.0)}

rubak · 2019-12-08T09:36:10Z

I like the idea of simply having an exported function st_sample_spatstat to call spatstat samplers from sf rather than wrapping it in st_sample. It makes the documentation more straightforward and I just think there should be a link to this new function in the help of st_sample.

edzer · 2019-12-08T12:12:10Z

I don't think that the missing size error justifies introducing a new function, but should be solved. This alternative gives us

library(sf)
# Linking to GEOS 3.8.0, GDAL 3.0.2, PROJ 6.2.1
#> Linking to GEOS 3.7.1, GDAL 2.3.2, PROJ 5.2.0

x <- sf::st_sfc(sf::st_polygon(list(rbind(c(0, 0), c(10, 0), c(10, 10), c(0, 0)))))
s = st_sample(x, 10)

# error expected
try(pts <- st_sample(x, type = "thomas"))
# Error in st_poly_sample(x, size = size, ..., type = type) : 
#   rthomas is not an exported function from spatstat.

# error expected
try(pts <- st_sample(x, kappa = 1, mu = 10, type = "Thomas"))
# Error in st_poly_sample(x, size = size, ..., type = type) : 
#   The spatstat function rThomas did not return a valid result. Consult the help file.
# Error message from spatstat:
# Error : ‘scale’ should be a single number


# points expected
pts <- st_sample(x, kappa = 1, mu = 10, scale = 0.1, type = "Thomas")
plot(x)
plot(pts, add = TRUE)

giving the plot generated above by @Nowosad .

There's a couple of things to do or to discuss:

check what happens if an empty set is returned
~~we could convert size into the intensity (or n) argument of the spatstat r* function, in case that is not specified but size is (although that doesn't seem to have a constant naming)~~
remove the dependency on maptools and sp (write sfc -> owin converter)
interface sampling on lines
adding docs
adding tests

rubak · 2019-12-09T08:28:55Z

we could convert size into the intensity (or n) argument of the spatstat r* function, in case that is not specified but size is (although that doesn't seem to have a constant naming)

For many people it would indeed be convenient to specify the average number of points they want rather than the intensity. I considered this at some point, but I don't think it is worth the effort. The problem is that many methods don't have a single concept of intensity, and it may be difficult for us to do the conversion in a generic fashion:

For the Thomas model there is the intensity of the cluster centers kappa and then the average number of points in each cluster mu and the overall intensity is kappa*mu.
For Gibbs models like Hardcore and Strauss the intensity is actually unknown (but can be approximated by saddlepoint methods).

So rather than have a list of spatstat models where size is converted in some way and others where it isn't I think it is easier just to say that the sampling is done as in the spatstat documentation and the sf argument size is ignored.

remove the dependency on maptools and sp (write sfc -> owin converter)

That would indeed make sense, but I won't get around to it in the near future.

interface sampling on lines

Nice to have, but it could also be postponed. I have no strong opinion about this right now.

edzer · 2019-12-30T21:50:22Z

I now (manually) merged this into master. Thanks!

* remove maptools dependency * namespace issues * test results

edzer · 2020-01-04T12:17:17Z

Removed maptools dependency.

adds runifpoint

acb35f1

shows not working spatstat sampling

fec8a53

edzer added a commit that referenced this pull request Dec 8, 2019

alternative to #1204

89da16f

edzer added a commit that referenced this pull request Dec 8, 2019

update to #1204

cb8ae95

edzer added a commit that referenced this pull request Dec 9, 2019

add tests; #1204

f9cadb3

edzer added a commit that referenced this pull request Dec 9, 2019

add documentation; #1204

80d36f7

edzer closed this Dec 30, 2019

edzer added a commit that referenced this pull request Jan 4, 2020

spatstat - sf related; #1204 #1233

ad8e371

* remove maptools dependency * namespace issues * test results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adds runifpoint #1204

adds runifpoint #1204

Nowosad commented Nov 24, 2019

edzer commented Nov 24, 2019

edzer commented Nov 25, 2019

rubak commented Nov 25, 2019

Nowosad commented Nov 26, 2019

edzer commented Nov 26, 2019

edzer commented Nov 26, 2019

rubak commented Nov 27, 2019 •

edited

Loading

rubak commented Nov 27, 2019

edzer commented Nov 27, 2019

rubak commented Nov 27, 2019

edzer commented Nov 27, 2019

rubak commented Nov 27, 2019

Nowosad commented Dec 8, 2019

Nowosad commented Dec 8, 2019 •

edited

Loading

rubak commented Dec 8, 2019

edzer commented Dec 8, 2019 •

edited

Loading

rubak commented Dec 9, 2019

edzer commented Dec 30, 2019

edzer commented Jan 4, 2020

adds runifpoint #1204

adds runifpoint #1204

Conversation

Nowosad commented Nov 24, 2019

edzer commented Nov 24, 2019

edzer commented Nov 25, 2019

rubak commented Nov 25, 2019

Nowosad commented Nov 26, 2019

edzer commented Nov 26, 2019

edzer commented Nov 26, 2019

rubak commented Nov 27, 2019 • edited Loading

rubak commented Nov 27, 2019

edzer commented Nov 27, 2019

rubak commented Nov 27, 2019

edzer commented Nov 27, 2019

rubak commented Nov 27, 2019

Nowosad commented Dec 8, 2019

Nowosad commented Dec 8, 2019 • edited Loading

rubak commented Dec 8, 2019

edzer commented Dec 8, 2019 • edited Loading

rubak commented Dec 9, 2019

edzer commented Dec 30, 2019

edzer commented Jan 4, 2020

rubak commented Nov 27, 2019 •

edited

Loading

Nowosad commented Dec 8, 2019 •

edited

Loading

edzer commented Dec 8, 2019 •

edited

Loading