Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds runifpoint #1204

Closed
wants to merge 2 commits into from
Closed

adds runifpoint #1204

wants to merge 2 commits into from

Conversation

Nowosad
Copy link
Contributor

@Nowosad Nowosad commented Nov 24, 2019

@edzer this PR adds the spatstat::runifpoint function's support to sf. Please take a look at the code and let me know if this implementation is ok. If so, I can add some documentation, etc.

The second thing is to decide which sampling types from spatstat are worth adding - see pages 3 and 4 at https://spatstat.org/resources/spatstatQuickref.pdf for the complete list.

Example of use:

library(sf)
#> Linking to GEOS 3.7.1, GDAL 2.3.2, PROJ 5.2.0

window = read_sf(system.file("shape/nc.shp", package = "sf"))
p1 = st_sample(window, size = 20, type = "runifpoint")
plot(p1)

Created on 2019-11-24 by the reprex package (v0.3.0)

@edzer
Copy link
Member

edzer commented Nov 24, 2019

This looks attractive, but still needs a bit of thought. Pinging @rubak.

  1. geographic coordinates; with newer maptools, you get
> p1 = st_sample(window, size = 20, type = "runifpoint")
Error in maptools::as.owin.SpatialPolygons(as(x, "Spatial")) : 
  Only projected coordinates may be converted to spatstat class objects

for very good reasons, but we should catch that earlier. Actually, type "random" (sf native) does "proper" sampling on the sphere in that case, can spatstat can do that too somehow, Ege?

  1. naming: "runifpoint" is not a sampling type, but a routine. If . For testing now this would be OK, but once we release this people start counting on that name. The type as used by sf is still "random", but the path taken is through spatstat.

  2. at line 149 we still kick out points outside the window, but spatstat already did that. This may be a costly redundant operation if geometries are complex and point sets large.

@edzer
Copy link
Member

edzer commented Nov 25, 2019

Further thoughts: spatstat doesn't do anything fancy for uniform random sampling, we want it for the special cases, like type = Strauss, type = StraussHard, type = Thomas and so on. It would be best to set up a single interface to all these cases.

@rubak
Copy link
Contributor

rubak commented Nov 25, 2019

Indeed, spatstat doesn't do anything special for uniform sampling, and we don't handle geographic coordinates at all. This has all been postponed to spatstat.sphere which should interface s2 in a spatstat-centric way. However, this still requires some work by me and is stalled at the moment.

For a list of the random point generators in spatstat see the spatstat package help (p. 3-4 in this pdf).

@Nowosad
Copy link
Contributor Author

Nowosad commented Nov 26, 2019

@edzer
Ad1. Should spatstat "types" return an error for lonlat data? I could add a test for that and stop the code if data is not projected.
Ad2. Could you take a look at the documentation and decide which sampling types should be implemented? I think their names should be the same as in spatstat, but without the r prefix.
Ad3. I think this is not a problem - I return the object in line 146. Therefore line 149 is not executed.

@edzer
Copy link
Member

edzer commented Nov 26, 2019

Having a return() on line 146 is entirely against the logic of the whole if/else block, that's why I didn't see it. R return()s are the GOTOs of Fortran.

@edzer
Copy link
Member

edzer commented Nov 26, 2019

Yes, a simple solution would be using type to create the function call, e.g. sth along the lines of

type = "MaternI"
f = eval(parse(text = paste0("spatstat::r", type)))
f(n = 30, ...) # calls rMaternI(n = 30, ...)

which would avoid writing an interface for each and every method. But this would require that the functions share a common interface (at least: parameter names).Each of them seems to have win, but not all of them have an n argument for instance. Do you see a way out here, @rubak?

Also: all rxxx functions seem to have an nsim argument. One could interface that by returning a set of MULTIPOINT geometries, each with a single simulation, when used. (Not sure if that is a good idea.)

@rubak
Copy link
Contributor

rubak commented Nov 27, 2019

I think it should be a curated list of simulation functions from spatstat with a direct interface from sf since it otherwise maybe difficult for the user to find relevant functions. Then we also have the option of including a unit test for each function we claim to interface. The sketched way of calling these by @edzer is great. Sometimes I like to start with the documentation when thinking such an interface through:

@param type character; indicates the spatial sampling type; either one of 
\code{random}, \code{hexagonal} and \code{regular} or one of the random 
generators from \pkg{spatstat} listed in the Details section.
#' @details if \code{x} has dimension 2 (polygons) and geographical coordinates 
(long/lat), uniform random sampling on the sphere is applied, see e.g. 
\url{http://mathworld.wolfram.com/SpherePointPicking.html}. 
The \pkg{spatstat} can be used to generate other types of sampling points if 
\code{x} has dimension 2 (polygons) and planar coordinates (not long/lat) by 
using the appropriate \code{type} (see below). In this case arguments \code{size} 
and \code{exact} are ignored and all relevant sampling parameters for 
\pkg{spatstat} should be passed through \code{\dots}. For each sampling type 
below we refer to the relevant help page in \pkg{spatstat} for details on sampling 
parameters. The \pkg{spatstat} function name is the \code{type} argument 
prefixed by \code{r}, e.g., for \code{type="Thomas"} see the help of 
\code{\link[spatstat]{rThomas}}. The available sampling types are:
\describe{
\item{\code{type="point"}}{
Independent point distribution. Fixed number of (non-uniform) independent sampling points.
}
\item{\code{type="poispp"}}{
Poisson process. Random number of independent sampling points.
}
\item{\code{type="Thomas"}}{
Thomas process. Random number of clustered sampling points.
}
\item{\code{type="MatClust"}}{
Matern cluster process. Random number of clustered sampling points.
}
\item{\code{type="MaternI"}}{
Matern type I inhibiton process. Random number of regularly spaced sampling points.
}
\item{\code{type="MaternII"}}{
Matern type II inhibiton process. Random number of regularly spaced sampling points.
}
}

@rubak
Copy link
Contributor

rubak commented Nov 27, 2019

This list is not complete, but a starting point. Most simulation functions in spatstat generate a random number of sampling points, so I thought it was better to explicitly say that size is ignored and then use the argument names from spatstat. Let me know what you think.

@edzer
Copy link
Member

edzer commented Nov 27, 2019

I think this is a good approach: making clear where the docs are to be found. I'm hesitant to add unit tests to sf, because it would test spatstat functionality really; a change in spatstat would then cause a breaking sf.

@rubak
Copy link
Contributor

rubak commented Nov 27, 2019

I see your point about the unit tests. What about examples? Are you OK with having an example of some of the spatstat sampling types at the end of the help page?

@edzer
Copy link
Member

edzer commented Nov 27, 2019

Yes, that makes sense.

@rubak
Copy link
Contributor

rubak commented Nov 27, 2019

OK. With the proposal by @edzer I think the spatstat part of the function can be written simply as (plus some checking etc. for geographic coordinates etc.):

st_sample_poly_spatstat <- function(x, ..., type){
  spatstat_fun <- try(get(paste0("r", type), asNamespace("spatstat")), silent = TRUE)
  if(inherits(spatstat_fun, "try-error")){
    stop(paste0("r", type), " is not an exported function from spatstat.")
  }
  rslt <- try(spatstat_fun(..., win = maptools::as.owin.SpatialPolygons(as(x, "Spatial"))), silent = TRUE)
  if(inherits(rslt, "try-error")){
    stop("The spatstat function ", paste0("r", type),
         " did not return a valid result. Consult the help file.\n",
         "Error message from spatstat:\n", rslt)
  }
  return(rslt)
}
x <- sf::st_sfc(sf::st_polygon(list(rbind(c(0,0),c(10,0),c(10,10),c(0,0)))))
pts <- st_sample_poly_spatstat(x, type = "thomas")
#> Error in st_sample_poly_spatstat(x, type = "thomas"): rthomas is not an exported function from spatstat.
pts <- st_sample_poly_spatstat(x, kappa = 1, mu = 10, type = "Thomas")
#> Error in st_sample_poly_spatstat(x, kappa = 1, mu = 10, type = "Thomas"): The spatstat function rThomas did not return a valid result. Consult the help file.
#> Error message from spatstat:
#> Error : 'scale' should be a single number
pts <- st_sample_poly_spatstat(x, kappa = 1, mu = 10, scale = 0.1, type = "Thomas")
plot(pts)

This also allows the advanced user to call any random generator from spatstat which has argument win, but I still think we should list the relevant ones and let this be a hidden feature which may be removed at any point.

@Nowosad
Copy link
Contributor Author

Nowosad commented Dec 8, 2019

@edzer, I have tried to add spatstat sampling to be inside of the st_sample() function, but I have failed. You can see my code at fec8a53 and the results below. The problem is probably with ..., which goes from one function to another inside sf, and then to some other in spatstat...

library(sf)
#> Linking to GEOS 3.7.1, GDAL 2.3.2, PROJ 5.2.0

x <- sf::st_sfc(sf::st_polygon(list(rbind(c(0, 0), c(10, 0), c(10, 10), c(0, 0)))))
plot(x)

# error expected
pts <- st_sample(x, type = "thomas")
#> Error in st_poly_sample(x, ..., size, type = type): sampling type thomas not implemented for polygons

# error expected
pts <- st_sample(x, kappa = 1, mu = 10, type = "Thomas")
#> Error in st_poly_sample(x, ..., size, type = type): The spatstat function rThomas did not return a valid result. Consult the help file.
#> Error message from spatstat:
#> Error : 'scale' should be a single number

# points expected
pts <- st_sample(x, kappa = 1, mu = 10, scale = 0.1, type = "Thomas")
#> Error in st_poly_sample(x, ..., size, type = type): The spatstat function rThomas did not return a valid result. Consult the help file.
#> Error message from spatstat:
#> Error in st_poly_sample(x, ..., size, type = type) : 
#>   argument "size" is missing, with no default
plot(pts)
#> Error in plot(pts): object 'pts' not found

Created on 2019-12-08 by the reprex package (v0.3.0)

@Nowosad
Copy link
Contributor Author

Nowosad commented Dec 8, 2019

Alternative approach would be to add a new function st_sample_spatstat() for samplings using spatstat. It is a slightly modified version of the function by @rubak. Let me know what you think @edzer.

st_sample_spatstat = function(x, ..., type){
    if (!requireNamespace("spatstat", quietly = TRUE)){
        stop("package spatstat required, please install it first")
    }
    spatstat_fun = try(get(paste0("r", type), asNamespace("spatstat")), silent = TRUE)
    if(inherits(spatstat_fun, "try-error")){
        stop(paste0("r", type), " is not an exported function from spatstat.")
    }
    if (!requireNamespace("maptools", quietly = TRUE)){
        stop("package maptools required, please install it first")
    }
    spatstat_fun = try(get(paste0("r", type), asNamespace("spatstat")), silent = TRUE)
    rslt = try(spatstat_fun(..., win = maptools::as.owin.SpatialPolygons(as(x, "Spatial"))), silent = TRUE)
    if(inherits(rslt, "try-error")){
        stop("The spatstat function ", paste0("r", type),
             " did not return a valid result. Consult the help file.\n",
             "Error message from spatstat:\n", rslt)
    }
    return(rslt)
}

library(sf)
#> Linking to GEOS 3.7.1, GDAL 2.3.2, PROJ 5.2.0

x <- sf::st_sfc(sf::st_polygon(list(rbind(c(0, 0), c(10, 0), c(10, 10), c(0, 0)))))
plot(x)

# error expected
pts <- st_sample_spatstat(x, type = "thomas")
#> Error in st_sample_spatstat(x, type = "thomas"): rthomas is not an exported function from spatstat.

# error expected
pts <- st_sample_spatstat(x, kappa = 1, mu = 10, type = "Thomas")
#> Error in st_sample_spatstat(x, kappa = 1, mu = 10, type = "Thomas"): The spatstat function rThomas did not return a valid result. Consult the help file.
#> Error message from spatstat:
#> Error : 'scale' should be a single number

# points expected
pts <- st_sample_spatstat(x, kappa = 1, mu = 10, scale = 0.1, type = "Thomas")
plot(pts)

Created on 2019-12-08 by the reprex package (v0.3.0)

@rubak
Copy link
Contributor

rubak commented Dec 8, 2019

I like the idea of simply having an exported function st_sample_spatstat to call spatstat samplers from sf rather than wrapping it in st_sample. It makes the documentation more straightforward and I just think there should be a link to this new function in the help of st_sample.

edzer added a commit that referenced this pull request Dec 8, 2019
@edzer
Copy link
Member

edzer commented Dec 8, 2019

I don't think that the missing size error justifies introducing a new function, but should be solved. This alternative gives us

library(sf)
# Linking to GEOS 3.8.0, GDAL 3.0.2, PROJ 6.2.1
#> Linking to GEOS 3.7.1, GDAL 2.3.2, PROJ 5.2.0

x <- sf::st_sfc(sf::st_polygon(list(rbind(c(0, 0), c(10, 0), c(10, 10), c(0, 0)))))
s = st_sample(x, 10)

# error expected
try(pts <- st_sample(x, type = "thomas"))
# Error in st_poly_sample(x, size = size, ..., type = type) : 
#   rthomas is not an exported function from spatstat.

# error expected
try(pts <- st_sample(x, kappa = 1, mu = 10, type = "Thomas"))
# Error in st_poly_sample(x, size = size, ..., type = type) : 
#   The spatstat function rThomas did not return a valid result. Consult the help file.
# Error message from spatstat:
# Error : ‘scale’ should be a single number


# points expected
pts <- st_sample(x, kappa = 1, mu = 10, scale = 0.1, type = "Thomas")
plot(x)
plot(pts, add = TRUE)

giving the plot generated above by @Nowosad .

There's a couple of things to do or to discuss:

  • check what happens if an empty set is returned
  • we could convert size into the intensity (or n) argument of the spatstat r* function, in case that is not specified but size is (although that doesn't seem to have a constant naming)
  • remove the dependency on maptools and sp (write sfc -> owin converter)
  • interface sampling on lines
  • adding docs
  • adding tests

edzer added a commit that referenced this pull request Dec 8, 2019
@rubak
Copy link
Contributor

rubak commented Dec 9, 2019

we could convert size into the intensity (or n) argument of the spatstat r* function, in case that is not specified but size is (although that doesn't seem to have a constant naming)

For many people it would indeed be convenient to specify the average number of points they want rather than the intensity. I considered this at some point, but I don't think it is worth the effort. The problem is that many methods don't have a single concept of intensity, and it may be difficult for us to do the conversion in a generic fashion:

  • For the Thomas model there is the intensity of the cluster centers kappa and then the average number of points in each cluster mu and the overall intensity is kappa*mu.
  • For Gibbs models like Hardcore and Strauss the intensity is actually unknown (but can be approximated by saddlepoint methods).

So rather than have a list of spatstat models where size is converted in some way and others where it isn't I think it is easier just to say that the sampling is done as in the spatstat documentation and the sf argument size is ignored.

remove the dependency on maptools and sp (write sfc -> owin converter)

That would indeed make sense, but I won't get around to it in the near future.

interface sampling on lines

Nice to have, but it could also be postponed. I have no strong opinion about this right now.

edzer added a commit that referenced this pull request Dec 9, 2019
edzer added a commit that referenced this pull request Dec 9, 2019
@edzer
Copy link
Member

edzer commented Dec 30, 2019

I now (manually) merged this into master. Thanks!

@edzer edzer closed this Dec 30, 2019
edzer added a commit that referenced this pull request Jan 4, 2020
* remove maptools dependency
* namespace issues
* test results
@edzer
Copy link
Member

edzer commented Jan 4, 2020

Removed maptools dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants