Skip to content

Commit

Permalink
accepting sf object as input spatial data
Browse files Browse the repository at this point in the history
  • Loading branch information
rvalavi committed Sep 18, 2018
1 parent bf997cc commit 1ee9b57
Show file tree
Hide file tree
Showing 13 changed files with 73 additions and 48 deletions.
Binary file modified .RData
Binary file not shown.
8 changes: 4 additions & 4 deletions .Rhistory
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
testTable$pred <- predict(rf, mydata[testSet, ], type="prob")[,2] # predict the test set
AUCs[k] <- as.numeric(auc)
}
for(k in 1:length(folds)){
trainSet <- unlist(folds[[k]][1]) # extract the training set indices
testSet <- unlist(folds[[k]][2]) # extract the testing set indices
rf <- randomForest(Species~., mydata[trainSet, ]) # model fitting on training set
Expand Down Expand Up @@ -510,3 +506,7 @@ library(blockCV)
library(blockCV)
library(blockCV)
library(blockCV)
?dismo::gridSample
library(blockCV)
library(blockCV)
library(blockCV)
27 changes: 20 additions & 7 deletions R/blocking.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
#' records is stored and returned in the \code{records} table. If \code{species = NULL} (no column with 0s and 1s are defined), the procedure is like presence-absence data.
#'
#'
#' @param speciesData A SpatialPointsDataFrame or SpatialPoints object containing species data.
#' @param speciesData A SpatialPointsDataFrame, SpatialPoints or sf object containing species data.
#' @param species Character. Indicating the name of the field in which species presence/absence data (0s and 1s) are stored. If \code{speceis = NULL}
#' the presence and absence data will be treated the same and only training and testing records will be counted.
#' @param theRange Numeric value of the specified range by which the training and testing datasets are separated (See \code{\link{spatialAutoRange}}).
Expand Down Expand Up @@ -79,9 +79,13 @@
#'
#' }
buffering <- function(speciesData, species=NULL, theRange, spDataType="PA", addBG=TRUE, progress=TRUE){
if(!methods::is(speciesData, 'SpatialPoints')){stop("speciesData should be SpatialPoints or SpatialPointsDataFrame")}
if((methods::is(speciesData, "SpatialPoints") || methods::is(speciesData, "sf"))==FALSE){stop("speciesData should be SpatialPointsDataFrame, SpatialPoints or sf object")}
if(methods::is(speciesData, "sf")){
sfobj <- speciesData
} else{
sfobj <- sf::st_as_sf(speciesData)
}
speciesData$ID <- 1:length(speciesData)
sfobj <- sf::st_as_sfc(speciesData)
if(is.null(sf::st_crs(sfobj))){
stop("The coordinate reference system of species data should be defined")
} else if(sp::is.projected(speciesData)){ # this is due to a recent change (Jan 2018) in the sf package that doesn't recognize the projected crs units. It will be fixed soon.
Expand Down Expand Up @@ -141,7 +145,7 @@ buffering <- function(speciesData, species=NULL, theRange, spDataType="PA", addB
testSet <- i
foldList[[i]] <- assign(paste0("fold", i), list(trainSet, testSet))
lnPrsences <- length(presences)
lnAbsence <- length(speciesData) - lnPrsences
# lnAbsence <- length(speciesData) - lnPrsences
trainPoints <- speciesData[trainSet, ]
trainTestTable$trainPr[i] <- length(trainPoints[trainPoints@data[,species]==1,])
trainTestTable$trainAb[i] <- length(trainPoints[trainPoints@data[,species]!=1,])
Expand Down Expand Up @@ -237,7 +241,7 @@ systematicNum <- function(layer, num=5){
#'
#'
#' @inheritParams buffering
#' @param blocks A SpatialPolygons* object to be used as the blocks. This can be a user defined polygon and it must cover all
#' @param blocks A SpatialPolygons* or sf object to be used as the blocks. This can be a user defined polygon and it must cover all
#' the species points.
#' @param theRange Numeric value of the specified range by which blocks are created and training/testing data are separated.
#' This distance should be in \strong{metres}. The range could be explored by \code{spatialAutoRange()} and \code{rangeExplorer()} functions.
Expand All @@ -252,7 +256,7 @@ systematicNum <- function(layer, num=5){
#' created based on the raster extent, but only those blocks covering species data is kept. The default is \code{TRUE}.
#' @param degMetre Integer. The conversion rate of metres to degree. See the details section for more information.
#' @param rasterLayer RasterLayer for visualisation. If provided, this will be used to specify the blocks covering the area.
#' @param border SpatialPolygons* to clip the block based on a border. This might increase the computation time.
#' @param border SpatialPolygons* or sf object to clip the block based on a border. This might increase the computation time.
#' @param showBlocks Logical. If TRUE the final blocks with fold numbers will be plotted. A raster layer could be specified
#' in \code{rasterlayer} argument to be as background.
#' @param biomod2Format Logical. Creates a matrix of folds that can be directly used in the \pkg{biomod2} package as
Expand Down Expand Up @@ -337,6 +341,12 @@ spatialBlock <- function(speciesData, species=NULL, blocks=NULL, rasterLayer=NUL
message("k has been set to 2 because of checkerboard fold selection")
}
}
if(methods::is(speciesData, "sf")){
speciesData <- sf::as_Spatial(speciesData)
}
if(methods::is(border, "sf")){
border <- sf::as_Spatial(border)
}
if(is.null(blocks)){
if(is.null(rasterLayer)){
net <- rasterNet(speciesData, resolution=theRange, xbin=cols, ybin=rows, degree=degMetre, xOffset=xOffset, yOffset=yOffset, checkerboard=chpattern)
Expand Down Expand Up @@ -367,8 +377,11 @@ spatialBlock <- function(speciesData, species=NULL, blocks=NULL, rasterLayer=NUL
subBlocks <- raster::intersect(blocks, speciesData)
if(!is.null(species)){
species <- NULL
message("The species has been set to NULL since there is no associated table with SpatialPoints object")
message("The species argument has been set to NULL since there is no table associated with SpatialPoints object")
}
} else if(methods::is(blocks, "sf")){
blocks <- sf::as_Spatial(blocks)
subBlocks <- raster::intersect(blocks, speciesData)
} else{
stop("The input blocks should be a SpatialPolygons* object")
}
Expand Down
3 changes: 3 additions & 0 deletions R/environBlock.R
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,9 @@ normalize <- function(x){
#' }
#'
envBlock <- function(rasterLayer, speciesData, species=NULL, k=5, standardization="normal", rasterBlock=TRUE, biomod2Format=TRUE, numLimit=0){
if(methods::is(speciesData, "sf")){
speciesData <- sf::as_Spatial(speciesData)
}
if(methods::is(rasterLayer, 'Raster')){
if(raster::nlayers(rasterLayer) >= 1){
foldList <- list()
Expand Down
6 changes: 6 additions & 0 deletions R/explorer.R
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,9 @@ foldExplorer <- function(blocks, rasterLayer, speciesData){
kmax <- length(folds)
species <- blocks$species
# set x and y coordinates
if(methods::is(speciesData, "sf")){
speciesData <- sf::as_Spatial(speciesData)
}
if(is.na(sp::proj4string(speciesData))){
mapext <- raster::extent(speciesData)[1:4]
if(mapext >= -180 && mapext <= 180){
Expand Down Expand Up @@ -413,6 +416,9 @@ rangeExplorer <- function(rasterLayer, speciesData=NULL, species=NULL, rangeTabl
scale_fill_gradient2(low="darkred", mid="yellow", high="darkgreen", midpoint=mid) + guides(fill=FALSE) +
xlab(xaxes) + ylab(yaxes)
if(!is.null(speciesData)){
if(methods::is(speciesData, "sf")){
speciesData <- sf::as_Spatial(speciesData)
}
coor <- sp::coordinates(speciesData)
coor <- as.data.frame(coor)
speciesData@data <- cbind(speciesData@data, coor)
Expand Down
5 changes: 4 additions & 1 deletion R/spatialAutoRange.R
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ multiplot <- function(..., plotlist=NULL, file, cols=2, layout=NULL) {
#' @param rasterLayer RasterLayer, RasterBrick or RasterStack of covariates to find spatial autocorrelation range.
#' @param sampleNumber Integer. The number of sample points of each raster layer to fit variogram models. It is 5000 by default,
#' however it can be increased by user to represent their region well (relevant to the extent and resolution of rasters).
#' @param border A SpatialPolygons* for clipping output blocks. This increases the computation time slightly.
#' @param border A SpatialPolygons* or sf object for clipping output blocks. This increases the computation time slightly.
#' @param showPlots Logical. Show final plot of spatial blocks and autocorrelation ranges.
#' @param maxpixels Number of random pixels to select the blocks over the study area.
#' @param plotVariograms Logical. Plot fitted variograms. This can also be done after the analysis. Set to \code{FALSE} by default.
Expand Down Expand Up @@ -304,6 +304,9 @@ spatialAutoRange <- function(rasterLayer, sampleNumber=5000, border=NULL, doPara
subBlocks <- rasterNet(rasterLayer[[1]], resolution=theRange2, degree=degMetre, mask=TRUE, maxpixels =maxpixels)
} else{
net <- rasterNet(rasterLayer[[1]], resolution=theRange2, degree=degMetre, mask=FALSE)
if(methods::is(border, "sf")){
border <- sf::as_Spatial(border)
}
subBlocks <- raster::crop(net, border)
}
if(numLayer>1){
Expand Down
2 changes: 1 addition & 1 deletion man/buffering.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/envBlock.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/foldExplorer.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/spatialAutoRange.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions man/spatialBlock.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions vignettes/BlockCV_for_SDM.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ awt <- raster::brick(system.file("extdata", "awt.grd", package = "blockCV"))
```

The presence absence species data include 116 presence points and 138 absence points. The appropriate format of species data for the **blockCV** package is *SpatialPointsDataFrame*. We convert the data.frame to SpatialPointsDataFrame as follows:
The presence absence species data include 116 presence points and 138 absence points. The appropriate format of species data for the **blockCV** package is *SpatialPointsDataFrame* (or *sf*). We convert the data.frame to SpatialPointsDataFrame as follows:

```{r, fig.height=4.5, fig.width=7.1}
# import presence-absence species data
Expand Down Expand Up @@ -141,7 +141,7 @@ The function *Buffering* generates spatially separated train and test folds by c

When working with *presence-background* (presence and pseudo-absence) data (specified by `spDataType` argument), only presence records are used for specifying the folds. Consider a target presence point. The buffer is defined around this target point, using the specified range (`theRange`). The testing fold comprises the target presence point and all background points within the buffer. Any non-target presence points inside the buffer are excluded. All points (presence and background) outside of buffer are used for training set. The method cycles through all the presence data, so the number of folds is equal to the number of presence points in the dataset.

For *presence-absence* data, folds are created based on all records, both presences and absences. As above, a target observation (presence or absence) forms a test point. All presence and absence points other than the target point within the buffer are ignored, and the training set comprises all presences and absences outside the buffer. Apart from the folds, the number of *training-presence*, *training-absence*, *testing-presence* and *testing-absence* points is stored and returned in the `records` table. If `species = NULL`, the procedure is like presence-absence data. The `species` argument is the name of the column with 0s (absences or backgrounds) and 1s (presences) in the species SpatialPointsDataFrame file.
For *presence-absence* data, folds are created based on all records, both presences and absences. As above, a target observation (presence or absence) forms a test point. All presence and absence points other than the target point within the buffer are ignored, and the training set comprises all presences and absences outside the buffer. Apart from the folds, the number of *training-presence*, *training-absence*, *testing-presence* and *testing-absence* points is stored and returned in the `records` table. If `species = NULL`, the procedure is like presence-absence data. The `species` argument is the name of the column with 0s (absences or backgrounds) and 1s (presences) in the species SpatialPointsDataFrame or sf object.

```{r, warning=FALSE, message=FALSE}
# buffering with presence-absence data
Expand Down
54 changes: 27 additions & 27 deletions vignettes/BlockCV_for_SDM.html

Large diffs are not rendered by default.

0 comments on commit 1ee9b57

Please sign in to comment.