-
Notifications
You must be signed in to change notification settings - Fork 0
/
00_KDE_analysis_workflow.Rmd
368 lines (281 loc) · 18.7 KB
/
00_KDE_analysis_workflow.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
---
title: "KDE Caribou"
author: "G Perkins"
date: "April 28, 2019"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
# run libraries
library(lattice)
library(ggplot2)
library(scales)
library(dplyr)
library(xlsx)
library(readxl)
#library(GGally)
library(lubridate)
library(sf)
library(sp)
library(adehabitatHR)
library(ggmap)
library(scales)
library(here)
library(ggmap)
library(tidyr)
# set up working directories
here()
# set up a key
register_google(key = "INSERT KEY HERE", # your Static Maps API key
account_type = "standard")
## load data into R
# <- read.csv("5177_TEC GPS telemetry 2014-ongoing_April2_2019.csv",header = TRUE,stringsAsFactors = TRUE)
twdata <- read.csv("5177_TEC GPS telemetry 2014-ongoing_April18_2019.csv",header = TRUE,stringsAsFactors = TRUE)
# select the columns of interst
twdata<- twdata %>% dplyr::select(Animal.ID, Date.Time,Latitude..ø., Longitude..ø., Height..m.,DOP,FixType,LMT_Time,Month,Year)
twdata <- twdata %>% mutate(Latitude = Latitude..ø., Longitude = Longitude..ø.,Month.0 = Month, Year.0 = Year )
twdata <- twdata[ , -which(names(twdata) %in% c("Year","Month",'Latitude..ø.', 'Longitude..ø.'))]
animal.no <- length(unique(twdata$Animal.ID)) # 45? individuals (with blank)
##############################################################
#### PART 1: DATA EXPLORATION #####
twdata$Date2=as.Date(twdata$Date.Time, format = '%Y-%m-%d')
twdata$Year<- year(twdata$Date2)
twdata$Month <- month(twdata$Date2) # break out DMY columns
twdata$Day <- day(twdata$Date2)
twdata$Date.j <- julian(twdata$Date2)#Add julian day
twdata$Hours <- as.numeric(format(as.POSIXct(strptime(twdata$Date.Time,"%Y-%m-%d %H:%M",tz="")) ,format = "%H"))
# remove data with missing ID no
twdata <- twdata[twdata$Animal.ID != "",]
#if the month and year is missing then fill in wih original dataset values for Month and year
nomonthdata <- twdata %>% filter(is.na(Month)) %>% mutate(Month = Month.0,Year = Year.0); length(nomonthdata$Animal.ID)
twdata <- twdata %>% drop_na(Month); length(twdata$Month)
twdata <- rbind(twdata, nomonthdata)
# group cariou into seasons
twdata <- twdata %>% mutate(season = ifelse(Month %in% c(4,5),"SM",
ifelse(Month == 6,"C",
ifelse(Month %in% c(7,8,9),"S",
ifelse(Month == 10,"R",
ifelse(Month == 11,"FM",
ifelse(Month %in% c(12,1,2,3),"W", NA)))))))
## Add a caribou year (december = following year)
twdata <- twdata %>% mutate(year.C = ifelse(Month %in% c(1:11),Year,
ifelse(Month == 12,Year + 1, NA)))
animal.years <- na.omit(unique(twdata$year.C))
```
## Summary
We used Caribou telemetry data collected from 2015 - 2019 to create Kernal Density Estimates(KDE) for individual animals and all animals combined.KDEs are also calaculated per season and per year to investigate temporal change.
Telemetry data is collected for `r animal.no` from `r min(animal.years)` to `r max(animal.years)`. Seasonal Kernal density estimates were calculated per season as defined by Cichowski (2005). These include: Summer (July to September), Rut (October), Fall migration (November), Winter (December to March), Calving (June), Spring Migration (April and May). Where seasons overlapped years (ie; winter), seasons were grouped into succesive years. For example Winter 2016 was defined from December 2015 to March 2015.
KDEs were calculated using R (package: adeHabitatHR) after investigating the effect of various smoothing parameters and in conjunction with the current literature. All analysis and data are stored here:`r #"I:\ES\General\Wildlife\WildlifeSpecies\Caribou\Tweedsmuircaribou\2015-2019\KHR_Analysis"`. and on Github:TO BE FILLED IN
## Caribou Telemetry Data
The total number of raw fixes varied between individuals with season and years. All KDEs were calculated using a minimum of 50 unique locations as recommended for home range calculations (Seaman 1999, Kernohan 2001). Note telemetry data is yet to be subset by a specific time interval and currently includes all raw fixes.
```{r include=TRUE, echo = FALSE}
# create some summary plots to show the Caribou data
p1 <- ggplot(twdata) + geom_bar(aes(Animal.ID)) + theme(axis.text.x = element_text(angle = 60, hjust = 1)) + ggtitle("Number of Raw fixes per Caribou (Animal.ID)"); p1
# Check the multiple counts of anaimals per day
counts.per.day <- twdata %>% group_by(Animal.ID,Date.j) %>%
summarise(total = n(),unique = unique(Date.j)) %>%
group_by(Animal.ID,total) %>%
summarise(total.j = n())
pp = ggplot(data = counts.per.day, aes(total,total.j,col = Animal.ID)) + geom_point() + ggtitle('Number of fixes per day per animal')
pp
id.season.yr <- twdata %>% group_by(Animal.ID,year.C,season) %>% summarise(count = n()) #; id.season.yr
id.season <- twdata %>% group_by(Animal.ID,season) %>% summarise(count = n()) #; id.season
id.jdate <- twdata %>% group_by(Animal.ID,year.C) %>% summarise(count = length(unique(Date.j))) #; id.jdate
id.total <- twdata %>% group_by(Animal.ID) %>% summarise(count = n()) #; id.total
p3.1 <- ggplot(id.season,aes(x = season,count)) + geom_bar(stat ="identity") + facet_wrap(~Animal.ID) + theme(axis.text.x = element_text(angle = 90)) + ggtitle("Total Telemetry fixes per animal per season "); p3.1
# check the spread of Caribou with maped locations
p7.1 = ggplot(twdata, aes(Longitude,Latitude,colour = season)) + geom_point() #; p7.1
p8.1 = ggplot(twdata, aes(Longitude,Latitude,colour = season)) + facet_wrap(~year.C) + geom_point() +
ggtitle("Distribution of Caribou Telemetry fixes per season per year"); p8.1
```
## Kernel density estimates (KDE)
KDEs are very sensitive to input parameters, specfically the bandwidth (h) which determines the smoothing parameter (https://cran.r-project.org/web/packages/adehabitatHR/vignettes/adehabitatHR.pdf). H parameters can be estimated using three methods 1) a reference bandwidth (h = σ × n ^−1/6), however this is generally an overestimate of the range, and is not suitable for multi-modal distributions. 2) Least Square Cross Validation (LSCV), which minimises the difference in volumne between the true UD and the estimates UD. 3) A subjective visual choice for the smoothing parameter, based on successive trials (Silverman, 1986; Wand & Jones 1995).
To create home ranges for each unique id we used a bivariate normal kernel with a variety of h (smoothing parameters). We then interpreted visually to determine size to use based on successive trials. This is supported by the literature (Hemson et al. 2005; Calenge et al. 2011)
After extensive testing with a variety of h parameters we elected to have a user estimated h paramter. This followed implementation of similar Caribou Analysis conducted by T. Muhly <https://github.com/bcgov/clus/blob/master/R/caribou_habitat/04_caribou_habitat_model_telemetry_data_prep_doc.Rmd>
## Kernal density example plots
The following plots show the difference in smoothing parameters (1000,4000)
```{r, include = TRUE,echo = TRUE, warning=FALSE}
## Calculate Kernal density estiamtes ################
aoi <- unique(twdata$Animal.ID)
yrs.to.map <- unique(twdata$year.C)
seasons.to.map <-unique(twdata$season)
###################################
## Testing with Winter
####################################
#for(i in 1:length(season.to.map)){
i = 1 # winter season
seas <-seasons.to.map[i]
tdf.0 <- twdata %>% filter(season == seas)
## drop animals with less than 5 records
tdf.0$Animal.ID <- droplevels(tdf.0$Animal.ID)
tdf <- tdf.0[, c("Animal.ID", "Longitude", "Latitude")] #; length(tdf$Animal.ID)
# filter for individuals with more than 50 locations
tdf.u <- unique(tdf)
#tdf.u.df <- tdf.u %>% group_by(Animal.ID) %>% summarise (count = n ())
tdf <- tdf.u [tdf.u$Animal.ID %in% names(table(tdf.u$Animal.ID)) [table(tdf.u$Animal.ID) >= 50], ]
tdf$Animal.ID <- droplevels(tdf$Animal.ID)
# Create a SpatialPointsDataFrame by defining the coordinates
coordinates(tdf) <- c("Longitude", "Latitude")
proj4string(tdf) <- CRS( "+proj=longlat +datum=WGS84 +units=m +no_defs" )
tdfgeo <- spTransform(tdf, CRS("+init=epsg:3005")) # Transform to UTM
##################################################################################################
# Test a number of KDE versions: href, h = 1000,2000,4000, LSCV
##############################################################################################
# KDE 2) run kernal density for inidividual animasl with user selected h ref (after testing)
file.oi <- here("outputs",paste("KDE95",seas,"1000id.shp",sep = ""))
if (file.exists(file.oi)) {
ver95.2.sf <- st_read(here("outputs",paste("KDE95",seas,"1000id.shp",sep = "")))}else{
print( "... running KDE this may take a few minutes")
kde2 <- kernelUD(tdfgeo, h = 1000, kern = c("bivnorm"), grid = 500,extent = 2)
ver95.2 <- getverticeshr(kde2,95) ; ver95.2.sf<- st_as_sf(ver95.2)
#ver75.2 <- getverticeshr(kde2,75) ; ver75.2.sf<- st_as_sf(ver75.2)
#ver50.2 <- getverticeshr(kde2,50) ; ver50.2.sf<- st_as_sf(ver50.2)
st_write(ver95.2.sf,here("outputs",paste("KDE95",seas,"1000id.shp",sep = "")))}
plot(st_geometry(ver95.2.sf),col = "yellow")
#plot(st_geometry(ver75.2.sf),col = "blue", add = TRUE)
#plot(st_geometry(ver50.2.sf),col = "purple", add = TRUE)
plot(tdfgeo, pch = 1, size = 0.5, add = TRUE) # Add points
```
Figure 1: h bandwidth = 2000
```{r, include = TRUE,echo = FALSE, warning=FALSE}
##################################################################################################
# KDE 4) run kernal density for inidividual animasl with user selected h ref (after testing)
file.oi <- here("outputs",paste("KDE95",seas,"4000id.shp",sep = ""))
if (file.exists(file.oi)) {
ver95.4.sf <- st_read(here("outputs",paste("KDE95",seas,"4000id.shp",sep = "")))}else{
print( "... running KDE this may take a few minutes")
kde4 <- kernelUD(tdfgeo, h = 1000, kern = c("bivnorm"), grid = 500,extent = 2)
ver95.4 <- getverticeshr(kde2,95) ; ver95.4.sf<- st_as_sf(ver95.4)
st_write(ver95.4.sf,here("outputs",paste("KDE95",seas,"4000id.shp",sep = "")))}
plot(st_geometry(ver95.4.sf),col = "blue")
plot(tdfgeo, pch = 1, size = 0.5, add = TRUE) # Add points
```
Figure 1: h bandwidth = 4000
```{r, echo = FALSE}
#
# ##################################################################################################
# ###################################################################################################
# ## PART 2:
# ## Run KDE for all animals combined
# ##################################################################################################
#
# ## drop animals with less than 5 records
# tdf <- tdf.0[, c("Longitude", "Latitude")] #; length(tdf$Animal.ID)
#
# # Create a SpatialPointsDataFrame by defining the coordinates
# coordinates(tdf) <- c("Longitude", "Latitude")
# proj4string(tdf) <- CRS( "+proj=longlat +datum=WGS84 +units=m +no_defs" )
# tdfgeo <- spTransform(tdf, CRS("+init=epsg:3005")) # Transform to UTM
#
# ##############################################################################################
# # KDE 2) run kernal density for inidividual animasl with user selected h ref (after testing)
# kde2 <- kernelUD(tdfgeo, h = 2000, kern = c("bivnorm"), grid = 500,extent = 2)
# ver95.2 <- getverticeshr(kde2,95) ; ver95.2.sf<- st_as_sf(ver95.2)
# #st_write(ver95.2.sf,here("outputs",paste("KDE95",seas,"2000all.shp",sep = "")))
#
# plot(st_geometry(ver95.2.sf),col = "green", add = TRUE)
# plot(tdfgeo, pch = 1, size = 0.5, add = TRUE) # Add points
#
# ```
#
# Add some desciptive text
#
# ```{r}
# ###################################################################################################
# ## PART 3:
# ## Run KDE for years all animals
# ##################################################################################################
#
# ## drop animals with less than 5 records
# tdf.0$Animal.ID <- droplevels(tdf.0$Animal.ID)
# tdf <- tdf.0[, c("year.C","Longitude", "Latitude")] #; length(tdf$Animal.ID)
#
# # filter for individuals with more than 50 locations
# tdf.u <- unique(tdf)
# years.c.sum <- tdf %>% group_by(year.C) %>% summarise(count = n())
#
# # Create a SpatialPointsDataFrame by defining the coordinates
# coordinates(tdf) <- c("Longitude", "Latitude")
# proj4string(tdf) <- CRS( "+proj=longlat +datum=WGS84 +units=m +no_defs" )
# tdfgeo <- spTransform(tdf, CRS("+init=epsg:3005")) # Transform to UTM
#
# # KDE 5) run kernal density for inidividual animasl with user selected h ref (after testing)
# kde5 <- kernelUD(tdfgeo, h = 2000, kern = c("bivnorm"), grid = 500,extent = 2)
# ver95.5 <- getverticeshr(kde5,95) ; ver95.5.sf<- st_as_sf(ver95.5)
# #st_write(ver95.5.sf,here("outputs","KDE95W2000pyr.shp"))
# #plot(st_geometry(ver95.5.sf),col = "forest green", add = TRUE)
# #plot(st_geometry(ver75.4.sf),col = "blue", add = TRUE)
# #plot(st_geometry(ver50.4.sf),col = "purple", add = TRUE)
# plot(tdfgeo, pch = 1, size = 0.5, add = TRUE) # Add points
#
# ```
#
# ADD SOME DESCRIPTIVE
#
# ```{r}
# ####################################################################################################
# ###################################################################################################
# ## PART 4:
# ## Run KDE all animals per year.
# ##################################################################################################
#
# ## drop animals with less than 5 records
# #tdf.0$Animal.ID <- droplevels(tdf.0$Animal.ID)
# tdf <- tdf.0[, c("Animal.ID","year.C","Longitude", "Latitude")] #; length(tdf$Animal.ID)
#
# # filter for individuals with more than 50 locations
# tdf.u <- unique(tdf)
#
# # get a list of animal id and year when unique locations is >50
# years.c.id.sum <- tdf.u %>% group_by(year.C,Animal.ID) %>% summarise(count = n())
#
# # filter the dataset
# tdf.u <- left_join(tdf.u, years.c.id.sum , by = c('year.C','Animal.ID'))
# tdf.u <- tdf.u[tdf.u$count > 50,] #; unique(tdf.u$count)
#
# yrs.to.map <- unique(tdf.u$year.C )
#
# ## loop this through the years of interest
# for(ii in 4:length(yrs.to.map )){
# #ii = 2 # 2015 or first year
# yrs <-yrs.to.map [ii]
# tdf.yr <- tdf.u %>% filter(year.C == yrs)
# sum.yr <- tdf.yr %>% group_by(year.C,Animal.ID) %>% summarise(count = n())
# tdf <- tdf.yr %>% dplyr::select('Animal.ID','Longitude', 'Latitude')
# tdf$Animal.ID <- droplevels(tdf$Animal.ID)
#
# #sum.yr <- tdf.yr %>% group_by(year.C,Animal.ID) %>% summarise(count = n())
#
# # Create a SpatialPointsDataFrame by defining the coordinates
# coordinates(tdf) <- c("Longitude", "Latitude")
# proj4string(tdf) <- CRS( "+proj=longlat +datum=WGS84 +units=m +no_defs" )
# tdfgeo <- spTransform(tdf, CRS("+init=epsg:3005")) # Transform to UTM
#
# # KDE 5) run kernal density for inidividual animasl with user selected h ref (after testing)
# kde5 <- kernelUD(tdfgeo, h = 2000, kern = c("bivnorm"), grid = 500,extent = 2)
# ver95.5 <- getverticeshr(kde5,95) ; ver95.5.sf<- st_as_sf(ver95.5)
# #st_write(ver95.5.sf,here("outputs", paste("KDE95",seas,"2000ID_",yrs,".shp",sep = "")))
#
# plot(st_geometry(ver95.5.sf),col = "forest green", add = TRUE)
# #plot(st_geometry(ver75.4.sf),col = "blue", add = TRUE)
# #plot(st_geometry(ver50.4.sf),col = "purple", add = TRUE)
# plot(tdfgeo, pch = 1, size = 0.5, add = TRUE) # Add points
#
# }
#
```
### Appendices 1: KDE Parameters in Detail
KDE were used to determine density of individuals. Calculations for KDE require consideration of the mathematical method to describe the density function. A detailed description of the parameters can be found here : (http://www.spatialecology.com/gme/kde.htm).
- Kernal Type: The type of kernal is limited to Gaussian (bivariate normal), quadratic or normal. We used the bivariate normal model as default.
- Bandwidth (h): This determines the level of smoothing used. The two types include "href" and "LSCV"(Least Squared Cross Validation). The bandwidth will be determined by the type of kernal chosen. Some tools and functions are available to estiamte the optimal Bandwidth ie: (ks: Hpi.diag) (https://cran.r-project.org/web/packages/ks/ks.pdf) or to plot optimal bandwidth using LSCV method. (ks:plotLSCV()). Using the Hpi.diag provides estimates of h that can be entered manually.
- Cell size : This determines the area or extent over which the home range will be estimated. This is a mix of fine scale and time consuming processing and faster blocky resolution over a continuous surface. As a rule of thumb you can GME suggests: take the square root of the x or y variance value (whichever is smaller) and divide by 5 or 10 (I usually round to the nearest big number - so 36.7 becomes 40). Before using this rule of thumb value calculate how many cells this will result in for the output (take the width and height of you input points, divide by the cell size, and multiply the resulting numbers together). If you get a value somewhere between 1-20 million, then you have a reasonable value. If you have a value much larger then 20 million cells then consider increasing the cell size (http://www.spatialecology.com/gme/kde.htm).
Note it is also possible to run KDE with set barriers and boundaries .
##References
- Packages: estimate Kernel home range Utiliation Distribution Using adehabitatHR (Calenge et al. 2011) (https://cran.r-project.org/web/packages/adehabitatHR/vignettes/adehabitatHR.pdf)
- KDE : "http://www.spatialecology.com/gme/kde.htm"
- https://www.ckwri.tamuk.edu/sites/default/files/publication/pdfs/2017/leonard_analyzing_wildlife_telemetry_data_in_r.pdf
- Seaman, D. E., Millspaugh, J. J., Kernohan, B. J., Brundige, G. C., Raedeke, K. J., & Gitzen, R. A. (1999). Effects of sample #size on kernel home range estimates. The journal of wildlife management, 739-747.
- Kernohan, B. J., R. A. Gitzen, and J. J. Millspaugh. 2001. Analysis of animal space use and movements. Pages 125–166 in J. J. #Millspaugh and J. M. Marzluff, editors. Radio tracking and animal populations. Academic Press, San Diego, CA, USA
- Hemson, G., Johnson, P., South, A., Kenward, R., Ripley, R., & MACDONALD, D. (2005). Are kernels the mustard? Data from global positioning system (GPS) collars suggests problems for kernel home‐range analyses with least‐squares cross‐validation. Journal of Animal Ecology, 74(3), 455-463
## Data Location.
- Previously Run KDE (Hearnden, B) : "W:\srm\smt\Workarea\BHearnden\Client_Proj\AM\Tweedsmuir_Cariboo"