-
Notifications
You must be signed in to change notification settings - Fork 4
/
4_3_Ptarmigan_Presentation.rmd
436 lines (258 loc) · 14.3 KB
/
4_3_Ptarmigan_Presentation.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
---
title: "4_3 Willow ptarmigan line transect data"
author: "Prepared by Erlend B. Nilsen, https://orcid.org/0000-0002-5119-8331"
date: ""
output:
ioslides_presentation:
fig_height: 4
fig_width: 7
incremental: yes
smaller: yes
widescreen: yes
widescreen: yes
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
```
## Introduction to the use case
<div class="columns-2">
```{r, out.width = "250px", echo=FALSE}
knitr::include_graphics("pictures/logosmall.png")
```
```{r, out.width = "300px", echo=FALSE}
knitr::include_graphics("pictures/HFP_screenshot.png")
```
- The use case covered here is based on a data set recently published on gbif, containing inforamtion from a long-term line transect sampling survey targeting willow ptarmigan (*Lagopus lagopus*)
- The surveys are part of a national program coordinated through the project Hønsefuglportalen [(http://honsefugl.nina.no)](http://honsefugl.nina.no).
- The program is operated in close collaboration between the Norwegian Institute for Nature Research [(NINA; www.nina.no)](www.nina.no), Nord University [(www.nord.no)](www.nord.no), Inland Norway University of Applied Sciences [(www.inn.no)](www.inn.no) as well as landowners responsible for grouse management on their properties (FeFo, Statskog, Fjellstyrene as well as private landowners). The specific part of the project data used here is from surveys carried out in Finnmark county (with FeFo as management authority and data owner).
</div>
## Accessing the data (1)
```{r, out.width = "300px", echo=FALSE}
knitr::include_graphics("pictures/gbif_screenshot.png")
```
```{r, echo = TRUE, message=FALSE, warning=FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
library(RJSONIO)
library(dplyr)
library(tidyr)
library(stringr)
library(xml2)
datasetID <- "c47f13c1-7427-45a0-9f12-237aad351040"
dataset <- RJSONIO::fromJSON(paste0("http://api.gbif.org/v1/dataset/",datasetID,"/endpoint"))
endpoint_url <- dataset[[1]]$url
download.file(endpoint_url, destfile="data/temp.zip", mode="wb")
unzip ("data/temp.zip", exdir = "data")
```
## Accessing the data (2)
```{r, echo = TRUE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
d <- read.csv("data/event.txt", sep="\t", encoding = "UTF-8") %>% select(-id)
Occ <- read.csv("data/occurrence.txt", sep="\t", stringsAsFactors = FALSE, encoding = "UTF-8") %>% select(-id)
```
## Resource citation
Following acceptable reuse of openly published data, we should cite the data and the data set to give credit to the data providers.
One way to get to the citation string from within R is to follow the procedure bellow:
```{r, echo = TRUE, message=FALSE, warning=FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
tmp <- tempfile()
download.file(paste0("http://api.gbif.org/v1/dataset/",datasetID,"/document"),tmp)
meta <- read_xml(tmp) %>% as_list()
gbif_citation <- meta$additionalMetadata$metadata$gbif$citation[[1]]
citation <- gsub("GBIF.org", paste(endpoint_url), gbif_citation)
```
- `r paste(citation)`
## Preparation (1)
First - loading some libraries that we will use during the excercise:
```{r, echo = TRUE, message=FALSE, warning=FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
library(tidyverse)
library(sp)
library(rgeos)
library(geosphere)
library(rgdal)
library(lme4)
library(AICcmodavg)
library(MuMIn)
library(geosphere)
library(Distance)
library(lubridate)
library(stringr)
library(gridExtra)
library(Distance)
library(knitr)
LongLat = CRS("+proj=longlat +ellps=WGS84 +datum=WGS84")
```
## Preparation (2)
Before entering the data analysis, we also need to do some simple data manipulation:
```{r, echo=TRUE, message=FALSE, warning =FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
d <- d %>% mutate(eventYear=year(eventDate))
d$dynamicProperties <- gsub("}", '', d$dynamicProperties)
temp2 <- reshape2::colsplit(d$dynamicProperties, ":", c("temp1", "distanceToTransectLine"))
d <- mutate(d, distanceToTransectLine=temp2$distanceToTransectLine)
```
## Inspecting effort data (1)
Summarizing number of years each line is included in the surveys
```{r, echo=TRUE, message=FALSE, warning=FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
LineData <- filter(d, eventRemarks=="Line transect")
N_years_line <- LineData %>% dplyr::count(locationID)
head(N_years_line)
```
## Inspecting effort data (2)
Summarizing total numerber of transects surveyd each year (object *N_lines_year*), as well as toal transect length each year (object *Effort*)
```{r, echo=TRUE, message=FALSE, warning =FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
N_lines_year <- LineData %>% dplyr::count(eventYear)
EffortData <- LineData %>% group_by(eventYear)
Effort <- dplyr::summarize(EffortData, Effort=sum(sampleSizeValue)/1000)
```
## Inspecting effort data (3)
```{r, message=FALSE, warning =FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
par(mfrow=c(2,1), bty="l", cex=0.9)
par(mar=c(1, 4, 3,1))
barplot(N_lines_year$n, names.arg=N_lines_year$eventYear, col="dark orange", xlab="", ylab="Number of transect lines")
text("A)", cex=1.2, x=2000, y=150)
par(mar=c(4, 4, 1,1))
plot(Effort$eventYear, Effort$Effort, type="b", ylim=c(0, 1000), lwd=2, pch=16, col="dark orange", xlab="Year", ylab="Effort (km)")
text("B)", cex=1.2, x=2000, y=800)
```
**Figure 1** *Summary of number of transect lines (A) and total combined length of line transects (B) survey each year in a line transect survey program used to monitor fluctuations in willow ptarmigan (Lagopus lagopus) population in Finnmark county, Norway.*
## Spatial locations of transect lines (1)
Transforming the footprintWKT-field into a *SpatialLinesDataFrame* object.
```{r, echo=TRUE, message=FALSE, warning =FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
lines <- unique(d$locationID)
for (i in 1:length(lines)) {
if (i == 1) {
temp <- filter(d, locationID==lines[i] & footprintWKT != "NULL")
spTemp <- readWKT(temp$footprintWKT[1], temp$locationID[1], p4s=LongLat)
}
else {
temp <- filter(d, locationID==lines[i] & footprintWKT != "NULL")
spTemp = rbind(
spTemp, readWKT(temp$footprintWKT[1], temp$locationID[1], p4s=LongLat)
)
}
}
data <- as.data.frame(lines)
colnames(data) <- c("locationID")
rownames(data) <- paste(data$locationID)
Lines_transect <- SpatialLinesDataFrame(spTemp, data, match.ID=T)
```
## Spatial locations of transect lines (2)
Making a map showing the locations of the transect lines
```{r, eval=FALSE, echo=TRUE, message=FALSE, warning =FALSE, evel=FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
wrld <- map_data("world")
Nor <- subset(wrld, region=="Norway")
Lines_fortify <- fortify(Lines_transect, id="locationID")
x <- c(21.5, 32, 32, 21.5, 21.5)
y <- c(68, 68, 72, 72, 68)
id <- rep(1,5)
box <- data.frame(x,y, id)
fortify_box <- fortify(box, id="id")
p1 <- ggplot() + geom_polygon(data = Nor, aes(x=long, y = lat, group = group), fill="grey70", color="yellow") +
geom_path(data=fortify_box, aes(x=x, y=y), lwd=2, col="dark green")+
coord_map(xlim=c(0, 35), ylim=c(55, 75))
p2 <- ggplot() + geom_polygon(data = Nor, aes(x=long, y = lat, group = group), fill="grey70", color="yellow") +
geom_path(data=Lines_fortify, aes(x=long, y=lat, group=group), lwd=3, col="red") +
coord_map(xlim=c(21.5, 32), ylim=c(68, 71.5))
grid.arrange(p1, p2, ncol=2)
```
## Spatial locations of transect lines (3)
```{r, echo=FALSE, message=FALSE, warning =FALSE, evel=TRUE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
wrld <- map_data("world")
Nor <- subset(wrld, region=="Norway")
Lines_fortify <- fortify(Lines_transect, id="locationID")
x <- c(21.5, 32, 32, 21.5, 21.5)
y <- c(68, 68, 72, 72, 68)
id <- rep(1,5)
box <- data.frame(x,y, id)
fortify_box <- fortify(box, id="id")
p1 <- ggplot() + geom_polygon(data = Nor, aes(x=long, y = lat, group = group), fill="grey70", color="yellow") +
geom_path(data=fortify_box, aes(x=x, y=y), lwd=2, col="dark green")+
coord_map(xlim=c(0, 35), ylim=c(55, 75))
p2 <- ggplot() + geom_polygon(data = Nor, aes(x=long, y = lat, group = group), fill="grey70", color="yellow") +
geom_path(data=Lines_fortify, aes(x=long, y=lat, group=group), lwd=3, col="red") +
coord_map(xlim=c(21.5, 32), ylim=c(68, 71.5))
grid.arrange(p1, p2, ncol=2)
```
**Figure 2** *Map illustrating locations of transect lines used in a line transect survey program used to monitor fluctuations in willow ptarmigan (Lagopus lagopus) population in Finnmark county, Norway.*
## Setting up the occurence data
Combining the event table and the occurence table
```{r, echo=TRUE, message=FALSE, warning =FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
Occu <- filter(d, dynamicProperties!="NA")
distanceToTransectLine <- numeric()
Occ <- Occ %>% mutate(SexStage=str_c(sex, lifeStage, sep = ""))
Occ_wide <- spread(Occ[c("eventID", "SexStage", "individualCount", "scientificName")], key="SexStage", value= "individualCount", fill=0)
Occ_wide <- mutate(Occ_wide, clusterSize=rowSums(Occ_wide[c("FemaleAdult", "MaleAdult", "unknownJuvenile", "unknownunknown")], na.rm=T))
Occ_combined <- plyr::join(Occu, Occ_wide, "eventID")
```
## A simplyfied Distance Sampling example (1)
<div class="centered">
```{r, out.width = "800px", echo=FALSE}
knitr::include_graphics("pictures/ds.png")
```
</div>
## A simplyfied Distance Sampling example (2)
<div class="centered">
```{r, out.width = "300px", echo=FALSE}
knitr::include_graphics("pictures/Detection_curve.png")
```
</div>
## A simplyfied Distance Sampling example (3)
Example of occurences (orange dots) along four transect lines (red lines) in one of the study areas in the years 2013-2017
```{r, echo=FALSE, message=FALSE, warning =FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
li <- c("1C64965F-D590-4201-9C4B-79CFE732D30C", "C49B5802-C408-42E7-ABDE-95C146087100", "C9E68D8E-B7BE-49EF-95D7-6D265BCDAA46", "E23DA718-9423-4F25-9274-2E84A3857EB2")
for (i in 1:length(li)) {
if (i == 1) {
temp <- filter(d, locationID==paste(li[i]) & footprintWKT != "NULL")
spTemp <- readWKT(temp$footprintWKT[1], temp$locationID[1], p4s=LongLat)
}
else {
temp <- filter(d, locationID==paste(li[i]) & footprintWKT != "NULL")
spTemp = rbind(
spTemp, readWKT(temp$footprintWKT[1], temp$locationID[1], p4s=LongLat)
)
}
}
data <- as.data.frame(li)
colnames(data) <- c("locationID")
rownames(data) <- paste(data$locationID)
Lines_transect <- SpatialLinesDataFrame(spTemp, data, match.ID=T)
temp2 <- filter(Occ_combined, locality=="Jarfjord" & eventYear>2012)
coords <- cbind(temp2$decimalLongitude, temp2$decimalLatitude)
points1 <- SpatialPoints(coords, LongLat)
plot(Lines_transect, col="dark red", lwd=4)
points(points1, pch=16, col="dark orange")
```
## A simplyfied Distance Sampling example (4)
Recasting data into the data in the format required by the function *ds* in library *Distance*
```{r, echo=TRUE, message=FALSE, warning =FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
Occ_combined_Lagopus <- Occ_combined %>% filter(scientificName=="Lagopus lagopus" & distanceToTransectLine !="NA")
Reg_dat <- data.frame(Region.Label=paste(sort(unique(LineData$eventYear))), Area=rep(1000, length(unique(LineData$eventYear))))
Samp_dat <- data.frame(Sample.Label=LineData$eventID, Region.Label=paste(LineData$eventYear), Effort=LineData$sampleSizeValue/1000)
Obs_dat <- data.frame(object=as.numeric(Occ_combined_Lagopus$eventID),
Region.Label=paste(Occ_combined_Lagopus$eventYear), Sample.Label=Occ_combined_Lagopus$parentEventID)
Dat_tab <- data.frame(object=as.numeric(Occ_combined_Lagopus$eventID), distance=Occ_combined_Lagopus$distanceToTransectLine/1000,
size=Occ_combined_Lagopus$clusterSize, Strata=paste(Occ_combined_Lagopus$eventYear))
```
## A simplyfied Distance Sampling example (5)
Using Distance Sampling methods to estimate abundance.
```{r, echo=TRUE, message=FALSE, warning =FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
ds.model <-ds(data=Dat_tab, region.table=Reg_dat, sample.table=Samp_dat,
obs.table=Obs_dat, adjustment="poly", order=2, transect="line", truncation="10%",
formula= ~size, key="hn")
```
*We would also like to stress the fact that because there is considerable variation in the study design during the study period from 2000- and onwards. In a proper analysis of the temporal dynamics such variation should be accounted for in the model framework. This is however beyond the scope of this excercise.*
*In our analysis presented here, we will use functions from library [Distance](https://cran.r-project.org/web/packages/Distance/index.html). Note that there are several other r-packages that might provide greater flexibility in the modelling, and that there has been made recent progress in distance sampling models for open populations using Bayesian inference with code written in the BUGS language.*
## A simplyfied Distance Sampling example (6)
Assessing goodness-of-fit....
```{r, echo=TRUE, message=FALSE, warning =FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
par(mfrow=c(1,2))
plot(ds.model, main="Detection model", pch=16, lwd=2)
fit.test <- ddf.gof(ds.model$ddf, lwd=2, pch=16)
```
## A simplyfied Distance Sampling example (6)
```{r, echo=FALSE, message=FALSE, warning =FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
x <- ds.model$dht$individuals$D$Label[-length(ds.model$dht$individuals$D$Label)]
y <- ds.model$dht$individuals$D$Estimate[-length(ds.model$dht$individuals$D$Estimate)]
par(bty="l", cex=1.1, lwd=1.2)
plot(as.numeric(as.character(x)), y, type="b", pch=16, col="dark orange", ylim=c(0, 35), xlab="Year", ylab="Ptarmigan density")
```
**Figure** *Estimated mean density (ptarmigan pr. km-2) in the years 2000-2017 based on data from a line transect survey program used to monitor fluctuations in willow ptarmigan (Lagopus lagopus) population in Finnmark county, Norway. Data was published using the event-core data model of gbif. Note that the estimates reported here are from a simplified analysis that does not model the variation in spatial arrangement of transect lines across years*
## Concluding remarks
- As is evident from this example, using the event-core data model makes it possible to publish data that allows for a proper time series analysis of abundance data.
- The example made here is just one out of many possible ways of analysing these data, but it should be evindent that using the event-core opens up for completely new oportunities.