-
Notifications
You must be signed in to change notification settings - Fork 2
/
BirdFlowR.Rmd
340 lines (271 loc) · 11.1 KB
/
BirdFlowR.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
---
title: "BirdFlowR"
output:
rmarkdown::html_vignette:
toc: true
vignette: >
%\VignetteIndexEntry{BirdFlowR}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
out.width = "65%",
fig.width = 6,
fig.height = 5
)
```
## Setup
### Install packages
```{r install, eval = FALSE }
installed <- rownames(installed.packages())
if (!"remotes" %in% installed)
install.packages("remotes")
if (!"rnaturalearthdata" %in% installed)
install.packages("rnaturalearthdata")
remotes::install_github("birdflow-science/BirdFlowModels")
remotes::install_github("birdflow-science/BirdFlowR", build_vignettes = TRUE)
```
### Load libraries
```{r setup}
library(BirdFlowModels)
library(BirdFlowR)
library(terra)
library(sf)
```
### Load model
The BirdFlow Science team has shared a [collection of fitted models](`r paste0(birdflow_options("collection_url"), "index.html")`) for use with
the BirdFlowR package. The website includes reports on each species.
We can also access the collection index through the package.
```{r load index}
# Load and print index
index <- load_collection_index()
print(index[, c("model", "common_name")])
```
And we can load a model from the collection based on the `model` column from
the index.
**Note:** in the vignette this block isn't executed.
```{r load model, eval = FALSE}
# Load a specific model
bf <- load_model("amewoo_prebreeding") # caches locally and loads from cache
```
This loads the smaller example model instead for efficiency of package building
and testing, but do not use this one for science!
```{r load example model, eval = TRUE}
bf <- BirdFlowModels::amewoo # example and test dataset
```
## Access basic information
`dim()`, `nrow()`, and `ncol()` all report on raster dimensions associated with the model.
`n_active` is the total number of cells that the BirdFlow model can route birds through and is a subset of the cells in the raster.
`n_transitions()` and `n_distr()` report on temporal dimensions. If the model `is_cyclical()`, they will be equal.
```{r access}
# Methods for base R functions:
dim(bf)
c(nrow(bf), ncol(bf))
bf # same as print(bf)
# BirdFlowR functions
n_active(bf)
n_transitions(bf)
n_timesteps(bf)
# Contents
has_marginals(bf)
has_distr(bf)
has_transitions(bf)
is_cyclical(bf)
```
## Species information and metadata
`species_info()` and `get_metadata()` take a BirdFlow object as their first argument. An optional second argument allows specifying a specific item, if omitted a list is returned with all the available information.
`species(bf)` is a shortcut for `species_info(bf, "common_name")`
Use `?species_info()` to see descriptions of all the available information. Dates associated with migration and resident seasons are likely to be useful.
```{r BirdFlow specific info}
species(bf)
species(bf, "scientific")
species_info(bf, "prebreeding_migration_start")
si <- species_info(bf) # list with all species information
md <- get_metadata(bf) # list with all metadata
get_metadata(bf, "birdflow_model_date") # date model was exported from python
validate_BirdFlow(bf) # throws error if there are problems
```
## Spatial aspects
BirdFlow models are based on a raster representation of a time series of species
distributions and contain all the spatial information necessary to recreate
those distributions and to define how the raster is positioned in space.
BirdFlowR uses the **terra** package to import raster data and provides BirdFlow
methods for functions defined in the terra package - so that you can use those
functions on BirdFlow objects.
`crs()` returns the coordinate reference system - useful if you need to project other data to match the BirdFlow object.
`res()`, `xres()`, and `yres()` describe the dimensions of individual cells in the model.
`ext()` returns a terra extent object.
`compare_geom()` tests if the extent, resolution, and CRS of two objects is the
same. BirdFlowR includes methods to compare BirdFlow models with each other and
with terra objects.
```{r spatial aspects terra}
# Methods for terra functions:
a <- crs(bf) # well known text (long)
crs(bf, proj = TRUE) # proj4 string
res(bf)
c(xres(bf), yres(bf)) # same as res(bf)
ext(bf)
c(xmin(bf), xmax(bf), ymin(bf), ymax(bf)) # same as ext(bf)
# Compare geometries - do they have the same CRS, extent, and cell size
compareGeom(bf, rast(bf))
```
BirdFlow objects also play nicely with the *sf* package.
```{r sf}
bb <- sf::st_bbox(bf)
crs <- sf::st_crs(bf)
```
## Retrieve and plot distributions
A distribution in BirdFlow is stored as a vector of values that correspond to only the active cells (`n_active()`) in the model. Multiple distributions are stored as matrices with `n_active()` rows and a column for each distribution.
We can retrieve distributions in this format with `get_distr()`.
Use timestep, character dates, date objects, or "all" to specify which distributions to retrieve.
Retrieve the first distribution and compare its length to the number of active cells.
```{r single distribution}
d <- get_distr(bf, 1) # get first timestep distribution
length(d) # 1 distribution so d is a vector
n_active(bf) # its length is the the number of active cells in the model
```
Get 5 distributions, the result is a matrix in which each column is a distribution with a row for each active cell.
```{r multiple distributions}
d <- get_distr(bf, 26:30)
dim(d)
head(d, 3)
```
We can also specify distributions with dates, or use "all" to retrieve all the distributions.
```{r get_distr options}
d <- get_distr(bf, "2022-12-15") # from character date
d <- get_distr(bf, "all") # all distributions
d <- get_distr(bf, Sys.Date()) # Using a Date object
```
Use `rasterize_distr()` to convert a distribution to a SpatRaster defined in the terra package. The second argument, the BirdFlow
model, is needed for the spatial information it contains.
```{r get distributions}
d <- get_distr(bf, c(1, 26)) # winter and summer
r <- rasterize_distr(d, bf) # convert to SpatRaster
```
Alternatively convert directly from BirdFlow to SpatRaster with `rast()`. The second (optional) argument `which` accepts the same inputs as `which` in `get_distr()`.
```{r rast, fig.width=8, fig.height=4, out.width='100%'}
r <- rast(bf) # all distributions
r <- rast(bf, c(1, 26)) # 1st, and 26th timesteps.
plot(r)
```
BirdFlowR provides convenience wrappers to functions in **rnaturalearth**
that load vector data and then crop and transform it to make it suitable for plotting with BirdFlow output.
***Note:** until **rnaturalearth** has fully transitioned
away from legacy packages you may see a warning about them, but **BirdFlowR**
does not use the legacy packages or data formats itself.*
```{r }
r <- rast(bf, species_info(bf, "prebreeding_migration_start"))
plot(r)
coast <- get_coastline(bf) # lines
plot(coast, add = TRUE)
```
# Forecasting
In this section we will sample a single starting location from the winter
distribution and project it forward. We will generate a distribution of
predicted breeding grounds for birds that wintered at the starting location.
Set predict parameters.
```{r predict parameters}
start <- 1 # winter
end <- 26 # summer
```
## Sample starting distribution
`sample_distr()` will sample from one or more input distribution to select a
single location per distribution. The result is one or more distributions
with ones in the selected location(s) and zero elsewhere.
```{r starting location}
set.seed(0)
d <- get_distr(bf, start)
location <- sample_distr(d)
print(i_to_xy(which(as.logical(location)), bf)) # starting coordinates
```
## Project forward from this location to summer
`predict()` returns the distribution over time as a matrix with
one column per timestep.
The plot shows where birds that winter at a particular location are
likely to be as the year progresses and ultimately where they might spend their
summer. The probability density spreads as the weeks progress.
```{r predict, out.width='100%'}
f <- predict(bf, distr = location, start = start, end = end,
direction = "forward")
r <- rasterize_distr(f[, c(1, 7, 14, 19)], bf)
plot(r)
```
Additionally, we can calculate the difference between the projected distribution
and the distribution of the species as a whole at the same timestep.
```{r probability over time}
projected <- f[, ncol(f)] # last projected distribution
diff <- projected - get_distr(bf, end)
plot(rasterize_distr(diff, bf))
```
# Generate synthetic routes
Here we sample locations from the American Woodcock winter
distribution and generate routes to their summer grounds.
Set route parameters.
```{r route parameters}
n_positions <- 15 # number of starting positions
start <- 1 # starting timestep (winter)
end <- 26 # ending timestep (summer)
```
## Generate starting locations
First extract the winter distribution, then use `sample_locations()` with
`n = n_positions` to sample the input distribution repeatedly. The result is a
matrix in which each column has a single '1' representing the sampled location.
```{r starting locations}
d <- get_distr(bf, start)
locations <- sample_distr(d, n = n_positions, bf = bf, format = "xy")
x <- locations$x
y <- locations$y
```
Plot the starting (winter) distribution and sampled locations.
```{r plot starting distribution}
winter <- rasterize_distr(d, bf)
plot(winter)
points(x, y)
```
## Generate routes
`route()` will generate synthetic routes for each starting position.
`route()` returns a BirdFlowRoutes object which is at it's core a data frame
with a row for each timestep of each route, but also includes some additional
spatial and temporal information from the BirdFlow object.
```{r route, fig.show='hide'}
rts <- route(bf, x_coord = x, y_coord = y, start = start, end = end)
head(rts$points, 4)
```
The `route()` function can sample starting locations from the distribution for
the starting timestep so the following is equivalent to the preceding two sections.
```{r route with sample}
rts2 <- route(bf, n = n_positions, start = start, end = end)
```
We can specify the date range with any arguments supported by
`lookup_timestep_sequence()` so an alternative to the above with slightly
different start and end dates is to use the season argument. Here we route
during the prebreeding migration.
```{r route with season}
rts3 <- route(bf, n = n_positions, season = "prebreeding")
```
## Using base R plotting to plot routes
Plot the route lines over the summer distribution along with points at the
starting and ending positions.
```{r plot routes with base R}
d <- get_distr(bf, end)
summer <- rasterize_distr(d, bf)
line_col <- rgb(0, 0, 0, .2)
pt_col <- rgb(0, 0, 0, .5)
plot(summer)
points(x, y, cex = .4, col = pt_col, pch = 16) # starting points
lines <- sf::st_as_sf(rts, type = "line") # convert to sf lines
plot(lines, add = TRUE, col = line_col) # routes
end_pts <- rts[rts$timestep == end, ] # end points
points(x = end_pts$x, y = end_pts$y,
cex = 0.4, pch = 12, col = pt_col)
title(main = species(bf))
```
## Or use plot_routes()
`plot_routes()` will visualize time as a color gradient using **ggplot2** with
stop point dots that indicate how long a bird was at each stopping point.
```{r plot routes}
plot_routes(rts, bf)
```