/
cropping.Rmd
415 lines (327 loc) · 15.2 KB
/
cropping.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
---
title: "Why to Crop Quadrats by Area"
description: >
The function `crop_area()` is powerful when working with quadrats of different
sizes but the same density of points. This vignette will walk through why you
might want to crop quadrats based on area and how to go about using the
`crop_area()` function.
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Why to Crop Quadrats by Area}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
# Introduction
The function `crop_area()` is powerful when working with quadrats of different sizes but the same density of points. This vignette will walk through why you might want to crop quadrats based on area and how to go about using the `crop_area()` function.
When working with quadrats, the area sampled and the density of points sampled will affect diversity measures being conducted. When sampling a larger area, or a differing shaped area, the amount of heterogeneity and number of possible species can differ. Simply averaging the cover of individuals or proportional abundance will be influenced by size of the quadrat, introducing sources of error or variation between samples [Anderson and Marcus 1993](#References).
If the identification of substrate and organisms have already occurred for quadrats of different sizes, this vignette can demonstrate how to subset, or 'crop' the area of the quadrats all to the same size while maintaining similar effort and comparability between samples. For this the sampling effort needs to be roughly the same per unit area, so that the cropped areas will have the same sampling effort.
- If uniform grids of the same area are used to identify the organism or substrate underneath, cropping to a smaller area will be exactly the same effort with the same number of points in each cropped area. Ex splitting each quadrat into 10cm by 10cm squares and identifying the substrate or organism under at the vertexes of the squares.
- If random points are used are used to identify the organism or substrate underneath, the density of points will need to be the same per unit area, so that on average, the cropped areas will contain the same number of points. Ex randomly identifying 100 points in a 1m by 1m quadrat and randomly identifying 54 points in a 0.9m by 0.6m quadrat, so both have an average density of 1 point for every cm^2^ if quadrat area.
- If the full area of the quadrat is identified, then the same effort will be applied to any area, though this method of cropping will not be of value.
Additionally, this method is beneficial because it still ensures spatial relationships between organisms within each quadrat. If a subset of the points was randomly selected, like 54 out of the 100 identified points, these points would still cover the whole quadrat and would therefore still be representative of a larger sampling effort, than 54 points out of the 100, but all aggregated in the bottom left corner of the quadrat. This is the method this vignette will explore, and both random and uniform grid points will be displayed.
First, some packages need to be loaded, the `quadcleanR` package and `ggplot2` to help you visualize the data and results.
```{r load package, message = FALSE}
library(quadcleanR)
library(ggplot2)
```
# Randomized Data
First we will start by randomizing data to help illustrate what is going on here. In this example we will randomize 4 quadrats of data. These quadrats will each have 100 identifications within them, each with a random row and column location. We will set the quadrat size as being within a 2000x2000 pixel quadrat, and randomly identify each identification as one of 4 soft coral genera: Cladiella, Sinularia, Sarcophyton and Lobophytum.
```{r randomized data, out.width = '45%', fig.show='hold'}
#Creating a vector of the soft coral genera
tags <- c("Cladiella", "Sinularia", "Sarcophyton", "Lobophytum")
#Creating a vector of quadrat names
rep <- c(rep("Q1", times = 100),
rep("Q2", times = 100),
rep("Q3", times = 100),
rep("Q4", times = 100))
#Creating a vector of randomized row locations
row <- c(sample(x = c(0:2000), size = 100, replace = TRUE),
sample(x = c(0:2000), size = 100, replace = TRUE),
sample(x = c(0:2000), size = 100, replace = TRUE),
sample(x = c(0:2000), size = 100, replace = TRUE))
#Creating a vector of randomized column locations
column <- c(sample(x = c(0:2000), size = 100, replace = TRUE),
sample(x = c(0:2000), size = 100, replace = TRUE),
sample(x = c(0:2000), size = 100, replace = TRUE),
sample(x = c(0:2000), size = 100, replace = TRUE))
#Creating a vector of randomized identification labels
label <- c(sample(x = tags, size = 100, replace = TRUE),
sample(x = tags, size = 100, replace = TRUE),
sample(x = tags, size = 100, replace = TRUE),
sample(x = tags, size = 100, replace = TRUE))
#Joining vectors into a data frame
coral_annotations <- data.frame(rep, row, column, label)
#Plotting each quadrat
ggplot(coral_annotations[1:100,], aes(x = column, y = row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "Quadrat 1")
ggplot(coral_annotations[101:200,], aes(x = column, y = row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "Quadrat 2")
ggplot(coral_annotations[201:300,], aes(x = column, y = row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "Quadrat 3")
ggplot(coral_annotations[301:400,], aes(x = column, y = row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "Quadrat 4")
```
These plots show where each identification was randomly placed within our quadrats. Now we will crop this area to 50% of the original length and width. There are two ways of doing this, the first, if we know haw large each quadrat is, we can specify this exact size, or estimate based on the maximum row and column locations.
First we will examine this by just estimating the size:
```{r randomized crop, out.width = '45%', fig.show='hold'}
crop_area_coral_1 <- crop_area(data = coral_annotations, row = "row",
column = "column", id = "rep", dim = c(0.5, 0.5))
#Plotting each quadrat
ggplot(coral_annotations[1:100, ], aes(x = column, y = row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "Quadrat 1") +
geom_rect(
aes(
xmin = 0,
xmax = 0.5 * max(column),
ymin = 0,
ymax = 0.5 * max(row)
),
color = "black",
alpha = 0
) +
geom_point(data = subset(crop_area_coral_1, rep == "Q1"),
color = "red")
ggplot(coral_annotations[101:200, ], aes(x = column, y = row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "Quadrat 2") +
geom_rect(
aes(
xmin = 0,
xmax = 0.5 * max(column),
ymin = 0,
ymax = 0.5 * max(row)
),
color = "black",
alpha = 0
) +
geom_point(data = subset(crop_area_coral_1, rep == "Q2"),
color = "red")
ggplot(coral_annotations[201:300, ], aes(x = column, y = row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "Quadrat 3") +
geom_rect(
aes(
xmin = 0,
xmax = 0.5 * max(column),
ymin = 0,
ymax = 0.5 * max(row)
),
color = "black",
alpha = 0
) +
geom_point(data = subset(crop_area_coral_1, rep == "Q3"),
color = "red")
ggplot(coral_annotations[301:400, ], aes(x = column, y = row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "Quadrat 4") +
geom_rect(
aes(
xmin = 0,
xmax = 0.5 * max(column),
ymin = 0,
ymax = 0.5 * max(row)
),
color = "black",
alpha = 0
) +
geom_point(data = subset(crop_area_coral_1, rep == "Q4"),
color = "red")
```
These plots show the cropped coral points in red and the full uncropped quadrats in red, showing how the cropped points are grouped in the bottom corner, maintaining spatial relationships.
Now the same cropping, but specifying the dimensions of the quadrat
```{r randomized crop with dimensions, out.width = '45%', fig.show='hold'}
coral_annotations[["col_dim"]] <- 2000
coral_annotations[["row_dim"]] <- 2000
crop_area_coral_2 <- crop_area(data = coral_annotations, row = "row",
column = "column", id = "rep", dim = c(0.5, 0.5),
res = TRUE, res_dim_x = "col_dim", res_dim_y = "row_dim")
#Plotting each quadrat
ggplot(coral_annotations[1:100, ], aes(x = column, y = row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "Quadrat 1") +
geom_rect(
aes(
xmin = 0,
xmax = 0.5 * 2000,
ymin = 0,
ymax = 0.5 * 2000
),
color = "black",
alpha = 0
) +
geom_point(data = subset(crop_area_coral_2, rep == "Q1"),
color = "red")
ggplot(coral_annotations[101:200, ], aes(x = column, y = row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "Quadrat 2") +
geom_rect(
aes(
xmin = 0,
xmax = 0.5 * 2000,
ymin = 0,
ymax = 0.5 * 2000
),
color = "black",
alpha = 0
) +
geom_point(data = subset(crop_area_coral_2, rep == "Q2"),
color = "red")
ggplot(coral_annotations[201:300, ], aes(x = column, y = row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "Quadrat 3") +
geom_rect(
aes(
xmin = 0,
xmax = 0.5 * 2000,
ymin = 0,
ymax = 0.5 * 2000
),
color = "black",
alpha = 0
) +
geom_point(data = subset(crop_area_coral_2, rep == "Q3"),
color = "red")
ggplot(coral_annotations[301:400, ], aes(x = column, y = row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "Quadrat 4") +
geom_rect(
aes(
xmin = 0,
xmax = 0.5 * 2000,
ymin = 0,
ymax = 0.5 * 2000
),
color = "black",
alpha = 0
) +
geom_point(data = subset(crop_area_coral_2, rep == "Q4"),
color = "red")
```
By specifying the dimensions of the dimensions of the quadrat, the cropping will be more accurate than when it is estimated, though as you can see, if the dimension are unknown, which may occur if using photoquadrats and pixel locations, the estimated method is still very accurate.
# Coral Data
Now for a real example. This data is published in [Maucieri and Baum 2021](#References) and archived in the [GitHub repository](https://github.com/baumlab/Maucieri_Baum_2021_BioCon) for the paper. In this study, quadrat sizes were switched in 2013. Prior to 2013, they were 0.9cm by 0.6cm and after they were changed to 1m by 1m. However, these quadrats were randomly annotated, using 54 and 100 points respectively, so that each quadrat had a density of 1point/cm^2^. This method of cropping quadrats was developed to deal with these different sized quadrats and to make them both 0.9cm by 0.6cm or 54cm^2^.
First lets load the data.
```{r load data, message = FALSE}
data(softcoral_annotations)
```
There are quite a few quadrats, which we will visualize the first 4 of as with the previous example
```{r visualize soft coral, out.width = '45%', fig.show='hold'}
ex_1 <- subset(softcoral_annotations, Name == unique(softcoral_annotations$Name)[1])
ex_2 <- subset(softcoral_annotations, Name == unique(softcoral_annotations$Name)[2])
ex_3 <- subset(softcoral_annotations, Name == unique(softcoral_annotations$Name)[3])
ex_4 <- subset(softcoral_annotations, Name == unique(softcoral_annotations$Name)[4])
#Plotting each quadrat
ggplot(ex_1, aes(x = Column, y = Row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "KI2013_site14_Q10")
ggplot(ex_2, aes(x = Column, y = Row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "KI2013_site14_Q11")
ggplot(ex_3, aes(x = Column, y = Row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "KI2013_site14_Q12")
ggplot(ex_4, aes(x = Column, y = Row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "KI2013_site14_Q13")
```
Now what you may notice, is that the axes will differ between the photos. This is because the quadrat photos were cropped before uploading to [CoralNet](https://coralnet.ucsd.edu/) which allowed for easy randomized annotations. Due to this, the number of pixels for each photo quadrat will differ, and is mostly unknown. This is why there is the ability to estimate the length and width in pixels within the `crop_area()` function.
For this example, I will also be specifying `obs_range` as being between 36:64, so that quadrats with very high or very low numbers of points within the subsetted area will not be included. Now to crop each of these 1m by 1m quadrats to 0.9m by 0.6m:
```{r randomized crop soft corals, out.width = '45%', fig.show='hold'}
crop_area_softcoral <- crop_area(data = softcoral_annotations, row = "Row",
column = "Column", id = "Name", dim = c(0.9, 0.6),
obs_range = c(36,64))
ex_1_sub <- subset(crop_area_softcoral, Name == unique(softcoral_annotations$Name)[1])
ex_2_sub <- subset(crop_area_softcoral, Name == unique(softcoral_annotations$Name)[2])
ex_3_sub <- subset(crop_area_softcoral, Name == unique(softcoral_annotations$Name)[3])
ex_4_sub <- subset(crop_area_softcoral, Name == unique(softcoral_annotations$Name)[4])
#Plotting each quadrat
ggplot(ex_1, aes(x = Column, y = Row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "KI2013_site14_Q10") +
geom_rect(
aes(
xmin = 0,
xmax = 0.6 * max(Column),
ymin = 0,
ymax = 0.9 * max(Row)
),
color = "black",
alpha = 0
) +
geom_point(data = ex_1_sub,
color = "red")
ggplot(ex_2, aes(x = Column, y = Row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "KI2013_site14_Q11") +
geom_rect(
aes(
xmin = 0,
xmax = 0.6 * max(Column),
ymin = 0,
ymax = 0.9 * max(Row)
),
color = "black",
alpha = 0
) +
geom_point(data = ex_2_sub,
color = "red")
ggplot(ex_3, aes(x = Column, y = Row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "KI2013_site14_Q12") +
geom_rect(
aes(
xmin = 0,
xmax = 0.6 * max(Column),
ymin = 0,
ymax = 0.9 * max(Row)
),
color = "black",
alpha = 0
) +
geom_point(data = ex_3_sub,
color = "red")
ggplot(ex_4, aes(x = Column, y = Row)) +
geom_point() +
theme_classic() +
labs(y = "", x = "", title = "KI2013_site14_Q13") +
geom_rect(
aes(
xmin = 0,
xmax = 0.6 * max(Column),
ymin = 0,
ymax = 0.9 * max(Row)
),
color = "black",
alpha = 0
) +
geom_point(data = ex_4_sub,
color = "red")
```
And that's it, now all the quadrats are cropped to 0.9m by 0.6m quadrats, plus quadrats with a randomly high or low number of points have been removed.
# <a name="References" />References
Anderson, S., and L. F. Marcus. 1993. Effect of quadrat size on measurements of species density. Journal of Biogeography 20: 421-428.
Maucieri, D.G., and J.K. Baum. 2021. Impacts of heat stress on soft corals, an overlooked and highly vulnerable component of coral reef ecosystems, at a central equatorial Pacific atoll. Biological Conservation 262: 1-10. https://doi.org/10.1016/j.biocon.2021.109328