This repository has been archived by the owner on Oct 1, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 44
/
09-B-networks.Rmd
422 lines (278 loc) · 16.5 KB
/
09-B-networks.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
---
layout: topic
title: "Network analysis"
author: "Beth & Nicole"
output: html_document
---
**Assigned Reading:**
> Dormann et al. 2009 [DOI: 10.2174/1874213000902010007](http://dx.doi.org/10.2174/1874213000902010007)
**Supplemental Reading:**
> Montoya et al. 2006 [DOI: 10.1038/nature04927](http://dx.doi.org/10.1038/nature04927)
```{r include = FALSE}
# This code block sets up the r session when the page is rendered to html
# include = FALSE means that it will not be included in the html document
# Write every code block to the html document
knitr::opts_chunk$set(echo = TRUE)
# Write the results of every code block to the html document
knitr::opts_chunk$set(eval = TRUE)
# Define the directory where images generated by knit will be saved
knitr::opts_chunk$set(fig.path = "images/09-B/")
# Set the web address where R will look for files from this repository
# Do not change this address
repo_url <- "https://raw.githubusercontent.com/fukamilab/BIO202/master/"
```
### Key Points
Anything that can be represented as an interaction matrix can be represented as a network (e.g., food webs, plant-pollinator networks, plant-fungal networks, microbe co-occurrence networks, facultative networks, host-parasite networks, etc.)
Networks are interesting because they are a way to organize and quantify community structure beyond simple species identities and numbers. Network structure is purportedly related to community robustness (the ability of the community to resist change after a disturbance), resilience (the ability of the community to recover after a disturbance), evolutionary history, ecosystem service provision, and biodiversity maintenance.
Columns and rows of interaction matrices represent interacting species (columns as pollinators and rows as plants, for example), and cell values represent either the presence/absence of an interaction between two species (binary networks) or the frequency of interactions between two species (weighted networks)
* Binary and weighted interaction networks are analyzed differently
Networks are often unipartite (all species in the same guild), bipartite (two guilds), or tripartite (three guilds). We focus today on **bipartite networks**.
There are many R packages to analyze and visualize networks including:
* `vegan` (not strictly a network package, but required by `bipartite`)
* `bipartite`
* `igraph`
* `sna`
* `d3network`
* `graphviz` (java)
* `qgraph`
For the example analysis we use `vegan` and `bipartite`.
Network metrics are used to quantify network structural properties (for example, a very simple one would be the average number of interactions per species). What network metric you pick depends on what question you’re trying to answer.
Some common *network* metrics that appear often in the literature are:
* **Connectance**: the number of observed interactions divided by the number of possible interactions
+ has been proposed that connectance is a measure of complexity
+ indices: ‘connectance’, ‘weighted connectance’
* **Nestedness**: a network in which the species with many interactions interact with the species with few interactions (and vice versa) is a nested network
+ common in mutualistic networks
+ indices: ‘nestedness’, ‘weighted nestedness’, 'NODF’, 'weighted NODF’
* **Modularity**: a network in which the species form discrete interaction compartments, with little interactions between species outside of each compartment, is a modular network
+ common in antagonistic networks
+ indices: 'computeModules', 'modularity' (in `igraph`)
* **Specialization** – two potential meanings
+ specialized networks = species have few interaction partners
+ indices: 'links per species'
+ specialized networks = networks where species are specific in their interaction partner choice (not opportunistic)
+ indices: '$H_2$'
Some common *species level* (node level) metrics that appear often in the literature are:
* **Species degree**: number of interaction partners
+ indices: 'species degree'
* **Centrality** - 2 kinds
+ **Closeness centrality**: the sum of the number of shortest distances (i.e. number of interactions, also known as 'path lengths') between the species in question and all other species in the network
+ indices: 'closeness_w'
+ **Betweenness centrality**: the sum of the number of pairs of species whose shortest path lengths are connected through the species in question
+ indices: 'betweenness_w'
* **Specialization** (specificity): species level metric similar to '$H_2$'
+ the deviation of observed interactions from opportunistic, abundance-based interactions
+ takes into account interaction partner diversity, relative abundance of other possible interaction partners, and reciprocal specialization between interaction partners
+ Index: 'd'
Use null models to determine if observed network structure is significantly different from random network structure.
* Use the z-score $(observed - mean(null)/sd(null))$ to calculate the statistical significance of observed network structure (more or less than two standard deviations away from the mean $\rightarrow p < 0.05$)
* Null models are common in community ecology in general (e.g., niche overlap, species-area relationship, species co-occurrence, phylogenetic diversity, etc.) so this null model approach is useful for ecologists in general
A common binary null model:
* 'shuffle.web'
+ shuffles cells around the interaction matrix, keeps nothing constant
Common weighted null models:
* 'r2dtable'
+ changes cell values while keeping row and column sums constant
* 'vaznull'
+ changes cell values while keeping connectance (number of links in the matrix) constant
* 'swap.web'
+ changes cell values while keeping both connectance and row/column sums constant
Choose null model wisely - how restrictive do you want to be?
Also use null models if you want to compare structural properties between networks.
* All network metrics are affected by network size (number of rows and columns) and shape (ratio of rows:columns) so it is not really possible to reliably compare network metrics between networks of different sizes and shapes
* Common solution is to calculate the z-score to standardize metrics first, then compare the z-scores of a metric between networks
That's networks in a nutshell! Proceed at your own risk…
<br>
### Analysis Example
In this analysis example, we're going to use the `bipartite` R package to visualize networks and to generate null models for significance of different network patterns. We will be using datasets that are already included in the package. These are plant-pollinator interaction matrices (pollinators species as columns and plant species as rows).
Each dataset, or interaction matrix, is a summary of observed interactions between a pollinator species and a particular plant species in a specific country (or site). The number of these pairwise interactions is given by the value in each matrix cell. Each dataset was obtained in a different country, so the dataset/matrix and the network constructed from that will be named after its country of origin (or region).
```{r include = TRUE}
# Clear environment
rm(list=ls())
# Load required packages for bipartite
library(permute)
library(lattice)
library(vegan)
library(statnet.common)
library(network)
library(sna)
# Load main package
library(bipartite)
# Read all the interaction network matrices into a list
webs <- list(Safariland, barrett1987, elberling1999,
memmott1999, motten1982, olesen2002aigrettes)
# Re-name the datasets according to the sites for each plant-pollinator network
webs.names <- c("Argentina", "Canada", "Sweden", "UK", "USA", "Azores")
names(webs) <- webs.names
# View data (interaction matrix, for generating a network, or a web)
lapply(webs, head, n = 2L) # Only display the first two rows in the dataset
```
As you can see, these are all quantitative or *weighted networks* (cells contain values ranging from $0$ to any whole number $n$, as supposed to just $0$ and $1$, as in binary networks).
<br>
####Visualizing networks
We can visualize these networks using the `visweb()` and `plotweb()` functions in the `bipartite` R package. For the viswebs, the plants are in rows and the pollinators are in columns. For the plotwebs, the plant species are colored in green and the pollinators are in blue.
```{r include = TRUE}
# Visualize the observed networks from the datasets
visweb(webs$Argentina)
plotweb(webs$Argentina, text.rot=90, col.low = "green", col.high = "blue")
visweb(webs$Canada)
plotweb(webs$Canada, text.rot=90, col.low = "green", col.high = "blue")
visweb(webs$Sweden)
plotweb(webs$Sweden, text.rot=90, col.low = "green", col.high = "blue")
visweb(webs$UK)
plotweb(webs$UK, text.rot=90, col.low = "green", col.high = "blue")
visweb(webs$USA)
plotweb(webs$USA, text.rot=90, col.low = "green", col.high = "blue")
visweb(webs$Azores)
plotweb(webs$Azores, text.rot=90, col.low = "green", col.high = "blue")
```
All networks look different, since they are essentially very different networks (different species and different ecosystems) even though they are all plant-pollinator networks.
<br>
####Network structure analysis
Next, we're going to calculate some network metrics, or indices. We're going to look at 'nestedness' and 'links per species' at the network level. You can also repeat the same analyses for indices at the species level (e.g., degree, which could be thought of as the equivalent of 'links per species' at a different level). We'll be using the function `networklevel()` (and `specieslevel()` can be used for species level analyses).
```{r include = TRUE}
# Calculate network metric nestedness for all plant-pollinator sites
net.metrics.nest <- lapply(webs, networklevel, index = 'nestedness')
# Calculate network metric links per species for all plant-pollinator sites
net.metrics.links <- lapply(webs, networklevel, index = 'links per species')
```
Then we will create null models, which are new networks that have been randomized to a certain extent (based on the original observed networks) to remove any possible patterns. We will use three different null model types that work for weighted networks:
1. r2dtable
2. vaznull
3. swap.web
and see how the significance varies depending on which null model type we use as a fun exercise. We use the function `nullmodel()` to create our nulls (we set the number of nulls to 500) and we can use the `method` argument to specify the null model type.
```{r include = TRUE}
# Time consuming step!
# Load environment (already saved objects)
#load("data/network_analysis_example.RData")
# Make null models for all sites using the r2dtable null
net.nulls.r2d <- lapply(webs, nullmodel, method = "r2dtable", N = 500)
# Make null models for all sites using the vaznull null
net.nulls.vaz <- lapply(webs, nullmodel, method = "vaznull", N = 500)
# Make null models for all sites using the swap.web null
net.nulls.swap <- lapply(webs, nullmodel, method = "swap.web", N = 500)
# Save null objects
#save(net.nulls.r2d, net.nulls.vaz, net.nulls.swap, file = "data/network_analysis_example.RData")
```
Then, we need to calculate the same indices for the different nulls (as we already did with the observed networks: 'nestedness' and 'links per species') created by each null model type. Since there will be a lot of repetition, we start by defining a function for each index where we can specify a null model type for generating the null distribution.
```{r include = TRUE}
# Null distribution function for nestedness - calculates the network nestedness for each null (using a particular null method) for each site
net.null.nest = function(nulls){
net.null.metric <- list()
for (i in 1:length(nulls)) {
net.null.metric[[i]] = do.call('rbind',
lapply(nulls[[i]], networklevel, index = 'nestedness'))
}
names(net.null.metric) <- webs.names
return(net.null.metric)
}
# Null distribution function for links per species - calculates the network links per species metric for each null (using a particular null method) for each site
net.null.links = function(nulls){
net.null.metric <- list()
for (i in 1:length(nulls)) {
net.null.metric[[i]] = do.call('rbind',
lapply(nulls[[i]], networklevel, index = 'links per species'))
}
names(net.null.metric) <- webs.names
return(net.null.metric)
}
```
```{r include = TRUE}
# Time consuming step!
# Load environment (already saved objects)
#load("data/network_analysis_example.RData")
r2d.nest <- net.null.nest(net.nulls.r2d)
vaz.nest <- net.null.nest(net.nulls.vaz)
swap.nest <- net.null.nest(net.nulls.swap)
r2d.links <- net.null.links(net.nulls.r2d)
vaz.links <- net.null.links(net.nulls.vaz)
swap.links <- net.null.links(net.nulls.swap)
# Save environment
#save.image(file="data/network_analysis_example.RData")
```
Next, we will define a function for calculating the z-score when comparing the observed network with the null networks.
```{r include = TRUE}
# Z-score function for comparing different networks
net.zscore = function(obsval, nullval) {
(obsval - mean(nullval))/sd(nullval)
}
```
Then, we apply the `net.zscore()` function to obtain our z-scores for each network site.
```{r include = TRUE}
# Function that perform z-score calculation of nestedness using the observed and null networks
nest.zscore = function(nulltype){
net.nest.zscore <- list()
for(i in 1:length(net.metrics.nest)){
net.nest.zscore[[i]] = net.zscore(net.metrics.nest[[i]]['nestedness'],
nulltype[[i]][ ,'nestedness'])
}
names(net.nest.zscore) <- webs.names
return(net.nest.zscore)
}
# Function that perform z-score calculation of links per species using the observed and null networks
links.zscore = function(nulltype){
net.links.zscore <- list()
for(i in 1:length(net.metrics.links)){
net.links.zscore[[i]] = net.zscore(net.metrics.links[[i]]['links per species'],
nulltype[[i]][ ,'links per species'])
}
names(net.links.zscore) <- webs.names
return(net.links.zscore)
}
```
```{r include = TRUE}
r2d.nest.zscore <- nest.zscore(r2d.nest)
vaz.nest.zscore <- nest.zscore(vaz.nest)
swap.nest.zscore <- nest.zscore(swap.nest)
r2d.links.zscore <- links.zscore(r2d.links)
vaz.links.zscore <- links.zscore(vaz.links)
swap.links.zscore <- links.zscore(swap.links)
```
Lastly, we calculate the two-sided p-value for significance of the network property for nestedness and links per species. We start by defining a function below and then apply it for the various null models and metrics.
```{r include = TRUE}
# Function that adds p-values according to the obtained z-scores
add.pvalues = function(net.metric.zscore){
# Change the output class from list of a list into a matrix
net.metric.zscore <- do.call('rbind', net.metric.zscore)
# Convert z-scores to p-values (two-sided)
net.metric.pvalue <- 2*pnorm(-abs(net.metric.zscore))
# Change matrix into a dataframe
net.metric.pvalue <- as.data.frame(as.table(net.metric.pvalue))
colnames(net.metric.pvalue) <- c('site', 'metric', 'pvalue')
net.metric.pvalue <- within(net.metric.pvalue, {
significance <- ifelse(pvalue <= 0.001, "***",
ifelse(pvalue <= 0.01, "**",
ifelse(pvalue <= 0.05, "*", "not significant")))
})
return(net.metric.pvalue)
}
```
```{r include = TRUE}
# Add the p-values to our nestedness results
r2d.test.nest <- add.pvalues(r2d.nest.zscore)
vaz.test.nest <- add.pvalues(vaz.nest.zscore)
swap.test.nest <- add.pvalues(swap.nest.zscore)
# Add the p-values to our links per species results
r2d.test.links <- add.pvalues(r2d.links.zscore)
vaz.test.links <- add.pvalues(vaz.links.zscore)
swap.test.links <- add.pvalues(swap.links.zscore)
# Print the nestedness results
print(r2d.test.nest)
print(vaz.test.nest)
print(swap.test.nest)
# Print the links per species results
print(r2d.test.links)
print(vaz.test.links)
print(swap.test.links)
# Save environment
#save.image(file="data/network_analysis_example.RData")
```
<br>
### Discussion Questions
* Based on the final output above, how do the different null methods compare for
+ nestedness?
+ links per species?
* Which null method is more conservative?
* How does nestedness compare between the different sites (i.e., the networks constructed for each geographic region)?
* Why does the 'r2dtable' null method work for 'links per species' but not 'vaznull' and 'swap.web' (i.e., why do we obtain `NaN` values for these null methods)?
* Do you think bipartite networks are useful for your work?