# Species interactions (co-occurrence)

Source: https://esajournals.onlinelibrary.wiley.com/doi/full/10.1002/ecy.2133

## Correcting pairwise co-occurrence for environmental/habitat filtering

Species pairs may or may not co-occur for a variety of reasons. Although pairwise co-occurrence is often interpreted directly as a signal of pairwise species interactions, co-occurrences themselves may be a result of habitat filtering (e.g., Blois et al. 2014, Morueta-Holme et al. 2016). Several methods have been proposed to first account for the influence of environmental factors in driving species co-occurrences and then examine the remaining "significant" associations. For several methods, accounting for the environment is implemented alongside the co-occurrence method (see the R package `BayesComm` for the JSDM residual covariance method, the R Bioconductor package `ccrepe` for the correlation methods). For the remaining methods (all constraint-based methods and partial correlation), we use the framework proposed by Blois et al. (2014; see also an implementation in Li & Waller 2016). In brief, the framework operates post-hoc on significant pairwise associations to evaluate whether negative co-occurrence is due to differences in habitat or whether positive co-occurrence is due to similarity in habitat (see Figures 1 and 2 in Blois et al. 2014 for a conceptual diagram). Note that the Blois et al. (2014) also considers how dispersal could generate patterns of positive or negative co-occurrence, though this is not considered in the present study nor implemented in this example.

In [1]:
# Housekeeping

library(cooccur)
library(reshape2)
library(tidyverse)
library(vegan)

── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ──
✔ ggplot2 3.1.0       ✔ purrr   0.3.2  
✔ tibble  2.1.1       ✔ dplyr   0.8.0.1
✔ tidyr   0.8.3       ✔ stringr 1.4.0  
✔ readr   1.1.1       ✔ forcats 0.3.0  
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
Loading required package: permute
Loading required package: lattice
This is vegan 2.5-4


In [2]:
# Read in data

species_composition = read.table("../../../data/amplicon/species_composition_normalized_to_15k_reads.txt", 
                                 sep = "\t",
                                 header = T,
                                 row.names = 1)

metadata = read.table("../../../data/amplicon/metadata.txt",
                      sep = "\t",
                      header = T,
                      row.names = 1)


# Extract regime shift data without predation

x = metadata$Experiment != "FiltrateExp" & # keep only regime shift data
    metadata$Predation != 1 & # exclude predation
    metadata$Immigration != "stock" # exclude stock

# Subset

species_composition = species_composition[x,] # keep only species with data
species_composition = species_composition[,colSums(species_composition)>0]
metadata = metadata[x,-c(1, 3, 6)] # remove redundant columns

# Convert species composition to presence absence data
species_composition = as.data.frame(ifelse(species_composition == 0, 0, 1))

# Convert metadata factors to numeric
metadata$Streptomycin = as.numeric(as.character(metadata$Streptomycin))
metadata$Immigration = as.numeric(as.character(metadata$Immigration))

We need a list of the "significant" positive and negative co-occurrences for each species pair (implement this method _after_ running a co-occurrence analysis). Here we'll implement a simple co-occurrence method (combinatorics) available in R from the `cooccur` package.

In [3]:
# run the combinatorics method
# note: must transpose our "site x species" data to "species x site" data for function to work
cooccur(t(species_composition), type = "spp_site", spp_names = TRUE)$results %>%
  # select only our variables of interest: species pairs, and their
  # probabilities of being positively associated ("p_gt" < 0.05) or
  # negatively associated ("p_lt" < 0.05)
  # for more information see ?cooccur::prob.table
  select(sp1_name, sp2_name, positive = p_gt, negative = p_lt) %>%
  gather(positive, negative, key = sign, value = probability) %>%
  filter(sign == "positive" & probability < 0.05 |
           sign == "negative" & probability < 0.05) %>%
  # assign "association" based on the sign
  mutate(association = ifelse(sign == "positive", 1, -1)) %>%
  select(-sign, -probability) -> data_pairs
# place positive & negative associations in different data frames
data_pairs_neg <- filter(data_pairs, association < 0)
data_pairs_pos <- filter(data_pairs, association > 0)



The goal is to loop through each pair of negatively associated species, find the sites where the pair never co-occurs and determine whether the environmental characteristics are different at those sites. To test whether the environmental characteristics are different, Blois et al. (2014) recommend using ANOVA, but in many cases one might have multiple environmental characteristics of interest and we thus illustrate the use of multivariate Anova (PERMANOVA; Anderson 2001). Finally, if the sites are significantly different, remove that pair of species from the list of negatively associated species.

In [4]:
num_neg <- nrow(data_pairs_neg)
data_pairs_neg_env <- numeric()

for (i in 1:num_neg){
  both_sp <- cbind(species_composition[,data_pairs_neg[i,"sp1_name"]],
                 species_composition[,data_pairs_neg[i,"sp2_name"]])
  sp1_only <- which(both_sp[,1] == 1 & both_sp[,2] == 0)
  sp2_only <- which(both_sp[,1] == 0 & both_sp[,2] == 1)
  env_tmp_1 <- metadata[sp1_only,-1]
  env_tmp_2 <- metadata[sp2_only,-1]
  if (nrow(env_tmp_1) == 0 | nrow(env_tmp_2) == 0) {
    data_pairs_neg_env[i] <- NA
  } else {
    env_tmp <- bind_rows(list(one = env_tmp_1, two = env_tmp_2), .id = "fac")
    # 'adonis' is the PERMANOVA function
    ad_tmp <- adonis(select(env_tmp, -fac) ~ env_tmp$fac, permutations = 99)
    data_pairs_neg_env[i] <- ad_tmp$aov.tab$`Pr(>F)`[1]
  }
}

In [5]:
# which species associations remain after removing those caused by
# environmental conditions that are significantly different?
data_pairs_neg[which(data_pairs_neg_env > 0.05),]

Unnamed: 0,sp1_name,sp2_name,association
1,Aeromonas_caviae_HAMBI_1972,Stenotrophomonas_maltophilia_HAMBI_2659,-1
3,Bordetella_avium_HAMBI_2160,Stenotrophomonas_maltophilia_HAMBI_2659,-1
4,Comamonas_testosteroni_HAMBI_403,Stenotrophomonas_maltophilia_HAMBI_2659,-1
6,Myroides_odoratus_HAMBI_1923,Stenotrophomonas_maltophilia_HAMBI_2659,-1
7,Niabella_yanshanensis_HAMBI_3031,Pseudomonas_chlororaphis_HAMBI_1977,-1
8,Paracoccus_denitrificans_HAMBI_2443,Stenotrophomonas_maltophilia_HAMBI_2659,-1
9,Pseudomonas_chlororaphis_HAMBI_1977,Stenotrophomonas_maltophilia_HAMBI_2659,-1
11,Sphingobacterium_multivorum_HAMBI_1874,Stenotrophomonas_maltophilia_HAMBI_2659,-1
12,Sphingobacterium_spiritivorum_HAMBI_1896,Stenotrophomonas_maltophilia_HAMBI_2659,-1
13,Sphingobium_yanoikuyae_HAMBI_1842,Stenotrophomonas_maltophilia_HAMBI_2659,-1


Then, for each pair of positively associated species, test whether the environment at the site of positive co-occurrence is different from the sites where both species never occur and remove those associations from subsequent analysis.

In [6]:
num_pos <- nrow(data_pairs_pos)
data_pairs_pos_env <- numeric()

for (i in 1:num_pos){
  both_sp<-cbind(species_composition[,data_pairs_pos[i,"sp1_name"]],
                 species_composition[,data_pairs_pos[i,"sp2_name"]])
  both_occ <- which(both_sp[,1] == 1 & both_sp[,2] == 1)
  neith_occ <- which(both_sp[,1] == 0 & both_sp[,2] == 0)
  env_tmp_1 <- metadata[both_occ,]
  env_tmp_2 <- metadata[neith_occ,]
  if (nrow(env_tmp_1) == 0 | nrow(env_tmp_2) == 0) {
    data_pairs_pos_env[i] <- NA
  } else {
    env_tmp <- bind_rows(list(one = env_tmp_1, two = env_tmp_2), .id = "fac")
    ad_tmp <- adonis(select(env_tmp, -fac) ~ env_tmp$fac, permutations = 99)
    data_pairs_pos_env[i] <- ad_tmp$aov.tab$`Pr(>F)`[1]
  }
}

In [7]:
# which species associations remain after removing those caused by
# environmental conditions that are significantly different?
data_pairs_pos[which(data_pairs_pos_env > 0.05),]

Unnamed: 0,sp1_name,sp2_name,association
6,Azorhizobium_caulinodans_HAMBI_216,Paracoccus_denitrificans_HAMBI_2443,1
28,Myroides_odoratus_HAMBI_1923,Paracoccus_denitrificans_HAMBI_2443,1
37,Paracoccus_denitrificans_HAMBI_2443,Sphingobium_yanoikuyae_HAMBI_1842,1


When corrected for environmental filtering, very few cases of higher than expected negative or positive species co-occurrence remain. All except for one case of negative co-occurrence are explained by a single rare species (2659) that does not co-occur with 9 other species. The remaining case is between 1977 and 3031. Positive co-occurrences appear between different combinations of four relatively low-abundance soil bacteria (216, 1842, 1923, 2443).

## References

Anderson, M. J. 2001. A new method for non-parametric multivariate analysis of variance. Austral Ecology 26:32–46. doi: [10.1111/j.1442-9993.2001.01070.pp.x](https://doi.org/10.1111/j.1442-9993.2001.01070.pp.x)

Blois, J. L., N. J. Gotelli, A. K. Behrensmeyer, J. T. Faith, S. K. Lyons, J. W. Williams, K. L. Amatangelo, A. Bercovici, A. Du, J. T. Eronen, G. R. Graves, N. Jud, C. Labandeira, C. V. Looy, B. McGill, D. Patterson, R. Potts, B. Riddle, R. Terry, A. Tóth, A. Villaseñor, and S. Wing. 2014. A framework for evaluating the influence of climate, dispersal limitation, and biotic interactions using fossil pollen associations across the late Quaternary. Ecography 37:1095–1108. doi: [10.1111/ecog.00779](https://doi.org/10.1111/ecog.00779)

Li, D., and D. Waller. 2016. Long-term shifts in the patterns and underlying processes of plant associations in Wisconsin forests. Global Ecology and Biogeography 25:516-526. doi: [10.1111/geb.12432](https://doi.org/10.1111/geb.12432)

Morueta-Holme, N., B. Blonder, B. Sandel, B. J. McGill, R. K. Peet, J. E. Ott, C. Violle, B. J. Enquist, P. M. Jørgensen, and J.-C. Svenning. 2016. A network approach for inferring species associations from co-occurrence data. Ecography 39:1–12. doi: [10.1111/ecog.01892](https://doi.org/10.1111/ecog.01892)