In [1]:
%load_ext rpy2.ipython
# Turning on in-IPy R capabilities

In [2]:
%%R
library(phyloseq)
library(vegan)
library(plyr)
library(dplyr)
library(ggplot2)

Loading required package: permute
Loading required package: lattice
This is vegan 2.3-0

Attaching package: ‘dplyr’

The following objects are masked from ‘package:plyr’:

    arrange, count, desc, failwith, id, mutate, rename, summarise,
    summarize

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union



In [3]:
%%R
#Creating the physeq object from our tree and biom table with taxonomic and metadata already included, telling it \
# that the taxonomy is from greengenes database, so it will recognize the samples
physeq = import_biom("../../SeqData/otu_table.tax.meta.biom", "../../SeqData/trees/fulltree.tre", parseFunction = parse_taxonomy_greengenes)

In [4]:
%%R
physeq

phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 5452 taxa and 72 samples ]
sample_data() Sample Data:       [ 72 samples by 3 sample variables ]
tax_table()   Taxonomy Table:    [ 5452 taxa by 8 taxonomic ranks ]
phy_tree()    Phylogenetic Tree: [ 5452 tips and 5450 internal nodes ]


In [5]:
%%R
# Normalizes the sample counts by the total - i.e., reporting what fraction of each sample each OTU makes up.
physeq = transform_sample_counts(physeq, function(x) x / sum(x))

In [None]:
%%R
df = as(sample_data(physeq), "data.frame")
d = distance(physeq, method = "unifrac")
sampdat = sample_data(physeq)
groups = as.factor(sampdat$Day)
x = betadisper(d, groups)
boxplot(x, ylab = "Distance to centroid")
anova(x)
TukeyHSD(x, ordered = FALSE, conf.level = 0.95)
# Performing test of beta dispersion - the permanova/adonis test requires that groups have similar dispersion.
# This doesn't look outrageously bad here. Some worry re. Kaolinite, as expected, and soils look tight too.

In [42]:
%%R
df = as(sample_data(physeq), "data.frame")
d = distance(physeq, method = "bray")

d.adonis = adonis(d ~ sample_data(physeq)$Month + sample_data(physeq)$Trtmt, df)
d.adonis


Call:
adonis(formula = d ~ sample_data(physeq)$Month + sample_data(physeq)$Trtmt,      data = df) 

Permutation: free
Number of permutations: 999

Terms added sequentially (first to last)

                          Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)    
sample_data(physeq)$Month  1    0.2669 0.26694  1.6345 0.04511  0.055 .  
sample_data(physeq)$Trtmt  1    0.7512 0.75123  4.5999 0.12695  0.001 ***
Residuals                 30    4.8995 0.16332         0.82794           
Total                     32    5.9176                 1.00000           
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


There is a significant effect of both month and treatment, across samples

In [7]:
%%R
physeq.QS = subset_samples(physeq, Trtmt == c("Soil","Quartz"))
physeq.FS = subset_samples(physeq, Trtmt == c("Soil","Ferrihydrite"))
physeq.QF = subset_samples(physeq, Trtmt == c("Quartz","Ferrihydrite"))

In [8]:
%%R
ps = physeq.QS
df = as(sample_data(ps), "data.frame")
d = distance(ps, method = "bray")

d.adonis = adonis(d ~ sample_data(ps)$Month + sample_data(ps)$Trtmt, df)
d.adonis


Call:
adonis(formula = d ~ sample_data(ps)$Month + sample_data(ps)$Trtmt,      data = df) 

Permutation: free
Number of permutations: 999

Terms added sequentially (first to last)

                      Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)    
sample_data(ps)$Month  1   0.09045 0.09045  0.8509 0.03781  0.479    
sample_data(ps)$Trtmt  1   1.02607 1.02607  9.6527 0.42894  0.001 ***
Residuals             12   1.27558 0.10630         0.53325           
Total                 14   2.39211                 1.00000           
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


In [9]:
%%R
ps = physeq.FS
df = as(sample_data(ps), "data.frame")
d = distance(ps, method = "bray")

d.adonis = adonis(d ~ sample_data(ps)$Month + sample_data(ps)$Trtmt, df)
d.adonis


Call:
adonis(formula = d ~ sample_data(ps)$Month + sample_data(ps)$Trtmt,      data = df) 

Permutation: free
Number of permutations: 999

Terms added sequentially (first to last)

                      Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)    
sample_data(ps)$Month  1   0.17345 0.17345  1.5799 0.05869  0.159    
sample_data(ps)$Trtmt  1   1.24509 1.24509 11.3409 0.42127  0.001 ***
Residuals             14   1.53703 0.10979         0.52005           
Total                 16   2.95557                 1.00000           
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


In [10]:
%%R
ps = physeq.QF
df = as(sample_data(ps), "data.frame")
d = distance(ps, method = "bray")

d.adonis = adonis(d ~ sample_data(ps)$Month + sample_data(ps)$Trtmt, df)
d.adonis


Call:
adonis(formula = d ~ sample_data(ps)$Month + sample_data(ps)$Trtmt,      data = df) 

Permutation: free
Number of permutations: 999

Terms added sequentially (first to last)

                      Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)   
sample_data(ps)$Month  1   0.33423 0.33423  2.1606 0.11609  0.016 * 
sample_data(ps)$Trtmt  1   0.37921 0.37921  2.4514 0.13171  0.005 **
Residuals             14   2.16572 0.15469         0.75220          
Total                 16   2.87916                 1.00000          
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
