# This re-analysis provides the greengenes corrected tree to allow more accurate calculations of PD_whole_tree
##Australia analysis alpha-diversity

The goal of this anlysis is to run basic alpha-diversity comparisons for the Australia GCMP dataset.
These will include estimates of richness (with obs_species), equitability, and PD for all samples, and alpha_diversity comparisons between compartments, sites, etc.

In [1]:
cd ..

/Users/FJPollock/Dropbox/Coral_Microbiomes_Postdoc/GCMP_OSU_PSU_Shared_Folder/Coral_microbe_coevolution/Projects/Australia_Coevolution_Paper/16S_analysis/3a_adiv_australia_analysis


In [2]:
#Import libraries
from os.path import join,abspath
from os import listdir

In [8]:
#Set file paths
input_dir = abspath("input/")
#NOTE: this is the rarified OTU table
otu_table_1000 = abspath("input/otu_table_mc2_wtax_no_pynast_failures_no_organelles_even1000.biom")
otu_table = abspath("input/otu_table_mc2_wtax_no_pynast_failures_no_organelles.biom")
tree_fp = abspath("input/gg_constrained_rep_set_fastttree.tre")
mapping = abspath("input/gcmp16S_map_r24.txt")
output_dir = abspath("output/")
out_table_1000_scleractinia_only = abspath("input/otu_table_subset_scleractinian_no_whole_coral_even1000.biom")

### Alpha rarefaction
First let's calculate overall alpha-diversity rarefaction for the whole dataset.  Later we may want to filter down to mostly the study corals.

In [4]:
curr_output_dir = join(output_dir,"alpha_rarefaction_1000_gg_constrained_tree")
!alpha_rarefaction.py -i $otu_table_1000 -t $tree_fp -m $mapping -o $curr_output_dir -f

From the alpha rarefaction step we get some interesting results.  There may be minor differences in richness between compartments, but these are not super striking.  Instead, differences betweeen all corals and outgroups seem pretty major (corals are less rich), as do mode_of_larval_reproduction (brooders are indeed less rich),sample_type.

So now it's time to run some statistical tests on these categories.

In [6]:
# PD_whole_tree w gg_constrained_tree
collated_alpha_dir = join(output_dir,"alpha_rarefaction_1000_gg_constrained_tree/alpha_div_collated/")
obs_otus_file = join(collated_alpha_dir,"PD_whole_tree.txt")
test_cat_output_dir = join(output_dir,"alpha_rarefaction_1000_stats_scleractinian_only_PD_gg_constrained_tree")
test_categories = "host_clade_sensu_fukami,field_host_genus_id,field_host_name,Huang_Roy_tree_name,NCBI_inherited_blast_name,Mode_of_larval_development,sample_type,BiologicalMatter,reef_name,functional_group_sensu_darling,sediment_contact,binary_macroalgal_contact,binary_turf_contact,dominant_cover_2m
!compare_alpha_diversity.py -i $obs_otus_file -o $test_cat_output_dir -d 1000 -c $test_categories -p fdr -m $mapping



In [7]:
# Observed OTU's w gg_constrained_tree
collated_alpha_dir = join(output_dir,"alpha_rarefaction_1000_gg_constrained_tree/alpha_div_collated/")
obs_otus_file = join(collated_alpha_dir,"observed_otus.txt")
test_cat_output_dir = join(output_dir,"alpha_rarefaction_1000_stats_Observed_OTUs_gg_constrained_tree")
test_categories = "host_clade_sensu_fukami,field_host_genus_id,field_host_name,Huang_Roy_tree_name,NCBI_inherited_blast_name,Mode_of_larval_development,sample_type,BiologicalMatter,reef_name,functional_group_sensu_darling,sediment_contact,binary_macroalgal_contact,binary_turf_contact,dominant_cover_2m
!compare_alpha_diversity.py -i $obs_otus_file -o $test_cat_output_dir -d 1000 -c $test_categories -p fdr -m $mapping



# Scleractinian-only

In [9]:
curr_output_dir = join(output_dir,"alpha_rarefaction_1000_Scleractinia_only_gg_constrained_tree")
!alpha_rarefaction.py -i $out_table_1000_scleractinia_only -t $tree_fp -m $mapping -o $curr_output_dir -f

In [10]:
# Observed OTU's w gg_constrained_tree (Scleractinian-only)
collated_alpha_dir = join(output_dir,"alpha_rarefaction_1000_Scleractinia_only_gg_constrained_tree/alpha_div_collated/")
obs_otus_file = join(collated_alpha_dir,"observed_otus.txt")
test_cat_output_dir = join(output_dir,"alpha_rarefaction_1000_Scleractinia_only_gg_constrained_tree_Observed_OTU_Scleractinian_only_stats")
test_categories = "BiologicalMatter"
!compare_alpha_diversity.py -i $obs_otus_file -o $test_cat_output_dir -d 1000 -c $test_categories -p fdr -m $mapping



In [11]:
# PD_whole_tree w gg_constrained_tree (Scleractinian-only)
collated_alpha_dir = join(output_dir,"alpha_rarefaction_1000_Scleractinia_only_gg_constrained_tree/alpha_div_collated/")
obs_otus_file = join(collated_alpha_dir,"PD_whole_tree.txt")
test_cat_output_dir = join(output_dir,"alpha_rarefaction_1000_Scleractinia_only_gg_constrained_tree_PD_Scleractinian_only_stats")
test_categories = "BiologicalMatter"
!compare_alpha_diversity.py -i $obs_otus_file -o $test_cat_output_dir -d 1000 -c $test_categories -p fdr -m $mapping



# Calulate per sample alpha diversity for downstream use

In [None]:
!alpha_diversity.py -i otu_table.biom -m chao1,PD_whole_tree,equitability -o adiv_chao1_pd.txt -t rep_set.tre

In [6]:
!alpha_diversity.py -i input/otu_table_mc2_wtax_no_pynast_failures_no_organelles_even1000.biom -m equitability -o output/equitability.txt -t input/rep_set.tre