# Deep rarefaction analysis
Reviewers suggested we try alternatives to the 1k rarefaction depth. Also there is evidence that you get significant info from reads until at least 50k in many systems. The tradeoff, of course, is that this results in fewer samples.

So in this script we will test different rarefaction depths, and verify whether e.g. the pattern of skeleton > tissue > mucus diversity holds.

The choice of depths is a little arbitrary, but going with 1k (original study), 5k (preserves many samples), 10k (same as Apprill et al, mentioned by reviewers), 15k (keeps about half of the samples),20k (about highest to preserve reasonable sample numbers).

In [1]:
from os import listdir
from os.path import join,abspath,splitext
!ls ../input
input_dir = "../input/"
output_dir = "../output/"
otu_table = join(input_dir,"otu_table_mc2_wtax_no_pynast_failures_no_organelles_coral_tissue_mucus_skeleton_only.biom")
mapping = join(input_dir,"gcmp16S_map_r25.txt")
tree = join(input_dir,"gg_constrained_rep_set_fastttree.tre")
rarefaction_depths = [1000,5000,10000,15000,20000]
bdiv_prefs_fp = join(input_dir,"bdiv_bc_prefs.txt")

bdiv_bc_prefs.txt
gcmp16S_map_r25.txt
gg_constrained_rep_set_fastttree.tre
host_tree_from_step_11.newick
otu_table_mc2_wtax_no_pynast_failures_no_organelles_coral_tissue_mucus_skeleton_only.biom
readme.txt
table_summary.txt


In [10]:
#Prefiltering
#Table summaries suggest that there are are a few outgroup samples 
#still in the analysis.
otu_table_no_outgroups = join(output_dir,"otu_table_no_outgroups.biom")
!filter_samples_from_otu_table.py -i $otu_table -m $mapping -s "outgroup:n" \
  -o $otu_table_no_outgroups

    

In [2]:

rarefied_data = {}
n_reps = 10 #for the records, this is the default value
for depth in rarefaction_depths:
    
    print("Rarefying at depth %i" % depth)
    
    curr_output_dir = join(output_dir,"rarified_to_%i/" %depth)
    rarefied_data[depth] = curr_output_dir
    !multiple_rarefactions_even_depth.py -i $otu_table_no_outgroups \
      -o $curr_output_dir --lineages_included -d $depth -n 10
    

Rarefying at depth 1000
Error in multiple_rarefactions_even_depth.py: option -i: file does not exist: '-o'

If you need help with QIIME, see:
http://help.qiime.org
Rarefying at depth 5000
Error in multiple_rarefactions_even_depth.py: option -i: file does not exist: '-o'

If you need help with QIIME, see:
http://help.qiime.org
Rarefying at depth 10000
Error in multiple_rarefactions_even_depth.py: option -i: file does not exist: '-o'

If you need help with QIIME, see:
http://help.qiime.org
Rarefying at depth 15000
Error in multiple_rarefactions_even_depth.py: option -i: file does not exist: '-o'

If you need help with QIIME, see:
http://help.qiime.org
Rarefying at depth 20000
Error in multiple_rarefactions_even_depth.py: option -i: file does not exist: '-o'

If you need help with QIIME, see:
http://help.qiime.org


### Define rarifaction directories 

Do this without running analytical steps to allow easier checkpointing

In [2]:
#Separately inputing just the informational part of the rarifaction so you can
#rariefied filepaths without actually rerunning the analysis.
rarefaction_depths = [1000,5000,10000,15000,20000]
print(rarefaction_depths)
rarefied_data = {}
collated_alpha_dirs = {}
for depth in rarefaction_depths: 
    curr_output_dir = join(output_dir,"rarified_to_%i/" %depth)
    rarefied_data[depth] = curr_output_dir
    collated_alpha_dir = join(output_dir,"adiv_%i/" %depth)
    collated_alpha_dirs[depth] = collated_alpha_dir
    
print(collated_alpha_dirs)

[1000, 5000, 10000, 15000, 20000]
{1000: '../output/adiv_1000/', 5000: '../output/adiv_5000/', 15000: '../output/adiv_15000/', 10000: '../output/adiv_10000/', 20000: '../output/adiv_20000/'}


#### Alpha-diveristy Compartment comparison

Run alpha diversity for ALL samples (NOT split by compartment)

In [None]:
adiv_methods = "equitability,PD_whole_tree,chao1,observed_otus"
comparison_categories = "BiologicalMatter,functional_group_sensu_darling"
for depth in rarefaction_depths:
    print ("Analyzing depth %i" %depth)
    otu_table_dir = rarefied_data[depth]
    curr_outdir = join(output_dir,"adiv_%i") %depth
    
    !alpha_diversity.py -i $otu_table_dir -o $curr_outdir -t $tree -m $adiv_methods
    
    print ("Collating alpha diversity...")
    collated_alpha_dir = join(output_dir,"collated_alpha_%i/" %depth)
    print ("Collated alpha dir: %s" %collated_alpha_dir)
    print ("alpha diversity dir: %s" %curr_outdir)
    !collate_alpha.py -i $curr_outdir -o $collated_alpha_dir
    print ("Running statistical comparisons...")
    compare_alpha_results = join(output_dir,"compare_alpha_%i" %depth)
    for m in adiv_methods.split(","):
        collated_alpha_file = join(collated_alpha_dir,"%s.txt" %m)
        !compare_alpha_diversity.py -i $collated_alpha_file -o $compare_alpha_results \
          -m $mapping -c $comparison_categories -t nonparametric

Analyzing depth 1000


### Add per-sample alpha-diversity to mapping file

In [20]:
depth = 1000
alpha_dir = collated_alpha_dirs[depth]
#alpha_files = ",".join([join(alpha_dir,m) for m in ("observed_otus.txt","equitability.txt","PD_whole_tree.txt","chao1.txt")])
alpha_file = join(alpha_dir,"alpha_rarefaction_1000_0.txt")
output_file = join(output_dir,"mapping_with_alpha_values.txt")
cmd_str = "add_alpha_to_mapping_file.py -i %s -o %s -m %s --depth %i --binning_method quantile -b 4" %(alpha_file,output_file,mapping,depth)
print(cmd_str)
!$cmd_str

add_alpha_to_mapping_file.py -i ../output/adiv_1000/alpha_rarefaction_1000_0.txt -o ../output/mapping_with_alpha_values.txt -m ../input/gcmp16S_map_r25.txt --depth 1000 --binning_method quantile -b 4


### Per compartment alpha-diversity comparisons

To run compare_alpha_diversity I need seperate alpha-diversity results for each compartment

In [6]:
compartments = ['Coral Tissue','Coral Mucus','Coral Skeleton']

for depth in rarefaction_depths:
    print ("Analyzing depth %i" %depth)
    otu_table_dir = rarefied_data[depth]
    #curr_otu_table = join(otu_table_dir,"rarefaction_%i_0.biom" %depth)
    #print ("curr_otu_table:",curr_otu_table)
    for compartment in compartments:
        compartment_no_spaces = compartment.replace(" ","_")
        criterion = "'BiologicalMatter:%s'" %compartment
        print("curr compartment filter:%s" %criterion)
        
        curr_outdir = join(output_dir,"rarified_to_%i_%s" %(depth,compartment_no_spaces))
        print("curr outdir: %s" %curr_outdir)
        !mkdir $curr_outdir
        
        for rarified_otu_table in listdir(otu_table_dir):
            print("Rarified OTU table: %s" %rarified_otu_table)
            base_name,ext = splitext(rarified_otu_table)
            if not base_name[-1].isdigit():
                print("Skipping file: %s ....doesn't look like a rarified OTU table" %rarified_otu_table)
                continue
            new_name = "".join(["_".join([compartment_no_spaces,base_name]),ext])
            print("Filtered filename: %s" %new_name)
            new_fp = join(curr_outdir,new_name)
            print("Output filepath:%s" %new_fp)
            
            
            table_to_filter = join(otu_table_dir,rarified_otu_table)
            print("Table to filter:%s" %table_to_filter)
            !filter_samples_from_otu_table.py -i $table_to_filter -o $new_fp --valid_states $criterion -m $mapping
        
        
    
 

Analyzing depth 1000
curr compartment filter:'BiologicalMatter:Coral Tissue'
curr outdir: ../output/rarified_to_1000_Coral_Tissue
mkdir: ../output/rarified_to_1000_Coral_Tissue: File exists
Rarified OTU table: rarefaction_1000_0.biom
Filtered filename: Coral_Tissue_rarefaction_1000_0.biom
Output filepath:../output/rarified_to_1000_Coral_Tissue/Coral_Tissue_rarefaction_1000_0.biom
Table to filter:../output/rarified_to_1000/rarefaction_1000_0.biom
Rarified OTU table: rarefaction_1000_0_Coral_Mucus.biom
Skipping file: rarefaction_1000_0_Coral_Mucus.biom ....doesn't look like a rarified OTU table
Rarified OTU table: rarefaction_1000_0_Coral_Skeleton.biom
Skipping file: rarefaction_1000_0_Coral_Skeleton.biom ....doesn't look like a rarified OTU table
Rarified OTU table: rarefaction_1000_0_Coral_Tissue.biom
Skipping file: rarefaction_1000_0_Coral_Tissue.biom ....doesn't look like a rarified OTU table
Rarified OTU table: rarefaction_1000_1.biom
Filtered filename: Coral_Tissue_rarefaction_1000

In [3]:
compartments = ['Coral Tissue','Coral Mucus','Coral Skeleton']
adiv_methods = "equitability,PD_whole_tree,chao1,observed_otus"

for depth in rarefaction_depths:
    print ("Analyzing depth %i" %depth)
    
    #here's the issue
    
    #curr_otu_table = join(otu_table_dir,"rarefaction_%i_0.biom" %depth)
    #print ("curr_otu_table:",curr_otu_table)
    for compartment in compartments:
        compartment_no_spaces = compartment.replace(" ","_")
        otu_table_dir =  join(output_dir,"rarified_to_%i_%s" %(depth,compartment_no_spaces))
        curr_outdir = join(output_dir,"adiv_%s_%i" %(compartment_no_spaces,depth))
       
        
        print("Calculating alpha diversity on directory: %s" %otu_table_dir)
        adiv_cmd = "alpha_diversity.py -i %s -o %s -t %s -m %s" %(otu_table_dir,curr_outdir,tree,adiv_methods)
        print("adiv cmd:%s"%adiv_cmd)
        !$adiv_cmd
        print ("Done. alpha diversity output dir: %s" %curr_outdir)
        collated_alpha_dir = join(output_dir,"collated_alpha_%s_%i" %(compartment_no_spaces,depth))
        print ("Collating alpha diversity to output dir: %s" %collated_alpha_dir)
        
        !collate_alpha.py -i $curr_outdir -o $collated_alpha_dir
        

Analyzing depth 1000
Calculating alpha diversity on directory: ../output/rarified_to_1000_Coral_Tissue
adiv cmd:alpha_diversity.py -i ../output/rarified_to_1000_Coral_Tissue -o ../output/adiv_Coral_Tissue_1000 -t ../input/gg_constrained_rep_set_fastttree.tre -m equitability,PD_whole_tree,chao1,observed_otus
Done. alpha diversity output dir: ../output/adiv_Coral_Tissue_1000
Collating alpha diversity to output dir: ../output/collated_alpha_Coral_Tissue_1000
Calculating alpha diversity on directory: ../output/rarified_to_1000_Coral_Mucus
adiv cmd:alpha_diversity.py -i ../output/rarified_to_1000_Coral_Mucus -o ../output/adiv_Coral_Mucus_1000 -t ../input/gg_constrained_rep_set_fastttree.tre -m equitability,PD_whole_tree,chao1,observed_otus
Done. alpha diversity output dir: ../output/adiv_Coral_Mucus_1000
Collating alpha diversity to output dir: ../output/collated_alpha_Coral_Mucus_1000
Calculating alpha diversity on directory: ../output/rarified_to_1000_Coral_Skeleton
adiv cmd:alpha_diversi

In [5]:
adiv_methods = "equitability,PD_whole_tree,chao1,observed_otus"
comparison_categories = "functional_group_sensu_darling"
compartments = ['Coral Tissue','Coral Mucus','Coral Skeleton']

for depth in rarefaction_depths:
    print ("Analyzing depth %i" %depth)
    otu_table_dir = rarefied_data[depth]
    curr_outdir = join(output_dir,"rarified_to_%i_%s" %(depth,compartment_no_spaces))
    for compartment in compartments:
        compartment_no_spaces = compartment.replace(" ","_")
        otu_table_dir = curr_outdir
        curr_outdir = join(output_dir,"adiv_%s_%i" %(compartment_no_spaces,depth))  
        collated_alpha_dir = join(output_dir,"collated_alpha_%s_%i" %(compartment_no_spaces,depth))

        print ("Running alpha diversity comparisons comparisons...")
        
        for m in adiv_methods.split(","):
            compare_alpha_results = join(output_dir,"compare_alpha_%s_%i_%s" %(compartment_no_spaces,depth,m))
            collated_alpha_file = join(collated_alpha_dir,"%s.txt" %m)
            !compare_alpha_diversity.py -i $collated_alpha_file -o $compare_alpha_results \
              -m $mapping -c $comparison_categories -t nonparametric

Analyzing depth 1000
Running alpha diversity comparisons comparisons...
Running alpha diversity comparisons comparisons...
Running alpha diversity comparisons comparisons...
Analyzing depth 5000
Running alpha diversity comparisons comparisons...
Error in compare_alpha_diversity.py: option -i: file does not exist: '../output/collated_alpha_Coral_Tissue_5000/equitability.txt'

If you need help with QIIME, see:
http://help.qiime.org
Error in compare_alpha_diversity.py: option -i: file does not exist: '../output/collated_alpha_Coral_Tissue_5000/PD_whole_tree.txt'

If you need help with QIIME, see:
http://help.qiime.org
Error in compare_alpha_diversity.py: option -i: file does not exist: '../output/collated_alpha_Coral_Tissue_5000/chao1.txt'

If you need help with QIIME, see:
http://help.qiime.org
Error in compare_alpha_diversity.py: option -i: file does not exist: '../output/collated_alpha_Coral_Tissue_5000/observed_otus.txt'

If you need help with QIIME, see:
http://help.qiime.org
Running

#### Core microbiome analysis

In [17]:
## Pick the 0th rarefaction at each depth,
# calculate core microbiomes for each tissue compartment

for depth in rarefaction_depths:
    print ("Analyzing depth %i" %depth)
    otu_table_dir = rarefied_data[depth]
    curr_otu_table = join(otu_table_dir,"rarefaction_%i_0.biom" %depth)
    #print ("curr_otu_table:",curr_otu_table)
    for compartment in ['Coral Tissue','Coral Mucus','Coral Skeleton']:
        curr_outdir = join(output_dir,"core_%s_%i" %(compartment.replace(" ","_"),depth))
        curr_compartment = "BiologicalMatter:%s" %compartment
        print("curr_compartment:%s" %curr_compartment)
        !echo $curr_otu_table
        !ls ../output/rarified_to_1000/
        
        
        #NOTE this shouldn't be necessary
        #but was getting an arcane error when substituting directly using !/$ magic
        cmd_template = 'compute_core_microbiome.py -i %s --mapping_fp %s --valid_states "%s" -o %s --num_fraction_for_core_steps 11'
        cmd_str = cmd_template %(curr_otu_table,mapping,curr_compartment,curr_outdir)
        print(cmd_str)
        !$cmd_str
        

Analyzing depth 1000
curr_compartment:BiologicalMatter:Coral Tissue
../output/rarified_to_1000/rarefaction_1000_0.biom
rarefaction_1000_0.biom rarefaction_1000_2.biom rarefaction_1000_4.biom rarefaction_1000_6.biom rarefaction_1000_8.biom
rarefaction_1000_1.biom rarefaction_1000_3.biom rarefaction_1000_5.biom rarefaction_1000_7.biom rarefaction_1000_9.biom
compute_core_microbiome.py -i ../output/rarified_to_1000/rarefaction_1000_0.biom --mapping_fp ../input/gcmp16S_map_r25.txt --valid_states "BiologicalMatter:Coral Tissue" -o ../output/core_Coral_Tissue_1000 --num_fraction_for_core_steps 21
Traceback (most recent call last):
  File "/macqiime/anaconda/bin/compute_core_microbiome.py", line 171, in <module>
    main()
  File "/macqiime/anaconda/bin/compute_core_microbiome.py", line 156, in main
    write_biom_table(core_table, output_table_fp)
  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/util.py", line 569, in write_biom_table
    "Attempting to write an empty BIOM table 

# Beta-diversity analysis across rarefactions

As a next step, we'd like to test the senstitivity of our conclusions about beta-diversity
(mostly conducted at 1000 seqs/sample), at additional depths. 

Across all of our analyses, the factors we test are:
Factors we analyze:
prop_Colony_maximum_diameter_universal
field_host_name
latitude
field_host_genus_id
reef_name
oz_disease_mean
visibility
turf_contact_percent
host_clade_sensu_fukami
binary_turf_contact
photosynthetically_active_radiation
cyanobacteria_percent
functional_group_sensu_darling
cyanobacteria_contact
n_macroalgal_contacts
binary_macroalgal_contact

additional factors we should include for consistency or by reference to reviewer comments:
temperature 
depth
geographic_area
maximum_corallite_width
symbiodinium_sp_in_propagules
growth_form_typical
complex_robust


###Step 1: infer beta-diversity distances at each depth. 



In [14]:
#Beta diversity across depths
#First, generate compartment tables at each rarefaction depth
for depth in rarefaction_depths:
    print ("Analyzing depth %i" %depth)
    otu_table_dir = rarefied_data[depth]
    curr_otu_table = join(otu_table_dir,"rarefaction_%i_0.biom" %depth)
    #print ("curr_otu_table:",curr_otu_table)
    for compartment in ['Coral Tissue','Coral Mucus','Coral Skeleton']:
        curr_outdir = join(output_dir,"bdiv_%s_%i" %(compartment.replace(" ","_"),depth))
        curr_compartment = "BiologicalMatter:%s" %compartment
        #print("curr_compartment:%s" %curr_compartment)
        compartment_table = join(otu_table_dir,"rarefaction_%i_0_%s.biom" %(depth,compartment.replace(" ","_"))) 
        cmd_str = "filter_samples_from_otu_table.py -i %s -o %s --valid_states '%s' -m %s"
        cmd_str = cmd_str %(curr_otu_table,compartment_table,curr_compartment,mapping)
        print(cmd_str)
        !$cmd_str

Analyzing depth 1000
filter_samples_from_otu_table.py -i ../output/rarified_to_1000/rarefaction_1000_0.biom -o ../output/rarified_to_1000/rarefaction_1000_0_Coral_Tissue.biom --valid_states 'BiologicalMatter:Coral Tissue' -m ../input/gcmp16S_map_r25.txt
filter_samples_from_otu_table.py -i ../output/rarified_to_1000/rarefaction_1000_0.biom -o ../output/rarified_to_1000/rarefaction_1000_0_Coral_Mucus.biom --valid_states 'BiologicalMatter:Coral Mucus' -m ../input/gcmp16S_map_r25.txt
filter_samples_from_otu_table.py -i ../output/rarified_to_1000/rarefaction_1000_0.biom -o ../output/rarified_to_1000/rarefaction_1000_0_Coral_Skeleton.biom --valid_states 'BiologicalMatter:Coral Skeleton' -m ../input/gcmp16S_map_r25.txt
Analyzing depth 5000
filter_samples_from_otu_table.py -i ../output/rarified_to_5000/rarefaction_5000_0.biom -o ../output/rarified_to_5000/rarefaction_5000_0_Coral_Tissue.biom --valid_states 'BiologicalMatter:Coral Tissue' -m ../input/gcmp16S_map_r25.txt
filter_samples_from_otu_

In [17]:
#Beta diversity across depths
#Second, run beta_diversity_through_plots.py
for depth in rarefaction_depths:
    print ("Analyzing depth %i" %depth)
    otu_table_dir = rarefied_data[depth]
    curr_otu_table = join(otu_table_dir,"rarefaction_%i_0.biom" %depth)
    #print ("curr_otu_table:",curr_otu_table)
    for compartment in ['Coral Tissue','Coral Mucus','Coral Skeleton']:
        curr_outdir = join(output_dir,"bdiv_%s_%i" %(compartment.replace(" ","_"),depth))
        curr_compartment = "BiologicalMatter:%s" %compartment 
        compartment_table = join(otu_table_dir,"rarefaction_%i_0_%s.biom" %(depth,compartment.replace(" ","_")))
        #NOTE this shouldn't be necessary
        #but was getting an arcane error when substituting directly using !/$ magic
        cmd_template = 'beta_diversity_through_plots.py -i %s -t %s --mapping_fp %s -o %s -p %s'
        cmd_str = cmd_template %(compartment_table,tree,mapping,curr_outdir,bdiv_prefs_fp)
        print(cmd_str)
        !$cmd_str

Analyzing depth 1000
beta_diversity_through_plots.py -i ../output/rarified_to_1000/rarefaction_1000_0_Coral_Tissue.biom -t ../input/gg_constrained_rep_set_fastttree.tre --mapping_fp ../input/gcmp16S_map_r25.txt -o ../output/bdiv_Coral_Tissue_1000 -p ../input/bdiv_bc_prefs.txt
beta_diversity_through_plots.py -i ../output/rarified_to_1000/rarefaction_1000_0_Coral_Mucus.biom -t ../input/gg_constrained_rep_set_fastttree.tre --mapping_fp ../input/gcmp16S_map_r25.txt -o ../output/bdiv_Coral_Mucus_1000 -p ../input/bdiv_bc_prefs.txt
beta_diversity_through_plots.py -i ../output/rarified_to_1000/rarefaction_1000_0_Coral_Skeleton.biom -t ../input/gg_constrained_rep_set_fastttree.tre --mapping_fp ../input/gcmp16S_map_r25.txt -o ../output/bdiv_Coral_Skeleton_1000 -p ../input/bdiv_bc_prefs.txt
Analyzing depth 5000
beta_diversity_through_plots.py -i ../output/rarified_to_5000/rarefaction_5000_0_Coral_Tissue.biom -t ../input/gg_constrained_rep_set_fastttree.tre --mapping_fp ../input/gcmp16S_map_r25.tx

In [9]:
categories = ['Range_size','Colony_maximum_diameter','prop_Colony_maximum_GCMP_recorded',\
            'max_dimension','enclosed_area','IUCN_Red_List_categoy','16S_tree_name','outgroup',\
  'Oocyte_size_at_maturity','temperature','depth','geographic_area',\
  'Corallite_width_maximum','Corallite_width_minimum','Symbiodinium_sp_in_propagules',\
 'Growth_form_typical','Sexual_system','Mode_of_larval_development','complex_robust',\
'prop_Colony_maximum_diameter_universal','host_name','latitude','longitude','host_genus',\
 'reef_name','oz_disease_mean','visibility','Skeletal_density','turf_contact_percent',\
  'host_clade_sensu_fukami','binary_turf_contact',\
  'photosynthetically_active_radiation','cyanobacteria_percent',\
    'functional_group_sensu_darling','cyanobacteria_contact',\
  'binary_CCA_contact','cca_contact_percent','n_macroalgal_contacts',\
    'binary_macroalgal_contact']
#categories = ['depth']

#Beta diversity across depths
#Second, run beta_diversity_through_plots.py
#this is going to be annoying to report at a gazillion depths. Maybe just the most reasonable
#ones (probably 1k for comparison to Zaneveld et al., 2016 and 10k for comparison to Apprill et al)
#similarly for now just doing weighted 

#for similar reasons I am setting this up to be general, 
#but just calculating weighted UniFrac for now
metrics = ['weighted_unifrac','bray_curtis','unweighted_unifrac']
#metrics = ['weighted_unifrac']
rarefaction_depths = [1000,5000,10000,15000,20000]
#rarefaction_depths = [1000]
for depth in rarefaction_depths:
    print ("Analyzing depth %i" %depth)
    otu_table_dir = rarefied_data[depth]
    curr_otu_table = join(otu_table_dir,"rarefaction_%i_0.biom" %depth)
    #print ("curr_otu_table:",curr_otu_table)
    for compartment in ['Coral Tissue','Coral Mucus','Coral Skeleton']:
        for metric in metrics:
            for category in categories:
                curr_dm = join(output_dir,"bdiv_%s_%i"%(compartment.replace(" ","_"),depth),"%s_dm.txt"%(metric))
                print(curr_dm)
                curr_outdir = join(output_dir,"compare_categories_r2","%i"%depth,\
                  "%s"%(compartment.replace(" ","_")),"compare_categories_%s_%s" %(category,metric))
                curr_compartment = "BiologicalMatter:%s" %compartment 
                relevant_dm = join(output_dir,"bdiv_%s_%i"%(compartment.replace(" ","_"),depth),"%s_dm_filtered_for_%s.txt"%(metric,category))
                negative_filter = "'%s:Unknown'" %category
                !filter_distance_matrix.py -i $curr_dm -o $relevant_dm -m $mapping -s $negative_filter --negate
                
                filtered_otu_table = join(output_dir,"bdiv_%s_%i"%(compartment.replace(" ","_"),depth),"%s_otu_table_filtered_for_%s.biom"%(metric,category))
                filtered_mapping = join(output_dir,"bdiv_%s_%i"%(compartment.replace(" ","_"),depth),"%s_mapping_filtered_for_%s.txt"%(metric,category))
                
                #NOTE: I only really want to filter samples with 'Unknown' values from the mapping file
                #but there is no separate script for this. So I have to filter the OTU table too.
                
                positive_filter = "'%s:*,!Unknown'" %category
                !filter_samples_from_otu_table.py -i $curr_otu_table -o $filtered_otu_table --output_mapping_fp $filtered_mapping -m $mapping -s $positive_filter
                #NOTE this shouldn't be necessary
                #but was getting an arcane error when substituting directly using !/$ magic
                cmd_template = "compare_categories.py -i %s  -m %s --method adonis -o %s -c '%s' -n 9999"
                cmd_str = cmd_template %(relevant_dm,filtered_mapping,curr_outdir,category)
                print(cmd_str)
                !$cmd_str

Analyzing depth 1000
../output/bdiv_Coral_Tissue_1000/weighted_unifrac_dm.txt
compare_categories.py -i ../output/bdiv_Coral_Tissue_1000/weighted_unifrac_dm_filtered_for_depth.txt  -m ../output/bdiv_Coral_Tissue_1000/weighted_unifrac_mapping_filtered_for_depth.txt --method adonis -o ../output/compare_categories_r2/1000/Coral_Tissue/compare_categories_depth_weighted_unifrac -c 'depth' -n 9999
../output/bdiv_Coral_Tissue_1000/bray_curtis_dm.txt
compare_categories.py -i ../output/bdiv_Coral_Tissue_1000/bray_curtis_dm_filtered_for_depth.txt  -m ../output/bdiv_Coral_Tissue_1000/bray_curtis_mapping_filtered_for_depth.txt --method adonis -o ../output/compare_categories_r2/1000/Coral_Tissue/compare_categories_depth_bray_curtis -c 'depth' -n 9999
../output/bdiv_Coral_Tissue_1000/unweighted_unifrac_dm.txt
compare_categories.py -i ../output/bdiv_Coral_Tissue_1000/unweighted_unifrac_dm_filtered_for_depth.txt  -m ../output/bdiv_Coral_Tissue_1000/unweighted_unifrac_mapping_filtered_for_depth.txt --me

## Compile Adonis results

To generate heatmaps, we now need to compile all these Adonis results into a single CSV file

In [10]:
rarefaction_depths = [1000,5000,10000,15000,20000]

for depth in rarefaction_depths:
    print ("Analyzing depth %i" %depth)   
    for compartment in ['Coral Tissue','Coral Mucus','Coral Skeleton']:
            for category in categories:
                curr_outdir = join(output_dir,"compare_categories_r2","%i"%depth)
                curr_output = join(output_dir,"compare_categories_r2","adonis_results_%i.tsv" %depth)
                !python csv_from_adonis_results_deep_rarefaction.py $curr_outdir $curr_output
                !python make_heatmap.py $curr_output
                 

Analyzing depth 1000
  if self._edgecolors == str('face'):
  if self._edgecolors == str('face'):
  if self._edgecolors == str('face'):
Analyzing depth 5000
  if self._edgecolors == str('face'):
  if self._edgecolors == str('face'):
  if self._edgecolors == str('face'):
Analyzing depth 10000
  if self._edgecolors == str('face'):
  if self._edgecolors == str('face'):
  if self._edgecolors == str('face'):
Analyzing depth 15000
  if self._edgecolors == str('face'):
  if self._edgecolors == str('face'):
  if self._edgecolors == str('face'):
Analyzing depth 20000
  if self._edgecolors == str('face'):
  if self._edgecolors == str('face'):
  if self._edgecolors == str('face'):


#### Summarize taxa for ASR

Summarize taxonomy at 1000 seqs/sample and add to the mapping file

Resulting heatmaps: