Skip to content

Per body site variability within and between individuals

gregcaporaso edited this page Nov 17, 2012 · 5 revisions

Filter distance matrices to include only individuals who donated > 5 samples.

filter_distance_matrix.py -i forehead_even10000_unweighted_unifrac_dm.txt -o forehead_even10000_unweighted_unifrac_dm_tc_only.txt -m ../StudentMicrobiomeProject-map.tsv -s "TimeCourseAnalysis:yes"
filter_distance_matrix.py -i gut_even10000_unweighted_unifrac_dm.txt -o gut_even10000_unweighted_unifrac_dm_tc_only.txt -m ../StudentMicrobiomeProject-map.tsv -s "TimeCourseAnalysis:yes"
filter_distance_matrix.py -i palm_even10000_unweighted_unifrac_dm.txt -o palm_even10000_unweighted_unifrac_dm_tc_only.txt -m ../StudentMicrobiomeProject-map.tsv -s "TimeCourseAnalysis:yes"
filter_distance_matrix.py -i tongue_even10000_unweighted_unifrac_dm.txt -o tongue_even10000_unweighted_unifrac_dm_tc_only.txt -m ../StudentMicrobiomeProject-map.tsv -s "TimeCourseAnalysis:yes"
filter_distance_matrix.py -i forehead_even10000_weighted_unifrac_dm.txt -o forehead_even10000_weighted_unifrac_dm_tc_only.txt -m ../StudentMicrobiomeProject-map.tsv -s "TimeCourseAnalysis:yes"
filter_distance_matrix.py -i gut_even10000_weighted_unifrac_dm.txt -o gut_even10000_weighted_unifrac_dm_tc_only.txt -m ../StudentMicrobiomeProject-map.tsv -s "TimeCourseAnalysis:yes"
filter_distance_matrix.py -i palm_even10000_weighted_unifrac_dm.txt -o palm_even10000_weighted_unifrac_dm_tc_only.txt -m ../StudentMicrobiomeProject-map.tsv -s "TimeCourseAnalysis:yes"
filter_distance_matrix.py -i tongue_even10000_weighted_unifrac_dm.txt -o tongue_even10000_weighted_unifrac_dm_tc_only.txt -m ../StudentMicrobiomeProject-map.tsv -s "TimeCourseAnalysis:yes"

Generate box plots for each individual on a per body site basis.

make_distance_boxplots.py -d /Users/caporaso/analysis/student-microbiome-project/beta_diversity/forehead_even10000_unweighted_unifrac_dm_tc_only.txt -o /Users/caporaso/analysis/student-microbiome-project/beta_diversity/forehead_even10000_unweighted_unifrac_box_plots -m /Users/caporaso/analysis/student-microbiome-project/StudentMicrobiomeProject-map.tsv -f PersonalID --suppress_significance_tests --sort --height 8
make_distance_boxplots.py -d /Users/caporaso/analysis/student-microbiome-project/beta_diversity/gut_even10000_unweighted_unifrac_dm_tc_only.txt -o /Users/caporaso/analysis/student-microbiome-project/beta_diversity/gut_even10000_unweighted_unifrac_box_plots -m /Users/caporaso/analysis/student-microbiome-project/StudentMicrobiomeProject-map.tsv -f PersonalID --suppress_significance_tests --sort --height 8
make_distance_boxplots.py -d /Users/caporaso/analysis/student-microbiome-project/beta_diversity/tongue_even10000_unweighted_unifrac_dm_tc_only.txt -o /Users/caporaso/analysis/student-microbiome-project/beta_diversity/tongue_even10000_unweighted_unifrac_box_plots -m /Users/caporaso/analysis/student-microbiome-project/StudentMicrobiomeProject-map.tsv -f PersonalID --suppress_significance_tests --sort --height 8
make_distance_boxplots.py -d /Users/caporaso/analysis/student-microbiome-project/beta_diversity/palm_even10000_unweighted_unifrac_dm_tc_only.txt -o /Users/caporaso/analysis/student-microbiome-project/beta_diversity/palm_even10000_unweighted_unifrac_box_plots -m /Users/caporaso/analysis/student-microbiome-project/StudentMicrobiomeProject-map.tsv -f PersonalID --suppress_significance_tests --sort --height 8
make_distance_boxplots.py -d /Users/caporaso/analysis/student-microbiome-project/beta_diversity/forehead_even10000_unweighted_unifrac_dm_tc_only.txt -o /Users/caporaso/analysis/student-microbiome-project/beta_diversity/forehead_even10000_unweighted_unifrac_box_plots_summ -m /Users/caporaso/analysis/student-microbiome-project/StudentMicrobiomeProject-map.tsv -f PersonalID --sort --height 8 --suppress_individual_within --suppress_individual_between
make_distance_boxplots.py -d /Users/caporaso/analysis/student-microbiome-project/beta_diversity/gut_even10000_unweighted_unifrac_dm_tc_only.txt -o /Users/caporaso/analysis/student-microbiome-project/beta_diversity/gut_even10000_unweighted_unifrac_box_plots_summ -m /Users/caporaso/analysis/student-microbiome-project/StudentMicrobiomeProject-map.tsv -f PersonalID --sort --height 8 --suppress_individual_within --suppress_individual_between
make_distance_boxplots.py -d /Users/caporaso/analysis/student-microbiome-project/beta_diversity/tongue_even10000_unweighted_unifrac_dm_tc_only.txt -o /Users/caporaso/analysis/student-microbiome-project/beta_diversity/tongue_even10000_unweighted_unifrac_box_plots_summ -m /Users/caporaso/analysis/student-microbiome-project/StudentMicrobiomeProject-map.tsv -f PersonalID --sort --height 8 --suppress_individual_within --suppress_individual_between
make_distance_boxplots.py -d /Users/caporaso/analysis/student-microbiome-project/beta_diversity/palm_even10000_unweighted_unifrac_dm_tc_only.txt -o /Users/caporaso/analysis/student-microbiome-project/beta_diversity/palm_even10000_unweighted_unifrac_box_plots_summ -m /Users/caporaso/analysis/student-microbiome-project/StudentMicrobiomeProject-map.tsv -f PersonalID --sort --height 8 --suppress_individual_within --suppress_individual_between

I add some additional plots showing the variability on a per-individual basis for each body site. You can find these here. Within each directory you'll find a PDF of the box plots. These are pretty unruly because there is so much data, but you can zoom in to scroll through. We'll need a better way to view these.

Dan is currently looking at these plots in the context of Gilbert's per-individual distance decay plots and the disturbance data to see if (1) individuals reporting some disturbance types have higher variability, and (2) if the timing of the reported disturbances correlate with a 'jump' in the distance decay plot (and we'll of course need to figure out how to quantify a 'jump'. Jai is working on updating the boxplotting code so we can color boxes by metadata, to, for example, color boxes different for individuals who reported a disturbance.

For all body sites the within individual distances are significantly lower than between individual distances (two-tailed two-sample t-test, parametric):

Body Site Group 1 Group 2 t statistic p-value
Forehead All within PersonalID All between PersonalID -73.7896603811 0.0
Gut All within PersonalID All between PersonalID -142.184026639 0.0
Palm All within PersonalID All between PersonalID -62.2498126892 0.0
Tongue All within PersonalID All between PersonalID -57.2688909063 0.0