## 16S metaanalysis of sociodemographic associations with childhood gut microbiomes

This is a metaanalysis of publicly available 16S rRNA sequencing datasets from the gut microbiomes of pre-adolescent children. The goal is to see if sociodemographic factors are related to gut microbiome composition.

This analysis was run in QIIME2-2021.4.

### Importing data

Each study is individually imported, as different sequencing protocols and primer sets were used in each study. Code for importing each study is below:

In [None]:
#Importing sequences from Herman et al., 2019
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/herman

qiime tools import --type EMPPairedEndSequences --input-path ${project} --output-path ${project}/herman_raw_seqs.qza

qiime demux emp-paired --m-barcodes-file ${project}/herman_metadata.txt --m-barcodes-column BarcodeSequence \
    --p-rev-comp-mapping-barcodes --i-seqs ${project}/herman_raw_seqs.qza \
    --o-per-sample-sequences ${project}/herman-paired-end-demux.qza \
    --o-error-correction-details ${project}/herman-demux-details.qza
    
qiime demux summarize --i-data ${project}/herman-paired-end-demux.qza \
    --o-visualization ${project}/herman-paired-end-demux.qzv

In [None]:
#Importing sequences from Chu et al., 2016
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/chu

cd ${project}

qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path ${project}/Manifest.txt \
    --output-path chu-single-end-demux.qza --input-format SingleEndFastqManifestPhred33V2
    
cd

qiime demux summarize --i-data ${project}/chu-single-end-demux.qza \
    --o-visualization ${project}/chu-single-end-demux.qzv

In [None]:
#Importing sequences from Planer et al., 2016
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/planer

cd ${project}

qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path ${project}/Manifest.tsv \
    --output-path planer-paired-end-demux.qza --input-format PairedEndFastqManifestPhred33V2
    
cd

qiime demux summarize --i-data ${project}/planer-paired-end-demux.qza \
    --o-visualization ${project}/planer-paired-end-demux.qzv

In [None]:
#Importing sequences from Kim type 1 diabetes study (QIITA 11129)
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/kim_t1d

cd ${project}

qiime tools import --type 'EMPPairedEndSequences' \
    --input-path ${project}/study_raw_data_11129_070221-125259/FASTQ/31359 \
    --output-path ${project}/kim-t1d-paired-end.qza
    
cd

qiime tools peek ${project}/kim-t1d-paired-end.qza

qiime demux emp-paired --i-seqs ${project}/kim-t1d-paired-end.qza \
    --m-barcodes-file ${project}/study_raw_data_11129_070221-125259/mapping_files/31359_mapping_file.txt \
    --m-barcodes-column barcode \
    --p-rev-comp-barcodes --p-rev-comp-mapping-barcodes \
    --o-per-sample-sequences ${project}/kim-t1d-paired-end-demux.qza \
    --o-error-correction-details ${project}/demux-details.qza

qiime demux summarize --i-data ${project}/kim-t1d-paired-end-demux.qza \
    --o-visualization ${project}/kim-t1d-paired-end-demux.qzv


In [None]:
#Importing sequences from MGD time series study (QIITA 10894)
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/MGD_infant_timeseries

cd ${project}

qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path ${project}/Manifest.tsv \
    --output-path mgd-single-end-demux.qza --input-format SingleEndFastqManifestPhred33V2
    
cd

qiime demux summarize --i-data ${project}/mgd-single-end-demux.qza \
    --o-visualization ${project}/mgd-single-end-demux.qzv

In [None]:
#Importing sequences from Levin et al., 2016
#Using forward reads only, because some sequences have length 0 in multiple files
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/levin

cd ${project}

qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path ${project}/Manifest.tsv \
    --output-path levin-single-end-demux.qza --input-format SingleEndFastqManifestPhred33V2
    
cd

qiime demux summarize --i-data ${project}/levin-single-end-demux.qza \
    --o-visualization ${project}/levin-single-end-demux.qzv

In [None]:
#Importing sequences from Robinson et al., 2017
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/robinson

cd ${project}

qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path ${project}/Manifest.tsv \
    --output-path robinson-paired-end-demux.qza --input-format PairedEndFastqManifestPhred33V2
    
cd

qiime demux summarize --i-data ${project}/robinson-paired-end-demux.qza \
    --o-visualization ${project}/robinson-paired-end-demux.qzv

In [None]:
#Importing sequences from Cioffi et al., 2020
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/cioffi

cd ${project}

qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path ${project}/Manifest.tsv \
    --output-path cioffi-paired-end-demux.qza --input-format PairedEndFastqManifestPhred33V2
    
cd

qiime demux summarize --i-data ${project}/cioffi-paired-end-demux.qza \
    --o-visualization ${project}/cioffi-paired-end-demux.qzv

### Running DADA2

Each study was individually run through DADA2, per the QIIME developers' suggestions. Trimming and truncation parameters were set individually for each study due to differences in primer sets and sequencing quality.

In [None]:
#DADA2 for Herman et al., 2019 sequences
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/herman

qiime dada2 denoise-paired --i-demultiplexed-seqs ${project}/herman-paired-end-demux.qza \
    --p-trunc-len-f 150 --p-trunc-len-r 150 --p-trim-left-f 19 --p-trim-left-r 20 --p-n-threads 2 \
    --o-table ${project}/herman-table.qza \
    --o-representative-sequences ${project}/herman-rep-seqs.qza \
    --o-denoising-stats ${project}/herman-dada2-stats.qza

qiime metadata tabulate --m-input-file ${project}/herman-dada2-stats.qza \
    --o-visualization ${project}/herman-dada2-stats.qzv

qiime feature-table summarize --i-table ${project}/herman-table.qza \
    --o-visualization ${project}/herman-table.qzv \
    --m-sample-metadata-file ${project}/herman_metadata.txt

In [None]:
#DADA2 for Chu et al., 2016 sequences
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/chu

qiime dada2 denoise-pyro --i-demultiplexed-seqs ${project}/chu-single-end-demux.qza \
    --p-trunc-len 450 --p-trim-left 17 --p-n-threads 2 \
    --o-table ${project}/chu-table.qza \
    --o-representative-sequences ${project}/chu-rep-seqs.qza \
    --o-denoising-stats ${project}/chu-dada2-stats.qza

qiime metadata tabulate --m-input-file ${project}/chu-dada2-stats.qza \
    --o-visualization ${project}/chu-dada2-stats.qzv

qiime feature-table summarize --i-table ${project}/chu-table.qza \
    --o-visualization ${project}/chu-table.qzv \
    --m-sample-metadata-file ${project}/chu_metadata.txt

In [None]:
#DADA2 for Planer et al., 2016 sequences
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/planer

qiime dada2 denoise-paired --i-demultiplexed-seqs ${project}/planer-paired-end-demux.qza \
    --p-trunc-len-f 220 --p-trunc-len-r 220 --p-trim-left-f 19 --p-trim-left-r 20 --p-n-threads 2 \
    --o-table ${project}/planer-table.qza \
    --o-representative-sequences ${project}/planer-rep-seqs.qza \
    --o-denoising-stats ${project}/planer-dada2-stats.qza

qiime metadata tabulate --m-input-file ${project}/planer-dada2-stats.qza \
    --o-visualization ${project}/planer-dada2-stats.qzv

qiime feature-table summarize --i-table ${project}/planer-table.qza \
    --o-visualization ${project}/planer-table.qzv \
    --m-sample-metadata-file ${project}/planer_metadata.txt

In [None]:
#DADA2 for Kim type 1 diabetes study (QIITA 11129)
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/kim_t1d

qiime dada2 denoise-paired --i-demultiplexed-seqs ${project}/kim-t1d-paired-end-demux.qza \
    --p-trunc-len-f 150 --p-trunc-len-r 150 --p-trim-left-f 0 --p-trim-left-r 0 --p-n-threads 2 \
    --o-table ${project}/kim-t1d-table.qza \
    --o-representative-sequences ${project}/kim-t1d-rep-seqs.qza \c
    --o-denoising-stats ${project}/kim-t1d-dada2-stats.qza

qiime metadata tabulate --m-input-file ${project}/kim-t1d-dada2-stats.qza \
    --o-visualization ${project}/kim-t1d-dada2-stats.qzv

qiime feature-table summarize --i-table ${project}/kim-t1d-table.qza \
    --o-visualization ${project}/kim-t1d-table.qzv \
    --m-sample-metadata-file ${project}/kim_metadata.txt

In [None]:
#DADA2 for MGD time series study (QIITA 10894)
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/MGD_infant_timeseries

qiime dada2 denoise-single --i-demultiplexed-seqs ${project}/mgd-single-end-demux.qza \
    --p-trunc-len 125 --p-trim-left 19 --p-n-threads 2 \
    --o-table ${project}/mgd-table.qza \
    --o-representative-sequences ${project}/mgd-rep-seqs.qza \
    --o-denoising-stats ${project}/mgd-dada2-stats.qza

qiime metadata tabulate --m-input-file ${project}/mgd-dada2-stats.qza \
    --o-visualization ${project}/mgd-dada2-stats.qzv

qiime feature-table summarize --i-table ${project}/mgd-table.qza \
    --o-visualization ${project}/mgd-table.qzv \
    --m-sample-metadata-file ${project}/mgd_timeseries_metadata.txt

In [None]:
#DADA2 for Levin et al., 2016 sequences
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/levin

qiime dada2 denoise-single --i-demultiplexed-seqs ${project}/levin-single-end-demux.qza \
    --p-trunc-len 250 --p-trim-left 19 --p-n-threads 2 \
    --o-table ${project}/levin-table.qza \
    --o-representative-sequences ${project}/levin-rep-seqs.qza \
    --o-denoising-stats ${project}/levin-dada2-stats.qza

qiime metadata tabulate --m-input-file ${project}/levin-dada2-stats.qza \
    --o-visualization ${project}/levin-dada2-stats.qzv

qiime feature-table summarize --i-table ${project}/levin-table.qza \
    --o-visualization ${project}/levin-table.qzv \
    --m-sample-metadata-file ${project}/levin_metadata.txt

In [None]:
#DADA2 for Robinson et al., 2017 sequences
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/robinson

qiime dada2 denoise-paired --i-demultiplexed-seqs ${project}/robinson-paired-end-demux.qza \
    --p-trunc-len-f 248 --p-trunc-len-r 200 --p-trim-left-f 19 --p-trim-left-r 20 --p-n-threads 2 \
    --o-table ${project}/robinson-table.qza \
    --o-representative-sequences ${project}/robinson-rep-seqs.qza \
    --o-denoising-stats ${project}/robinson-dada2-stats.qza

qiime metadata tabulate --m-input-file ${project}/robinson-dada2-stats.qza \
    --o-visualization ${project}/robinson-dada2-stats.qzv

qiime feature-table summarize --i-table ${project}/robinson-table.qza \
    --o-visualization ${project}/robinson-table.qzv \
    --m-sample-metadata-file ${project}/robinson_metadata.txt

In [None]:
#DADA2 for Cioffi et al., 2020 sequences
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/cioffi

qiime dada2 denoise-paired --i-demultiplexed-seqs ${project}/cioffi-paired-end-demux.qza \
    --p-trunc-len-f 150 --p-trunc-len-r 150 --p-trim-left-f 19 --p-trim-left-r 20 --p-n-threads 2 \
    --o-table ${project}/cioffi-table.qza \
    --o-representative-sequences ${project}/cioffi-rep-seqs.qza \
    --o-denoising-stats ${project}/cioffi-dada2-stats.qza

qiime metadata tabulate --m-input-file ${project}/cioffi-dada2-stats.qza \
    --o-visualization ${project}/cioffi-dada2-stats.qzv

qiime feature-table summarize --i-table ${project}/cioffi-table.qza \
    --o-visualization ${project}/cioffi-table.qzv \
    --m-sample-metadata-file ${project}/cioffi_metadata.txt

### Assigning taxonomy

Taxonomy was assigned using a Naive-Bayesian classifier trained on the Greengenes 13_8 99% OTU full-length 16S sequence database. Taxonomy was assigned for each study individually, and then for merged datasets (see below).

Pre-trained classifiers were obtained from the QIIME2 website:
Bokulich, N.A., Robeson, M., Dillon, M.R. bokulich-lab/RESCRIPt. Zenodo. http://doi.org/10.5281/zenodo.3891931
Bokulich, N.A., Kaehler, B.D., Rideout, J.R. et al. Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin. Microbiome 6, 90 (2018). https://doi.org/10.1186/s40168-018-0470-z

In [None]:
#Taxonomy for Herman et al., 2019 sequences
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/herman

qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-nb-classifier.qza \
    --i-reads ${project}/herman-rep-seqs.qza --o-classification ${project}/herman-taxonomy.qza

In [None]:
#Taxonomy for Chu et al., 2016 sequences
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/chu

qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-nb-classifier.qza \
    --i-reads ${project}/chu-rep-seqs.qza --o-classification ${project}/chu-taxonomy.qza

In [None]:
#Taxonomy for Planer et al., 2016 sequences
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/planer

qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-nb-classifier.qza \
    --i-reads ${project}/planer-rep-seqs.qza --o-classification ${project}/planer-taxonomy.qza

In [None]:
#Taxonomy for Kim type 1 diabetes study (QIITA 11129)
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/kim_t1d

qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-nb-classifier.qza \
    --i-reads ${project}/kim-t1d-rep-seqs.qza --o-classification ${project}/kim-t1d-taxonomy.qza

In [None]:
#Taxonomy for MGD time series study (QIITA 10894)
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/MGD_infant_timeseries

qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-nb-classifier.qza \
    --i-reads ${project}/mgd-rep-seqs.qza --o-classification ${project}/mgd-taxonomy.qza

In [None]:
#Taxonomy for Levin et al., 2016
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/levin

qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-nb-classifier.qza \
    --i-reads ${project}/levin-rep-seqs.qza --o-classification ${project}/levin-taxonomy.qza

In [None]:
#Taxonomy for Robinson et al., 2017
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/robinson

qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-nb-classifier.qza \
    --i-reads ${project}/robinson-rep-seqs.qza --o-classification ${project}/robinson-taxonomy.qza

In [None]:
#Taxonomy for Cioffi et al., 2020
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children/cioffi

qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-nb-classifier.qza \
    --i-reads ${project}/cioffi-rep-seqs.qza --o-classification ${project}/cioffi-taxonomy.qza

### Merging datasets

Output feature table and representative sequences from all studies were merged. The q2-fragment-insertion QIIME2 plugin was used to create phylogenetic trees with the merged datasets. The merged table was filtered to remove sequences not present in the insertion tree.

In [None]:
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children

qiime feature-table merge --i-tables ${project}/herman/herman-table.qza --i-tables ${project}/chu/chu-table.qza \
     --i-tables ${project}/planer/planer-table.qza --i-tables ${project}/kim_t1d/kim-t1d-table.qza \
     --i-tables ${project}/MGD_infant_timeseries/mgd-table.qza ${project}/levin/levin-table.qza \
     --i-tables ${project}/robinson/robinson-table.qza --i-tables ${project}/cioffi/cioffi-table.qza \
     --o-merged-table ${project}/merged-table-new.qza

qiime feature-table merge-seqs --i-data ${project}/herman/herman-rep-seqs.qza \
    --i-data ${project}/chu/chu-rep-seqs.qza --i-data ${project}/planer/planer-rep-seqs.qza \
    --i-data ${project}/kim_t1d/kim-t1d-rep-seqs.qza --i-data ${project}/MGD_infant_timeseries/mgd-rep-seqs.qza \
    --i-data ${project}/levin/levin-rep-seqs.qza --i-data ${project}/robinson/robinson-rep-seqs.qza \
    --i-data ${project}/cioffi/cioffi-rep-seqs.qza --o-merged-data ${project}/merged-rep-seqs-new.qza 

wget -O "sepp-refs-gg-13-8.qza" \
  "https://data.qiime2.org/2021.4/common/sepp-refs-gg-13-8.qza"
  
qiime fragment-insertion sepp --i-representative-sequences ${project}/merged-rep-seqs-new.qza \
    --i-reference-database /Users/elizabethmallott/sepp-refs-gg-13-8.qza --o-tree ${project}/insertion-tree-new.qza \
    --o-placements ${project}/insertion-placements-new.qza --p-threads 2

qiime fragment-insertion filter-features --i-table ${project}/merged-table-new.qza \
    --i-tree ${project}/insertion-tree-new.qza --o-filtered-table ${project}/insertion-filtered-merged-table-new.qza \
    --o-removed-table ${project}/removed-insertion-merged-table-new.qza

### Taxonomic assigned of merged datasets

We assigned taxonomy to the merged dataset two ways. First, using a Naive-Bayesian classifier trained on the Greengenes 13_8 99% OTU full-length 16S sequence database, as above. Second, we used the experimental OTU classification method based on the results of the fragment insertion method results. Mitochondrial and chloroplast sequences were removed based on the results of both methods.

In [None]:
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children

qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-nb-classifier.qza \
    --i-reads ${project}/merged-rep-seqs-new.qza --o-classification ${project}/merged-taxonomy-new.qza
    
qiime fragment-insertion classify-otus-experimental --i-representative-sequences ${project}/merged-rep-seqs.qza \
    --i-tree ${project}/insertion-tree.qza --i-reference-taxonomy ${project}/taxonomy_gg99.qza \
    --o-classification ${project}/merged-otu-taxonomy.qza

### Filtering table

The merged table was filtered to remove mitochondrial and chloroplast sequences, as well as remove control sample or non-fecal samples from some datasets. The table was then filtered to remove samples lacking specific metadata categories prior to downstream analysis. Study-specific and age-group-specific feature tables were also created.

In [None]:
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children

qiime taxa filter-table --i-table ${project}/insertion-filtered-merged-table-new.qza \
    --i-taxonomy ${project}/merged-taxonomy-new.qza --p-exclude mitochondria,chloroplast \
    --o-filtered-table ${project}/merged-table-nomito-nochloro-new.qza
    
qiime feature-table filter-samples --i-table ${project}/merged-table-nomito-nochloro-new.qza \
    --m-metadata-file ${project}/metadata-combined.txt --o-filtered-table ${project}/merged-table-filtered-new.qza
    
qiime feature-table filter-samples --i-table ${project}/merged-table-nomito-nochloro-otu.qza \
    --m-metadata-file ${project}/metadata-combined.txt --o-filtered-table ${project}/merged-table-filtered-otu.qza
    
qiime feature-table summarize --i-table ${project}/merged-table-filtered.qza \
    --o-visualization ${project}/merged-table-filtered.qzv \
    --m-sample-metadata-file ${project}/metadata-combined.txt

qiime feature-table summarize --i-table ${project}/merged-table-filtered-otu.qza \
    --o-visualization ${project}/merged-table-filtered-otu.qzv \
    --m-sample-metadata-file ${project}/metadata-combined.txt

qiime feature-table filter-samples --i-table ${project}/merged-table-nomito-nochloro-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --o-filtered-table ${project}/merged-table-filtered-re-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-nomito-nochloro-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "NOT [Race]='Other' AND NOT [Ethnicity]='Other'" \
    --o-filtered-table ${project}/merged-table-filtered-re-noother-new.qza
    
qiime feature-table filter-samples --i-table ${project}/merged-table-nomito-nochloro-new.qza \
    --m-metadata-file ${project}/metadata-combined-re-onesubj.txt \
    --p-where "NOT [Race]='Other' AND NOT [Ethnicity]='Other'" \
    --o-filtered-table ${project}/merged-table-filtered-re-onesubj-noother-new.qza
    
qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "NOT [Race]='Asian/Pacific Islander'" \
    --o-filtered-table ${project}/merged-table-filtered-re-noother-noasian-new.qza
    
qiime feature-table filter-samples --i-table ${project}/merged-table-nomito-nochloro-new.qza \
    --m-metadata-file ${project}/metadata-combined-re-age-sex-delivery-feeding-noother.txt \
    --o-filtered-table ${project}/merged-table-filtered-re-age-sex-delivery-feeding-noother.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-noasian-new.qza \
    --m-metadata-file ${project}/metadata-combined-re-age-sex-delivery-feeding-noother.txt \
    --o-filtered-table ${project}/merged-table-filtered-re-age-sex-delivery-feeding-noother-noasian.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-nomito-nochloro-new.qza \
    --m-metadata-file ${project}/metadata-combined-re-age-sex-delivery-feeding-onesubj-noother.txt \
    --o-filtered-table ${project}/merged-table-filtered-re-age-sex-delivery-feeding-onesubj-noother.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat]='0-1 week'" \
    --o-filtered-table ${project}/merged-table-filtered-re-01week-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat]='1-6 weeks'" \
    --o-filtered-table ${project}/merged-table-filtered-re-16week-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat] IN ('0-1 week', '1-6 weeks')" \
    --o-filtered-table ${project}/merged-table-filtered-re-06week-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat]='6 weeks-3 months'" \
    --o-filtered-table ${project}/merged-table-filtered-re-6week3month-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat]='3-12 months'" \
    --o-filtered-table ${project}/merged-table-filtered-re-312month-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat] IN ('0-1 week', '1-6 weeks', '6 weeks-3 months', '3-12 months')" \
    --o-filtered-table ${project}/merged-table-filtered-re-012month-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat]='1-3 years'" \
    --o-filtered-table ${project}/merged-table-filtered-re-13year-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat]='3-6 years'" \
    --o-filtered-table ${project}/merged-table-filtered-re-36year-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat]='6-12 years'" \
    --o-filtered-table ${project}/merged-table-filtered-re-612year-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat] IN ('3-6 years', '6-12 years')" \
    --o-filtered-table ${project}/merged-table-filtered-re-312year-new.qza
   
qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-noasian-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat]='0-1 week'" \
    --o-filtered-table ${project}/merged-table-filtered-re-noother-noasian-01week-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-noasian-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat]='1-6 weeks'" \
    --o-filtered-table ${project}/merged-table-filtered-re-noother-noasian-16week-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-noasian-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat]='6 weeks-3 months'" \
    --o-filtered-table ${project}/merged-table-filtered-re-noother-noasian-6week3month-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-noasian-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat]='3-12 months'" \
    --o-filtered-table ${project}/merged-table-filtered-re-noother-noasian-312month-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-noasian-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat]='1-3 years'" \
    --o-filtered-table ${project}/merged-table-filtered-re-noother-noasian-13year-new.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-noasian-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Host_age_cat] IN ('3-6 years', '6-12 years')" \
    --o-filtered-table ${project}/merged-table-filtered-re-noother-noasian-312year-new.qza
    
qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Study]='Kim'" \
    --o-filtered-table ${project}/merged-table-filtered-re-kim.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Study]='Herman'" \
    --o-filtered-table ${project}/merged-table-filtered-re-herman.qza
    
qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Study]='Herman' AND NOT [Host_age_cat]='NA'" \
    --o-filtered-table ${project}/merged-table-filtered-re-herman.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Study]='Chu'" \
    --o-filtered-table ${project}/merged-table-filtered-re-chu.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Study]='Robinson'" \
    --o-filtered-table ${project}/merged-table-filtered-re-robinson.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Study]='Planer'" \
    --o-filtered-table ${project}/merged-table-filtered-re-planer.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Study]='Levin' AND NOT [Breastfeeding]='NA'" \
    --o-filtered-table ${project}/merged-table-filtered-re-levin.qza

qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Study]='Dominguez-Bello' AND NOT [host_sex]='NA' AND NOT [Breastfeeding]='NA'" \
    --o-filtered-table ${project}/merged-table-filtered-re-db.qza
    
qiime feature-table filter-samples --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --p-where "[Study]='Cioffi'" \
    --o-filtered-table ${project}/merged-table-filtered-re-cioffi.qza

### Alpha and beta diversity analyses

Alpha and beta diversity metrics were calculated in QIIME2, and then results were output to R for statistical analysis.

In [None]:
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-noother-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re

qiime diversity alpha --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --p-metric 'pielou_e' --o-alpha-diversity ${project}/core-metrics-5000-re/pielou_e_vector.qza
    
qiime diversity alpha --i-table ${project}/merged-table-filtered-re-noother-new.qza \
    --p-metric 'chao1' --o-alpha-diversity ${project}/core-metrics-5000-re/chao1_vector.qza

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-age-sex-delivery-feeding-noother.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re-age-sex-delivery-feeding-noother.txt \
    --output-dir ${project}/core-metrics-5000-re-age-sex-delivery-feeding

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-noother-noasian-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-noasian

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-age-sex-delivery-feeding-noother-noasian.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re-age-sex-delivery-feeding-noother.txt \
    --output-dir ${project}/core-metrics-5000-re-age-sex-delivery-feeding-noasian

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-onesubj-noother-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re-onesubj.txt \
    --output-dir ${project}/core-metrics-5000-re-onesubj-new

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-age-sex-delivery-feeding-onesubj-noother.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re-age-sex-delivery-feeding-onesubj-noother.txt \
    --output-dir ${project}/core-metrics-5000-re-age-sex-delivery-feeding-onesubj

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-01week-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-01week

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-16week-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-16week

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-06week-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-06week

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-6week3month-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-6week3month

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-312month-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-312month

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-012month-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-012month

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-13year-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-13year

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-36year-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-36year

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-612year-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-612year

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-312year-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-312year

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-noother-noasian-01week-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-noother-noasian-01week

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-noother-noasian-16week-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-noother-noasian-16week

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-noother-noasian-6week3month-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-noother-noasian-6week3month

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-noother-noasian-312month-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-noother-noasian-312month

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-noother-noasian-13year-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-noother-noasian-13year

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-noother-noasian-312year-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-noother-noasian-312year

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-noother-new.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-noother
    
qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-kim.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-kim
    
qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-herman.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-herman

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-chu.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-chu

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-robinson.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-robinson
    
qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-planer.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-planer

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-levin.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-levin

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-db.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-db

qiime diversity core-metrics-phylogenetic --i-phylogeny ${project}/insertion-tree-new.qza \
    --i-table ${project}/merged-table-filtered-re-cioffi.qza --p-sampling-depth 5000 \
    --m-metadata-file ${project}/metadata-combined-re.txt \
    --output-dir ${project}/core-metrics-5000-re-cioffi

qiime tools extract --input-path ${project}/core-metrics-5000-re/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re

qiime tools extract --input-path ${project}/core-metrics-5000-re/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re

qiime tools extract --input-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding

qiime tools extract --input-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding

qiime tools extract --input-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding

qiime tools extract --input-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding-noasian

qiime tools extract --input-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding-noasian

qiime tools extract --input-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding-noasian
  
qiime tools extract --input-path ${project}/core-metrics-5000-re-onesubj-new/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-onesubj-new

qiime tools extract --input-path ${project}/core-metrics-5000-re-onesubj-new/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-onesubj-new

qiime tools extract --input-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding-onesubj/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding-onesubj

qiime tools extract --input-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding-onesubj/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-age-sex-delivery-feeding-onesubj

qiime tools export --input-path ${project}/core-metrics-5000-re/faith_pd_vector.qza \
    --output-path ${project}/alpha_out

qiime tools export --input-path ${project}/core-metrics-5000-re/shannon_vector.qza \
    --output-path ${project}/alpha_out

qiime tools export --input-path ${project}/core-metrics-5000-re/observed_features_vector.qza \
    --output-path ${project}/alpha_out

qiime tools export --input-path ${project}/core-metrics-5000-re/pielou_e_vector.qza \
    --output-path ${project}/alpha_out

qiime tools export --input-path ${project}/core-metrics-5000-re/chao1_vector.qza \
    --output-path ${project}/alpha_out

qiime tools extract --input-path ${project}/core-metrics-5000-re-01week/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-01week

qiime tools extract --input-path ${project}/core-metrics-5000-re-01week/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-01week
   
qiime tools extract --input-path ${project}/core-metrics-5000-re-01week/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-01week

qiime tools extract --input-path ${project}/core-metrics-5000-re-16week/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-16week

qiime tools extract --input-path ${project}/core-metrics-5000-re-16week/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-16week

qiime tools extract --input-path ${project}/core-metrics-5000-re-16week/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-16week

qiime tools extract --input-path ${project}/core-metrics-5000-re-06week/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-06week

qiime tools extract --input-path ${project}/core-metrics-5000-re-06week/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-06week

qiime tools extract --input-path ${project}/core-metrics-5000-re-6week3month/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-6week3month

qiime tools extract --input-path ${project}/core-metrics-5000-re-6week3month/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-6week3month

qiime tools extract --input-path ${project}/core-metrics-5000-re-6week3month/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-6week3month
    
qiime tools extract --input-path ${project}/core-metrics-5000-re-312month/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-312month

qiime tools extract --input-path ${project}/core-metrics-5000-re-312month/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-312month

qiime tools extract --input-path ${project}/core-metrics-5000-re-312month/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-312month

qiime tools extract --input-path ${project}/core-metrics-5000-re-012month/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-012month

qiime tools extract --input-path ${project}/core-metrics-5000-re-012month/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-012month

qiime tools extract --input-path ${project}/core-metrics-5000-re-13year/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-13year

qiime tools extract --input-path ${project}/core-metrics-5000-re-13year/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-13year

qiime tools extract --input-path ${project}/core-metrics-5000-re-13year/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-13year

qiime tools extract --input-path ${project}/core-metrics-5000-re-36year/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-36year

qiime tools extract --input-path ${project}/core-metrics-5000-re-36year/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-36year

qiime tools extract --input-path ${project}/core-metrics-5000-re-612year/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-612year

qiime tools extract --input-path ${project}/core-metrics-5000-re-612year/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-612year

qiime tools extract --input-path ${project}/core-metrics-5000-re-312year/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-312year

qiime tools extract --input-path ${project}/core-metrics-5000-re-312year/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-312year

qiime tools extract --input-path ${project}/core-metrics-5000-re-312year/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-312year

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother
  
qiime tools extract --input-path ${project}/core-metrics-5000-re-noother/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-01week/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-01week

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-01week/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-01week
   
qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-01week/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-01week

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-16week/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-16week

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-16week/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-16week

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-16week/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-16week

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-6week3month/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-6week3month

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-6week3month/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-6week3month

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-6week3month/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-6week3month
    
qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-312month/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-312month

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-312month/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-312month

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-312month/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-312month

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-13year/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-13year

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-13year/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-13year

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-13year/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-13year

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-312year/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-312year

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-312year/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-312year

qiime tools extract --input-path ${project}/core-metrics-5000-re-noother-noasian-312year/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noother-noasian-312year

qiime tools extract --input-path ${project}/core-metrics-5000-re-noasian/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noasian

qiime tools extract --input-path ${project}/core-metrics-5000-re-noasian/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noasian
  
qiime tools extract --input-path ${project}/core-metrics-5000-re-noasian/bray_curtis_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-noasian
    
qiime tools extract --input-path ${project}/core-metrics-5000-re-kim/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-kim

qiime tools extract --input-path ${project}/core-metrics-5000-re-kim/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-kim
    
qiime tools extract --input-path ${project}/core-metrics-5000-re-chu/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-chu

qiime tools extract --input-path ${project}/core-metrics-5000-re-chu/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-chu

qiime tools extract --input-path ${project}/core-metrics-5000-re-robinson/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-robinson

qiime tools extract --input-path ${project}/core-metrics-5000-re-robinson/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-robinson

qiime tools extract --input-path ${project}/core-metrics-5000-re-herman/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-herman

qiime tools extract --input-path ${project}/core-metrics-5000-re-herman/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-herman

qiime tools extract --input-path ${project}/core-metrics-5000-re-cioffi/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-cioffi

qiime tools extract --input-path ${project}/core-metrics-5000-re-cioffi/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-cioffi

qiime tools extract --input-path ${project}/core-metrics-5000-re-planer/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-planer

qiime tools extract --input-path ${project}/core-metrics-5000-re-planer/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-planer

qiime tools extract --input-path ${project}/core-metrics-5000-re-levin/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-levin

qiime tools extract --input-path ${project}/core-metrics-5000-re-levin/unweighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-levin

qiime tools extract --input-path ${project}/core-metrics-5000-re-db/weighted_unifrac_distance_matrix.qza \
    --output-path ${project}/core-metrics-5000-re-db




### Taxa bar plots

Taxa bar plots were constructed for both the taxonomic assignment methods. Then feature tables were converted to relative abundance and presence/absence and exported.

In [None]:
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children

qiime taxa barplot --i-table ${project}/merged-table-filtered.qza --i-taxonomy ${project}/merged-taxonomy.qza \
    --m-metadata-file ${project}/metadata-combined.txt --o-visualization ${project}/taxa_plots_merged_filtered.qzv

qiime taxa barplot --i-table ${project}/merged-table-filtered-otu.qza --i-taxonomy ${project}/merged-otu-taxonomy.qza \
    --m-metadata-file ${project}/metadata-combined.txt --o-visualization ${project}/taxa_plots_merged_filtered-otu.qzv
    
qiime feature-table relative-frequency --i-table ${project}/merged-table-nomito-nochloro-new.qza \
    --o-relative-frequency-table ${project}/merged-table-nomito-nochloro-new-relab.qza
    
qiime feature-table presence-absence --i-table ${project}/merged-table-nomito-nochloro-new.qza \
    --o-presence-absence-table ${project}/merged-table-nomito-nochloro-new-pa.qza

qiime tools export --input-path ${project}/merged-table-nomito-nochloro-new-relab.qza \
    --output-path ${project}/exported

qiime tools export --input-path ${project}/merged-taxonomy-new.qza --output-path ${project}/exported

cp ${project}/exported/taxonomy.tsv ${project}/biom-taxonomy.tsv #edit header to #OTUID taxonomy confidence

biom add-metadata -i ${project}/exported/feature-table.biom -o ${project}/table-with-taxonomy.biom \
    --observation-metadata-fp ${project}/biom-taxonomy.tsv --sc-separated taxonomy

biom convert -i ${project}/table-with-taxonomy.biom -o ${project}/feature-table-full-relab.tsv \
    --to-tsv --header-key taxonomy
    
qiime tools export --input-path ${project}/merged-table-nomito-nochloro-new.qza \
    --output-path ${project}/exported

qiime tools export --input-path ${project}/merged-taxonomy-new.qza --output-path ${project}/exported

cp ${project}/exported/taxonomy.tsv ${project}/biom-taxonomy.tsv #edit header to #OTUID taxonomy confidence

biom add-metadata -i ${project}/exported/feature-table.biom -o ${project}/table-with-taxonomy.biom \
    --observation-metadata-fp ${project}/biom-taxonomy.tsv --sc-separated taxonomy

biom convert -i ${project}/table-with-taxonomy.biom -o ${project}/feature-table-full.tsv \
    --to-tsv --header-key taxonomy

qiime tools export --input-path ${project}/merged-table-nomito-nochloro-new-pa.qza \
    --output-path ${project}/exported

qiime tools export --input-path ${project}/merged-taxonomy-new.qza --output-path ${project}/exported

cp ${project}/exported/taxonomy.tsv ${project}/biom-taxonomy.tsv #edit header to #OTUID taxonomy confidence

biom add-metadata -i ${project}/exported/feature-table.biom -o ${project}/table-with-taxonomy.biom \
    --observation-metadata-fp ${project}/biom-taxonomy.tsv --sc-separated taxonomy

biom convert -i ${project}/table-with-taxonomy.biom -o ${project}/feature-table-full-pa.tsv \
    --to-tsv --header-key taxonomy

Feature tables were also collapsed at the level of phyla, family, genus, and species and then exported. These feature tables were also converted to presence/absence feature tables and exported.

In [None]:
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children

qiime taxa collapse --i-table ${project}/merged-table-nomito-nochloro-new.qza \
    --i-taxonomy ${project}/merged-taxonomy-new.qza \
    --p-level 2 --o-collapsed-table ${project}/merged-table-nomito-nochloro-new-phyla.qza

qiime taxa collapse --i-table ${project}/merged-table-nomito-nochloro-new.qza \
    --i-taxonomy ${project}/merged-taxonomy-new.qza \
    --p-level 5 --o-collapsed-table ${project}/merged-table-nomito-nochloro-new-family.qza

qiime taxa collapse --i-table ${project}/merged-table-nomito-nochloro-new.qza \
    --i-taxonomy ${project}/merged-taxonomy-new.qza \
    --p-level 6 --o-collapsed-table ${project}/merged-table-nomito-nochloro-new-genus.qza
    
qiime taxa collapse --i-table ${project}/merged-table-nomito-nochloro-new.qza \
    --i-taxonomy ${project}/merged-taxonomy-new.qza \
    --p-level 7 --o-collapsed-table ${project}/merged-table-nomito-nochloro-new-species.qza

qiime tools export --input-path ${project}/merged-table-nomito-nochloro-new-phyla.qza \
    --output-path ${project}/exported

biom convert -i ${project}/exported/feature-table.biom -o ${project}/feature-table-phyla.tsv --to-tsv

qiime tools export --input-path ${project}/merged-table-nomito-nochloro-new-family.qza \
    --output-path ${project}/exported

biom convert -i ${project}/exported/feature-table.biom -o ${project}/feature-table-family.tsv --to-tsv

qiime tools export --input-path ${project}/merged-table-nomito-nochloro-new-genus.qza \
    --output-path ${project}/exported

biom convert -i ${project}/exported/feature-table.biom -o ${project}/feature-table-genus.tsv --to-tsv

qiime tools export --input-path ${project}/merged-table-nomito-nochloro-new-species.qza \
    --output-path ${project}/exported

biom convert -i ${project}/exported/feature-table.biom -o ${project}/feature-table-species.tsv --to-tsv

qiime feature-table presence-absence --i-table ${project}/merged-table-nomito-nochloro-new-phyla.qza \
    --o-presence-absence-table ${project}/merged-table-nomito-nochloro-new-phyla-pa.qza

qiime feature-table presence-absence --i-table ${project}/merged-table-nomito-nochloro-new-family.qza \
    --o-presence-absence-table ${project}/merged-table-nomito-nochloro-new-family-pa.qza

qiime feature-table presence-absence --i-table ${project}/merged-table-nomito-nochloro-new-genus.qza \
    --o-presence-absence-table ${project}/merged-table-nomito-nochloro-new-genus-pa.qza

qiime feature-table presence-absence --i-table ${project}/merged-table-nomito-nochloro-new-species.qza \
    --o-presence-absence-table ${project}/merged-table-nomito-nochloro-new-species-pa.qza

qiime tools export --input-path ${project}/merged-table-nomito-nochloro-new-phyla-pa.qza \
    --output-path ${project}/exported

biom convert -i ${project}/exported/feature-table.biom -o ${project}/feature-table-phyla-pa.tsv --to-tsv

qiime tools export --input-path ${project}/merged-table-nomito-nochloro-new-family-pa.qza \
    --output-path ${project}/exported

biom convert -i ${project}/exported/feature-table.biom -o ${project}/feature-table-family-pa.tsv --to-tsv

qiime tools export --input-path ${project}/merged-table-nomito-nochloro-new-genus-pa.qza \
    --output-path ${project}/exported

biom convert -i ${project}/exported/feature-table.biom -o ${project}/feature-table-genus-pa.tsv --to-tsv

qiime tools export --input-path ${project}/merged-table-nomito-nochloro-new-species-pa.qza \
    --output-path ${project}/exported

biom convert -i ${project}/exported/feature-table.biom -o ${project}/feature-table-species-pa.tsv --to-tsv

### Supervised machine learning

We used a Random Forest Classifier to determine if various metadata categories can be predicted by microbial composition. We did this for all samples across all age categories combined, as well as for infants 3-12 months of age alone. We also used both ASV-level and genus-level feature tables.

In [None]:
project=/Users/elizabethmallott/Dropbox/Projects/VMI/children

#All ages ASV level
qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-filtered-re-noother-new.qza --m-metadata-file ${project}/metadata-combined-re.txt \
  --m-metadata-column Race --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-race-classifier

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-race-classifier/feature_importance.qza \
  --o-visualization ${project}/merged-table-filtered-race-classifier/feature_importance.qzv
  
qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-race-classifier/predictions.qza \
  --o-visualization ${project}/merged-table-filtered-race-classifier/predictions.qzv

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-race-classifier/probabilities.qza \
  --o-visualization ${project}/merged-table-filtered-race-classifier/probabilities.qzv

qiime sample-classifier heatmap \
  --i-table ${project}/merged-table-filtered-re-noother-new.qza \
  --i-importance ${project}/merged-table-filtered-race-classifier/feature_importance.qza \
  --m-sample-metadata-file ${project}/metadata-combined-re.txt \
  --m-sample-metadata-column Race \
  --p-group-samples \
  --p-feature-count 30 \
  --o-filtered-table ${project}/merged-table-filtered-race-classifier/important-feature-table-top-30.qza \
  --o-heatmap ${project}/merged-table-filtered-race-classifier/important-feature-heatmap.qzv
  
qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-filtered-re-noother-new.qza --m-metadata-file ${project}/metadata-combined-re.txt \
  --m-metadata-column Ethnicity --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-ethnicity-classifier

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-ethnicity-classifier/feature_importance.qza \
  --o-visualization ${project}/merged-table-filtered-ethnicity-classifier/feature_importance.qzv

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-ethnicity-classifier/predictions.qza \
  --o-visualization ${project}/merged-table-filtered-ethnicity-classifier/predictions.qzv

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-ethnicity-classifier/probabilities.qza \
  --o-visualization ${project}/merged-table-filtered-ethnicity-classifier/probabilities.qzv

qiime sample-classifier heatmap \
  --i-table ${project}/merged-table-filtered-re-noother-new.qza \
  --i-importance ${project}/merged-table-filtered-ethnicity-classifier/feature_importance.qza \
  --m-sample-metadata-file ${project}/metadata-combined-re.txt \
  --m-sample-metadata-column Ethnicity \
  --p-group-samples \
  --p-feature-count 30 \
  --o-filtered-table ${project}/merged-table-filtered-ethnicity-classifier/important-feature-table-top-30.qza \
  --o-heatmap ${project}/merged-table-filtered-ethnicity-classifier/important-feature-heatmap.qzv
  
qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-filtered-re-noother-new.qza \
  --m-metadata-file ${project}/metadata-combined-re-age-sex-delivery-feeding-noother.txt \
  --m-metadata-column host_sex --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-sex-classifier
  
qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-filtered-re-noother-new.qza \
  --m-metadata-file ${project}/metadata-combined-re-age-sex-delivery-feeding-noother.txt \
  --m-metadata-column Host_age_cat --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-age-classifier
  
qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-filtered-re-noother-new.qza \
  --m-metadata-file ${project}/metadata-combined-re-age-sex-delivery-feeding-noother.txt \
  --m-metadata-column Breastfeeding --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-infantdiet-classifier
  
qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-filtered-re-noother-new.qza \
  --m-metadata-file ${project}/metadata-combined-re-age-sex-delivery-feeding-noother.txt \
  --m-metadata-column host_del_route --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-delivery-classifier
  
qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-filtered-re-noother-new.qza --m-metadata-file ${project}/metadata-combined-re.txt \
  --m-metadata-column Study --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-study-classifier
  
#3-12 months ASV level

qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-filtered-re-312month-new.qza --m-metadata-file ${project}/metadata-combined-re.txt \
  --m-metadata-column Race --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-312month-race-classifier

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-312month-race-classifier/feature_importance.qza \
  --o-visualization ${project}/merged-table-filtered-312month-race-classifier/feature_importance.qzv
  
qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-312month-race-classifier/predictions.qza \
  --o-visualization ${project}/merged-table-filtered-312month-race-classifier/predictions.qzv

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-312month-race-classifier/probabilities.qza \
  --o-visualization ${project}/merged-table-filtered-312month-race-classifier/probabilities.qzv
  
qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-filtered-re-312month-new.qza --m-metadata-file ${project}/metadata-combined-re.txt \
  --m-metadata-column Ethnicity --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-312month-ethnicity-classifier

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-312month-ethnicity-classifier/feature_importance.qza \
  --o-visualization ${project}/merged-table-filtered-312month-ethnicity-classifier/feature_importance.qzv
  
qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-312month-ethnicity-classifier/predictions.qza \
  --o-visualization ${project}/merged-table-filtered-312month-ethnicity-classifier/predictions.qzv

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-312month-ethnicity-classifier/probabilities.qza \
  --o-visualization ${project}/merged-table-filtered-312month-ethnicity-classifier/probabilities.qzv
  
#All ages species level

qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-nomito-nochloro-new-species.qza --m-metadata-file ${project}/metadata-combined-re.txt \
  --m-metadata-column Race --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-species-race-classifier

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-species-race-classifier/feature_importance.qza \
  --o-visualization ${project}/merged-table-filtered-species-race-classifier/feature_importance.qzv
  
qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-species-race-classifier/predictions.qza \
  --o-visualization ${project}/merged-table-filtered-species-race-classifier/predictions.qzv

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-species-race-classifier/probabilities.qza \
  --o-visualization ${project}/merged-table-filtered-species-race-classifier/probabilities.qzv
  
qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-nomito-nochloro-new-species.qza --m-metadata-file ${project}/metadata-combined-re.txt \
  --m-metadata-column Ethnicity --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-species-ethnicity-classifier

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-species-ethnicity-classifier/feature_importance.qza \
  --o-visualization ${project}/merged-table-filtered-species-ethnicity-classifier/feature_importance.qzv
  
qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-species-ethnicity-classifier/predictions.qza \
  --o-visualization ${project}/merged-table-filtered-species-ethnicity-classifier/predictions.qzv

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-species-ethnicity-classifier/probabilities.qza \
  --o-visualization ${project}/merged-table-filtered-species-ethnicity-classifier/probabilities.qzv

#3-12 months species level

qiime taxa collapse --i-table ${project}/merged-table-filtered-re-312month-new.qza \
    --i-taxonomy ${project}/merged-taxonomy-new.qza \
    --p-level 7 --o-collapsed-table ${project}/merged-table-nomito-nochloro-312month-new-species.qza

qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-nomito-nochloro-312month-new-species.qza --m-metadata-file ${project}/metadata-combined-re.txt \
  --m-metadata-column Race --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-312month-species-race-classifier

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-312month-species-race-classifier/feature_importance.qza \
  --o-visualization ${project}/merged-table-filtered-312month-species-race-classifier/feature_importance.qzv
  
qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-312month-species-race-classifier/predictions.qza \
  --o-visualization ${project}/merged-table-filtered-312month-species-race-classifier/predictions.qzv

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-312month-species-race-classifier/probabilities.qza \
  --o-visualization ${project}/merged-table-filtered-312month-species-race-classifier/probabilities.qzv
  
qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-nomito-nochloro-312month-new-species.qza --m-metadata-file ${project}/metadata-combined-re.txt \
  --m-metadata-column Ethnicity --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 \
  --output-dir ${project}/merged-table-filtered-312month-species-ethnicity-classifier

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-312month-species-ethnicity-classifier/feature_importance.qza \
  --o-visualization ${project}/merged-table-filtered-312month-species-ethnicity-classifier/feature_importance.qzv
  
qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-312month-species-ethnicity-classifier/predictions.qza \
  --o-visualization ${project}/merged-table-filtered-312month-species-ethnicity-classifier/predictions.qzv

qiime metadata tabulate \
  --m-input-file ${project}/merged-table-filtered-312month-species-ethnicity-classifier/probabilities.qza \
  --o-visualization ${project}/merged-table-filtered-312month-species-ethnicity-classifier/probabilities.qzv
  
#Classifier with 90% of childhood samples to use for testing adult AGP data
qiime sample-classifier classify-samples \
  --i-table ${project}/merged-table-filtered-re-noother-new.qza --m-metadata-file ${project}/metadata-combined-re.txt \
  --m-metadata-column Race --p-missing-samples ignore --p-optimize-feature-selection \
  --p-parameter-tuning --p-estimator RandomForestClassifier --p-random-state 123 --p-test-size 0.9 \
  --output-dir ${project}/merged-table-filtered-race-classifier-1

