## Running core metrics diversity in qiime2 on the total_sum_scaled data

need to first convert my tsv file to a qza file to run the core diversity analysis

In [1]:
## converting my tsv to a biom table as an intermediate
! biom convert \
    -i data/total_sum_scaling3.tsv \
    -o data/total_sum_scaling3.biom \
    --table-type "Table" \
    --to-hdf5

In [2]:
## converting my new biom table to a qza file 
! qiime tools import \
    --input-path data/total_sum_scaling3.biom \
    --type 'FeatureTable[Frequency]' \
    --output-path data/total_sum_scaling3.qza

[32mImported updated/total_sum_scaling3.biom as BIOMV210DirFmt to updated/total_sum_scaling3.qza[0m
[0m

In [3]:
! qiime feature-table summarize \
    --i-table data/total_sum_scaling3.qza \
    --o-visualization data/total_sum_scaling3.qzv

[32mSaved Visualization to: /Users/madiapgar/gut_microbiome_metabolomics/total_sum_scaled/total_sum_scaling2.qzv[0m
[0m

In [4]:
! qiime tools view data/total_sum_scaling3.qzv

Press the 'q' key, Control-C, or Control-D to quit. This view may no longer be accessible or work correctly after quitting.
Press the 'q' key, Control-C, or Control-D to quit. This view may no longer be accessible or work correctly after quitting.

converting the rep-seqs to a fasta file (updated as of May 16, 2023)

In [6]:
! qiime tools export \
    --input-path data/euk_filt-mergedDietAim1rep-seqs_051523-Copy1.qza \
    --output-path /Users/madiapgar/gut_microbiome_metabolomics/total_sum_scaled/tss3/data

[32mExported /Users/madiapgar/gut_microbiome_metabolomics/CaseyandMadi/euk_filt-mergedDietAim1rep-seqs_051523-Copy1.qza as DNASequencesDirectoryFormat to directory /Users/madiapgar/gut_microbiome_metabolomics/total_sum_scaled/updated[0m
[0m

need to use my rep seqs file to generate a phylogenetic tree via sepp

In [8]:
! qiime fragment-insertion sepp \
--i-representative-sequences data/euk_filt-mergedDietAim1rep-seqs_051523-Copy1.qza \
--o-tree data/tree.qza \
--o-placements data/placements.qza \
--i-reference-database inputs/sepp-refs-silva-128.qza

^C


need to filter the sepp phylogenic tree via my biom table to only include relevant ASVs

In [None]:
! qiime fragment-insertion filter-features \
--i-table data/total_sum_scaling3.qza \
--i-tree data/tree.qza \
--o-filtered-table data/total-sum-filtered-table.qza \
--o-removed-table data/total-sum-removed-table.qza

generation of taxonomic classification and filtering steps before alpha and beta diversity analysis

In [None]:
! qiime feature-classifier classify-sklearn \
--i-classifier /Users/madiapgar/Desktop/qiimework/silva-138-99-515-806-nb-classifier.qza \
--i-reads data/euk_filt-mergedDietAim1rep-seqs_051523-Copy1.qza \
--o-classification data/taxonomy.qza

In [None]:
! qiime metadata tabulate \
--m-input-file data/taxonomy.qza \
--o-visualization data/taxonomy.qzv

In [None]:
## filtering my taxonomic table 
! qiime taxa filter-table \
--i-table data/total_sum_scaling3.qza \
--i-taxonomy data/taxonomy.qza \
--p-include p_ \
--p-exclude mitochondria,chloroplast \
--o-filtered-table data/taxonomy_filtered.qza

In [None]:
## i'm creating a visualization for my taxonomic filtered table and I will be able to use that table for my alpha/beta
## diversity aanlysis and building my taxa bar chart 
! qiime feature-table summarize \
--i-table data/taxonomy_filtered.qza \
--o-visualization data/taxonomy_filtered.qzv

In [2]:
! qiime tools view /Users/madiapgar/gut_microbiome_metabolomics/total_sum_scaled/tss3/data/taxonomy_filtered.qzv

Press the 'q' key, Control-C, or Control-D to quit. This view may no longer be accessible or work correctly after quitting.
Press the 'q' key, Control-C, or Control-D to quit. This view may no longer be accessible or work correctly after quitting.

now I can run my core metrics diversity since I have two qza files
 -- need to know my sampling depth from my filtered table before I can do this

In [None]:
! qiime diversity core-metrics-phylogenetic \
--i-phylogeny data/tree.qza \
--i-table data/taxonomy_filtered.qza \
--p-sampling-depth  99976 \
--m-metadata-file data/total_sum_metadata.tsv \
--output-dir tss_core_metrics3

some of my samples couldn't go through core metrics so I had to filter them out of the taxonomic table using my metadata file before I could re-run core metrics

In [7]:
! qiime feature-table filter-samples \
    --i-table data/taxonomy_filtered.qza \
    --m-metadata-file data/total_sum_metadata.tsv \
    --o-filtered-table data/tax_filt_actual.qza

[32mSaved FeatureTable[Frequency] to: /Users/madiapgar/gut_microbiome_metabolomics/total_sum_scaled/tax_filt_actual.qza[0m
[0m

## fixing core outputs for R

have to fix my faith pd qza because R doesn't like that it has empty cells. so I'm going to convert it to a tsv and see if I can deal with it that way.

putting my faith pd vector into qiime metadata tabulate gave me a tsv file that I could download once the visualization was opened, this helped me overcome the problem of not being able to read it into R or basically do anything else with it - go team!

In [17]:
! qiime metadata tabulate \
    --m-input-file tss_core_metrics3/faith_pd_vector.qza \
    --o-visualization tss_core_metrics3/faith_pd_vector.qzv

[32mSaved Visualization to: tss_core_metrics2/faith_pd_vector.qzv[0m
[0m

In [20]:
! qiime tools view tss_core_metrics3/faith_pd_vector.qzv

Press the 'q' key, Control-C, or Control-D to quit. This view may no longer be accessible or work correctly after quitting.
Press the 'q' key, Control-C, or Control-D to quit. This view may no longer be accessible or work correctly after quitting.

In [12]:
! qiime diversity alpha-group-significance \
--i-alpha-diversity tss_core_metrics3/faith_pd_vector.qza \
--m-metadata-file data/total_sum_metadata.tsv \
--o-visualization tss_core_metrics3/faith_pd.qzv

[32mSaved Visualization to: tss_core_metrics2/faith_pd.qzv[0m
[0m

In [19]:
! qiime tools view tss_core_metrics3/faith_pd.qzv

Press the 'q' key, Control-C, or Control-D to quit. This view may no longer be accessible or work correctly after quitting.
Press the 'q' key, Control-C, or Control-D to quit. This view may no longer be accessible or work correctly after quitting.

## Running core metrics on tss3 (total sum scaled part three) with lactococcus ASVs subtracted out (finally)

In [1]:
## checking the sampling depth for core metrics analysis 
! qiime feature-table summarize \
    --i-table data/tax_filt_actual.qza \
    --o-visualization data/tax_filt_actual.qzv 

[32mSaved Visualization to: total_sum_part_three/tax_filt_actual.qzv[0m
[0m

In [2]:
! qiime tools view data/tax_filt_actual.qzv
## sampling depth should be 99976

Press the 'q' key, Control-C, or Control-D to quit. This view may no longer be accessible or work correctly after quitting.
Press the 'q' key, Control-C, or Control-D to quit. This view may no longer be accessible or work correctly after quitting.

In [3]:
## actual core metrics analysis run locally bc I'm tired of dealing with the remote server
! qiime diversity core-metrics-phylogenetic \
    --i-phylogeny data/tree.qza \
    --i-table data/tax_filt_actual.qza \
    --p-sampling-depth 99976 \
    --m-metadata-file data/total_sum_metadata.tsv \
    --output-dir tss_core_metrics3

[32mSaved FeatureTable[Frequency] to: total_sum_part_three/tss_core_metrics3/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: total_sum_part_three/tss_core_metrics3/faith_pd_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: total_sum_part_three/tss_core_metrics3/observed_features_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: total_sum_part_three/tss_core_metrics3/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: total_sum_part_three/tss_core_metrics3/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: total_sum_part_three/tss_core_metrics3/unweighted_unifrac_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: total_sum_part_three/tss_core_metrics3/weighted_unifrac_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: total_sum_part_three/tss_core_metrics3/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: total_sum_part_three/tss_core_metrics3/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: total_sum_part_t