Here we tidy up the data using dada2 for denoising, and a variety of summarization and filtering.

Choose path or make sure you are in the right folder

In [None]:
import os
os.chdir(<your path>)

Denoising with dada2.

 - input the data created in the data import code
 - trim according to the primers that you have used
 - truncate according to the quality plots
 - output representative sequences
 - output a dada2 feature table
 - define how many threads you want to use. Useful for big data if you have many cores
 - output the denoising stats 

In [None]:

!qiime dada2 denoise-paired \
--i-demultiplexed-seqs demux-paired-end.qza \
--p-trim-left-f <value> \
--p-trim-left-r <value> \
--p-trunc-len-f <value> \
--p-trunc-len-r <value> \
--o-representative-sequences rep-seqs-dada2.qza \
--o-table dada2-table.qza \
--p-n-threads <value> \
--o-denoising-stats dada2-stats.qza

Summarize and visualize the feature table. Include the metadata file when you summarize.

In [None]:
!qiime feature-table summarize \
--i-table dada2-table.qza \
--o-visualization dada2-table.qzv \
--m-sample-metadata-file <your-sample-metadata.tsv>

Visualize the representative sequences

In [None]:
!qiime feature-table tabulate-seqs \
--i-data rep-seqs-dada2.qza \
--o-visualization rep-seqs-dada2.qzv

Typically, OTUs/ASVs with few reads are removed for quality purposes.

You can use the dada2 feature table produced above and the list of reads per sample, listed in the "demux-paired-end.qzv" from the previous section, as a guide to evaluate what threshold to use according to your data. Again, it is possible to save a csv of the visualization.

In [None]:
from qiime2 import Visualization

In [None]:
Visualization.load('dada2-table.qzv')

Based on the information on your dataset, define a minimum frequency or number of reads per feature ID. This filtrates the dada2 feature table and saves it with the prefix "minreads".

In [None]:
!qiime feature-table filter-features \
--i-table dada2-table.qza \
--p-min-frequency <value> \
--o-filtered-table minreads-dada2-table.qza

Below, data are summarized again and visualized to see how the filtration affected the feature table.

In [None]:
!qiime feature-table summarize \
--i-table minreads-dada2-table.qza \
--o-visualization minreads-dada2-table.qzv \
--m-sample-metadata-file sample-metadata.tsv

In [None]:
from qiime2 import Visualization

In [None]:
Visualization.load('minreads-dada2-table.qzv')