# Filter Feature Table

This step will set up the 'base' feature table. We must remove sequences that are either contaminants or spurious. To remove contaminants and low abundance features from the feature table. We can break this down into two strategies:

1. `qiime feature-table filter-features` using --p-min-samples = 2
2. `qiime taxa filter-table` to retain features only if they were observed at the phylum level
3. finally, removal of identified contaminants

Following the filtering of the feature table, then filter the rep-seqs.qza file to only include the features that are in the filtered table. The filtered-rep-seqs.qza file will then be used for tree building. 

The Rscript `identify-contaminants.R` was used to identify contaminants. This script is located in /bin/scripts/

## Filter features that appear in only one sample

These are most likely spurious

In [None]:
%%bash
qiime feature-table filter-features \
    --i-table table.qza \
    --p-min-samples 2 \
    --o-filtered-table filtered-table.qza

## Filter features that are not assigned at least to the Phylum level

Since the SILVA annotations assign taxonomic levels as D_0 = kingdom, D_1 = phylum, etc. only include those features with annotations of *at least* phylum level.

In [None]:
%%bash
qiime taxa filter-table \
    --i-table filtered-table.qza \
    --i-taxonomy taxonomy.qza \
    --p-include D_1__\
    --o-filtered-table taxa-filtered-table.qza

In [None]:
## Filter contaminants identified by decontam

Use the `contaminants.tsv` file created by the decontam script to filter out possible contaminant FeatureIDs.

In [None]:
%%bash
qiime feature-table filter-features \
    --i-table taxa-filtered-table.qza \
    --m-metadata-file contaminants.tsv \
    --p-exclude-ids \
    --o-filtered-table decontam-taxa-filtered-table.qza

## Filter the rep-seqs to match the new feature table

In [None]:
%%bash
qiime feature-table filter-seqs \
    --i-data rep-seqs.qza \
    --i-table decontam-taxa-filtered-10table.qza \
    --o-filtered-data filtered-rep-seqs10.qza