![VWBPRGbanner.jpeg](VWBPRGbanner.jpeg)

# 16S amplicon NGS

# Data Import and Preliminary analysis

![download.png](attachment:download.png)

This workflow follows documentation from QIIME2 documents on [data import](https://docs.qiime2.org/2017.12/tutorials/importing/). 
<br>
<br>
***16S amplicon NGS analysis***

This notebook continues on from the notebook on native installation of QIIME2 and the USEARCH pipeline.

**Assumptions**
- Using a macOS environment
- Installed QIIME2 following their [native installation guide](https://docs.qiime2.org/2017.12/install/native/)
- Worked through the USEARCH Pipeline as outlined <font color=green>[INSERT LINK HERE]</font>

**What you will need**
- **`biom`** file: this is generated from the [UNOISE algothrim](https://www.drive5.com/usearch/manual/unoise_pipeline.html) in the USEARCH pipeline. At present the UNOISE pipeline  generates a v1 format, however it is worth checking this is still the case on the USEARCH webpage before proceding further. See here for more information on the [biom format](http://biom-format.org/documentation/format_versions/biom-1.0.html).
- **`sequences`** file: Unaligned sequence data is imported from a fasta formatted file containing DNA sequences that are not aligned (i.e., do not contain - or . characters). The sequences may contain degenerate nucleotide characters, such as `N`, but some QIIME2 actions may not support these characters. See the [scikit-bio fasta format description](http://scikit-bio.org/docs/latest/generated/skbio.io.format.fasta.html#fasta-format) for more information about the fasta format.

----

## 1. BIOMV1.0.0 and Feature Table

### (a) Import BIOM file

    qiime tools import \
    --input-path unoise_otu_biom.biom \
    --type 'FeatureTable[Frequency]' \
    --source-format BIOMV100Format \
    --output-path feature-table-1.qza

 <div class="alert alert-block alert-info">The `input-path` will dependend on where your `biom` file is located and what it is called. In this example the `biom` file is called `unoise_otu_biom` and is located in the current directory `/7.unoise_all`</div>

**Output artifacts:**

    feature-table-1.qza

<div class="alert alert-block alert-danger">Current issues with the UNOISE3 output of the biom file. You will need to do the following if the above instructions did not work</div>

This needs to be in the qiime environment, unless you have the biom package installed locally.  Navigate to the final unoise file `/7.unoise_all` (it should have the file `unoise_otu_tab.txt`) and execute the following:

    biom convert -i unoise_otu_tab.txt -o table.from_txt_json.biom --table-type="OTU table" --to-json

Now in the QIIME environment, navigate to the relevent excute the following:

    qiime tools import \
    --input-path table.from_txt_json.biom \
    --type 'FeatureTable[Frequency]' \
    --source-format BIOMV100Format \
    --output-path feature-table-1.qza

### (b) Import per-feature unaligned sequence data (i.e., representative sequences)

    qiime tools import \
    --input-path unoise_zotus_relabelled.fasta \
    --output-path sequences.qza \
    --type 'FeatureData[Sequence]'

 <div class="alert alert-block alert-info">The `input-path` will dependend on where your `sequences` file is located and what it is called. In this example the `sequences` file is called `sequences`, with the file extension `.fna` and is located in the current directory `/7.unoise_all`</div>

**Output artifacts:**

    sequences.qza

<div class="alert alert-block alert-info">Note the file format for the `input path` is written as `.fna`. This format is the `fasta` format - and is synonymous with the file formats `.fa` and `.fasta`.</div>

<div class="alert alert-block alert-danger"> but **NOT** the same as `.fq` or `.fastq`. </div>

***

## 2. Create Feature Table and Feature Data Summaries

Once the BIOM file and sequences have been import then the feature table and data summaries can be generated

**Requirements**
- Feature table - called `feature-table-1.qza`
- Sequences called `sequences.qza`
- Metadata called `metadata.tsv` - it is essential that the metadata is in the correct format, see below for more info

### Create Feature Table Summary

    qiime feature-table summarize 
    --i-table feature-table-1.qza 
    --o-visualization table.qzv 
    --m-sample-metadata-file metadata.tsv 

**Output artifacts:**

    table.qzv

<div class="alert alert-block alert-info"> <font color=red>**Error messages while creating Feature Table?**</font>
<br>
If you are having trouble with the above code it is most likely there is an issue with your metadata and/or your sequences matching your metadata. To check this is the case you can run the above script without the last line adding in your metadata </div>

    qiime feature-table summarize 
    --i-table feature-table-1.qza 
    --o-visualization table.qzv 

**Output visualization:**

    table.qzv

Go to the second tab `interactive sample detail` and check the name of the samples matches what is in your metadata.
If you are still having issues see QIIME documenation of metadata available [here](https://docs.qiime2.org/2017.12/tutorials/metadata/)

### Create Feature Table Sequences

    qiime feature-table tabulate-seqs \
    --i-data sequences.qza \
    --o-visualization sequences.qzv

**Output visualization:**

    sequences.qzv

****