# Module: Exporting QIIME2 Data Files

QIIME2 artifacts (`.qza`) and visualizations (`.qzv`) are not so useful outside QIIME2 since not many external tools support it. To be useable for other utilities, we need to export these into more common formats (e.g. TSV, FASTA, etc).

This notebook demonstrates how you can convert QIIME2-native files into the more familiar bioinformatics file formats. In general, it is a fairly easy task that could be accomplished by the `qiime tools export` command. But discussed below are a few more steps that may be helpful when converting.

The following was used as reference for this notebook: [QIIME2 Exporting](https://docs.qiime2.org/2024.10/tutorials/exporting/).

Created by: _Microbial Oceanography Laboratory (MOLab)_

<div class="alert alert-block alert-warning">
<b>Warning from QIIME2:</b> 
    
When exporting data from a QIIME 2 artifact, there will no longer be provenance associated with the data. If you subsequently re-import the exported data, the provenance associated with the new artifact will begin with the import step and all existing provenance will be lost. It’s therefore best to only export data from artifacts when you are done with all processing steps that can be achieved with QIIME 2 to maximize the value of each artifact’s provenance.
</div>

---
## How to Use This Notebook

1. Activate conda environment in terminal window. Make sure to change the environment name to what is applicable in your case.
>`conda activate qiime2-2023.2`
2. Open jupyter notebook with the command below and select the notebook.
>`jupyter notebook`
3. To run the cells in this notebook, press Shift+Enter.

---
## Tools Used
1. **QIIME 2 Amplicon Distribution**
    - Installation procedure can be found here: [QIIME2 native installation](https://docs.qiime2.org/2024.10/install/native/)

---
## Starting Files 

1. The following types of `.qza` files to export.
    * `FeatureData[Taxonomy]`
    * `FeatureTable[Frequency]`
    * `FeatureData[Sequence]`
    
    
2. Other `.qza` types that are exportable.

---
## Expected Outputs

1. Feature table BIOM and TSV file.
2. Taxonomy assignments TSV file.
3. FASTA file of feature sequences.
4. Other file formats resulting from `qiime tools export`.

---
## Table of Contents
 * [**Exporting Taxonomic Annotations**](#Exporting-Taxonomic-Annotations)
 * [**Exporting Feature Table**](#Exporting-Feature-Table)
 * [**Exporting Representative Sequences**](#Exporting-Representative-Sequences)
 * [**Exporting Other Formats**](#Exporting-Other-Formats)

---
# <font color = 'gray'>Exporting Taxonomic Annotations</font>

The code below converts a `.qza` file of type `FeatureData[Taxonomy]` into a three-column TSV file named `taxonomy.tsv`. The columns are:

* Feature ID - These are the OTU/ASV IDs.
* Taxon - Inferred taxonomy for the feature.
* Confidence - Confidence of the taxonomy assignment method.

In [None]:
!qiime tools export \
    --input-path feature-taxa.qza \
    --output-path exported

---
# <font color = 'gray'>Exporting Feature Table</font>

The code below converts a `.qza` of type `FeatureTable[Frequency]` to a [BIOM table](https://biom-format.org/documentation/table_objects.html). The exported file is located inside the path indicated in the `--output-path` argument, and the filename should be `feature-table.biom`.

In [None]:
!qiime tools export \
    --input-path feature-table.qza \
    --output-path exported

Excel-ready files are probably easier to use and manipulate than BIOM tables, so we further convert this into TSV.

In [None]:
!biom convert \
    -i exported/feature-table.biom \
    -o exported/feature-table.tsv \
    --to-tsv

Optionally, you could also append a new column containing the taxonomies assigned to the features. To do that, first you modify the header row of `taxonomy.tsv`.

In [None]:
!sed '1c#OTUID\ttaxonomy\tconfidence' exported/taxonomy.tsv > exported/biom-taxonomy.tsv

Then, you append the taxonomic identities to `feature-table.biom`.

In [None]:
!biom add-metadata \
    -i exported/feature-table.biom \
    -o exported/feature-table-with-taxonomy.biom \
    --observation-metadata-fp exported/biom-taxonomy.tsv \
    --sc-separated taxonomy

Finally, convert the feature table with taxonomic identities (`feature-table-with-taxonomy.biom`) to TSV.

In [None]:
!biom convert \
    -i exported/feature-table-with-taxonomy.biom \
    -o exported/feature-table-with-taxonomy.tsv \
    --to-tsv \
    --header-key taxonomy

---
# <font color = 'gray'>Exporting Representative Sequences</font>

The code below converts a `.qza` file of type `FeatureData[Sequence]` to a FASTA file.

In [None]:
!qiime tools export \
    --input-path rep-seqs.qza \
    --output-path exported

---
# <font color = 'gray'>Exporting Other Formats</font>

Check this reference for a few more `.qza` types that could be exported outside QIIME2: [QIIME2 exporting](https://docs.qiime2.org/2024.10/tutorials/exporting/)