### Data Folder Overview

The `data` folder contains the following `.qza` files:

- `rarefied_table.qza`: Generated from the `core-metrics-phylogenetic` analysis with a sampling depth of 3815.  
- `rooted-tree_24h_A.qza`: Produced using the `align-to-tree-mafft-fasttree` pipeline.  
- `taxonomy.qza`: Also derived from the `align-to-tree-mafft-fasttree` pipeline.

These files were obtained from the earlier analysis titled  
**`Step3_Microbiome_Raw_data_preprocessing_Qiime_24h_artichoke`**.

In this step, we will extract the `.biom` files by unzipping the `.qza` files,  
and convert them into `.tsv` format for further analysis.

### File Generation for Analysis
- **QZA and BIOM File Generation**  
  From the  feature table, we generated the necessary `.qza` and `.biom` files for downstream analyses.  
  These files serve as the basis for taxonomic and diversity analysis. 

In [13]:
import os
import zipfile

In [14]:
# To extract the qza data files for future analysis, move the qza files to the export_qza_dataset directory.

In [15]:
def extract_qza_files(folder_path):
    qza_files = []
    for root, _, files in os.walk(folder_path):
        for file in files:
            if file.endswith(".qza"):
                qza_files.append(os.path.join(root, file))

    for qza_file in qza_files:
        output_dir = f"{os.path.splitext(qza_file)[0]}_extracted"
        os.makedirs(output_dir, exist_ok=True)
        
        try:
            with zipfile.ZipFile(qza_file, 'r') as zip_ref:
                zip_ref.extractall(output_dir)
        
        except zipfile.BadZipFile:
            print(f"Error extracting {qza_file}: Not a valid .zip file")


In [19]:
# extract_qza_files('data')

In [10]:
# When extracting qza files later, biom files will be generated. To convert them to tsv format, gather them in the export_biom_dataset directory and extract them.
# At this time, make sure to properly name the biom files to avoid duplication.

In [18]:
!biom convert -i extracted/rarefied_table.biom -o extracted/rarefied_table.tsv --to-tsv
!biom convert -i extracted/filtered_yoo_24h_table_A.biom -o extracted/filtered_yoo_24h_table_A.tsv --to-tsv