## Converting Kraken results to BIOM table
This notebook attempts to convert and merge multiple kraken outputs to a single biom file (`json` format).
To begin with the analyses, the report file `GMS-2405-ITS.zip` should be extracted  to the `data` folder (see [README](../README.md))

### Step 1. Convert kraken output to individual biom tables
In this step, we will generate biom tables (`hdf5` & `json` format) to `../output` folder using [kraken-biom](https://github.com/smdabdoub/kraken-biom).

In [None]:
%%bash
for s in '09' '10' '11'
do
    input_file="../data/GMS-2405-ITS/Analysis/barcode$s/read_classifications.kraken"
    staged_file="sample_$s.kraken" # important step to give proper column name in biom table
    cp $input_file $staged_file
    kraken-biom $staged_file -o ../output/sample_$s.hdf5.biom
    kraken-biom $staged_file -o ../output/sample_$s.json.biom --fmt json
    rm $staged_file
done

### Step 2. Using biom library to merge all data together
Next, we will follow the instruction from [biom-format](https://biom-format.org/index.html) to merge all the samples together.

In [None]:
# load python libraries
from pathlib import Path
from biom import Table, load_table

In [None]:
# read all biom tables into python object
biom_tables = []
for s in Path("../output/").glob("*.json.biom"):
    table = load_table(s)
    biom_tables.append(table)

In [None]:
# merge all tables together, should've use for loop here
merged_table = biom_tables[0].merge(biom_tables[1]).merge(biom_tables[2])

# apparently, the resulting "type" instance is Null. Manually assign value
merged_table.type = "OTU table"

In [None]:
# write output to file
with open("../output/all_sample.json.biom", "w") as f:
    f.write(merged_table.to_json("biom-format"))

### 3. Analysis and visualization
Now that we have a familiar format, we can explore the data using [MEGAN6](https://uni-tuebingen.de/fakultaeten/mathematisch-naturwissenschaftliche-fakultaet/fachbereiche/informatik/lehrstuehle/algorithms-in-bioinformatics/software/megan6/#:~:text=MEGAN6%20is%20a%20comprehensive%20toolbox,InterPro2GO%2C%20SEED%2C%20eggNOG%20or%20KEGG).