## Generating input for KEGGDecoder

The KEGGDecoder tool uses files output by KEGG ghostKOALA or kofamscan by default. This page walks you through how to generate these files using kofamscan. If you would like to generate them another way, the input files to KEGGDecoder look like this:

```
NORP96_1
NORP96_2
NORP96_3
NORP96_4
NORP96_5
NORP96_6	K04764
NORP96_7	K01890
NORP96_8	K01889
NORP96_9	K02887
NORP96_10	K02916
NORP96_11	K02520
NORP96_12	K01868
NORP96_13
NORP96_14
NORP96_15
NORP96_16
NORP96_17
NORP96_18	K07334
```

We did not binder-ize kofamscan because it requires too much compute to be executed on a binder cloud computer. However, you can copy and paste the code on your local computer and try it there. 

The documentation states that kofamscan needs to be run on linux, but we have successfully run it on unix using conda. 

Download the databases and executables using:
```
# download the ko list
wget ftp://ftp.genome.jp/pub/db/kofam/ko_list.gz
# download the hmm profiles
wget ftp://ftp.genome.jp/pub/db/kofam/profiles.tar.gz 	
# download kofamscan tool
wget ftp://ftp.genome.jp/pub/tools/kofamscan/kofamscan.tar.gz	
# download README
wget ftp://ftp.genome.jp/pub/tools/kofamscan/README.md			
```

And then unzip and untar the relevant files:

```
gunzip ko_list.gz
tar xf profiles.tar.gz
tar xf kofamscan.tar.gz
```


Next, make a conda environment using miniconda:

```
conda create -n kofamscan hmmer parallel
conda activate kofamscan
conda install -c conda-forge ruby
```

Then copy the template config file and add the relative paths of my newly downloaded kofamscan databases. The config file should look like this:

`config.yml`:

```
# Path to your KO-HMM database
# A database can be a .hmm file, a .hal file or a directory in which
# .hmm files are. Omit the extension if it is .hal or .hmm file
profile: ./profiles

# Path to the KO list file
ko_list: ko_list

# Path to an executable file of hmmsearch
# You do not have to set this if it is in your $PATH
# hmmsearch: /usr/local/bin/hmmsearch

# Path to an executable file of GNU parallel
# You do not have to set this if it is in your $PATH
# parallel: /usr/local/bin/parallel

# Number of hmmsearch processes to be run parallelly
cpu: 8
```

Lastly, run kofamscan:

```
./exec_annotation -o sb1_out sb1_proteins.faa
```