## Guide trees
The trees generated below are used both as guide trees in the progressive
alignment proceedure as well as trees used in the clustering step 
of the regressive alignment proceedure.

The command lines can be found below and in the `templates/trees` directory of the repository.

The generated trees can be found in the `data/trees` directory of this repository.


### Command lines used in the workflow to generate trees

#### Clustal Omega Trees (mBed)
```
custalo -i ${seqs} --guidetree-out ${id}.CLUSTALO.dnd
```


#### MAFFT PartTree trees
```
t_coffee -other_pg seq_reformat                 \
          -in ${seqs} -action +seq2dnd parttree \
          -output newick                        \
          >> ${id}.MAFFT_PARTTREE.dnd
```         


#### MAFFT trees
```
t_coffee -other_pg seq_reformat \
         -in ${seqs} -action +seq2dnd mafftdnd \
         -output newick \
         >> ${id}.MAFFT.dnd 
```


#### MAFFT-FFTNS1 trees
```
mafft-fftns --retree 1 --anysymbol --treeout ${seqs}
t_coffee -other_pg seq_reformat \
         -in ${seqs}.tree -in2 ${seqs} \
         -input newick \
         -action +mafftnewick2newick \
         > ${id}.MAFFT-FFTNS1.dnd
```

The following commands were used to generate the trees.


In [1]:
# Command to change to the base directory of the repository
import os
pwd = os.getcwd()
work_dir=pwd+"/.."
os.chdir(work_dir)
os.getcwd()

'/nfs/users2/cn/efloden/projects/dpa-analysis'

In [5]:
!nextflow run main.nf \
             --align_method='CLUSTALO' \
             --tree_method='CLUSTALO,MAFFT_PARTTREE,MAFFT,MAFFT-FFTNS1' \
             --refs='data/refs/*.ref' \
             --seqs='data/combined_seqs/*.fa' \
             --regressive_align=true \
             --standard_align=false \
             --default_align=false \
             --output results \
             -profile crg \
             -with-singularity \
             -resume

N E X T F L O W  ~  version 0.32.0
Launching `main.nf` [fabulous_minsky] - revision: 77411192a0
R E G R E S S I V E   M S A   A n a l y s i s  ~  version 0.1"
Input sequences (FASTA)                        : data/combined_seqs/*.fa
Input references (Aligned FASTA)               : data/refs/*.ref
Input trees (NEWICK)                           : false
Output directory (DIRECTORY)                   : results
Alignment methods                              : CLUSTALO
Tree methods                                   : CLUSTALO,MAFFT_PARTTREE,MAFFT,MAFFT-FFTNS1
Generate default alignments                    : false
Generate standard alignments                   : false
Generate regressive alignments (DPA)           : true
Bucket Sizes for regressive alignments         : 1000
Perform evaluation? Requires reference         : true
Output directory (DIRECTORY)                   : results

[warm up] executor > crg
WARN: Singularity cache directory has not been defined -- Remote image will be stored 

[05/97fbb1] Submitted process > guide_trees (int.CLUSTALO)
[13/af14dd] Submitted process > guide_trees (ghf5.MAFFT_PARTTREE)
[0d/4cc8ba] Submitted process > guide_trees (cah.MAFFT-FFTNS1)
[db/afd39e] Submitted process > guide_trees (Stap_Strp_toxin.MAFFT-FFTNS1)
[7e/a2a0cd] Submitted process > guide_trees (ins.MAFFT_PARTTREE)
[85/2f574a] Submitted process > guide_trees (ghf5.MAFFT-FFTNS1)
[20/a3180c] Submitted process > guide_trees (Stap_Strp_toxin.MAFFT_PARTTREE)
[6d/0879a0] Submitted process > guide_trees (ins.CLUSTALO)
[0e/ab8e9f] Submitted process > guide_trees (ins.MAFFT)
[d3/f8ddff] Submitted process > guide_trees (msb.MAFFT)
[9a/5d4c5f] Submitted process > guide_trees (kunitz.MAFFT)
[91/9be5c7] Submitted process > guide_trees (msb.MAFFT_PARTTREE)
[09/96b3b2] Submitted process > guide_trees (kunitz.MAFFT_PARTTREE)
[ad/bb97cd] Submitted process > guide_trees (egf.MAFFT)
[d5/1657a0] Submitted process > guide_trees (kunitz.CLUSTALO)
[4f/298ced] Submitted process > guide_trees (msb.M

[07/6c2a75] Submitted process > guide_trees (OTCace.MAFFT-FFTNS1)
[c8/15fa61] Submitted process > guide_trees (scorptoxin.MAFFT-FFTNS1)
[7f/dc6ebe] Submitted process > guide_trees (oxidored_q6.MAFFT_PARTTREE)
[3c/dad13c] Submitted process > guide_trees (slectin.CLUSTALO)
[96/0544a3] Submitted process > guide_trees (scorptoxin.MAFFT)
[f0/5cc730] Submitted process > guide_trees (p450.CLUSTALO)
[83/9d171d] Submitted process > guide_trees (OTCace.MAFFT_PARTTREE)
[61/6d98a5] Submitted process > guide_trees (phc.MAFFT-FFTNS1)
[02/51cde0] Submitted process > guide_trees (oxidored_q6.CLUSTALO)
[51/bf2c6c] Submitted process > guide_trees (seatoxin.MAFFT_PARTTREE)
[ee/e89416] Submitted process > guide_trees (seatoxin.MAFFT-FFTNS1)
[f8/e2f099] Submitted process > guide_trees (seatoxin.CLUSTALO)
[cd/9a500f] Submitted process > guide_trees (Ald_Xan_dh_2.CLUSTALO)
[10/730d88] Submitted process > guide_trees (Ald_Xan_dh_2.MAFFT)
[83/1fdc24] Submitted process > guide_trees (oxidored_q6.MAFFT)
[f6/e2bd

[31/b715c7] Submitted process > guide_trees (sti.MAFFT)
[79/3fc7d5] Submitted process > guide_trees (hom.MAFFT-FFTNS1)
[ff/428d18] Submitted process > guide_trees (sti.MAFFT-FFTNS1)
[b7/ce65cc] Submitted process > guide_trees (hom.MAFFT)
[cc/145288] Submitted process > guide_trees (rnasemam.MAFFT-FFTNS1)
[b7/6d5577] Submitted process > guide_trees (aadh.MAFFT-FFTNS1)
[6c/d36809] Submitted process > guide_trees (aadh.MAFFT_PARTTREE)
[7b/456dc5] Submitted process > guide_trees (sti.MAFFT_PARTTREE)
[00/89cfaa] Submitted process > regressive_alignment (serpin.CLUSTALO.DPA.1000.MAFFT_PARTTREE)
[e4/a396d6] Submitted process > regressive_alignment (hr.CLUSTALO.DPA.1000.MAFFT_PARTTREE)
[d6/f73c44] Submitted process > regressive_alignment (hpr.CLUSTALO.DPA.1000.MAFFT)
[e6/7bea67] Submitted process > regressive_alignment (hpr.CLUSTALO.DPA.1000.MAFFT_PARTTREE)
[f9/d909bd] Submitted process > regressive_alignment (hr.CLUSTALO.DPA.1000.MAFFT-FFTNS1)
[1c/11fef5] Submitted process > regressive_alignm

[ef/066531] Submitted process > regressive_alignment (ins.CLUSTALO.DPA.1000.MAFFT_PARTTREE)
[fc/4ae484] Submitted process > regressive_alignment (ghf5.CLUSTALO.DPA.1000.MAFFT-FFTNS1)
[36/d9f2c4] Submitted process > regressive_alignment (Stap_Strp_toxin.CLUSTALO.DPA.1000.MAFFT_PARTTREE)
[c8/68d0ea] Submitted process > regressive_alignment (ins.CLUSTALO.DPA.1000.CLUSTALO)
[84/d35340] Submitted process > regressive_alignment (ins.CLUSTALO.DPA.1000.MAFFT)
[53/df4fc0] Submitted process > regressive_alignment (kunitz.CLUSTALO.DPA.1000.MAFFT)
[74/3d6746] Submitted process > regressive_alignment (msb.CLUSTALO.DPA.1000.MAFFT_PARTTREE)
[ba/d6f6c7] Submitted process > regressive_alignment (kunitz.CLUSTALO.DPA.1000.MAFFT_PARTTREE)
[58/efc743] Submitted process > regressive_alignment (egf.CLUSTALO.DPA.1000.MAFFT)
[99/7eb069] Submitted process > regressive_alignment (kunitz.CLUSTALO.DPA.1000.CLUSTALO)
[40/76fd9d] Submitted process > regressive_alignment (msb.CLUSTALO.DPA.1000.MAFFT-FFTNS1)
[12/7c087

[39/73d3ee] Submitted process > regressive_alignment (biotin_lipoyl.CLUSTALO.DPA.1000.MAFFT-FFTNS1)
[28/812c60] Submitted process > regressive_alignment (HLH.CLUSTALO.DPA.1000.CLUSTALO)
[3d/175230] Submitted process > regressive_alignment (trfl.CLUSTALO.DPA.1000.CLUSTALO)
[3f/4adffc] Submitted process > regressive_alignment (HLH.CLUSTALO.DPA.1000.MAFFT)
[c7/d6cb0f] Submitted process > regressive_alignment (ltn.CLUSTALO.DPA.1000.CLUSTALO)
[47/3a256c] Submitted process > regressive_alignment (LIM.CLUSTALO.DPA.1000.CLUSTALO)
[23/47de78] Submitted process > regressive_alignment (gluts.CLUSTALO.DPA.1000.MAFFT-FFTNS1)
[9f/52992f] Submitted process > regressive_alignment (gpdh.CLUSTALO.DPA.1000.MAFFT-FFTNS1)
[64/877181] Submitted process > regressive_alignment (uce.CLUSTALO.DPA.1000.MAFFT)
[2e/770c61] Submitted process > regressive_alignment (proteasome.CLUSTALO.DPA.1000.MAFFT-FFTNS1)
[03/92635b] Submitted process > regressive_alignment (uce.CLUSTALO.DPA.1000.MAFFT-FFTNS1)
[89/86131a] Submitt

[99/749635] Submitted process > regressive_alignment (adh.CLUSTALO.DPA.1000.MAFFT-FFTNS1)
[64/b6e4cd] Submitted process > regressive_alignment (biotin_lipoyl.CLUSTALO.DPA.1000.MAFFT)
[1f/075f84] Submitted process > regressive_alignment (flav.CLUSTALO.DPA.1000.MAFFT_PARTTREE)
[37/209c41] Submitted process > regressive_alignment (annexin.CLUSTALO.DPA.1000.MAFFT_PARTTREE)
[74/6444f6] Submitted process > regressive_alignment (tim.CLUSTALO.DPA.1000.CLUSTALO)
[fd/620df3] Submitted process > regressive_alignment (KAS.CLUSTALO.DPA.1000.MAFFT-FFTNS1)
[00/4d8b30] Submitted process > regressive_alignment (cyclo.CLUSTALO.DPA.1000.MAFFT_PARTTREE)
[d8/037db2] Submitted process > regressive_alignment (ghf22.CLUSTALO.DPA.1000.MAFFT-FFTNS1)
[be/158589] Submitted process > regressive_alignment (blm.CLUSTALO.DPA.1000.MAFFT_PARTTREE)
[bc/64a226] Submitted process > regressive_alignment (TNF.CLUSTALO.DPA.1000.CLUSTALO)
[26/9eb09f] Submitted process > regressive_alignment (ghf1.CLUSTALO.DPA.1000.MAFFT_PARTT