# Phylogenetic inference

Next, we'll make a couple quick phylogenetic trees of our samples from the ipyrad assembly.

<br>

## iqtree

iqtree is a very commonly used and very easy to use program for generating maximum likelihood phylogenies. We'll start with this.

### MORE BACKGROUND HERE once code is all there





Load up the module on the UW cluster - probably container iqtree + SVDQuartets (paup) for GCP

In [None]:
! module load iq-tree/2.3.6

In [None]:
# ! /apps/u/opt/linux/iq-tree/2.3.6/bin/iqtree2

It's very easy to run, we mostly just need to point iqtree to the input file, which we'll set as the phylip-formatted output from ipyrad, the `.phy` file. 


We'll set up our input and output paths as variables so that these can easily be changed and we shouldn't need to change much in the actual program call for different datasets, just these variables.

`INFILE` will be the name and path to the input file
`OUTFIX` will the be the prefix that gets prepended to each output file.
`OUTDIR` is the directory that we want all output to go into.


 Options that we'll set int the program call include:

`-s $INFILE` sets the input sequence file.

`-m MFP` which instead of specifying a model of evolution, tells IQTree to use ModelFinderPlus to find the best model of sequence evolution.

`-T auto` tells IQTree to automatically determine the best number of threads to use, within some maximum we specify based on what we've allocated.

`--prefix $OUTFIX` sets the prefix for our output to what we define in out `OUTFIX` bash variable.

`-B 1000` tells IQTree to use 1000 rapid bootstraps for assessing support.

`-alrt 1000` uses 1000 bootstrap replicates for SH-aLRT calculation (a likelihood-based metric of branch support).

`-ntmax 12` sets the maximum number of threads to use, this should not exceeed the number of cores in your instance.


In [None]:
import os



# set up the input file, outfile prefix, and output directory
os.environ["INFILE"] = "/project/inbreh/radseq_cloud/ruber_reduced_denovo_outfiles/ruber_reduced_denovo.phy"
os.environ["OUTFIX"] = "ruber"
OUTDIR = "/project/inbreh/radseq_cloud/iqtree_out"

In [None]:
! mkdir -p $OUTDIR # create the output directory if it doesn't already exist
os.chdir(OUTDIR)

In [None]:
## Execute iqtree

! /apps/u/opt/linux/iq-tree/2.3.6/bin/iqtree2 -s $INFILE -m MFP -T auto --prefix $OUTFIX -B 1000 -alrt 1000 -nt AUTO -ntmax 32

## SVDQuartets


SVDQuartets is a quartet-based method that is designed to work on SNPs to crate species trees, but it can also be used with full concatenated alignments to generate trees of indiviuals like we've done with IQTree.

It is somewhat more involved to set up, and we'll again set it up with a bunch of bash variables.

What we'll do is run a single search for the best tree, save it, then run a search that includes bootstrapping and save those trees. Later, in R, we'll plot the bootstraps onto the best tree. Note that if you follow the tutorial from **ADD IT IN**, the bootstraps will be plotted on a consensus of bootstrap trees, not the tree that has the highest likelihood onyour actual data. I consider this to be highly undesirable.

## NOTE that you will need to manually edit the nexus file and create a new one in which you delete the charsets

In [None]:
%%bash
PAUP=/project/inbreh/software/paup4a168_centos64 # set up PAUP path
OUTDIR="/project/inbreh/radseq_cloud/svdq_out"


#define  variables for the PAUPblock
filebname="ruber_reduced_denovo" #basename for all produced files
infile="/project/inbreh/radseq_cloud/ruber_reduced_denovo_outfiles/ruber_reduced_denovoPAUP.nex" #name of input nexus file; can give a path so the input files don't have to be part of the working directory
nthreads=12 #number of threads in the slurm script
nreps=200 #number of reps for bootstrapping



################################################################################################################################################################
################################################################################################################################################################
####    Run based on the parameters set above
################################################################################################################################################################
################################################################################################################################################################


#change working directory to where your output files will go
mkdir -p $OUTDIR
cd $OUTDIR


cat <<EOF > $filebname.paup.txt
Begin paup;
set autoclose=yes warntree=no warnreset=no flock=no;
log start file=$filebname.log ;
execute $infile;
svdQuartets evalQuartets=all showScores=no ambigs=distribute bootstrap=no nthreads=$nthreads;
savetrees file=$filebname.besttree.tre;
svdQuartets evalQuartets=all showScores=no ambigs=distribute bootstrap=standard nreps=$nreps nthreads=$nthreads treefile=$filebname.svdqboots.tre;  
quit; 
end;
EOF

$PAUP $filebname.paup.txt #execute your new paup block file



