-
Notifications
You must be signed in to change notification settings - Fork 2
mptp_exercise
PTP takes as input a phylogenetic tree and tries to determine the transition point from a between- to a within-species process. The single rate PTP assumes that the branch lengths of the phylogenetic tree can be split in two clusters, fitting two distinct exponential distributions; one corresponding to the speciation and one to the coalescent process. The newly introduced mPTP assumes that the branch lengths of each putative species may form different clusters and therefore it assumes each may fit a distinct exponential distribution.
$ git clone https://github.com/Pas-Kapli/mptp.git
$ cd mptp
$ ./autogen.sh
$ ./configure
$ make
$ sudo make install$ mafft BR_cob_57ind.fasta > BR_cob_57ind_mafft.fasta
Note: Need to be much more careful with the alignment when using datasets with many indels (e.g. 16S rRNA)
$ raxmlHPC-PTHREADS-SSE3 -f c -s BR_cob_57ind_mafft.fasta -m GTRGAMMA -n read_alignment
On the screen and the RAxML_info.read_alignment file there is the following information
Found 17 sequences that are exactly identical to other sequences in the alignment.
Normally they should be excluded from the analysis.```
A reduced file is automatically created: [BR_cob_57ind_mafft.fasta.reduced](https://raw.githubusercontent.com/Pas-Kapli/assets/master/tutorials/Norway2016/data/Branchiomma/ptp/BR_cob_57ind_mafft.fasta.reduced) in phylip format
Check the difference:
```bash
$ head -n 1 BR_cob_57ind_mafft.fasta.reduced
$ grep ">"
$ grep ">" BR_cob_57ind_mafft.fasta | wc -l
Q: How many sequences were in the fasta file and how many are there in the reduced phylip file?
$ phylip_to_fasta.py BR_cob_57ind_mafft.fasta.reduced
The output file will be named BR_cob_57ind_mafft.fasta.reduced.fasta
$ raxmlHPC-PTHREADS-SSE3 -s BR_cob_57ind_mafft.fasta.reduced -m GTRGAMMA -n Branchiomma -p $RANDOM -T 2 -o BR_076,BR_018 Using the -o argument we retrieve a rooted phylogeny, if not then we can root it with mptp later
$ mkdir raxml multi single
$ mv RAxML* raxmlCheck the mptp options:
$ cd multi
$ mptp --help $ mptp --tree_file ../raxml/RAxML_bestTree.Branchiomma --minbr_auto ../../BR_cob_57ind.reduced.fasta -output_file minbr If the phylogeny is unrooted use the --outgroup taxon_name1,taxon_name2 for rooting the phylogeny prior to the delimitation
$ mptp --ml --multi --tree_file ../raxml/RAxML_bestTree.Branchiomma --output_file Branchiomma_mptp_multi --outgroup BR_076,BR_018 --outgroup_crop --minbr 0.0022718068 Check the output files:
$ firefox Branchiomma_mptp_multi.svg
$ nano Branchiomma_mptp_multi.txt
Q: How many putative species are delimited?
Re-run delimitation without taking the minimum branch length into account:
$ mptp --ml --multi --tree_file ../raxml/RAxML_bestTree.Branchiomma --output_file Branchiomma_mptp_multi --outgroup BR_076,BR_018 --outgroup_cropQ: Does the delimitation change when not using the minimum branch length or the crop-outgroup command? Why? When would these options matter?
Repeat the exercise with the single-rate algorithm using the --single instead of the --multi option
Q: Does the delimitation change under the single-rate PTP?
Perform MCMC sampling for the multi and the single rate PTP:
$ mptp --tree_file ../raxml/RAxML_bestTree.Branchiomma --minbr 0.0022718068 --mcmc 10000000 --mcmc_log --mcmc_sample 50000 --multi --mcmc_runs 2 --output_file mcmc_multi --outgroup BR_076,BR_018 --outgroup_crop Output files
tree file with support values in svg format (mcmc_multi.1476550918.combined.svg) and in newick format (mcmc_multi.1476550918.combined.tree)
Plot of the likelihood per mcmc step for the multi rate algorithm
Q: Did the two runs converge?
Q: Did the runs converge to the Maximum Likelihood solution?