Skip to content

mptp_exercise

Pas-Kapli edited this page Oct 15, 2016 · 20 revisions

Commands for installing mptp

$ git clone https://github.com/Pas-Kapli/mptp.git
$ cd mptp
$ ./autogen.sh
$ ./configure
$ make
$ sudo make install

Data pre-processing

1.1. Infer alignment:

$ mafft BR_cob_57ind.fasta > BR_cob_57ind_mafft.fasta

Need to be much more careful with the alignment when using datasets with many indels (e.g. 16S rRNA)

1.2. Remove identical sequences:

$ raxmlHPC-PTHREADS-SSE3 -f c -s BR_cob_57ind_mafft.fasta -m GTRGAMMA -n read_alignment

On the screen and the RAxML_info.read_alignment file there is the following information

Found 17 sequences that are exactly identical to other sequences in the alignment.
Normally they should be excluded from the analysis.```	

A reduced file is automatically created: BR_cob_57ind_mafft.fasta.reduced in phylip format 

Check the difference: 

```bash
$ head -n 1 BR_cob_57ind_mafft.fasta.reduced
$ grep ">"
$ grep ">" BR_cob_57ind_mafft.fasta | wc -l 

Q: How many sequences were in the fasta file and how many are there in the reduced phylip file?

Convert phylip to fasta (we will need it for mptp):

$ phylip_to_fasta.py BR_cob_57ind_mafft.fasta.reduced

The output file will be named BR_cob_57ind_mafft.fasta.reduced.fasta

1.3. Infer phylogenetic tree with RAxML:

 $ raxmlHPC-PTHREADS-SSE3 -s BR_cob_57ind_mafft.fasta.reduced -m GTRGAMMA -n Branchiomma -p $RANDOM -T 2 -o BR_076,BR_018 

Using the -o argument we retrieve a rooted phylogeny, if not then we can root it with mptp later

 $ mkdir raxml multi single
 $ mv RAxML* raxml

"Species" delimitation with mptp:

Check the mptp options:

$ cd multi
$ mptp --help

Delimitation with the multi-rate algorithm

If the phylogeny is unrooted use the ``--outgroup taxon_name1,taxon_name2'' for rooting the phylogeny prior to the delimitation

 $ mptp --ml --multi --tree_file ../raxml/RAxML_bestTree.Branchiomma --output_file Branchiomma_mptp_multi --outgroup BR_076,BR_018 --outgroup_crop 

Check the output files:

$ firefox Branchiomma_mptp_multi.svg

$ nano Branchiomma_mptp_multi.txt

Q: How many putative species are delimited?

Detect minimum branch length:

 $ mptp --tree_file ../raxml/RAxML_bestTree.Branchiomma --minbr_auto ../../BR_cob_57ind.reduced.fasta -output_file minbr 

Re-run delimitation taking the minimum branch length into account:

 $ mptp --ml --multi --tree_file ../raxml/RAxML_bestTree.Branchiomma --output_file Branchiomma_mptp_multi --outgroup BR_076,BR_018 --outgroup_crop --minbr 0.0022718068 

Q: Does the delimitation change when using the minimum branch length or the crop-outgroup command? Why? When would these options matter?

Repeat the exercise with the single-rate algorithm

Q: How do the results differ from the multi-rate PTP?

Support values:

Perform MCMC sampling (proportional to the likelihood) for the multi and the single rate PTP:

$ mptp --tree_file ../raxml/RAxML_bestTree.Branchiomma --minbr 0.0022718068 --mcmc 10000000 --mcmc_log 1000000 --multi --mcmc_runs 2 --output_file mcmc_multi 
$ firefox mcmc_multi.SOME_NUMBER.combined.svg

Q: Did the two runs converge? Q: Did the runs converge to the Maximum Likelihood solution?

Clone this wiki locally