Skip to content

PhyloPhlAn 3: Example 02: Tree of life

Katarina Mladenovic edited this page Apr 11, 2024 · 1 revision

Prokaryotes Tree of life reconstruction

Go back to the main PhyloPhlAn 3 Tutorial - Main Page


Before starting, make sure to have PhyloPhlAn 3 installed.

  • Make sure PhyloPhlAn 3 scripts are executable and available in your command line
  • The commands in this tutorial assume that you are inside the tutorial folder examples/02_tol
  • All the steps below are reported in the run_02.sh script

By following these steps, the user will be able to build the microbial tree of life and, if desired, place newly sequenced genomes into it.

Step 1. Download reference genomes

For this tutorial, you need to download at least one genome from each bacterial and archaeal species and you can do this by using phylophlan_get_reference with the parameters -n 1 and -g all as explained in the tutorial:

phylophlan_get_reference \
    -g all \
    -o input_genomes/ \
    -n 1 \
    --verbose 2>&1 | tee logs/phylophlan_get_reference.log

This will create a directory called input_genomes inside the examples/02_tol directory, which will contain all the genomes necessary to build the prokaryotes tree of life.

If you wish to phylogenetically place your genomes into the prokaryotes tree of life, you can move your newly sequenced genomes inside the same directory where the reference genomes have been downloaded (i.e. examples/02_tol/input_genomes).

Step 2. Generating the configuration file

The configuration file for this analysis can be generated using the following command:

phylophlan_write_config_file \
    -d a \
    -o 02_tol.cfg \
    --db_aa diamond \
    --map_dna diamond \
    --map_aa diamond \
    --msa mafft \
    --trim trimal \
    --tree1 iqtree \
    --verbose 2>&1 | tee phylophlan_write_config_file.log

Step 3. Reconstruct the prokaryotes tree of life

To reconstruct the prokaryotes tree of life (potentially including your newly sequenced genomes) you can run PhyloPhlAn 3 as follows:

phylophlan \
    -i input_genomes \
    -d phylophlan \
    -f 02_tol.cfg \
    --diversity high \
    --fast \
    -o output_tol \
    --nproc 16 \
    --verbose 2>&1 | tee logs/phylophlan.log

In this case, we specified both --diversity high and --fast since we want to use the configurations inside PhyloPhlAn 3 tuned for very-large phylogenies (see here for more information).

Also, in the above command, we specified --nproc 16, but we suggest to use as many cores as possible as explained in the tutorial.

The output files produced by the pipeline are available in the output folder output_tol and the final tree file is input_genomes.tre.treefile.

The resulting tree can be visualized using all the common tree editors (see here for a list).

PhyloPhlAn 3: Example 02: Tree of life

Clone this wiki locally