# RAxML-NG on 50% ID MSA

Aim: Use RAxML-NG to generate a phylogenetic tree for my protein. 
This notebook documents the process I used:
1. Building the Maximum-liklhood (ML) tree
2. Generating bootstrap replicates
3. Mapping bootsrap support onto the ML tree


In [1]:
from pathlib import Path # allows handling of files and folder paths cleanly
import subprocess # allows terminal commands to be run from python

# PARAMETERS
MSA = Path("../../data/cleaned/trimal/espy3-50.mafft.r07s08.trim.fasta") # trimmed MSA
OUTDIR = Path("../../data/cleaned/raxml/test/01_50id_bootstrap") # output folder for RaxML results
PREFIX = "mytree" # label to prefix all the output files
MODEL = "Blosum62+G4" # substitution model used for protein data
BS_TREES = 100 # number of bootstrap replicates
THREADS = "auto" # use all available CPU threads automatically

# Make sure output folder exists
OUTDIR.mkdir(parents=True, exist_ok=True)

# Print settings to confirm they are correct
print (f"MSA: {MSA.resolve()}")
print(f"OUTDIR: {OUTDIR.resolve()}")
print(f"MODEL: {MODEL}")

MSA: /Users/lachlanblack/Documents/GitHub/EspY3-protein-evolution/data/cleaned/trimal/espy3-50.mafft.r07s08.trim.fasta
OUTDIR: /Users/lachlanblack/Documents/GitHub/EspY3-protein-evolution/data/cleaned/raxml/test/01_50id_bootstrap
MODEL: Blosum62+G4


In [2]:
# Define small helper function to run terminal commands
def run(cmd):
    print(f"\n$ {cmd}\n") # prints command to show exactly whas ran
    process = subprocess.run(cmd, shell=True) # runs command in terminal
    if process.returncode != 0: # checks if it failed (non-zero = error)
        raise RuntimeError(f"Command failed: {cmd}") # stop notebook if it failed
        

Explanation of the RAxML-NG command:
- `--search` -> perform a maximum-likelihood search
- `--msa` -> the MSA file used to make tree
- `--model` -> the substitution model (Blosum62+G4)
- `--prefix` -> sets output file prefix
- `--threads auto` -> use all available CPU cores

In [4]:
# Build the ML tree

# Outputs:
    # {PREFIX}.raxml.bestTREE -> Best ML tree
    # {PREFIX}.raxml.log -> Log file
    # {PREFIX}.raxml.mlTrees -> ML search trees
run(f"raxml-ng --search --msa {MSA} --model {MODEL} --prefix {OUTDIR / PREFIX} --threads {THREADS} --redo")



$ raxml-ng --search --msa ../../data/cleaned/trimal/espy3-50.mafft.r07s08.trim.fasta --model Blosum62+G4 --prefix ../../data/cleaned/raxml/test/01_50id_bootstrap/mytree --threads auto --redo


RAxML-NG v. 1.2.2 released on 30.04.2024 by The Exelixis Lab.
Developed by: Alexey M. Kozlov and Alexandros Stamatakis.
Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth, Julia Haag, Anastasis Togkousidis.
Latest version: https://github.com/amkozlov/raxml-ng
Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml

System: Apple M2, 8 cores, 8 GB RAM

RAxML-NG was called at 03-Nov-2025 18:56:17 as follows:

raxml-ng --search --msa ../../data/cleaned/trimal/espy3-50.mafft.r07s08.trim.fasta --model Blosum62+G4 --prefix ../../data/cleaned/raxml/test/01_50id_bootstrap/mytree --threads auto --redo

Analysis options:
  run mode: ML tree search
  start tree(s): random (10) + parsimony (10)
  random seed: 1762196177
  tip-inner: 

In [6]:
# Generate bootstraps

# Output: {PREFIX}_bs.raxml.bootstraps
run(f"raxml-ng --bootstrap --msa {MSA} --model {MODEL} --bs-trees {BS_TREES} --prefix {OUTDIR / (PREFIX + '_bs')} --threads {THREADS} --redo")


$ raxml-ng --bootstrap --msa ../../data/cleaned/trimal/espy3-50.mafft.r07s08.trim.fasta --model Blosum62+G4 --bs-trees 100 --prefix ../../data/cleaned/raxml/test/01_50id_bootstrap/mytree_bs --threads auto --redo


RAxML-NG v. 1.2.2 released on 30.04.2024 by The Exelixis Lab.
Developed by: Alexey M. Kozlov and Alexandros Stamatakis.
Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth, Julia Haag, Anastasis Togkousidis.
Latest version: https://github.com/amkozlov/raxml-ng
Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml

System: Apple M2, 8 cores, 8 GB RAM

RAxML-NG was called at 03-Nov-2025 18:57:12 as follows:

raxml-ng --bootstrap --msa ../../data/cleaned/trimal/espy3-50.mafft.r07s08.trim.fasta --model Blosum62+G4 --bs-trees 100 --prefix ../../data/cleaned/raxml/test/01_50id_bootstrap/mytree_bs --threads auto --redo

Analysis options:
  run mode: Bootstrapping
  start tree(s): 
  bootstrap replicates: pa

In [7]:
# Map boostrap support onto ML tree

# Output: {PREFIX}_support.raxml.support -> final tree with bootstrap values
run(f"raxml-ng --support --tree {OUTDIR / (PREFIX + '.raxml.bestTree')} --bs-trees {OUTDIR / (PREFIX + '_bs.raxml.bootstraps')} --prefix {OUTDIR / (PREFIX + '_support')} --redo")


$ raxml-ng --support --tree ../../data/cleaned/raxml/test/01_50id_bootstrap/mytree.raxml.bestTree --bs-trees ../../data/cleaned/raxml/test/01_50id_bootstrap/mytree_bs.raxml.bootstraps --prefix ../../data/cleaned/raxml/test/01_50id_bootstrap/mytree_support --redo


RAxML-NG v. 1.2.2 released on 30.04.2024 by The Exelixis Lab.
Developed by: Alexey M. Kozlov and Alexandros Stamatakis.
Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth, Julia Haag, Anastasis Togkousidis.
Latest version: https://github.com/amkozlov/raxml-ng
Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml

System: Apple M2, 8 cores, 8 GB RAM

RAxML-NG was called at 03-Nov-2025 19:00:33 as follows:

raxml-ng --support --tree ../../data/cleaned/raxml/test/01_50id_bootstrap/mytree.raxml.bestTree --bs-trees ../../data/cleaned/raxml/test/01_50id_bootstrap/mytree_bs.raxml.bootstraps --prefix ../../data/cleaned/raxml/test/01_50id_bootstrap/mytree_su

# Visualising the tree

Open '{PREFIX}_support.raxml.support' in FigTree
- Show Node Labels -> label to see boostrap support values (0-100)
- The scale bar shows substitutions per site
- You can midpoint-root or outgroup-root for directionality