Skip to content

DomBennett/om..raxml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Run raxml through outsider in R

Build Status

Randomized Axelerated Maximum Likelihood (RAxML): Phylogenetic Analysis and Post-Analysis of Large Phylogenies

Install and look up help

library(outsider)
#> ----------------
#> outsider v 0.1.0
#> ----------------
#> - Security notice: be sure of which modules you install
module_install(repo = "dombennett/om..raxml")
#> -----------------------------------------------------
#> Warning: You are about to install an outsider module!
#> -----------------------------------------------------
#> Outsider modules install and run external programs
#> via Docker <https://www.docker.com>. These external
#> programs may communicate with the internet and could
#> potentially be malicious.
#> 
#> Be sure to know the module you are about to install:
#> Is it from a trusted developer? Are colleagues using
#> it? Is it supposed to download lots of data? Is it
#> well used (e.g. check number of stars on GitHub)?
#> -----------------------------------------------------
#>  Module information
#> -----------------------------------------------------
#> program: RAxML
#> details: Randomized Axelerated Maximum Likelihood (8.2.12 HPC PTHREADS SSE3) for inferring phylogenies
#> docker: dombennett
#> github: dombennett
#> url: https://github.com/DomBennett/om..raxml
#> image: dombennett/om_raxml
#> container: om_raxml
#> package: om..raxml
#> Travis CI: Failing/Erroring
#> -----------------------------------------------------
#> Enter any key to continue or press Esc to quit
#module_help(repo = "dombennett/om..raxml")

Partitioned DNA analysis

All demonstrations are taken from the RAxML “hands-on”

# ----
# Data
# ----
# example DNA
dna_phy <- "10 60
Cow       ATGGCATATCCCATACAACTAGGATTCCAAGATGCAACATCACCAATCATAGAAGAACTA
Carp      ATGGCACACCCAACGCAACTAGGTTTCAAGGACGCGGCCATACCCGTTATAGAGGAACTT
Chicken   ATGGCCAACCACTCCCAACTAGGCTTTCAAGACGCCTCATCCCCCATCATAGAAGAGCTC
Human     ATGGCACATGCAGCGCAAGTAGGTCTACAAGACGCTACTTCCCCTATCATAGAAGAGCTT
Loach     ATGGCACATCCCACACAATTAGGATTCCAAGACGCGGCCTCACCCGTAATAGAAGAACTT
Mouse     ATGGCCTACCCATTCCAACTTGGTCTACAAGACGCCACATCCCCTATTATAGAAGAGCTA
Rat       ATGGCTTACCCATTTCAACTTGGCTTACAAGACGCTACATCACCTATCATAGAAGAACTT
Seal      ATGGCATACCCCCTACAAATAGGCCTACAAGATGCAACCTCTCCCATTATAGAGGAGTTA
Whale     ATGGCATATCCATTCCAACTAGGTTTCCAAGATGCAGCATCACCCATCATAGAAGAGCTC
Frog      ATGGCACACCCATCACAATTAGGTTTTCAAGACGCAGCCTCTCCAATTATAGAAGAATTA"
# example partition
simpleDNApartition <- "DNA, p1=1-30
DNA, p2=31-60"
# Save as binary files
input_file <- file.path(tempdir(), 'dna.phy')
input_connection <- file(input_file, 'wb')
write(file = input_connection, x = dna_phy)
close(input_connection)
partition_file <- file.path(tempdir(), 'simpleDNApartition.txt')
partition_connection <- file(partition_file, 'wb')
write(file = partition_connection, x = simpleDNApartition)
close(partition_connection)


# -----
# RAxML
# -----
library(outsider)
# import function
raxml <- module_import(fname = 'raxml', repo = "dombennett/om..raxml")
# create folder to host results
results_dir <- file.path(tempdir(), 'raxml_example')
dir.create(results_dir)
# run raxml
# arglist = command arguments that would have been passed to command-line
# program.
# Note: R objects are allowed in the arglist, e.g. input_file
raxml(arglist = c('-m', 'GTRGAMMA', '-p', '12345', '-q', partition_file,
                  '-s', input_file, '-n', 'T21'), outdir = results_dir)
#> 
#> WARNING: The number of threads is currently set to 0
#> You can specify the number of threads to run via -T numberOfThreads
#> NumberOfThreads must be set to an integer value greater than 1
#> 
#> RAxML, will now set the number of threads automatically to 2 !
#> 
#> 
#> This is the RAxML Master Pthread
#> 
#> This is RAxML Worker Pthread Number: 1
#> 
#> 
#> This is RAxML version 8.2.12 released by Alexandros Stamatakis on May 2018.
#> 
#> With greatly appreciated code contributions by:
#> Andre Aberer      (HITS)
#> Simon Berger      (HITS)
#> Alexey Kozlov     (HITS)
#> Kassian Kobert    (HITS)
#> David Dao         (KIT and HITS)
#> Sarah Lutteropp   (KIT and HITS)
#> Nick Pattengale   (Sandia)
#> Wayne Pfeiffer    (SDSC)
#> Akifumi S. Tanabe (NRIFS)
#> Charlie Taylor    (UF)
#> 
#> 
#> Alignment has 38 distinct alignment patterns
#> 
#> Proportion of gaps and completely undetermined characters in this alignment: 0.00%
#> 
#> RAxML rapid hill-climbing mode
#> 
#> Using 2 distinct models/data partitions with joint branch length optimization
#> 
#> 
#> Executing 1 inferences on the original alignment using 1 distinct randomized MP trees
#> 
#> All free model parameters will be estimated by RAxML
#> GAMMA model of rate heterogeneity, ML estimate of alpha-parameter
#> 
#> GAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units
#> 
#> Partition: 0
#> Alignment Patterns: 20
#> Name: p1
#> DataType: DNA
#> Substitution Matrix: GTR
#> 
#> 
#> 
#> Partition: 1
#> Alignment Patterns: 18
#> Name: p2
#> DataType: DNA
#> Substitution Matrix: GTR
#> 
#> 
#> 
#> 
#> RAxML was called as follows:
#> 
#> raxmlHPC-PTHREADS-SSE3 -m GTRGAMMA -p 12345 -q simpleDNApartition.txt -s dna.phy -n T21 
#> 
#> 
#> Partition: 0 with name: p1
#> Base frequencies: 0.323 0.293 0.153 0.230 
#> 
#> Partition: 1 with name: p2
#> Base frequencies: 0.327 0.283 0.183 0.207 
#> 
#> Inference[0]: Time 0.098335 GAMMA-based likelihood -377.005373, best rearrangement setting 5
#> 
#> 
#> Conducting final model optimizations on all 1 trees under GAMMA-based models ....
#> 
#> Inference[0] final GAMMA-based Likelihood: -375.308100 tree written to file /working_dir/RAxML_result.T21
#> 
#> 
#> Starting final GAMMA-based thorough Optimization on tree 0 likelihood -375.308100 .... 
#> 
#> Final GAMMA-based Score of best tree -375.308100
#> 
#> Program execution info written to /working_dir/RAxML_info.T21
#> Best-scoring ML tree written to: /working_dir/RAxML_bestTree.T21
#> 
#> Overall execution time: 0.147280 secs or 0.000041 hours or 0.000002 days

Key arguments

Some key arguments for running the RAxMl program.

Argument Usage Description
m -m Model to run, e.g. GTRGAMMA or GTRCAT
p -p # Specify seed #
s -s file Specify input file
# -# # Specify # iterations
n -n name Specify name of analysis
q -q file Specify partition file

Additionally, the R interface allows a user to specify an outdir where all the resulting files should be saved. By default, the outdir is the current working directory.

Other examples: from command-line to R

ML
# command line
raxmlHPC -m BINGAMMA -p 12345 -s binary.phy -# 20 -n T5
# R
raxml(arglist = c('-m', 'BINGAMMA', '-p', '12345', '-s', 'binary.phy', '-#',
'20', '-n', 'T5'))
Ordered morphological character matrix
# command line
raxmlHPC -p 12345 -m MULTIGAMMA -s  multiState.phy -K ORDERED -n T12
# R
raxml(arglist = c('-p', '12345', '-m', 'MULTIGAMMA', '-s', 'multiState.phy',
'-K', 'ORDERED', '-n', 'T12'))
Bootstrap
# command line
raxmlHPC -m GTRCAT -p 12345 -f b -t RAxML_bestTree.T13 -z RAxML_bootstrap.T14 \
-n T15
# R
raxml(arglist = c('-m', 'GTRCAT', '-p', '12345', '-f', 'b', '-t',
'RAxML_bestTree.T13', '-z', 'RAxML_bootstrap.T14', '-n', 'T15'))

Links

Find out more by visiting the RAxML’s homepage

Please cite

  • A. Stamatakis: “RAxML Version 8: A tool for Phylogenetic Analysis and Post-Analysis of Large Phylogenies”. In Bioinformatics, 2014, open access.
  • Bennett et al. (2020). outsider: Install and run programs, outside of R, inside of R. Journal of Open Source Software, In review

An outsider module

Learn more at outsider website. Want to build your own module? Check out outsider.devtools website.

About

outsider-module: RAxML

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published