Skip to content

GET_PHYLOMARKERS_v2.2.0_2024-04-14

Compare
Choose a tag to compare
@vinuesa vinuesa released this 15 Apr 01:56
· 43 commits to master since this release

This major release (v2.2.0, 2024-04-14) contains new features, significant code improvements, binary updates, and some bug fixes

New features

  1. significant extension of run mode 2 (run_get_phylomarkers_pipeline.sh -R 2) for population genetics (multiple sequences from the same species):
    • the FASTA files of non-recombinant, kdetrees-compliant, and neutral loci are saved in their directory and concatenated
    • SNP sites are extracted from the concatenated alignment with snp-sites and saved as FASTA and VCF formats
    • ML trees are estimated from the SNP supermatrix using either IQ-TREE or FastTree
  2. run_get_phylomarkers_pipeline.sh now also calls the C binary WEIGHTED-ASTRAL to estimate a species tree using as input the filtered gene trees estimated by iqtree2 or FastTree from the core-genome clusters computed by get_homologues.
    • run_ASTRAL computes best-fitting model from protein or DNA concat alignments using IQT with wASTRAL species tree as a constraint
    • run_ASTRAL calls compute_ASTRALspTree_branch_lenghts from protein or DNA concatenated alignments using IQ-TREE with wASTRAL species tree as a constraint
    • run_ASTRAL now calls astral4 AND wASTRAL from the ASTER package, using the wASTRAL species tree for the downstream analyses listed above
  3. The main script run_get_phylomarkers_pipeline.sh
    • Added complex protein mixture models for concatenated protein alignments
    • Prints script invocation arguments to STDOU at the beginning of the run
    • Collects results of the different filtering steps and prints an overview of the pipeline's filtering process before exiting
    • Full shellcheck compliance

Updated and new external programs

  • A static binary of ASTRAL-IV is used to estimate the concatenation-free species tree
  • A static binary of WEIGHTED-ASTRAL is used to estimate the concatenation-free species tree
  • snp-sites is now used under run mode 2 (run_get_phylomarkers_pipeline.sh -R 2)

Updated scripts, library code, and test files

  • run_get_phylomarkers_pipeline.sh
  • run_test_suite.sh
  • lib/get_phylomarkers_fun_lib
  • install_R_deps.R omits installing R packages that are not used anymore
  • test_get_phylomarkers.t now runs 24 tests

Docker image

The distribution contains a Dockerfile used to build the Docker image ready to pull from Docker Hub. On Dockerhub, you will find detailed instructions on installing and configuring the Docker client on your machine, pulling the latest image, and running the containerized instance of the GET_PHYLOMARKERS pipeline.