Skip to content
PlantTribes is a collection of automated gene family analysis pipelines for comparative plant genomics
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
config
data
docs
legacy
pipelines
scripts
test
.gitignore
CHANGELOG.md
LICENSE
README.md

README.md

PlantTribes

Overview

PlantTribes is a collection of automated modular analysis pipelines that utilize objective classifications of complete protein sequences from sequenced plant genomes to perform comparative evolutionary studies. It post-processes de novo assembly transcripts into putative coding sequences and their corresponding amino acid translations, locally assembles targeted gene families, estimates paralogous/orthologous pairwise synonymous/non-synonymous substitution rates for a set of gene sequences, classifies gene sequences into pre-computed orthologous plant gene family clusters, and builds gene family multiple sequence alignments and their corresponding phylogenies.

Please submit all questions, inquires, and bugs using the PlantTribes repository issues tab.

In addition to this README file, you can consult the PlantTribes manual for more detailed information.

Installation

PlantTribes pipeline scripts have many external dependencies that need to be installed and available on the environment's $PATH before the pipelines can be used.

Pipelines dependencies

PlantTribes scaffolds datasets

PlantTribes gene family scaffolds download website

Install PlantTribes

  1. Open a terminal and change to the location where you would to keep PlantTribes.
  • Example: cd ~/softwares
  1. Clone the PlantTribes GitHub repository or download the zip archive and decompress it in your desired location.
  • Examples: git clone https://github.com/dePamphilis/PlantTribes.git or unzip https://github.com/dePamphilis/PlantTribes/archive/master.zip
  1. Download the scaffold data set(s) that you would like to use into the PlantTribes' data subdirectory and decompress them.
  • Examples: cd PlantTribes/data, md5sum 22Gv1.1.tar.bz (should match the provided MD5 checksum for the data archive), followed by tar -xjvf 22Gv1.1.tar.bz2

Using PlantTribes

The execulables for the PlantTribes pipelines are in the pipelines subdrectory of the installation. You can either add them to your PATH environment variable or execute directly from the PlantTribes installation.

  • AssemblyPostProcessor pipeline:
    • Display all usage options:
      • PlantTribes/pipelines/AssemblyPostProcesser
    • Basic run using ESTScan prediction method:
      • PlantTribes/pipelines/AssemblyPostProcesser --transcripts transcripts.fasta --prediction_method estscan --score_matrices /path/to/score/matrices/Arabidopsis_thaliana.smat
  • GeneFamilyClassifier pipeline:
    • Display all usage options:
      • PlantTribes/pipelines/GeneFamilyClassifier
    • Basic run using 22Gv1.1 scaffolds, orthomcl clustering method, and blastp classifier:
      • PlantTribes/pipelines/GeneFamilyClassifier --proteins proteins.fasta --scaffold 22Gv1.1 --method orthomcl --classifier blastp
  • PhylogenomicsAnalysis pipeline:
    • Display all usage options:
      • PlantTribes/pipelines/PhylogenomicsAnalysis
    • Basic run using 22Gv1.1 scaffolds, orthomcl clustering method, and raxml Phylogenetic trees inference method:
      • PlantTribes/pipelines/PhylogenomicsAnalysis --orthogroup_faa geneFamilyClassification_dir/orthogroups_fasta --scaffold 22Gv1.1 --method orthomcl --add_alignments --tree_inference raxml
  • GeneFamilyIntegrator:
    • Display all usage options:
      • PlantTribes/pipelines/GeneFamilyIntegrator
    • Basic run using 22Gv1.1 scaffolds, orthomcl clustering method:
      • GeneFamilyIntegrator --orthogroup_faa geneFamilyClassification_dir/orthogroups_fasta --scaffold 22Gv1.1 --method orthomcl
  • GeneFamilyAligner pipeline:
    • Display all usage options:
      • PlantTribes/pipelines/GeneFamilyAligner
    • Basic run using 22Gv1.1 scaffolds, orthomcl clustering method, and mafft alignment method:
      • GeneFamilyAligner --orthogroup_faa integratedGeneFamilies_dir --alignment_method mafft
  • GeneFamilyPhylogenyBuilder pipeline:
    • Display all usage options:
      • PlantTribes/pipelines/GeneFamilyPhylogenyBuilder
    • Basic run using 22Gv1.1 scaffolds, orthomcl clustering method, and fastree Phylogenetic trees inference method:
      • GeneFamilyPhylogenyBuilder --orthogroup_aln geneFamilyAlignments_dir/orthogroups_aln --tree_inference fasttree
  • KaKsAnalysis pipeline
    • Display all usage options:
      • PlantTribes/pipelines/KaKsAnalysis
    • Basic run using for paralogous analysis:
      • KaKsAnalysis --coding_sequences_species_1 species1.fna --proteins_species_1 species1.faa --comparison paralogs --num_threads 4

Please consult the PlantTribes manual and tutorial for a detailed description and usage of all options for the pipelines respectively.

License

PlantTribes is distributed under the GNU GPL v3. For more information, see license.

You can’t perform that action at this time.