Skip to content
A Phylogenomic protein function prediction method based on weighting evolutionary distances
Python HTML Perl Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
MSA_Analytics
Notebooks
Presentation
__pycache__
benchmarking
operating_reqs
tree_construction
MoreThan70Percent.fasta
README.md
aligned40-70Percent.fasta
benchmarking.py
lessThan30-40.fasta
main.py
reformat.pl
requirements.txt
tree_accuracy.txt

README.md

Phylogenomic Protein Function Prediction

This project seeks to implement a full phylogenetic pipeline to serve as a tool for transmembrane helix prediction. A target sequence and neighboring homologs are annotated by TMHMM and RAxML's maximum likelihood constructor is used to determine contribution scores to each sequence. See Presentation files for a brief overview of core functionality.

Implementing a full phylogenomic pipeline

Given an input sequence this software gathers the closest 100 homologs, generates a multiple sequence alignment, masks that alignment, and then uses RAxML's maximum likelhood estimator to generate a phylogenetic tree.

Annotation Transfer Protocol

This software uses an annotation transfer protocol based on evolutionary distances between proteins to transfer TMH annotations. That is, the more closely related a hit is to the target protein the more it's annotation at a particular site would matter in the determination of the target's true annotation.

You can’t perform that action at this time.