No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
comparisonFastaFiles
smallTest Improved ease of obtaining ortholog groups Aug 22, 2018
README Removed License Sep 20, 2018
README_OTHER_PROGRAMS Removed License Sep 20, 2018
README_RUN_MULTIPLE_SPECIES Removed License Sep 20, 2018
README_WRAPPER Removed License Sep 20, 2018
combineOrthoGroups.py Revised directory structure and README Jan 5, 2018
getNoException.py Revised directory structure and README Jan 5, 2018
gff3_parser.py Added increased functionality to gff3_parser.py to support different … Jul 2, 2018
justOrthologs.py Removed getCor.pyx Oct 18, 2018
run_multiple_species.sh Increased flexibility for file extensions Aug 22, 2018
sortFastaBySeqLen.sh Revised directory structure and README Jan 5, 2018
wrapper.py Improved ease of obtaining ortholog groups Aug 22, 2018

README

##########################
#     JustOrthologs      #
##########################

JustOrthologs
Created By: Justin Miller, Brandon Pickett, Perry Ridge
Email: jmiller@byu.edu

##########################

ARGUMENT OPTIONS:

 -h, --help    							show this help message and exit
 -q	Query Fasta File					A fasta file that has been divided into CDS regions and sorted based on the number of CDS regions.
 -s	Subject Fasta File					A fasta file that has been divided into CDS regions and sorted based on the number of CDS regions.
 -o	Output File							A path to an output file that will be created from this program.
 -t	Number of Cores						The number of Cores available 
 -d	For More Distantly Related Species	A flag that implements the -d algorithm, which gives better accuracy for more distantly related species.
 -c	Combine Both Algorithms				Combines both the normal and -d algorithms for the best overall accuracy.
 -r Correlation value					An optional value for changing the required correlation value between orthologs.

##########################

REQUIREMENTS:

JustOrthologs uses Python version 2.7

Python libraries that must be installed include:
1. sys
2. os
3. multiprocessing
4. argparse

If any of those libraries is not currently in your Python Path, use the following command:
pip install --user [library_name]
to install the library in your path.

##########################

Input Files:
This algorithm requires two fasta files that have header/sequence alternating lines with an asterisk (*) 
between every CDS region and at the end of each sequence. To create this file, look at the example files in
the directory smallTest, or use the included wrapper to extract all CDS regions from a reference genome and 
a gff3 file.

##########################

USAGE:

Typically, the -c option should be used to get the best overall precision and recall.

The default number of threads is the number of threads available. If you want to change that, use the -t option.

The query, subject, and output files  must always be supplied 

Example usage:

python justOrthologs.py -q smallTest/orthologTest/bonobo.fa -s smallTest/orthologTest/human.fa -o output -c -t 16

Running the above command will produce a single output file called output in the current directory. For us, this test took
approximately 4 seconds of real time and 41 seconds of user time.

##########################


Thank you, and happy researching!!