Skip to content

ridgelab/JustOrthologs

Repository files navigation

##########################
#     JustOrthologs      #
##########################

JustOrthologs
Created By: Justin Miller, Brandon Pickett, Perry Ridge
Email: justin.miller@uky.edu

##########################

ARGUMENT OPTIONS:

 -h, --help    							show this help message and exit
 -q	Query Fasta File					A fasta file that has been divided into CDS regions and sorted based on the number of CDS regions.
 -s	Subject Fasta File					A fasta file that has been divided into CDS regions and sorted based on the number of CDS regions.
 -o	Output File							A path to an output file that will be created from this program.
 -t	Number of Cores						The number of Cores available 
 -d	For More Distantly Related Species	A flag that implements the -d algorithm, which gives better accuracy for more distantly related species.
 -c	Combine Both Algorithms				Combines both the normal and -d algorithms for the best overall accuracy.
 -r Correlation value					An optional value for changing the required correlation value between orthologs.

##########################

REQUIREMENTS:

JustOrthologs uses Python version 2.7

Python libraries that must be installed include:
1. sys
2. os
3. multiprocessing
4. argparse

If any of those libraries is not currently in your Python Path, use the following command:
pip install --user [library_name]
to install the library in your path.

##########################

Input Files:
This algorithm requires two fasta files that have header/sequence alternating lines with an asterisk (*) 
between every CDS region and at the end of each sequence. To create this file, look at the example files in
the directory smallTest, or use the included wrapper to extract all CDS regions from a reference genome and 
a gff3 file.

##########################

USAGE:

Typically, the -c option should be used to get the best overall precision and recall.

The default number of threads is the number of threads available. If you want to change that, use the -t option.

The query, subject, and output files  must always be supplied 

Example usage:

python justOrthologs.py -q smallTest/orthologTest/bonobo.fa -s smallTest/orthologTest/human.fa -o output -c -t 16

Running the above command will produce a single output file called output in the current directory. For us, this test took
approximately 4 seconds of real time and 41 seconds of user time.

##########################


Thank you, and happy researching!!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published