Skip to content

oacar/SynORFan

Repository files navigation

SynORFan

This package is designed to find overlapping open reading frames in a multiple sequence alignment(MSA) given a reference alignment. Two main scripts are bioconductor.py and analysis.py which are both designed to be used as CLI programs.

python bioconductor.py --help

will give you the options and input files needed to use this program.

usage: bioconductor.py [-h] -p PATH -n ORF_NAME [-a] -y YEAST [-alg ALGORITHM]

optional arguments:
  -h, --help      show this help message and exit
  -p PATH         Directory path for alignment and output folder
  -n ORF_NAME     ORF name for output names
  -a              Is the sequence is annotated?
  -y YEAST        Fasta file containing dna sequence for annotated yeast genes
  -alg ALGORITHM  Select alignment algorithm. Default is mafft

Example Usage:

python bioconductor.py -p input_data/ -n YBR196C-A -y input_data/orf_genomic_all.fasta -a

Requirements:

Python requirements are in requirements.txt file however you also need mafft to be on your system path and a tmp folder on your Home folder.(i.e. $HOME/tmp/ should be available) -y argument needs orf_genomic_all.fasta file which can be downloaded from SGD for yeast to get the sequence if -a is specified or the ORF sequence can be given directly to -y.

About

Align and subset genomic sequences based on ORF positions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages