Skip to content

madmaze/gpuFFTMSA

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
doc
 
 
 
 
 
 
 
 
 
 
 
 

gpuFFTMSA

GPU accelerated FFT-based multiple sequence alinger.

Requirements:

For the CPU portion of the code to work, only python, numpy, pylab(for debug) need to be installed.

For GPU portion, the above need to be available, as well as pyCUDA and pyFFT.CUDA

Usage:

  • if pyFFTalign.py is run without arguments it will gather default values out of ./data
usage: pyFFTalign.py [-h] [-i INPUT_GENOME] [-s INPUT_SEQS]
                     [--logFile LOGFILE] [-l LOGLEVEL] [-g] [-e] [--verify]

given a directory of input genomes and sequences it will try to match up each
sequence to its genome

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT_GENOME, --inputgenome INPUT_GENOME
                        Input genome file or dir of fna files (Default:
                        ./data/sampleGenome.fna)
  -s INPUT_SEQS, --inputseqs INPUT_SEQS
                        Input sequence file (Default: ./data/sampleGenome.seq)
  --logFile LOGFILE     output to log file (Default: False)
  -l LOGLEVEL, --log LOGLEVEL
                        Log level, use DEBUG of more output (Default: INFO)
  -g, --usegpu          gpu option (Default: False)
  -e, --chopefficient   chop efficiently (Default: False)
  --verify              verify successfull transcription (Default: False)

Note: input files are expected in the following format

*.fna => INPUT_GENOME:
>some title
ATATTTTTTCTTGTTTTTTATATCCACAAACTCTTTTCGTACTTTTACACAGTATATCGTGTTGTGGACA
ATTTTATTCCACAAGGTATTGATTTTGTGGATAACTTTCTTAATTTCATTGCTATAGCTACTTTTTTTTG
ATATTATAGTTGTGTTTTCACTTTGAATAAGTTTTCCACATCTTTATCTTATCCACAATTTGTGTATAAC
ATGTGGACAGTTTTAATCACATGTGGGTAAATGATTATCCACATTTGCTTTTTTGTCGAAAACCCTATCT

*.seq => INPUT_SEQS:
some title|ATATTATAGTTGTGTTTTCACTTTGAATAAGTTTTCCACATCTTTATCTTATCCACAATTTGTGTATAAC

Code layout:

  • pyFFTalign.py
    • Provides basic command-line parsing and provides a wrapper for the other two.
    • Reads in Genome(long sequence)
    • Reads in Sequences(shorter sequences)
  • dataObj.py
    • Data container for raw and transcribed sequences
    • transcribes sequences on object creation
    • methods for returning padded raw and transcribed sequences
  • aligner.py
    • CPU and GPU correlation functions

More information:

About

GPU accelerated FFT-based multiple sequence alinger

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages