Skip to content

pratas/eagle

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Build Status License: GPL v3 Speed Release TinyURL


EAGLE


EAGLE is a program to map minimal Relative Absent Words (mRAWs). EAGLE identifies and localizes the mRAWs contained in a range size of k-mers, running on a command-line environment with multi-threads to minimize computation times. It contains extensions to estimate CG distributions and create automatic plots (Gnuplot). It works on FASTA data without size limitations.

Installation

CMake must be installed to compile EAGLE. CMake can be downloaded from the CMake webpage (http://www.cmake.org/) or by an appropriate packet manager. The following instructions show the procedure to install and compile EAGLE manually:

git clone https://github.com/pratas/eagle.git
cd eagle/src/
cmake .
make

External and complementary dependencies to download, align and visualize the data require conda installation.

Steps to install conda:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

Additional instructions can be found here:

https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html

To install the dependencies using conda:

conda install -c cobilab gto --yes
conda install -c bioconda tabix --yes
conda install -c bioconda bowtie2 --yes
conda install -c bioconda samtools --yes
conda install -c bioconda entrez-direct --yes
conda install -c bioconda/label/cf201901 entrez-direct --yes

Run EAGLE

Run EAGLE using:

./EAGLE -v -t -min 11 -max 14 -p -r Human.fna SARS-CoV-2.fa

Parameters

To see the possible options type

./EAGLE

or

./EAGLE -h

These will print the following options:


NAME                                                                    
      EAGLE v2.3 2015-2020                                            
      Efficient computation of minimal Relative Absent                  
      Words (mRAWs) and its associated GC distributions,                
      profiles, and patterns.                                           
                                                                        
AUTHORS                                                                 
      D. Pratas and J. M. Silva.                                        
                                                                        
SYNOPSIS                                                                
      ./EAGLE [OPTION]... [FILE] [FILE]                                 
                                                                        
SAMPLE                                                                  
      Run: ./EAGLE -v -min 11 -max 16 human.fa SARS-CoV2.fa             
                                                                        
DESCRIPTION                                                             
      Localization and quantification of minimal Relative               
      Absent Words (mRAWs) and GC associated measures                   
                                                                        
      -h,  --help                                                       
           usage guide (help menu)                                      
                                                                        
      -V,  --version                                                    
           display program and version information                      
                                                                        
      -f,  --force                                                      
           force mode. Overwrites old files                             
                                                                        
      -v,  --verbose                                                    
           verbose mode (more information)                              
                                                                        
      -vv, --very-verbose                                               
           very verbose mode (much more information)                    
                                                                        
      -t,  --threads                                                    
           does NOT use threads if flag is set (slower)                 
                                                                        
      -i,  --ignore-ir                                                  
           does NOT use inverted repeats if flag is set                 
                                                                        
      -c,  --ignore-profiles                                            
           does NOT compute GC profiles                                 
                                                                        
      -o,  --stdout                                                     
           write overall statistics to standard output                  
                                                                        
      -p,  --plots                                                      
           print Shell code to generate plots (gnuplot)                 
                                                                        
      -min [NUMBER],  --minimum [NUMBER]                                
           k-mer minimum size (usually 10)                              
                                                                        
      -max [NUMBER],  --maximum [NUMBER]                                
           k-mer maximum size (usually 16)                              
                                                                        
      [FILE]                                                            
           Input FASTA reference (e.g. human) -- MANDATORY.             
           This content will be loaded in the models.                   
                                                                        
      [FILE]                                                            
           Input FASTA target (e.g. SARS-CoV-2) -- MANDATORY.           
           The mRAWs will be mapped on this content file.               
                                                                        
COPYRIGHT                                                               
      Copyright (C) 2014-2020, IEETA/DETI, University of Aveiro.        
      This is a Free software, under GPLv3. You may redistribute        
      copies of it under the terms of the GNU - General Public          
      License v3 <http://www.gnu.org/licenses/gpl.html>. There          
      is NOT ANY WARRANTY, to the extent permitted by law. 

Citation

Version 2.2:

  • D. Pratas, J. M. Silva. Persistent minimal sequences of SARS-CoV-2. Bioinformatics (2020): btaa686. URL.

version 1.0:

  • R. M. Silva, D. Pratas, L. Castro, A. J. Pinho & P. J. S. G. Ferreira. Three minimal sequences found in Ebola virus genomes and absent from human DNA. Bioinformatics (2015): btv189. URL.

Issues

For any issue let us know at issues link.

License

GPL v3.

For more information:

http://www.gnu.org/licenses/gpl-3.0.html

About

An ultra-fast tool to find relative absent words in genomic data

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C 83.1%
  • Shell 16.3%
  • CMake 0.6%