Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



96 Commits

Repository files navigation


# download workflow
git clone
cd detectEVE

# install dependencies via conda or mamba (
mamba create -n detectEVE
mamba activate detectEVE
mamba env update --file workflow/envs/env.yaml

# run detectEVE
./detectEVE -h                       # show help
./detectEVE [options] [<in.fa> ...]  # analyze local fasta files
./detectEVE [options] -a acc.csv     # download & analyze NCBI accession table
./detectEVE [options] -A acc,acc     # download & analyze NCBI accession list

# or combine local fasta files and remote accessions
./detectEVE [options] [(-a acc.csv | -A acc,acc)] [<in.fa> ...]

# download and prep databases
./detectEVE --setup-databases [--snake ARGS]

# run example data
cd examples
../detectEVE *.fna

See NCBI SRA WGS for downloadable accessions. Exported csv tables from the Sequence Set Browser can be directly used with detectEVE (-a wgs_selector.csv). Genomes will be downloaded to genomes/<accession>.fna.

Note, by default, downloaded genomes will be removed again automatically after being scanned. If you want to keep them, add --notemp to --snake-args.

See Advanced database setup for alternatives and customization of databases.

See Known issues and for questions, problems or feedback.


The pipeline produces the following final files in results/:

  • <genome_id>-validatEVEs.tsv - best hit of the EVE with evidence and confidence annotation (high confidence: EVE score > 30, low confidence: EVE score > 10)
  • <genome_id>-validatEVEs.fna - validatEVEs nucleotide sequences
  • <genome_id>-validatEVEs.pdf - graphical overview of hit distribution for validatEVEs


Workflow overview and background

detectEVE is based on the EVE search strategy developed by S. Lequime and previously used in the following publications:

The current workflow involves the following steps:


Advanced database setup

If you need databases in a different location you can adjust db_dir in config.yaml to whatever suits your system.

If you prefer to handle downloads manually or use existing files, copy any file you don’t want detectEVE to download automatically into databases/ (or the respective config.yaml/db_dir) before running --setup-databases.

Note though, unless you add --notemp to the --snake-arguments, all but the final diamond-formatted database files will be deleted from databases/ at the end of the setup phase.

cd databases/

# Latest RVDB
url= && 
db=$(curl -fs $url | grep -oPm1 'files/U-RVDBv[0-9.]+-prot.fasta.xz')
curl $url/$db -o rvdb100.faa.xz

# UniRef50

# NCBI taxonomy
tar -xzf taxdump.tar.gz nodes.dmp names.dmp

Known issues

tidyverse stringi libicui

If you encounter an error related to tidyverse/stringi/, try reinstalling stringi locally. To restart the workflow from where it failed, just run the same command again.

mamba remove r-stringi r-tidyverse
R -e 'install.packages("stringi")'
mamba install r-tidyverse

diamond v2.1.9 send bug

diamond v2.1.9 has a bug and does not output send correctly when using long-read mode (–range-culling). Since at this point v2.1.9 is the latest diamond version, detectEVE defaults to diamond v2.1.8 to avoid this bug.