Installation

Jaime Huerta-Cepas edited this page Sep 20, 2016 · 2 revisions

Requirements

Software Requirements:

  • Python 2.7+
  • wget
  • HMMER 3 and/or DIAMOND binaries available (otherwise using the ones packaged with eggNOG-mapper)
  • BioPython (required only if using the --translate option)

Storage Requirements:

  • ~20GB for the eggNOG annotation database

  • ~20GB for eggNOG fasta files

  • ~130GB for the three optimized eggNOG databases (euk, bact, arch), and from 1GB to 35GB for each taxonomic-specific eggNOG HMM database. You don't have to download all, just pick the ones you are interested.

(you can check the size of individual datasets at http://beta-eggnogdb.embl.de/download/eggnog_4.5/hmmdb_levels/)

Memory requirements:

eggnog-mapper allows to run very fast searches by allocating the target HMM databases into memory (using the HMMER3 hmmpgmd program). This is enabled when using of the --usemem or --servermode flags, and it will require a lot of RAM memory (depending on the size of the target database). As a reference:

  • ~90GB to load the optimized eukaryotic databases (euk)
  • ~32GB to load the optimized bacterial database (bact)
  • ~10GB to load the optimized archeal database (arch)

Note:

  • Searches are still possible in low memory systems. However, Disk I/O will be the a bottleneck (specially when using multiple CPUs). Place databases in the fastest disk posible.

Installation

Download

git clone https://github.com/jhcepas/eggnog-mapper.git

eggNOG database retrieval

  • eggNOG mapper provides 107 taxonomically restricted HMM databases (xxxNOG), three optimized databases [Eukaryota (euk), Bacteria (bact) and Archea (arch)] and a virus specific database (viruses).

  • The three optimized databases include all HMM models from their corresponding taxonomic levels in eggNOG (euNOG, bactNOG, arNOG) plus additional models spliting large alignments into taxonomically restricted (smaller) HMM models. In particular, HMM models with more than 500 (euk) or 50 (bact) sequences are expanded. The arch database includes all models from all archeal taxonomic levels in eggNOG.

  • taxonomically restricted databases are listed here. They can be referred by its code (i.e. maNOG for Mammals).

To download a given database, execute the download script providing a the list of databases to fetch:

download_eggnog_data.py euk bact arch viruses

This will fetch and decompress all precomputed eggNOG data into the data/ directory.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.