Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
- Python 2.7+
- HMMER 3 and/or DIAMOND binaries available (otherwise using the ones packaged with eggNOG-mapper)
- BioPython (required only if using the
~20GB for the eggNOG annotation database
~20GB for eggNOG fasta files
~130GB for the three optimized eggNOG databases (euk, bact, arch), and from 1GB to 35GB for each taxonomic-specific eggNOG HMM database. You don't have to download all, just pick the ones you are interested.
(you can check the size of individual datasets at http://beta-eggnogdb.embl.de/download/eggnog_4.5/hmmdb_levels/)
eggnog-mapper allows to run very fast searches by allocating the target HMM
databases into memory (using the HMMER3 hmmpgmd program). This is enabled when
using of the
--servermode flags, and it will require a lot of
RAM memory (depending on the size of the target database). As a reference:
- ~90GB to load the optimized eukaryotic databases (
- ~32GB to load the optimized bacterial database (
- ~10GB to load the optimized archeal database (
- Searches are still possible in low memory systems. However, Disk I/O will be the a bottleneck (specially when using multiple CPUs). Place databases in the fastest disk posible.
Download and decompress the latest version of eggnog-mapper from https://github.com/jhcepas/eggnog-mapper/releases. The program does not require compilation nor installation.
or clone this git repository:
git clone https://github.com/jhcepas/eggnog-mapper.git
eggNOG database retrieval
eggNOG mapper provides 107 taxonomically restricted HMM databases (
xxxNOG), three optimized databases [Eukaryota (
euk), Bacteria (
bact) and Archea (
arch)] and a virus specific database (
The three optimized databases include all HMM models from their corresponding taxonomic levels in eggNOG (euNOG, bactNOG, arNOG) plus additional models spliting large alignments into taxonomically restricted (smaller) HMM models. In particular, HMM models with more than 500 (euk) or 50 (bact) sequences are expanded. The
archdatabase includes all models from all archeal taxonomic levels in eggNOG.
taxonomically restricted databases are listed here. They can be referred by its code (i.e. maNOG for Mammals).
To download a given database, execute the download script providing a the list of databases to fetch:
download_eggnog_data.py euk bact arch viruses
This will fetch and decompress all precomputed eggNOG data into the data/ directory.