This is a simple example package. You can use Github-flavored Markdown to write your content.
So installation here is a little robust due to the nature of how many programs are used. In order for everything to work smoothly, there are 3 main considerations:
First is the virtual environment you will be in while executing the pipeline. You want to make sure you are in a virtual environment so all the dependencies can play nice.
Second is the programs you install. You will need access to a lot of different programs, which should be housed within your virtual environment.
Third is the databases needed to annotate your data. Some programs have built in data permutations, but many require outside databases in order to compare your data to external data.
The next thing to consider is how to run the program. There are 3 snakemake files: Download.smk, Filter.smk, and Annotate.smk. Each can be assess with the command: $ genotator [download | filter | annotate] ...params. So you choose either the download, filter, or annotate command. Note that everything is chained together such as download --> filter --> annotate, so if you run filter, you're also running download. If you run annotate, you're also running download and filter. See below for detailed instructions on how to run the program.
Part 1: Create an isolated virtual environment using conda. See conda documentation for more details on conda.
conda create -y --name gen-annotator python=3.10
source activate /path/to/environment/gen-annotator
conda install -y -c conda-forge mambaPart 2: Install all the dependencies. Sorry there are so many 😧
Start by downloading mamba, which is how I prefer to install packages when I can (over conda). Almost always mamba is a lot faster at resolving dependencies and the whole installation process. See mamba documentation for more information.
conda install -y -c conda-forge mamba
mamba install -y -c conda-forge -c bioconda python=3.10 \
sqlite prodigal idba mcl muscle=3.8.1551 famsa hmmer diamond \
blast megahit spades bowtie2 bwa graphviz "samtools>=1.9" \
trimal iqtree trnascan-se fasttree vmatch r-base r-tidyverse \
r-optparse r-stringi r-magrittr bioconductor-qvalue meme ghostscript
mamba install -y -c bioconda fastani
conda install -c bioconda dbcan
pip install checkm-genome
# Grab the latest anvio version
curl -L https://github.com/merenlab/anvio/releases/download/v8/anvio-8.tar.gz --output anvio-8.tar.gz
pip install anvio-8.tar.gz
# pip install fastani (deprecated)Below is where we setup some databases through the Anvi'o suite.
- anvi-setup-pfams
https://github.com/merenlab/anvio/blob/master/anvio/pfam.py#L55
- OR, I can download it directly for them into the lib
- anvi-setup-ncbi-cogs (Run_COGs)
- anvi-setup-kegg-kofams (Run_Kegg)conda install -y -c conda-forge ncbi-datasets-cliNote that RAST may be finnicky ![]()
- pip install rast
##Common Issues
If the genome is too small and nothing was annotated, it gives an error!
- Probably have some sort of:
$ genome-annotator --setup-dbs
Line
- Install the conda package ($ conda install -c bioconda genome-annotator)
- Run the database configuration script ($ annotator --init-dbs {all,cazyme,cogs,kegg,pfam,rast,tigr
#User Guide
Note: Any command starting with $ is used to denote being in a terminal (on the command line), and the $ marks the prompt.
- Clone the repository:
$ git clonehttps://github.com/ddeemerpurdue/genome-annotator.git - Enter the new directory:
$ cd genome-annotator - Create the fresh conda environment:
SEE INSTALLATION - Download and initialize databases:
SEE INSTALLATIONLink to Installation Add genotator.py to path: export $PATH=$PATH:/path/to/genome-annotator/bin/- Pass