We have now added reproducible R code for producing element labels for mobileOGs using just the metadata (.csv) file. The resulting metadata can be accessed here. This metadata file can be used to provide classification to contigs, as in the figure here:
mobileOG-pl. v. kyanite is a lightweight mobile genetic element annotation pipeline using the mobile orthologous groups database (https://mobileogdb.flsi.cloud.vt.edu/). It takes a set of contigs or long reads as input and produces:
- Open reading frames using prodigal
- Alignment summaries to a mobile orthologous groups database file using diamond
- Element-mapping data summarizing matches to proteins from different element classes.
This pipeline reports the presence of MGE proteins in a set of contigs using mobileOG-db annotations. It provides protein hit classifications as being putatively derived from plasmid, phage, insertion sequences, and/or integrative genomic element. Thus, mobileOG-pl can be used as the basis for detection of any major class of bacterial MGE and can be complemented with other tools to achieve a fine-grained element classification.
python 3.6.15 with pandas, argparse, itertools
prodigal
diamond 0.9.24 or greater
- Install Conda environment:
conda create -n mobileOG-db python=3.6.15
conda activate mobileOG-db
conda install -c conda-forge biopython
conda install -c bioconda prodigal
conda install -c bioconda diamond
conda install -c anaconda pandas
-
Download mobileOG-db (From Website)
Database (mobileOG-db-beatrix-1.X.All.faa)
Metadata (mobileOG-db-beatrix-1.X.All.csv)
Code (mobileOGs-pl-kyanite.sh and mobileOGs-pl-kyanite.py)
mkdir mobileOG-db_workdir
cd mobileOG-db_workdir
chmod +x mobileOGs-pl-kyanite.sh
PATH_TO_DOWNLOAD=download link
wget $PATH_TO_DOWNLOAD
DOWNLOADED_ZIP=downloaded zip file
unzip $DOWNLOADED_ZIP
-
Make Diamond Database:
diamond makedb --in mobileOG-db-beatrix-1.X.All.faa -d mobileOG-db-beatrix-1.X.dmnd
-
Run Code (example of stringent settings):
conda activate mobileOG-db
chmod +x mobileOGs-pl-kyanite.sh
./mobileOGs-pl-kyanite.sh -i test.fasta -d mobileOG-db-beatrix-1.X.dmnd -m mobileOG-db-beatrix-1.X.All.csv -k 15 -e 1e-20 -p 90 -q 90
-
Usage:
-i, --input | Input Fasta File
-k , --kvalue | Number of Diamond Alignments to Report
-e, --escore | Maximum E-score
-d, --db | Diamond Database
-m, --metadata | mobileOG-db metadata (csv file) used to compare to samples
-p, --pidentvalue | Percent of Identical Matches of samples to metadata
-q, --queryscore | Percent of query coverage to sample