metagenome_annotation

annotation pipeline to annotate metagenomic data using KEGG, UniProt, NCBI, PFAM and IPERscan

This script takes a scaffold fasta file of nucleic acids, calls genes using prodigal and then annotates those genes against KEGG, NCBI, PFAM and Uniprot databaseses. The result of this script is multiple prodigal files for gene mapping, the specific database hits, and then a summary directory with a text file containing the full annotation, an annotated amino acid fasta file with the best hit for the protein, and an annotated nucleic acid fasta file (genes) with the annotation.

use annotate_fasta.sh

Software dependencies

prodigal

KEGG

interproscan

ublast

uniref

For sotware, be sure the script calls to the correct database name depending on the software that you have and databases you use. This is completely set up for our server, but should be transferrable with a few tweaks.

Additional dependent scripts:

pullcontigs.pl

ANNOTATION_PIPELINE_IPER_OPTION.sh

make_fasta_seq_single_line.py

interproscan_parallel.sh

parallel_PfamScan.py

convert_pfam_to_iperscan.py

reverse_best_hits.sh

perl1.pl

perl2.pl

perl4_NEW.pl

pull_all_contig_annotations.py

perl6.pl

write_annotation_to_fasta.py

Dependent scripts must me in the same directory, or executable from a root directory.

Run this command as follows:

bash annotate_fasta.sh scaffold.fa DATABASE ID $1 File to annotate

$2 IPER NO_IPER PFAM

$3 ID for metagenome that will be added to the beginning of all of the scaffolds

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ANNOTATE_PROTEIN_FASTA.sh		ANNOTATE_PROTEIN_FASTA.sh
ANNOTATION_PIPELINE_IPER_OPTION.sh		ANNOTATION_PIPELINE_IPER_OPTION.sh
README.md		README.md
add_missing_annotations.py		add_missing_annotations.py
annotate_fasta.sh		annotate_fasta.sh
contig_stats.pl		contig_stats.pl
convert_pfam_to_iperscan.py		convert_pfam_to_iperscan.py
interproscan_parallel.sh		interproscan_parallel.sh
parallel_PfamScan.py		parallel_PfamScan.py
perl1.pl		perl1.pl
perl2.pl		perl2.pl
perl4_NEW.pl		perl4_NEW.pl
perl6.pl		perl6.pl
pullseq.py		pullseq.py
reverse_best_hits.sh		reverse_best_hits.sh
write_annotation_to_fasta.py		write_annotation_to_fasta.py

TheWrightonLab/metagenome_annotation

Folders and files

Latest commit

History

Repository files navigation

metagenome_annotation

About

Resources

Stars

Watchers

Forks

Languages