annot-nf

A portable, scalable eukaryotic genome annotation pipeline implemented in Nextflow.

Purpose

This software is a comprehensive computational pipeline for the annotation of eukaryotic genomes (like protozoan parasites). It performs the following tasks:

Fast generation of pseudomolecules from scaffolds by ordering and orientating against a reference
Accurate transfer of highly conserved gene models from the reference
De novo gene finding as a complement to the gene transfer
Non-coding RNA detection (tRNA, rRNA, sn(o)RNA, ...)
Pseudogene detection
Functional annotation (GO, products, ...)
- ...by transferring reference annotations to the target genome
- ...by inferring GO terms and products from Pfam pHMM matches
Consistent gene ID assignment
Preparation of validated GFF3, GAF and EMBL output files for jump-starting manual curation and quick turnaround time to submission

It supports parallelized execution on a single machine as well as on large cluster platforms (LSF, SGE, ...).

Requirements

The pipeline is built on Nextflow as a workflow engine, so it needs to be installed first:

curl -fsSL get.nextflow.io | bash

With Nextflow installed, the easiest way to use the pipeline is to use the prepared Docker container (https://registry.hub.docker.com/u/satta/annot-nf) which contains all external dependencies.

docker pull satta/annot-nf

Running the pipeline

Here's how to start an example run using Docker (using the example dataset and parameterization included in the distribution):

$ nextflow run nextflow-io/annot-nf -profile docker

For your own runs, provide your own file names, paths, parameters, etc. as defined in the nextflow.config file.

Preparing reference annotations

The reference annotations used in the pipeline need to be pre-processed before they can be used. TODO: add documentation on how to prepare references.

Contact

Sascha Steinbiss (ss34@sanger.ac.uk)

###Build status

Name		Name	Last commit message	Last commit date
Latest commit History 282 Commits
ABACAS2		ABACAS2
RATT		RATT
bin		bin
data		data
example-data		example-data
test		test
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.travis.yml		.travis.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
annot.nf		annot.nf
loc_docker.config		loc_docker.config
loc_sanger.config		loc_sanger.config
loc_sanger_farm.config		loc_sanger_farm.config
loc_travis.config		loc_travis.config
nextflow.config		nextflow.config
params_default.config		params_default.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

annot-nf

Purpose

Requirements

Running the pipeline

Preparing reference annotations

Contact

About

Releases

Packages

Languages

License

rduque1/annot-nf

Folders and files

Latest commit

History

Repository files navigation

annot-nf

Purpose

Requirements

Running the pipeline

Preparing reference annotations

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages