TAQLoRE - Transcript Annotation and Quantification using Long Reads

TAQLoRE is a Snakemake-based pipeline to improve existing annotations and to quantify transcripts coming from long read amplicon-based cDNA sequencing technologies (Oxford Nanopore Technologies, PacBio). It was tested on Linux (CentOS 6) but it should work on Mac as well. Briefly, it uses LAST to align all reads to the transcriptome, then it discovers new exons by looking at insertions in alignments, it creates meta-gene with all known and novel exons, aligns all reads to it and generates a TMM-normalised read counts, together with expression heatmaps and PCA plots. It also identifies new splice sites by looking at perfectly aligned reads to the genome, and correcting all splice sites to the closest most abundant canonical ones. For more information, refer to :doc:`description`.

Installation

Source code: GitHub
Issue tracker: Issue tracker

Citations of dependencies

Our pipeline is based on following software:

Snakemake: Köster J, Rahmann S. “Snakemake - A scalable bioinformatics workflow engine”. Bioinformatics. 2018 Oct 15;34(20):3600.
LAST: Kiełbasa SM, Wan R, Sato K, Horton P, Frith MC. "Adaptive seeds tame genomic sequence comparison". Genome Res. 2011 Mar;21(3):487-93.
GMAP: Wu TD, Watanabe CK. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics. 2005 May 1;21(9):1859-75.
Bedtools: Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010 Mar 15;26(6):841-2.
Pybedtools: Dale RK, Pedersen BS, Quinlan AR. Pybedtools: a flexible Python library for manipulating genomic datasets and annotations. Bioinformatics. 2011 Dec 15;27(24):3423-4.

Papers using the pipeline

The following papers/pre-prints that use our pipeline has been published:

Clark M, Wrzesinski T, Garcia-Bea A, Kleinman J, Hyde T, Weinberger D, Haerty W, Tunbridge E. bioRxiv 260562.

Authors

Developers:

Wilfried Haerty (Earlham Institute)
Tomasz Wrzesinski (Earlham Institute)

Contributors:

Elizabeth Tunbridge (University of Oxford)
Michael Clark (University of Melbourne)
Nicola Hall (University of Oxford)
Syed Hussain (University of Oxford)
Hami Lee (University of Oxford)

Things to add

Splice-site-based pipeline (part4 and part5).
Usage of splice-site-based approach (part4 and part5).
Description of output files.
Description of scripts.
Description of example dataset.

.. toctree::
   :maxdepth: 2
   :caption: Documentation index

   description
   installation
   usage_exon_based
   usage_splice_site_based
   output_files_exon_based
   output_files_splice_site_based

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index.rst

index.rst

TAQLoRE - Transcript Annotation and Quantification using Long Reads

Installation

Citations of dependencies

Papers using the pipeline

Authors

Things to add

Files

index.rst

Latest commit

History

index.rst

File metadata and controls

TAQLoRE - Transcript Annotation and Quantification using Long Reads

Installation

Citations of dependencies

Papers using the pipeline

Authors

Things to add