Skip to content

Latest commit

 

History

History
27 lines (19 loc) · 1.36 KB

README.md

File metadata and controls

27 lines (19 loc) · 1.36 KB

SPAR – sequencing-based pipeline for sncRNA identification and annotation

Background: Small non-coding RNAs (sncRNAs) are highly abundant RNAs, typically <100 nt long, that act as key regulators of diverse cellular processes. Although thousands of sncRNA genes are known to exist in the human genome, no single sequencing-based pipeline/method provides accurate identification, unified annotation, expression and processing information for full sncRNA transcripts and mature RNA products derived from these larger RNAs.

Uniqueness: SPAR integrates significantly improved mapping, segmentation, annotation, and RNA processing information for both human sncRNA genes and mature sncRNA products.

The improved mapping by SPAR provides significantly more data for downstream analysis (on average, over 28% more reads per library). SPAR segmentation is not only signficantly more accurate, but also is order-of-magnitude faster (60x speed-up on average) by utilizing novel segmentation algorithms.

SPAR has been used to process over 187 smRNA high-throughput sequencing (smRNA-seq) datasets with over 2.5 billion reads. SPAR results are integrated into the database of small human non-coding RNAs (DASHR, http://lisanwanglab.org/DASHR, NAR 2015 Database issue, in revision)

Link (software): SPAR http://github.com/wanglab-upenn/SPAR

Link (database): DASHR http://lisanwanglab.org/DASHR