ASCOT - Alternative Splicing and Gene Expression Summaries of Public RNA-Seq Data
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin Update Dec 11, 2018
cfg ascot repo Dec 11, 2018
exons exon update Dec 16, 2018
imgs ascot repo Dec 11, 2018
software/snaptron ascot repo Dec 11, 2018 Update Dec 20, 2018 ascot repo Dec 11, 2018 ascot repo Dec 11, 2018
license.txt CC BY-NC 4.0 license Dec 11, 2018



Jonathan P. Ling, Chris Wilks, Rone Charles, Ben Langmead, Seth Blackshaw

ASCOT quantifies alternative splicing and gene expression across tens of thousands of bulk and single-cell RNA-Seq datasets in human and mouse. This repository contains the scripts used to generate the PSI/NAUC tables for this resource.

ASCOT uses annotation-free methods to detect exon percent spliced-in (PSI) values within Snaptron, a rapidly queryable database of splice junction counts derived from public RNA-seq data. Gene expression levels are calculated using a normalized "area-under-curve" (NAUC) metric as described in recount2.

Please refer to our bioRxiv preprint and the accompanying ASCOT website. All data tables are available for download.

Comments and suggestions are always welcome:

UCSC TrackHubs for data visualization:

We strongly recommend that users cross-validate any splicing results obtained from ASCOT. One way to do so is to visualize the data on the UCSC Genome Browser. We provide TrackHubs (collections of .bigwig files) from each dataset in ASCOT:

>Mouse cell types and tissues from bulk RNA-Seq (MESA) TrackHub link
>Mouse single-cell RNA-Seq data (CellTower) TrackHub link
>Human GTEx tissues + eye (GTEx) TrackHub link
>ENCODE shRNA-Seq knockdown data (ENCODE) TrackHub link

Instructions for using TrackHubs are available on the UCSC help page. In brief, navigate to the top menu bar option My Data > Track Hubs, select the My Hubs tab, enter URL from above and select Add Hub.

Usage instructions (we recommend a system with at least 30Gb ram):

git clone
cd ./ascot/software/snaptron

To generate the splicing PSI data tables:

>Mouse cell types and tissues from bulk RNA-Seq (MESA) - ## datasets, ## exon query
    python3 --i ./exons/mesa_exons.tsv --a mesaall --c mesalinked --o mesa_psi.tsv

>Mouse single-cell RNA-Seq data (CellTower) - ## datasets, ## exon query
    python3 --i ./exons/ctms_exons.tsv --a ctmsall --c ctmslinked --min 10 --f 1 --o ctms_psi.tsv

>Human GTEx tissues + eye (GTEx) - ## datasets, ## exon query
    python3 --i ./exons/gtexeye_exons_1.tsv --a gtexeyeall --c gtexeyelinked --o gtexeye_psi_1.tsv
    python3 --i ./exons/gtexeye_exons_2.tsv --a gtexeyeall --c gtexeyelinked --o gtexeye_psi_2.tsv
    python3 --i ./exons/gtexeye_exons_3.tsv --a gtexeyeall --c gtexeyelinked --o gtexeye_psi_3.tsv

>ENCODE shRNA-Seq knockdown data (ENCODE) - # datasets, ## exon query
    python3 --i ./exons/encode_exons_1.tsv --a encodegtexall --c encodegtexlinked --o encode_psi_1.tsv
    python3 --i ./exons/encode_exons_2.tsv --a encodegtexall --c encodegtexlinked --o encode_psi_2.tsv

To generate the gene expression NAUC data tables:

>Mouse cell types and tissues from bulk RNA-Seq (MESA)
    python3 --a mesaall --c mesalinked --o mesa_nauc.tsv
>Mouse single-cell RNA-Seq data (CellTower)
    python3 --a ctmsall --c ctmslinked --o ctms_nauc.tsv
>Human GTEx tissues + eye (GTEx)
    python3 --a gtexeyeall --c gtexeyelinked --o gtexeye_nauc.tsv
>ENCODE shRNA-Seq knockdown data (ENCODE)
    python3 --a encodegtexall --c encodegtexlinked --o encode_nauc.tsv