Skip to content

differential allele-specific expression

Notifications You must be signed in to change notification settings

Singh-Lab/diffASE

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

diffASE User’s Guide

This guide provides an overview and example of how to calculate differential Allele-Specific Expression (ASE) as described in “Differential Allele-Specific Expression Uncovers Breast Cancer Genes Dysregulated by Cis Noncoding Mutation.”

In order to use the method, simply source “diffASEfunctions.R” in R. The functions to compute differential ASE depend on the R packages: “metap”, “limma”, and “biomaRt”

source("diffASEfunctions.R")

The main data required to calculate differential ASE are allele counts at heterozygous sites in a tumor and matched normal sample. These can be found using the package “AllelicImbalance” in R with the function “getAlleleCounts” called on BAM files for each sample. Make sure to use the same set of heterozygous sites for both the tumor and matched normal files. Heterozygous sites can be found using “scanForHeterozygotes” from the same package.

Example allele counts should look as follows, with columns for Entrez gene id, sample name, genomic location, normal sample primary allele count, normal sample total read count, tumor sample primary allele count, tumor sample total read count:

V1 V2 V3 V4 V5 V6 V7
1 sample1 chr19:58346952 17 25 19 26
1 sample2 chr19:58346952 12 19 21 25

Using just this ASE data we can compute baseline differential ASE:

computeASEbaseline(testASE)
sample gene pval ASE
sample1 1 0.5789254 0.0507692
sample2 1 0.0307450 0.2084211

Next, to compute the purity adjusted differentual ASE, we need a tumor purity for each sample. For TCGA data many of these have been pre-computed in the paper “Systematic pan-cancer analysis of tumour purity.” These should be stored in a table with the sample name in the first column and the purity in the second.

V1 V2
sample1 0.8
sample2 0.9

Given that purity data, we can compute diffASE-purity:

computeASEpurity(testASE, allPurity)
sample gene pval ASE
sample1 1 0.5349559 0.0634615
sample2 1 0.0227733 0.2315789

Finally, expression adjusted differential ASE can be computed if expression values for each gene in each sample are known. This requires two tables: one for cpm counts for the normal sample and one with cpm counts for the matched tumor sample. These should have the Ensembl name of the gene as the row name and a column for each sample.

sample1 sample2
ENSG00000121410 1.2 2.1
sample1 sample2
ENSG00000121410 2 1.9

With that tumor purity and expression data, we can comute diffASE-exp:

computeASEexp(testASE, allPurity, cpmTumor, cpmNormal)
sample gene pval ASE
sample1 1 0.4967081 0.0761538
sample2 1 0.0234966 0.2291540

For comparison to differential ASE, we can also compute tumor sample ASE to see that sometimes ASE looks significant without the context of the matched tumor sample:

computeTumorSampleASE(testASE)
sample gene pval ASE
sample1 1 0.0186029 0.2307692
sample2 1 0.0006739 0.3400000

About

differential allele-specific expression

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 100.0%