consensusMIBC

This package implements a nearest-centroid transcriptomic classifier, that assigns class labels according to the consensus molecular classification of Muscle-Invasive Bladder Cancer (Manuscript submitted). The consensus classification identifies 6 molecular classes : Luminal Papillary (LumP), Luminal Non Specified (LumNS), Luminal Unstable (LumU), Stroma-rich, Basal/Squamous (Ba/Sq), Neuroendocrine-like (NE-like).
Two example data sets are provided to run the classifier.

Citation

For now, you can cite the following bioRxiv preprint: bioRxiv 488460; doi: https://doi.org/10.1101/488460

Install

You may install this package with devtools:

library(devtools)
devtools::install_github("cit-bioinfo/consensusMIBC", build_vignettes = TRUE)
library(consensusMIBC)

Long-form documentation

To see example results for the consensusMIBC package, use the package vignette:

vignette("consensusMIBC")

Usage

getConsensusClass(x, minCor = .2, gene_id = c("entrezgene", "ensembl_gene_id", "hgnc_symbol")[1])

Where x is either a single named vector of gene expression values or a dataframe formatted according to the example data sets provided (unique genes in row, samples in column). Gene names (vector names or dataframe rownames) may be supplied as Entrez IDs, Ensembl gene IDs, or HUGO gene symbols. RNA-seq data needs to be log-transformed, for example using log2(normalized counts + 1).

minCor is a numeric value specifying a confidence minimal threshold for best Pearson's correlation. Classifier predictions relying on a correlation lower than minCor are set to NA. Default is 0.2.

gene_id is a character value specifying the type of gene identifiers used for the names/rownames of x : entrezgene for Entrez IDs, ensembl_gene_id for Ensembl gene IDs, or hgnc_symbol for HUGO gene symbols. Default value is entrezgene.

Example

data(tcgadat)

# Single sample classification

getConsensusClass(tcga.dat[, 1])

#   consensusClass adjusted_pval separationLevel      LumP     LumNS      LumU Stroma-rich    Ba/Sq   NE-like
#ss           LumP  1.698954e-90       0.6626931 0.6176298 0.5735781 0.5641684    0.589056 0.575446 0.1820276

# Classification of a matrix of samples

res <- getConsensusClass(tcga.dat)
head(res)

#                 consensusClass      cor_pval separationLevel      LumP     LumNS      LumU Stroma-rich     Ba/Sq   NE-like
#TCGA-2F-A9KO-01A           LumP  1.698954e-90       0.6626931 0.6176298 0.5735781 0.5641684   0.5890560 0.5754460 0.1820276
#TCGA-2F-A9KP-01A           LumP 8.546007e-142       0.3213549 0.7284300 0.6806556 0.6938315   0.5608757 0.4695459 0.2734213
#TCGA-2F-A9KQ-01A           LumP 8.031287e-137       0.5460606 0.7196396 0.6443850 0.6345979   0.5290542 0.4500750 0.2041000
#TCGA-2F-A9KR-01A          Ba/Sq  3.288756e-94       0.2368501 0.5980346 0.4938276 0.4900237   0.5126923 0.6274487 0.2116246
#TCGA-2F-A9KT-01A          Ba/Sq 3.171797e-108       0.6737278 0.5772637 0.5144653 0.5335134   0.5393338 0.6615954 0.2865093
#TCGA-2F-A9KW-01A           LumU  6.990305e-83       0.6139238 0.5136967 0.5527899 0.5963426   0.5371054 0.4725626 0.3087334

table(res$consensusClass)

##      Ba/Sq       LumNS        LumP        LumU     NE-like Stroma-rich 
##        153          21         128          53           6          45

The classifier returns a dataframe with 9 columns :

consensusClass returns the consensus calls for each sample. Calls are set to NA for low confidence predictions (maximal correlation is below the given minCor parameter).

cor_pval returns the p-value(s) associated to the Pearson's correlation of the sample(s) with the nearest centroid.

separationLevel gives a measure of how a sample is representative of its consensus class. It ranges from 0 to 1, with 0 meaning the sample is too close to other consensus classes to be confidently assigned one consensus class label, and 1 meaning the sample is highly representative of its consensus class and well separated from the other consensus classes. The separationLevel is measured as follows : (correlation to nearest centroid - correlation to second nearest centroid) / median difference of sample-to-centroid correlation.

The 6 other columns return the Pearson's correlation between each sample and each consensus class.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
R		R
data		data
doc		doc
man		man
vignettes		vignettes
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
consensusMIBC.Rproj		consensusMIBC.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R

R

data

data

doc

doc

man

man

vignettes

vignettes

DESCRIPTION

DESCRIPTION

LICENSE

LICENSE

NAMESPACE

NAMESPACE

README.md

README.md

consensusMIBC.Rproj

consensusMIBC.Rproj

Repository files navigation

consensusMIBC

Citation

Install

Long-form documentation

Usage

Example

About

Releases

Packages

Contributors 2

Languages

License

cit-bioinfo/consensusMIBC

Folders and files

Latest commit

History

Repository files navigation

consensusMIBC

Citation

Install

Long-form documentation

Usage

Example

About

Resources

License

Stars

Watchers

Forks

Languages