PreCanCell

PreCanCell is a simple and effective ensemble learning algorithm for predicting cancer and non-cancer cells from single-cell transcriptomes. PreCanCell first identified the differentially expressed genes (DEGs) between malignant and non-malignant cells commonly in five common cancer-associated single-cell transcriptome datasets. With each of the five datasets as the training set and the DEGs as the features, a single cell is classified as malignant or non-malignant by k-NN (k = 5). Finally, the single cell is classified by the majority vote of the five k-NN classification results.

Details

The function PreCanCell_data() is used to data preprocessing. Its input should be normalized expression matrix with rownames being genes and colnames being cells. The input data can be any library-depth normalization (e.g. TPM, CPM).
The function PreCanCell_classifier() is used to identify malignant and non-malignant cells from single-cell transcriptomes, containing 2 parameters: testdata and cores.
- "testdata" is a output matrix of the function PreCanCell_data().
- "cores" is the number of threads.

Installation

The Seurat package (version >= 4.3.0) is used for data preprocessing.
To install PreCanCell , first install the fastknn package, which can be installed as follows:

if (!requireNamespace("devtools", quietly = TRUE))
    install.packages("devtools")

devtools::install_github("davpinto/fastknn")

Finally, users can install the released version of PreCanCell with:

devtools::install_github("WangX-Lab/PreCanCell")

Examples

Prepare data:

library(PreCanCell)
path <- system.file("extdata", "example.txt", package = "PreCanCell", mustWork = TRUE)
input <- read.table(path, stringsAsFactors = FALSE, header = TRUE, check.names = FALSE, sep = "\t", row.names = 1)
input[1:5,1:5]

	0	3
SERINC2	0.00	0.00
PTPRF	0.00	0.00
S100A1	0.00	0.00
EFNA1	0.00	6.33
SOX4	2.59	5.23

Data preprocessing (select matched genes and [0,1]-scaled gene expression values):

testdata <- PreCanCell_data(input)
testdata[1:5,1:5]

	EFNA1	SOX4
0	0.00	0.46
1	0.00	0.00
2	0.00	0.00
3	1.00	0.94
4	0.00	0.00

Prediction of malignant and non-malignant cells: cores represents the number of cores to use for parallel execution.

results <- PreCanCell_classifier(testdata, cores = 2)
head(results)

Sample	freq_cancer	freq_non_cancer	pred_label
0	1	0	cancer
1	0	1	non_cancer
2	0.8	0.2	cancer
3	1	0	cancer
4	0	1	non_cancer
5	0	1	non_cancer

Vignettes

Predicting malignant and non-malignant cells from single-cell transcriptomes

Contact

E-mail any questions to Xiaosheng Wang (xiaosheng.wang@cpu.edu.cn)

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
R		R
data		data
docs		docs
inst/extdata		inst/extdata
man		man
vignettes		vignettes
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PreCanCell

Details

Installation

Examples

Vignettes

Contact

About

Releases

Packages

Languages

WangX-Lab/PreCanCell

Folders and files

Latest commit

History

Repository files navigation

PreCanCell

Details

Installation

Examples

Vignettes

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages