/
dnCIDER_highlevel.Rmd
96 lines (67 loc) · 2.66 KB
/
dnCIDER_highlevel.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
title: "Getting Start with De Novo CIDER (dnCIDER)"
output:
rmarkdown::html_vignette:
toc: TRUE
number_sections: true
vignette: >
%\VignetteIndexEntry{Getting Start with De Novo CIDER (dnCIDER)}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# Introduction
This vignette performs dnCIDER on a cross-species pancreas dataset.
# Set up
In addition to **CIDER**, we will load the following packages:
```{r setup}
library(CIDER)
library(Seurat)
library(parallel)
library(cowplot)
```
# Load pancreas data
The example data can be downloaded from https://figshare.com/s/d5474749ca8c711cc205.
Pancreatic cell data$^1$ contain cells from human (8241 cells) and mouse (1886 cells).
```{r}
load("../data/pancreas_counts.RData") # count matrix
load("../data/pancreas_meta.RData") # meta data/cell information
seu <- CreateSeuratObject(counts = pancreas_counts, meta.data = pancreas_meta)
table(seu$Batch)
```
# Perform dnCIDER (high-level)
DnCIDER contains three steps
```{r}
seu <- initialClustering(seu, additional.vars.to.regress = "Sample", dims = 1:15)
ider <- getIDEr(seu, downsampling.size = 35, use.parallel = FALSE, verbose = FALSE)
seu <- finalClustering(seu, ider, cutree.h = 0.35) # final clustering
```
# Visualise clustering results
We use the Seurat pipeline to perform normalisation (`NormalizeData`), preprocessing (`FindVariableFeatures` and `ScaleData`) and dimension reduction (`RunPCA` and `RunTSNE`).
```{r seurat-pipeline}
seu <- NormalizeData(seu, verbose = FALSE)
seu <- FindVariableFeatures(seu, selection.method = "vst", nfeatures = 2000, verbose = FALSE)
seu <- ScaleData(seu, verbose = FALSE)
seu <- RunPCA(seu, npcs = 20, verbose = FALSE)
seu <- RunTSNE(seu, reduction = "pca", dims = 1:12)
```
We can see
```{r tsne-plot-CIDER-results, fig.height=3, fig.width=4}
scatterPlot(seu, "tsne", colour.by = "CIDER_cluster", title = "asCIDER clustering results")
```
By comparing the dnCIDER results to the cell annotation from the publication$^1$, we observe that dnCIDER correctly identify the majority of populations across two species.
```{r tsne-plot-ground-truth, fig.height=3, fig.width=4}
scatterPlot(seu, "tsne", colour.by = "Group", title = "Ground truth of cell populations")
```
# Technical
```{r sessionInfo}
sessionInfo()
```
# References
1. Baron, M. et al. A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure. Cell Syst 3, 346–360.e4 (2016).
2. Satija R, et al. Spatial reconstruction of single-cell gene expression data. Nature Biotechnology 33, 495-502 (2015).