CellMap

CellMap is an innovative tool crafted to precisely map individual cells onto spatial coordinates within tissue slices. Its broad utility lies in unveiling the spatial distribution of cell types, dissecting cellular compositions within tissue section spatial spots, and identifying critical functional structures within biological systems.

In this tutorial, we will demonstarte how to install and use CellMap to resolve spatial tranmscriptomic spots at single-cell resolution.

1. Installing the package and dependencices

To install CellMap,we recommed using devtools:

library(devtools)
devtools::install_github("liuhong-jia/CellMap")

Dependencies
R version >= 4.3.0.
R packages: Seurat, dplyr, ggplot2, Matrix, clue, jsonlite, magrittr, randomForest, parallel

2. Importing packages and preparing input data(scRNA-seq data and spatial transcriptomes data)

10X Visium low-resolution ST data of human HER2+ breast cancer as an example(https://zenodo.org/records/4739739)

library(CellMap)
library(devtools)
library(Seurat)
library(dplyr)
library(Matrix)
library(clue)
library(jsonlite)
library(magrittr)
library(randomForest)
library(parallel)

sc.obj <- readRDS("sc.obj.rds")
st.obj <- readRDS("st.obj.rds")

To ensure compatibility with CellMap, the spatial transcriptomics (ST) data should first be processed using the createSpObj function, which standardizes the data into the required format for subsequent analysis within the CellMap framework.

st.obj <- createSpObj(counts, coord.df, coord.label = c("imagerow", "imagecol"), meta.data = metadata)
# counts：The counts expression matrix of ST data, where rows represent genes and columns represent barcodes.
# coord.df: A data frame containing spatial coordinates for the barcodes in the ST data. 
# coord.label: A character vector specifying the column names in coord.df for the spatial coordinates.
# meta.data : Optional metadata data frame, where rows are barcodes.

Set the identities of the scRNA-seq data by using the cell-type annotation column when available; otherwise, use the clustering results (seurat_clusters).

Idents(sc.obj) <- sc.obj$celltype

3. Setting the parameters

Parameters	Description
st.obj	Seurat object of spatial transcriptome data.
sc.obj	Seurat object of scRNA-seq data.
coord	Coordinates column names in ST images slot.coord = c("x","y") or coord = c("imagerow","imagecol").
norm.method	Normalization methods for scRNA-seq and ST data, norm.method = "NormalizeData" or "SCTransform".
celltype.column	The column name for cell type in the single-cell Seurat object, with the default value as "idents".
sc.sub.size	Downsampling proportion or number for scRNA-seq data. Default: NULL.
min.sc.cells	The minimum number of cell types in scRNA-seq data.Default: 50.
factor.size	Factor size for scaling the weight of gene expression. Default: 0.1.
seed.num	Number of seed genes of each cell type for recognizing candidate markers. Default: 30.
pvalue.cut	Threshold for filtering cell type marker genes. Default: 0.1.
knn	The number of nearest neighboring single cells for each spot. Set to 5 for low-resolution data and 1 for high-resolution data.
mean.cell.num	The average number of single cells in the spot.Set to 5 for low-resolution data and 1 for high-resolution data.
max.cell.num	The maximum number of cells within each spot, if equal to 1, indicates that each spot contains only a single cell.
res	Resolution for clustering ST spots. Default: 0.5.
n.workers	Number of cores to be used for parallel processing. Default: 4.
verbose	Show running messages or not. Default: TRUE.

4. Run CellMap to assign single cells on 10X Visium spatial transcriptome data

results <-  CellMap(st.obj = st.obj,
                    sc.obj = sc.obj,
	        	coord = c("imagerow","imagecol"),
	            norm.method = "NormalizeData",
	        	celltype.column = "idents",
	        	sc.sub.size = NULL,
	      		min.sc.cell = 50,
            		factor.size = 0.1,
                  	seed.num = 30,
                  	pvalue.cut = 0.1,
                  	knn = 5,
	      		mean.cell.num = 5,
	      		max.cell.num = 10,
	            res = 0.5,
	      		n.workers = 4,
                  	verbose = TRUE)

 [INFO] Identification of cell type-specific genes...
 [INFO] Estimate the number of single cells in the spot
 [INFO] Integrate single-cell and spatial spot data
 [INFO] Train a random forest model and predict,waiting...
 [INFO] Map single cells onto spatial spots
 [INFO] Construct Seurat object
 [INFO] Finish!

Details of the results is described in the table below.

output	details
sc.out	Seurat object of spatial transcriptomic data with single-cell resolution.
decon	The cellular composition of each spot in tissue section.

Visualization

colors <-c("B-cells" = "#e68fac","CAFs" = "#a1caf1","Cancer Epithelial" = "#f7b565","Endothelial" = "#875692",
           "Myeloid" = "#d14c6f","Normal Epithelial" = "#894846","Plasmablasts" = "#848482","PVL" = "#56af8f","T-cells" = "#0067a5")

p1 <- DimPlot(sc.data,group.by= "celltype_major",label = T,label.size = 6,
              cols = colors, pt.size = 1.5 , repel = T ) + 
              NoLegend() + labs(x = "UMAP1",y = "UMAP2", title = "CellType") +
              theme(panel.border = element_rect(fill=NA,color= "black",size= 1,linetype="solid"))+
              theme(axis.title.x =element_text(size=24), axis.title.y=element_text(size=24))+
              theme(plot.title = element_text(hjust = 0.5,size = 20, face = "bold"),
              axis.text=element_text(size=12,face = "bold"),
              axis.title.x=element_text(size=14),
              axis.title.y=element_text(size=14))

p2 <- SpatialDimPlot(results$sc.out, group.by = "CellType", pt.size.factor = 1, label.size = 8, cols = colors) + 
                   theme(legend.title = element_text(size = 14),  
                   legend.text = element_text(size = 12))

p1 + p2

5. Run CellMap with single-cell data lacking cell type annotation

In cases where cell type information is not provided in the single-cell dataset, we applied our in-house automated annotation tool, scAnno, to perform cell type identification. We also use HER2+ breast cancer data as an example for demonstration example.

library(scAnno)
data(gene.anno)
data(tcga.data.u)
data(hcl.sc)
ref.obj <- hcl.sc
ref.expr <- GetAssayData(ref.obj, slot = 'data') %>% as.data.frame
ref.anno <- Idents(ref.obj) %>% as.character
sc.obj <- readRDS("sc.obj.rds")
sc.obj$seurat_clusters <- factor(sc.obj$seurat_clusters) 
sc.obj$seurat_clusters <- factor(sc.obj$seurat_clusters, levels = sort(unique(sc.obj$seurat_clusters)))
Idents(sc.obj) <- sc.obj$seurat_clusters
results = scAnno(query = sc.obj,
	ref.expr = ref.expr,
	ref.anno = ref.anno,
	save.markers = "markers",
	cluster.col = "seurat_clusters",
	factor.size = 0.1,
	pvalue.cut = 0.01,
	seed.num = 10,
	redo.markers = FALSE,
	gene.anno = gene.anno,
	permut.num = 100,
	permut.p = 0.01,
	show.plot = TRUE,
	verbose = TRUE,
	tcga.data.u = tcga.data.u
	)
sc.obj <- results$query
Idents(sc.obj) <- sc.obj$scAnno

Running CellMap after annotating cell types in single-cell data.

st.obj <- readRDS("st.obj")
results <-  CellMap(st.obj = st.obj,
                    sc.obj = sc.obj,
                    coord = c("imagerow","imagecol"),
                    norm.method = "NormalizeData",
                    celltype.column = "idents",
                    sc.sub.size = NULL,
                    min.sc.cell = 50,
                    factor.size = 0.1,
                    seed.num = 30,
                    pvalue.cut = 0.1,
                    knn = 5,
                    mean.cell.num = 5,
                    max.cell.num = 10,
                    n.workers = 4,
                    verbose = TRUE)

Visualization

table(Idents(sc.obj))
B cell   Endothelial cell    Epithelial cell         Fibroblast
 887               1092               2457               1428
hESC            Myeloid Smooth muscle cell             T cell
 471               1290                915              10771
colors <- c(
 "B cell" = "#e68fac",
 "Fibroblast" = "#a1caf1",
 "Epithelial cell" = "#f7b565",
 "Endothelial cell" = "#875692",
 "Myeloid" = "#d14c6f",
 "Smooth muscle cell" = "#894846",
 "hESC" = "#848482",
 "T cell" = "#0067a5"
)
p1 <- DimPlot(sc.obj,group.by= "scAnno",label = T,label.size = 6,
        cols =colors,
        pt.size = 1,
        repel = T ) + 
  NoLegend() + labs(x = "UMAP1",y = "UMAP2") +
  theme(panel.border = element_rect(fill=NA,color= "black",size= 1,linetype="solid"))+
  theme(axis.title.x =element_text(size=24), axis.title.y=element_text(size=24))+theme(plot.title = element_text(hjust = 0.5,size = 20, face = "bold"),axis.text=element_text(size=12,face = "bold"),axis.title.x=element_text(size=14),axis.title.y=element_text(size=14))
p2 <- SpatialDimPlot(results$sc.out, group.by = "CellType", pt.size.factor = 1, label.size = 8, cols = colors) + 
  theme(
    legend.title = element_text(size = 14),  
    legend.text = element_text(size = 12)   
  )
p1 + p2

6. Run CellMap to assign single cells on high-resolution ST data ,such as Slide-seq V2,Stereo-seq,Visium HD and Imaging-based ST platform

To ensure compatibility with CellMap, the spatial transcriptomics (ST) data derived from high-resolution datasets across multiple platforms should first be processed using the createSpObj function, which standardizes the data into the required format for subsequent analysis within the CellMap framework.

st.obj <- createSpObj(counts, coord.df, coord.label = c("x", "y"), meta.data = metadata)
# counts：The counts expression matrix of ST data, where rows represent genes and columns represent barcodes.
# coord.df: A data frame containing spatial coordinates for the barcodes in the ST data. 
# coord.label: A character vector specifying the column names in coord.df for the spatial coordinates.
# meta.data : Optional metadata data frame, where rows are barcodes.

Assign single cells to spatial spots by setting knn = 1, mean.cell.num = 1. Since each spot in Visium HD data contains only a single cell, the parameter max.cell.num is set to 1 for mapping
Visium HD high-resoluiton ST data of human CRC as an example(https://www.10xgenomics.com/products/visium-hd-spatial-gene-expression/dataset-human-crc)
scRNA-seq preprocessing

crc.obj <- Read10X_h5("path/HumanColonCancer_Flex_Multiplex_count_filtered_feature_bc_matrix.h5")
crc.sc.obj <- CreateSeuratObject(crc.obj)
crc.sc.obj$orig.ident <- "CRC"
metadata <- read.csv("SingleCell_MetaData.csv")

rownames(metadata) <- metadata$Barcode  
crc.sc.obj <- AddMetaData(crc.sc.obj, metadata = metadata)

p1.sc.obj <- subset(crc.sc.obj,subset = Patient=="P1CRC")
sc.obj <- subset(sc.obj,subset = QCFilter=="Keep")

###Merge the cell subtypes into broader cell types.
level2_to_level1 <- c(
  "CAF" = "Fibroblast",
  "CD4 T cell" = "T cells",
  "CD8 Cytotoxic T cell" = "T cells",
  "Endothelial" = "Endothelial",
  "Enteric Glial" = "Neuronal",
  "Enterocyte" = "Intestinal Epithelial",
  "Epithelial" = "Intestinal Epithelial",
  "Fibroblast" = "Fibroblast",
  "Goblet" = "Intestinal Epithelial",
  "Lymphatic Endothelial" = "Endothelial",
  "Macrophage" = "Myeloid",
  "Mast" = "Myeloid",
  "Mature B" = "B cells",
  "mRegDC" = "Myeloid",
  "Myofibroblast" = "Fibroblast",
  "Neuroendocrine" = "Neuronal",
  "Neutrophil" = "Myeloid",
  "pDC" = "Myeloid",
  "Pericytes" = "Endothelial",
  "Plasma" = "B cells",
  "Proliferating Immune II" = "T cells",
  "SM Stress Response" = "Smooth Muscle",
  "Smooth Muscle" = "Smooth Muscle",
  "Tuft" = "Intestinal Epithelial",
  "Tumor I" = "Tumor",
  "Tumor II" = "Tumor",
  "Tumor III" = "Tumor",
  "Tumor V" = "Tumor",
  "Unknown III (SM)" = "Smooth Muscle",
  "vSM" = "Smooth Muscle"
)

sc.obj$celltype <- level2_to_level1[sc.obj$Level2]
Idents(sc.obj) <- sc.obj$celltype

ST data preprocessing

st.data <- readRDS("path/st.visiumHD.P1CRC.8um.rds")
metadata <- read.csv("DeconvolutionResults_P1CRC.csv")
rownames(metadata) <- metadata$barcode
metadata <- metadata[rownames(st.data@meta.data),]
st.data <- AddMetaData(st.data, metadata = metadata)
st.data <- subset(st.data, subset = DeconvolutionClass == "singlet")

counts <- GetAssayData(st.data,layer = "counts")
coord.df <- st.data@images$slice1.008um$centroids@coords %>% as.data.frame
rownames(coord.df) <- st.data@images$slice1.008um@boundaries$centroids@cells
metadata = st.data@meta.data
st.obj <- createSpObj(counts, coord.df, coord.label = c("x", "y"), meta.data = metadata)
st.obj <- SCTransform(st.obj,assay = "Spatial") %>% RunPCA() %>% RunUMAP(dims = 1:30)%>% FindNeighbors() %>% FindClusters(resolution = 0.3)

Run CellMap

results <- CellMap(st.obj = st.obj,
                    sc.obj = sc.obj,
                    coord = c("x","y"),
                    norm.method = "SCTransform",
                    celltype.column = "idents",
                    sc.sub.size = NULL,
                    min.sc.cell = 50,
                    factor.size = 0.1,
                    seed.num = 30,
                    pvalue.cut = 0.1,
                    knn = 1,
                    mean.cell.num = 1,
                    max.cell.num = 1,
                    res = 0.5,
                    n.workers = 4,
                    verbose = TRUE)
colors <- c("B cells" = "#6baed6","Endothelial" = "#66c2a5","Fibroblast" = "#f781bf",
            "Intestinal Epithelial" = "#fc8d62","Myeloid" = "#8da0cb","Neuronal" = "#377eb8",
			"Smooth Muscle" = "#ffed6f","T cells" = "#ccebc5","Tumor" = "#ee6655")
SpatialDimPlot(results$sc.out, group.by = "CellType", pt.size.factor = 1, label.size = 8, cols = colors,image.alpha = 0) + 
                   theme(legend.title = element_text(size = 14),  
                   legend.text = element_text(size = 12))

Name		Name	Last commit message	Last commit date
Latest commit History 182 Commits
R		R
man		man
vignettes		vignettes
CellMap.Rproj		CellMap.Rproj
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CellMap

1. Installing the package and dependencices

2. Importing packages and preparing input data(scRNA-seq data and spatial transcriptomes data)

3. Setting the parameters

4. Run CellMap to assign single cells on 10X Visium spatial transcriptome data

5. Run CellMap with single-cell data lacking cell type annotation

6. Run CellMap to assign single cells on high-resolution ST data ,such as Slide-seq V2,Stereo-seq,Visium HD and Imaging-based ST platform

About

Uh oh!

Releases 1

Packages

Languages

liuhong-jia/CellMap

Folders and files

Latest commit

History

Repository files navigation

CellMap

1. Installing the package and dependencices

2. Importing packages and preparing input data(scRNA-seq data and spatial transcriptomes data)

3. Setting the parameters

4. Run CellMap to assign single cells on 10X Visium spatial transcriptome data

5. Run CellMap with single-cell data lacking cell type annotation

6. Run CellMap to assign single cells on high-resolution ST data ,such as Slide-seq V2,Stereo-seq,Visium HD and Imaging-based ST platform

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages