Skip to content

liuhong-jia/CellMap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CellMap

CellMap is an innovative tool crafted to precisely map individual cells onto spatial coordinates within tissue slices. Its broad utility lies in unveiling the spatial distribution of cell types, dissecting cellular compositions within tissue section spatial spots, and identifying critical functional structures within biological systems.

image

In this tutorial, we will demonstarte how to install and use CellMap to resolve spatial tranmscriptomic spots at single-cell resolution.

1. Installing the package and dependencices

To install CellMap,we recommed using devtools:

library(devtools)
devtools::install_github("liuhong-jia/CellMap")  
  • Dependencies
    R version >= 4.3.0.
    R packages: Seurat, dplyr, ggplot2, Matrix, clue, jsonlite, magrittr, randomForest, parallel

2. Importing packages and preparing input data(scRNA-seq data and spatial transcriptomes data)

library(CellMap)
library(devtools)
library(Seurat)
library(dplyr)
library(Matrix)
library(clue)
library(jsonlite)
library(magrittr)
library(randomForest)
library(parallel)
sc.obj <- readRDS("sc.obj.rds")
st.obj <- readRDS("st.obj.rds")
  • To ensure compatibility with CellMap, the spatial transcriptomics (ST) data should first be processed using the createSpObj function, which standardizes the data into the required format for subsequent analysis within the CellMap framework.
st.obj <- createSpObj(counts, coord.df, coord.label = c("imagerow", "imagecol"), meta.data = metadata)
# counts:The counts expression matrix of ST data, where rows represent genes and columns represent barcodes.
# coord.df: A data frame containing spatial coordinates for the barcodes in the ST data. 
# coord.label: A character vector specifying the column names in coord.df for the spatial coordinates.
# meta.data : Optional metadata data frame, where rows are barcodes.
  • Set the identities of the scRNA-seq data by using the cell-type annotation column when available; otherwise, use the clustering results (seurat_clusters).
Idents(sc.obj) <- sc.obj$celltype

3. Setting the parameters

Parameters Description
st.obj Seurat object of spatial transcriptome data.
sc.obj Seurat object of scRNA-seq data.
coord Coordinates column names in ST images slot.coord = c("x","y") or coord = c("imagerow","imagecol").
norm.method Normalization methods for scRNA-seq and ST data, norm.method = "NormalizeData" or "SCTransform".
celltype.column The column name for cell type in the single-cell Seurat object, with the default value as "idents".
sc.sub.size Downsampling proportion or number for scRNA-seq data. Default: NULL.
min.sc.cells The minimum number of cell types in scRNA-seq data.Default: 50.
factor.size Factor size for scaling the weight of gene expression. Default: 0.1.
seed.num Number of seed genes of each cell type for recognizing candidate markers. Default: 30.
pvalue.cut Threshold for filtering cell type marker genes. Default: 0.1.
knn The number of nearest neighboring single cells for each spot. Set to 5 for low-resolution data and 1 for high-resolution data.
mean.cell.num The average number of single cells in the spot.Set to 5 for low-resolution data and 1 for high-resolution data.
max.cell.num The maximum number of cells within each spot, if equal to 1, indicates that each spot contains only a single cell.
res Resolution for clustering ST spots. Default: 0.5.
n.workers Number of cores to be used for parallel processing. Default: 4.
verbose Show running messages or not. Default: TRUE.

4. Run CellMap to assign single cells on 10X Visium spatial transcriptome data

results <-  CellMap(st.obj = st.obj,
                    sc.obj = sc.obj,
	        	coord = c("imagerow","imagecol"),
	            norm.method = "NormalizeData",
	        	celltype.column = "idents",
	        	sc.sub.size = NULL,
	      		min.sc.cell = 50,
            		factor.size = 0.1,
                  	seed.num = 30,
                  	pvalue.cut = 0.1,
                  	knn = 5,
	      		mean.cell.num = 5,
	      		max.cell.num = 10,
	            res = 0.5,
	      		n.workers = 4,
                  	verbose = TRUE)

 [INFO] Identification of cell type-specific genes...
 [INFO] Estimate the number of single cells in the spot
 [INFO] Integrate single-cell and spatial spot data
 [INFO] Train a random forest model and predict,waiting...
 [INFO] Map single cells onto spatial spots
 [INFO] Construct Seurat object
 [INFO] Finish!

Details of the results is described in the table below.

output details
sc.out Seurat object of spatial transcriptomic data with single-cell resolution.
decon The cellular composition of each spot in tissue section.
  • Visualization
colors <-c("B-cells" = "#e68fac","CAFs" = "#a1caf1","Cancer Epithelial" = "#f7b565","Endothelial" = "#875692",
           "Myeloid" = "#d14c6f","Normal Epithelial" = "#894846","Plasmablasts" = "#848482","PVL" = "#56af8f","T-cells" = "#0067a5")

p1 <- DimPlot(sc.data,group.by= "celltype_major",label = T,label.size = 6,
              cols = colors, pt.size = 1.5 , repel = T ) + 
              NoLegend() + labs(x = "UMAP1",y = "UMAP2", title = "CellType") +
              theme(panel.border = element_rect(fill=NA,color= "black",size= 1,linetype="solid"))+
              theme(axis.title.x =element_text(size=24), axis.title.y=element_text(size=24))+
              theme(plot.title = element_text(hjust = 0.5,size = 20, face = "bold"),
              axis.text=element_text(size=12,face = "bold"),
              axis.title.x=element_text(size=14),
              axis.title.y=element_text(size=14))

p2 <- SpatialDimPlot(results$sc.out, group.by = "CellType", pt.size.factor = 1, label.size = 8, cols = colors) + 
                   theme(legend.title = element_text(size = 14),  
                   legend.text = element_text(size = 12))

p1 + p2

image

5. Run CellMap with single-cell data lacking cell type annotation

  • In cases where cell type information is not provided in the single-cell dataset, we applied our in-house automated annotation tool, scAnno, to perform cell type identification. We also use HER2+ breast cancer data as an example for demonstration example.
library(scAnno)
data(gene.anno)
data(tcga.data.u)
data(hcl.sc)
ref.obj <- hcl.sc
ref.expr <- GetAssayData(ref.obj, slot = 'data') %>% as.data.frame
ref.anno <- Idents(ref.obj) %>% as.character
sc.obj <- readRDS("sc.obj.rds")
sc.obj$seurat_clusters <- factor(sc.obj$seurat_clusters) 
sc.obj$seurat_clusters <- factor(sc.obj$seurat_clusters, levels = sort(unique(sc.obj$seurat_clusters)))
Idents(sc.obj) <- sc.obj$seurat_clusters
results = scAnno(query = sc.obj,
	ref.expr = ref.expr,
	ref.anno = ref.anno,
	save.markers = "markers",
	cluster.col = "seurat_clusters",
	factor.size = 0.1,
	pvalue.cut = 0.01,
	seed.num = 10,
	redo.markers = FALSE,
	gene.anno = gene.anno,
	permut.num = 100,
	permut.p = 0.01,
	show.plot = TRUE,
	verbose = TRUE,
	tcga.data.u = tcga.data.u
	)
sc.obj <- results$query
Idents(sc.obj) <- sc.obj$scAnno
  • Running CellMap after annotating cell types in single-cell data.
st.obj <- readRDS("st.obj")
results <-  CellMap(st.obj = st.obj,
                    sc.obj = sc.obj,
                    coord = c("imagerow","imagecol"),
                    norm.method = "NormalizeData",
                    celltype.column = "idents",
                    sc.sub.size = NULL,
                    min.sc.cell = 50,
                    factor.size = 0.1,
                    seed.num = 30,
                    pvalue.cut = 0.1,
                    knn = 5,
                    mean.cell.num = 5,
                    max.cell.num = 10,
                    n.workers = 4,
                    verbose = TRUE)
  • Visualization
table(Idents(sc.obj))
B cell   Endothelial cell    Epithelial cell         Fibroblast
 887               1092               2457               1428
hESC            Myeloid Smooth muscle cell             T cell
 471               1290                915              10771
colors <- c(
 "B cell" = "#e68fac",
 "Fibroblast" = "#a1caf1",
 "Epithelial cell" = "#f7b565",
 "Endothelial cell" = "#875692",
 "Myeloid" = "#d14c6f",
 "Smooth muscle cell" = "#894846",
 "hESC" = "#848482",
 "T cell" = "#0067a5"
)
p1 <- DimPlot(sc.obj,group.by= "scAnno",label = T,label.size = 6,
        cols =colors,
        pt.size = 1,
        repel = T ) + 
  NoLegend() + labs(x = "UMAP1",y = "UMAP2") +
  theme(panel.border = element_rect(fill=NA,color= "black",size= 1,linetype="solid"))+
  theme(axis.title.x =element_text(size=24), axis.title.y=element_text(size=24))+theme(plot.title = element_text(hjust = 0.5,size = 20, face = "bold"),axis.text=element_text(size=12,face = "bold"),axis.title.x=element_text(size=14),axis.title.y=element_text(size=14))
p2 <- SpatialDimPlot(results$sc.out, group.by = "CellType", pt.size.factor = 1, label.size = 8, cols = colors) + 
  theme(
    legend.title = element_text(size = 14),  
    legend.text = element_text(size = 12)   
  )
p1 + p2

image

6. Run CellMap to assign single cells on high-resolution ST data ,such as Slide-seq V2,Stereo-seq,Visium HD and Imaging-based ST platform

  • To ensure compatibility with CellMap, the spatial transcriptomics (ST) data derived from high-resolution datasets across multiple platforms should first be processed using the createSpObj function, which standardizes the data into the required format for subsequent analysis within the CellMap framework.
st.obj <- createSpObj(counts, coord.df, coord.label = c("x", "y"), meta.data = metadata)
# counts:The counts expression matrix of ST data, where rows represent genes and columns represent barcodes.
# coord.df: A data frame containing spatial coordinates for the barcodes in the ST data. 
# coord.label: A character vector specifying the column names in coord.df for the spatial coordinates.
# meta.data : Optional metadata data frame, where rows are barcodes.
crc.obj <- Read10X_h5("path/HumanColonCancer_Flex_Multiplex_count_filtered_feature_bc_matrix.h5")
crc.sc.obj <- CreateSeuratObject(crc.obj)
crc.sc.obj$orig.ident <- "CRC"
metadata <- read.csv("SingleCell_MetaData.csv")

rownames(metadata) <- metadata$Barcode  
crc.sc.obj <- AddMetaData(crc.sc.obj, metadata = metadata)

p1.sc.obj <- subset(crc.sc.obj,subset = Patient=="P1CRC")
sc.obj <- subset(sc.obj,subset = QCFilter=="Keep")

###Merge the cell subtypes into broader cell types.
level2_to_level1 <- c(
  "CAF" = "Fibroblast",
  "CD4 T cell" = "T cells",
  "CD8 Cytotoxic T cell" = "T cells",
  "Endothelial" = "Endothelial",
  "Enteric Glial" = "Neuronal",
  "Enterocyte" = "Intestinal Epithelial",
  "Epithelial" = "Intestinal Epithelial",
  "Fibroblast" = "Fibroblast",
  "Goblet" = "Intestinal Epithelial",
  "Lymphatic Endothelial" = "Endothelial",
  "Macrophage" = "Myeloid",
  "Mast" = "Myeloid",
  "Mature B" = "B cells",
  "mRegDC" = "Myeloid",
  "Myofibroblast" = "Fibroblast",
  "Neuroendocrine" = "Neuronal",
  "Neutrophil" = "Myeloid",
  "pDC" = "Myeloid",
  "Pericytes" = "Endothelial",
  "Plasma" = "B cells",
  "Proliferating Immune II" = "T cells",
  "SM Stress Response" = "Smooth Muscle",
  "Smooth Muscle" = "Smooth Muscle",
  "Tuft" = "Intestinal Epithelial",
  "Tumor I" = "Tumor",
  "Tumor II" = "Tumor",
  "Tumor III" = "Tumor",
  "Tumor V" = "Tumor",
  "Unknown III (SM)" = "Smooth Muscle",
  "vSM" = "Smooth Muscle"
)

sc.obj$celltype <- level2_to_level1[sc.obj$Level2]
Idents(sc.obj) <- sc.obj$celltype
  • ST data preprocessing
st.data <- readRDS("path/st.visiumHD.P1CRC.8um.rds")
metadata <- read.csv("DeconvolutionResults_P1CRC.csv")
rownames(metadata) <- metadata$barcode
metadata <- metadata[rownames(st.data@meta.data),]
st.data <- AddMetaData(st.data, metadata = metadata)
st.data <- subset(st.data, subset = DeconvolutionClass == "singlet")

counts <- GetAssayData(st.data,layer = "counts")
coord.df <- st.data@images$slice1.008um$centroids@coords %>% as.data.frame
rownames(coord.df) <- st.data@images$slice1.008um@boundaries$centroids@cells
metadata = st.data@meta.data
st.obj <- createSpObj(counts, coord.df, coord.label = c("x", "y"), meta.data = metadata)
st.obj <- SCTransform(st.obj,assay = "Spatial") %>% RunPCA() %>% RunUMAP(dims = 1:30)%>% FindNeighbors() %>% FindClusters(resolution = 0.3)
  • Run CellMap
results <- CellMap(st.obj = st.obj,
                    sc.obj = sc.obj,
                    coord = c("x","y"),
                    norm.method = "SCTransform",
                    celltype.column = "idents",
                    sc.sub.size = NULL,
                    min.sc.cell = 50,
                    factor.size = 0.1,
                    seed.num = 30,
                    pvalue.cut = 0.1,
                    knn = 1,
                    mean.cell.num = 1,
                    max.cell.num = 1,
                    res = 0.5,
                    n.workers = 4,
                    verbose = TRUE)
colors <- c("B cells" = "#6baed6","Endothelial" = "#66c2a5","Fibroblast" = "#f781bf",
            "Intestinal Epithelial" = "#fc8d62","Myeloid" = "#8da0cb","Neuronal" = "#377eb8",
			"Smooth Muscle" = "#ffed6f","T cells" = "#ccebc5","Tumor" = "#ee6655")
SpatialDimPlot(results$sc.out, group.by = "CellType", pt.size.factor = 1, label.size = 8, cols = colors,image.alpha = 0) + 
                   theme(legend.title = element_text(size = 14),  
                   legend.text = element_text(size = 12))

image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages