## CellChat analysis of multiple spatial transcriptomics datasets
Suoqin Jin
22 February, 2024
Load the required libraries
Part I: Data input & processing and initialization of CellChat object
Load data
Create a CellChat object
Set the ligand-receptor interaction database
Preprocessing the expression data for cell-cell communication analysis
Part II: Inference of cell-cell communication network
Compute the communication probability and infer cellular communication network
Infer the cell-cell communication at a signaling pathway level
Calculate the aggregated cell-cell communication network
Part III: Visualization of cell-cell communication network
Compute the contribution of each ligand-receptor pair to the overall signaling pathway
Part V: Save the CellChat object
This vignette outlines the steps of inference, analysis and visualization of cell-cell communication network for multiple spatial transcriptomics datasets using CellChat. We showcase CellChat’s application to multiple spatial transcriptomics datasets by applying it to two replicates from human spatial intestine datasets, which were downloaded from https://simmonslab.shinyapps.io/FetalAtlasDataPortal/.

Below we briefly describe the key steps of applying CellChat to multiple spatial transcriptomics datasets. Please check the vignette of applying CellChat to an individual spatially resolved transcriptomics dataset for detailed descriptions of the methods and steps, and check the vignette of FAQ on applying CellChat to spatially resolved transcriptomics data for detailed descriptions of applying CellChat to different types of spatial transcriptomics data.

Load the required libraries

In [None]:
ptm = Sys.time()
library(CellChat)
library(patchwork)

### Part I: Data input & processing and initialization of CellChat object
Load data

In [None]:
library(Seurat)
library(CellChat)
library(jsonlite)
library(Matrix)

In [None]:
setwd("/media/bio/Disk/Research Data/EBV/omicverse")

In [None]:
sample_ids <- c('NPC_ST05', 'NPC_ST06', 'NPC_ST07', 'NPC_ST08', 'NPC_ST09',
                'NPC_ST10', 'NPC_ST11', 'NPC_ST12', 'NPC_ST16', 'NPC_ST17',
                'NPC_ST18', 'NPC_ST19')

base_dir <- "Dataset/GSE206245"
tangram_path <- "Processed Data/tangram_ct_pred.csv"
scNiche_path <- "Processed Data/GSE206245_NPC_ST_scNiche.csv"
spot.size <- 65

assignLabels <- function(object, prediction = "tangram_ct_pred") {
  pred <- object[[prediction]]@data
  pred <- pred[1:(nrow(pred) - 1), ]
  labels = rownames(pred)[apply(pred, 2, which.max)]
  names(labels) <- colnames(pred)
  object$labels <- factor(labels)
  Idents(object) <- "labels"
  return(object)
}

seurat_list <- list()
data_list <- list()
meta_list <- list()
spatial_locs_list <- list()
spatial_factors_list <- list()

for (sid in sample_ids) {
  cat("Processing:", sid, "\n")
  
  sample_dir <- file.path(base_dir, sid)
  h5_file <- "filtered_feature_bc_matrix.h5"
  rna_dir <- sample_dir
  
  seu <- Load10X_Spatial(data.dir = rna_dir,
                         filename = h5_file,
                         assay = "Spatial",
                         slice = sid)
  seu <- SCTransform(seu, assay = "Spatial", verbose = FALSE)
  colnames(seu) <- paste0(colnames(seu), "_", sid)

  anno <- read.csv(tangram_path, header = TRUE, row.names = 1, check.names = FALSE)
  anno_filtered <- anno[grep(sid, rownames(anno)), ]
  anno_matrix <- as(t(as.matrix(anno_filtered)), "dgCMatrix")
  spot_names <- rownames(anno_filtered)

  seu <- subset(seu, cells = spot_names)
  seu[["TangramPred"]] <- CreateAssayObject(counts = anno_matrix)

  seu <- assignLabels(seu, prediction = "TangramPred")
  seurat_list[[sid]] <- seu
  
  sct_data <- GetAssayData(seu, slot = "data", assay = "SCT")
  colnames(sct_data) <- paste0(sid, "_", colnames(sct_data))
  data_list[[sid]] <- sct_data

  meta_tmp <- data.frame(labels = Idents(seu), samples = sid)
  scNiche_tmp <- read.csv(scNiche_path, header = TRUE, row.names = 1, check.names = FALSE)
  scNiche_tmp <- subset(scNiche_tmp, sample_id == sid)
  scNiche_tmp$sample_id <- NULL
  meta <- merge(meta_tmp, scNiche_tmp, by = "row.names", all = TRUE)
  rownames(meta) <- meta$Row.names
  meta <- meta[, -1]
  meta$scNiche2 <- ifelse(meta$scNiche == "Niche4", "Niche4", "Other Niches")
  meta_list[[sid]] <- meta

  coords <- GetTissueCoordinates(seu, scale = NULL, cols = c("imagerow", "imagecol"))
  rownames(coords) <- colnames(sct_data)
  spatial_locs_list[[sid]] <- coords

  json_path <- file.path(sample_dir, "spatial", "scalefactors_json.json")
  sf <- jsonlite::fromJSON(json_path)
  ratio <- spot.size / sf$spot_diameter_fullres
  spatial_factors_list[[sid]] <- data.frame(ratio = ratio, tol = spot.size / 2)
}

genes.common <- Reduce(intersect, lapply(data_list, rownames))
data.input <- do.call(cbind, lapply(data_list, function(x) x[genes.common, ]))

meta <- do.call(rbind, meta_list)
rownames(meta) <- colnames(data.input)
meta$labels <- factor(meta$labels)
meta$samples <- factor(meta$samples, levels = sample_ids)

spatial.locs <- do.call(rbind, spatial_locs_list)
rownames(spatial.locs) <- colnames(data.input)

spatial.factors <- do.call(rbind, spatial_factors_list)
rownames(spatial.factors) <- sample_ids

cellchat <- createCellChat(
  object = data.input,
  meta = meta,
  group.by = "labels",
  datatype = "spatial",
  coordinates = spatial.locs,
  spatial.factors = spatial.factors
)

cat("✅ Build CellChat object Sucessfully\n")

saveRDS(cellchat, file = "Processed Data/cellchat_12samples.rds")

In [None]:
# show the image and annotated spots
color.use <- scPalette(nlevels(seu)); names(color.use) <- levels(seu)
Seurat::SpatialDimPlot(seu, label = F, label.size = 3, cols = color.use)

In [None]:
cellchat

Set the ligand-receptor interaction database

In [None]:
CellChatDB <- CellChatDB.human # use CellChatDB.human if running on human data

In [None]:
# use a subset of CellChatDB for cell-cell communication analysis
CellChatDB.use <- subsetDB(CellChatDB, search = "Secreted Signaling", key = "annotation") # use Secreted Signaling
# set the used database in the object
cellchat@DB <- CellChatDB.use

#### Preprocessing the expression data for cell-cell communication analysis
To infer the cell state-specific communications, we identify over-expressed ligands or receptors in one cell group and then identify over-expressed ligand-receptor interactions if either ligand or receptor is over-expressed.

In [None]:
# subset the expression data of signaling genes for saving computation cost
cellchat <- subsetData(cellchat) # This step is necessary even if using the whole database
future::plan("multisession", workers = 64) 
cellchat <- identifyOverExpressedGenes(cellchat)
cellchat <- identifyOverExpressedInteractions(cellchat)

In [None]:
execution.time = Sys.time() - ptm
print(as.numeric(execution.time, units = "secs"))

### Part II: Inference of cell-cell communication network
#### Compute the communication probability and infer cellular communication network

In [None]:
ptm = Sys.time()
cellchat <- computeCommunProb(cellchat, type = "truncatedMean", trim = 0.1, 
                              distance.use = FALSE, interaction.range = 250, scale.distance = NULL,
                              contact.dependent = TRUE, contact.range = 100)

Users can filter out the cell-cell communication if there are only few cells in certain cell groups. By default, the minimum number of cells required in each cell group for cell-cell communication is 10.

In [None]:
cellchat <- filterCommunication(cellchat, min.cells = 10)

#### Infer the cell-cell communication at a signaling pathway level
CellChat computes the communication probability on signaling pathway level by summarizing the communication probabilities of all ligands-receptors interactions associated with each signaling pathway.

NB: The inferred intercellular communication network of each ligand-receptor pair and each signaling pathway is stored in the slot ‘net’ and ‘netP’, respectively.

In [None]:
cellchat <- computeCommunProbPathway(cellchat)

#### Calculate the aggregated cell-cell communication network
We can calculate the aggregated cell-cell communication network by counting the number of links or summarizing the communication probability.

In [None]:
cellchat <- aggregateNet(cellchat)

execution.time = Sys.time() - ptm
print(as.numeric(execution.time, units = "secs"))

NB: Upon infering the intercellular communication network from spatial transcriptomics data, CellChat’s various functionality can be used for further data exploration, analysis, and visualization. Please check other functionalities in the basic tutorial of CellChat and comparison analysis across different conditions

### Part III: Save the CellChat object

In [None]:
saveRDS(cellchat, file = "Processed Data/GSE206245_NPC_ST_Cluster_Tangram_Cellchat.rds")

In [None]:
sessionInfo()