Skip to content

4 Spatial RNA: VisiumHD ‐ Mouse Brain 2026

Ryan Mulqueen edited this page Jun 19, 2026 · 1 revision

Visium HD Spatial Gene Expression (10x Genomics)

The Visium HD Spatial Gene Expression platform offers whole-transcriptome spatial profiling at 2um resolution. Below is a concise summary of the key features and benefits.

image

Technology Key Features

image

  • High-Resolution Capture

    • Slides contain two 6.5 x 6.5 mm capture areas.
    • Each area includes millions of contiguous 2 x 2 µm barcoded squares.
    • Enables single-cell–scale spatial resolution.
  • Broad Sample Compatibility

    • Supports human and mouse tissue.
    • Compatible with:
      • Fresh frozen
      • Fixed frozen
      • FFPE sections
      • Archived blocks and pre-sectioned slides
  • CytAssist Workflow Integration

    • Uses the Visium CytAssist instrument.
    • Transfers transcriptomic probes from standard slides to Visium HD slides.
    • Streamlines sample preparation and improves precision.
  • Enhanced Data Analysis

    • Spatial gene expression data at 2 µm resolution.
    • Recommended binning: 8 x 8 µm for analysis.
    • Compatible with:
      • Space Ranger
      • Loupe Browser

Applications

  • Spatially resolved transcriptomics at single-cell resolution.
  • Analyze cell identity, function, and tissue architecture.
  • Ideal for research in cancer, neuroscience, and developmental biology.

Learn more at: 10x Genomics Visium HD

The difference of VisiumHD and Visium

image

Feature Visium HD Visium (Standard)
Spatial Resolution 2 × 2 µm barcoded squares (~10 million features) 55 µm diameter spots (~5,000 features)
Sample Types FFPE, fresh frozen, fixed frozen Fresh frozen (v1), FFPE (v2 with CytAssist)
Data Analysis High-resolution, single-cell level Lower resolution, may need scRNA-seq integration

Choosing the Right Bin Size in Visium HD: Implications from Human Cell Size

image

Human cell sizes vary widely across tissues and cell types—from small red blood cells (~6–8 µm) to large adipocytes or neurons (50–100+ µm). Visium HD provides raw data start at 2 × 2 µm resolution, allowing users to flexibly define bin sizes during analysis.

Why Bin Size Matters

The bin size determines how raw 2 µm pixel data are aggregated spatially. Choosing the appropriate bin size impacts:

  • Signal strength: Larger bins capture more transcripts (UMIs), reducing sparsity.
  • Cell boundary resolution: Smaller bins preserve spatial detail but may increase data dropout.

Smaller bins often have lower total UMI counts (nCount) and fewer detected genes (nFeature), which can pose challenges for clustering, normalization, and downstream statistical power.

Best Practices

  • Use 8 × 8 µm binning for most human tissues to balance resolution and data quality.
  • Use exploratory binning at multiple scales (e.g. 8 µm, 16 µm) to assess resolution vs. noise trade-off.
  • Align bin size with known cell morphology and tissue architecture.
  • Consider downstream filtering thresholds and quality control—small bins may require more stringent preprocessing or imputation.
  • Use segmentation tools like:
    • StarDist – for accurate nuclear segmentation from DAPI or H&E images.
    • bin2cell – to assign spatial bins to cells using segmentation-informed boundaries.

Data Downloads

Before starting this tutorial, please making sure to downloaded/find VisiumHD output.

In this tutorial, we will use one of 10X public datasets, Mouse brain tissue (Formalin fixed paraffin embedded).

More different tissue type data can be found at here

# standard spaceranger output for visiumHD
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.0.0/Visium_HD_Mouse_Brain/Visium_HD_Mouse_Brain_web_summary.html
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.0.0/Visium_HD_Mouse_Brain/Visium_HD_Mouse_Brain_cloupe_008um.cloupe
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.0.0/Visium_HD_Mouse_Brain/Visium_HD_Mouse_Brain_feature_slice.h5
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.0.0/Visium_HD_Mouse_Brain/Visium_HD_Mouse_Brain_metrics_summary.csv
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.0.0/Visium_HD_Mouse_Brain/Visium_HD_Mouse_Brain_molecule_info.h5
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.0.0/Visium_HD_Mouse_Brain/Visium_HD_Mouse_Brain_spatial.tar.gz
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.0.0/Visium_HD_Mouse_Brain/Visium_HD_Mouse_Brain_binned_outputs.tar.gz
## unzip
tar -xvf Visium_HD_Mouse_Brain_spatial.tar.gz
tar -xvf Visium_HD_Mouse_Brain_binned_outputs.tar.gz

Just a quick overview of the data, check cloupe with Loupe Browser 8.1.2

Check 10X Web_Summary

Always the easiest and first step to understand your sample with XXXX_web_summary.html


Let’s dive in !!

installing R packages

#increase timeout because some of these packages take longer to download
options(timeout=9999999)
#the extra arguments are so that it won't stop to ask you any questions
remotes::install_github("mojaveazure/seurat-disk", upgrade = "always")
remotes::install_github("cellgeni/schard", upgrade = "always")
remotes::install_github("huayc09/SeuratExtend", upgrade = "always")
remotes::install_github("dmcable/spacexr", upgrade = "always")
remotes::install_github("RubD/Giotto", upgrade = "always") 
remotes::install_github("jinworks/CellChat", upgrade = "always")
install.packages("arrow", repos='http://cran.us.r-project.org')
remotes::install_github("immunogenomics/presto", upgrade = "always")

loading R packages

Click to expand R pacakges list
{
  library(matrixStats)  ## remotes::install_version("matrixStats", version="1.1.0") ## packageVersion('matrixStats')
  library(scCustomize)
  
  library(Seurat) # packageVersion('Seurat'), 5.2.0
  library(SeuratDisk)

  library(fields) 
  library(grid)
  library(ggplot2)
  library(schard)
  library(cowplot)
  library(paletteer)
  library(SeuratExtend)
  library(reshape2)
  library(patchwork) 
  library(plotly)
  library(RColorBrewer)
  library(arrow)
  library(magick) 
  library(ggpubr)
  library(dplyr)
  
  library(rvest)
  
  library(ggrepel) 
  
  ## visiumHD annotation
  library(spacexr)
  library(Rfast)
  
  ## copykat
  library(copykat)
  
  ##spatial
  library(deldir)
  library(Rfast2)
  library(deldir)
  library(igraph)
  library(Giotto)

  set.seed(1234) ## for Reproducible
}

Note, this tutorial was prepared using Seurat 5.1 For this workshop all necessary packages with correct versions are already installed. If you are working on your local computers, you will need to install the packages yourself.

Functions in R

Here are some pre-defined functions that will be used in this tutorial, which help to organize, reuse and simplify your code (Navin Lab).

Click to expand R function code
Filter_cluster <- function(Object_8um){
## Extract metadata with cluster info
before <- Object_8um@meta.data %>%
  count(cluster = default_10X_cluster, name = "n_before")

after <- Object_8um_filtered@meta.data %>%
  count(cluster = default_10X_cluster, name = "n_after")

## Join and calculate filtered-out percentage
filtered_stats_8bin <- left_join(before, after, by = "cluster") %>%
  mutate(
    n_after = ifelse(is.na(n_after), 0, n_after),
    n_filtered = n_before - n_after,
    percent_filtered = round(100 * n_filtered / n_before, 1),
    percent_Keep = 100-percent_filtered
  )

if( any(is.na(filtered_stats_8bin$cluster)) ){
filtered_stats_8bin <- filtered_stats_8bin[-which(is.na(filtered_stats_8bin$cluster)),]
}
 
p2_8u_BF <- ggplot(filtered_stats_8bin, aes(x = factor(cluster), y = percent_Keep)) +
  geom_bar(stat = "identity", fill = "#3182bd", width = 0.7) +
  geom_text(
    aes(label = n_after),
    vjust = -0.6,
    size = 4.5,
    fontface = "bold"
  ) +
  labs(
    title = "Qualified bins per Cluster",
    x = "Cluster",
    y = "Percentage of bins Pass QC"
  ) +
  theme_minimal(base_size = 15) +
  theme(
    plot.title = element_text( size = 16, hjust = 0.5),
    axis.text = element_text(size = 12),
    panel.grid.major.x = element_blank(),
    panel.grid.minor = element_blank()
  )

return(p2_8u_BF)
}

## Custom color Palette for the workshop
Color_palette_VisiumHD <- c("#ffe5cd","#F70373","#822E1CFF","#AA0DFEFF","#1CBE4FFF",
             "#ff89cc","#1CFFCEFF","#ed968c","#f35d36","#bbb1ff","#D3FC5E",
             "#7c4b73","#CD69C9","#31c7ba","#5f5c0b","#4d1da9","#f4d451","#d0ffb7","#239f6e","#1b80ad","#2F4F4F",
             "#acb1b4","#080000","#2ab2e5","#b97e45","#f03c2b","#daa9d9","#63ff55","#ebc2b9",
             "#fae70f","#c9ceda","#564c6c","#4539dd","#dd0cc5","#c6662f","#105c13","#dd7d6d","#b1d8ff","#FEAF16FF",
             "#ffd000","#6596cd","#b90303","#aabf88","#534e46","#974949","#828282","#bd8399","#5373a7")

Data Preprocessing (Very similar to single cell RNAseq data or is it?!?)

data_path = "/data/shared/VisiumHD/"

Object <- Load10X_Spatial(data.dir = "/data/shared/VisiumHD/", bin.size = c(8, 16))

Object

names(Object@meta.data)

#Taking default clusters from CellRanger pipeline and adding them into the rds object for 8um
projection_8um <- read.csv(paste0(data_path,"/binned_outputs/square_008um/analysis/clustering/gene_expression_graphclust/clusters.csv",sep=""), row.names = 1,stringsAsFactors=F,check.names = FALSE)
cluster_8um <- read.csv(paste0(data_path,"/binned_outputs/square_008um/analysis/umap/gene_expression_2_components/projection.csv"), row.names = 1,stringsAsFactors=F,check.names = FALSE)
projection_cluster_8um <- cbind(projection_8um,cluster_8um)
#head(projection_cluster_8um)

Object@reductions$umap.8um <- CreateDimReducObject(embeddings = as.matrix(projection_cluster_8um[, c("UMAP-1", "UMAP-2")]),
                                             key = "UMAP-", assay = DefaultAssay(Object))
Object$default_10X_cluster.8um <- projection_cluster_8um$Cluster[match(colnames(Object), rownames(projection_cluster_8um))]

#Change default assay to 16um!
DefaultAssay(Object) <- "Spatial.016um"

#Taking default clusters from CellRanger pipeline and adding them into the rds object for 16um
projection_16um <- read.csv(paste0(data_path,"/binned_outputs/square_016um/analysis/clustering/gene_expression_graphclust/clusters.csv",sep=""), row.names = 1,stringsAsFactors=F,check.names = FALSE)
cluster_16um <- read.csv(paste0(data_path,"/binned_outputs/square_016um/analysis/umap/gene_expression_2_components/projection.csv"), row.names = 1,stringsAsFactors=F,check.names = FALSE)
projection_cluster_16um <- cbind(projection_16um,cluster_16um)

Object@reductions$umap.16um <- CreateDimReducObject(embeddings = as.matrix(projection_cluster_16um[, c("UMAP-1", "UMAP-2")]),key = "UMAP-", assay = DefaultAssay(Object))
Object$default_10X_cluster.16um <- projection_cluster_16um$Cluster[match(colnames(Object),rownames(projection_cluster_16um))]

#Plotting both 
p1<-SpatialDimPlot(Object, 
               label = F, repel = F,  
               pt.size.factor = 1.5, 
               image.alpha = 0.6,
               stroke = 0.01,
               group.by = c("default_10X_cluster.8um"))

p2<-SpatialDimPlot(Object, 
               label = F, repel = F,  
               pt.size.factor = 1.5, 
               image.alpha = 0.6,
               stroke = 0.01,
               group.by = c("default_10X_cluster.16um"))

plot_grid(p1,p2,ncol=2)
Spatialdimplot2bothresolutions

Let's dive into the analysis using 8um spots for this workshop

For the workshop we will use 8um spots

## split into 8 um and load 10X umap and cluster
DefaultAssay(Object) <- "Spatial.008um"
Object_8um <- DietSeurat(Object,
                       assays = "Spatial.008um")

projection_8um <- read.csv(paste0(data_path,"/square_008um/analysis/clustering/gene_expression_graphclust/clusters.csv",sep=""), row.names = 1,stringsAsFactors=F,check.names = FALSE)
cluster_8um <- read.csv(paste0(data_path,"/square_008um/analysis/umap/gene_expression_2_components/projection.csv"), row.names = 1,stringsAsFactors=F,check.names = FALSE)
projection_cluster_8um <- cbind(projection_8um,cluster_8um)
#head(projection_cluster_8um)

Object_8um@reductions$umap <- CreateDimReducObject(embeddings = as.matrix(projection_cluster_8um[, c("UMAP-1", "UMAP-2")]),
                                             key = "UMAP-", assay = DefaultAssay(Object_8um))
Object_8um$default_10X_cluster <- projection_cluster_8um$Cluster[match(colnames(Object_8um), rownames(projection_cluster_8um))]

p1_8u <- DimPlot(Object_8um, reduction = "umap", 
        group.by = c("default_10X_cluster"),
        ncol = 1,label=T,raster = T)

p2_8u <-SpatialDimPlot(Object_8um, 
               label = F, repel = F,  
               pt.size.factor = 1.5, 
               image.alpha = 0.6,
               stroke = 0.01,
               group.by = c("default_10X_cluster"))


plot_grid(p1_8u,p2_8u,ncol=1)

rm(Object)
plot1

QC metrics Visualization in 8 um Bin

Object_8um[["percent.mt"]] <- PercentageFeatureSet(object = Object_8um, pattern = "^mt-")


bin8_bin_ncount <- ggplot(Object_8um@meta.data, aes(x = nCount_Spatial.008um)) +
  xlim(0,500) + 
  #ylim(0,100000) +
  geom_histogram(binwidth = 10, fill = "steelblue", color = "black") +
  labs(x = "nCount_Spatial.008um",
       y = "Spot Count") + theme_minimal()

bin8_bin_nFeature <- ggplot(Object_8um@meta.data, aes(x = nFeature_Spatial.008um)) +
  xlim(0,500) + 
  #ylim(0,100000) +
  geom_histogram(binwidth = 10, fill = "steelblue", color = "black") +
  labs(x = "nFeature_Spatial.008um",
       y = "Spot Count") + theme_minimal()

Bin8_cellViability4 <-ggplot(Object_8um@meta.data,aes(x=percent.mt))+stat_ecdf(aes(colour=orig.ident))+ scale_x_continuous(name = "percent.Mitochondrial",breaks=seq(0,100,10),limits=c(0,100))+theme_bw()+ylab("The percentage of bins")+ theme(legend.position="none")

Bin8_QC <-plot_grid(bin8_bin_nFeature,bin8_bin_ncount,Bin8_cellViability4,ncol=3)

Bin8_Feature <- FeaturePlot(Object_8um,features = c("nFeature_Spatial.008um","nCount_Spatial.008um")) + theme(legend.position = "right")
Bin8_S_Feature <- SpatialFeaturePlot(Object_8um, features = c("nFeature_Spatial.008um","nCount_Spatial.008um")) + theme(legend.position = "top")

plot_grid(Bin8_QC,Bin8_Feature,Bin8_S_Feature,ncol=1)
QCplot2_NEWwithSpotcount

Filtering criterion for 8um Bin

  1. Bins detected in fewer than 50 spots were excluded
  2. Bins with fewer than 50 total detected genes were excluded
Object_8um_filtered <- subset(
  Object_8um,
  subset = nFeature_Spatial.008um > 50 &
           nCount_Spatial.008um > 50) #& percent.mt < 20)

## subset for 10k cell
sampled_cells <- sample(Cells(Object_8um_filtered), size = 10000, replace = FALSE)
Object_8um_filteredQ <- subset(Object_8um_filtered, cells = sampled_cells)

#rm(Object_8um, Object_8um_filtered)

## standard seurat pre-processing like scRNA
{
Object_8um_filteredQ <- NormalizeData(Object_8um_filteredQ)
Object_8um_filteredQ <- FindVariableFeatures(Object_8um_filteredQ,nfeatures = 2000)
Object_8um_filteredQ <- ScaleData(Object_8um_filteredQ)
#Object_8um_filteredQ <- SCTransform(Object_8um_filteredQ, assay = "Spatial.008um", verbose = FALSE)

Object_8um_filteredQ <- RunPCA(Object_8um_filteredQ, verbose = FALSE)
Object_8um_filteredQ <- RunUMAP(Object_8um_filteredQ, dims = 1:30, verbose = FALSE)
Object_8um_filteredQ <- FindNeighbors(Object_8um_filteredQ, dims = 1:30, verbose = FALSE)
Object_8um_filteredQ <- FindClusters(Object_8um_filteredQ, verbose = FALSE,resolution = 0.3)
}

bin8_F_d <- DimPlot(Object_8um_filteredQ, label = TRUE)

bin8_F_sd <-SpatialDimPlot(Object_8um_filteredQ, 
               label = F, repel = F,  
               pt.size.factor = 10, 
              image.alpha = 0.5,
              stroke = 0.1)

Bin8_filter_VP <- VlnPlot(Object_8um_filteredQ, features = c("nCount_Spatial.008um"), pt.size = 0, log = TRUE)

plot_grid(plot_grid(bin8_F_d,bin8_F_sd,ncol = 2), 
          Bin8_filter_VP, 
          rel_heights=c(0.3,0.2),ncol=1)
plot4

Check clustering with increased resolution

Object_8um_filteredQ <- FindClusters(Object_8um_filteredQ, verbose = FALSE,resolution = 0.8)

bin8_F_d <- DimPlot(Object_8um_filteredQ, label = TRUE)

bin8_F_sd <-SpatialDimPlot(Object_8um_filteredQ, 
               label = F, repel = F,  
               pt.size.factor = 10, 
              image.alpha = 0.5,
              stroke = 0.1)

Bin8_filter_VP <- VlnPlot(Object_8um_filteredQ, features = c("nCount_Spatial.008um"), pt.size = 0, log = TRUE)

plot_grid(plot_grid(bin8_F_d,bin8_F_sd,ncol = 2), 
          Bin8_filter_VP, 
          rel_heights=c(0.3,0.2),ncol=1)
plot4_newresolution
## cluster markers
bin8_markers <- FindAllMarkers(Object_8um_filteredQ, only.pos = TRUE)
bin8_markers %>%
  group_by(cluster) %>%
  dplyr::filter(avg_log2FC > 1) %>%
  slice_head(n = 10) %>%
  ungroup() -> bin8_top10

bin8_F_G <- DoHeatmap(Object_8um_filteredQ, features = bin8_top10$gene, size = 2.5) + theme(axis.text = element_text(size = 8)) + NoLegend()
bin8_F_G
Top10heatmap_NEWresolution

Due to the probe-based chemistry (similar to FLEX) used in Visium HD, and the platform’s inherent characteristics (such as targeted transcript capture, limited probe design coverage, and reduced sensitivity to low-abundance transcripts), not all canonical scRNA-seq markers are reliably captured in Visium HD data.

## canonical scRNA markers for some mouse brain cell types
Major_CellType_Markers  <- list("Neural_genes" = c("Cadm2","Pcdh9","Syt1","Dlg2","Dlgap1","Negr1", "Fgf14"),
# Oligodendrocyte
"Oligo_genes" = c("Mbp","Plp1","Cnp","Mag","Mobp", "Olig1", "Mal"),
# Astrocytes
"Astro_genes" = c("Slc39a12","Ntsr2","Fgfr3", "Bcan", "Aqp4", "Gfap"),
# Microglial
"Microglial_genes" = c("Hexb","C1qa","C1qb", "C1qc", "Csf1r", "Ctss"),
# Purkinje
"Purkinje_genes" = c("Auts2","Foxp2","Ebf1","Grid2", "Dner", "Khdrsb2"),
# Hypendymal
"Hypendymal_genes" = c("Rmst","Epha7","Zfhx4", "Efnb3", "Zic4", "Wls"),
#Hippocamal neuron
"HIPneuron_genes" = c("Ncdn","Ppfia2","Rfx3", "Ahcyl2", "Epha7"),#Cortical Interneuron
"CTXInterneuron_genes" = c("Galntl6","Gad2","Nap1l5", "Grip1", "Tspyl4", "Cplx1")
)


gene_df <- bind_rows(lapply(names(Major_CellType_Markers), function(x) {
tibble(Feature = Major_CellType_Markers[[x]], Group = x)
}))
all_markers <- gene_df$Feature

# Generate plot object and retrieve ggplot data
dot_plot <- DotPlot(object = Object_8um_filteredQ, features = unique(all_markers)) +
RotatedAxis()
# Extract the calculated data from the Seurat plot object
plot_data <- dot_plot$data
final_plot_data <- plot_data %>%
left_join(gene_df, by = c("features.plot" = "Feature"))

#Final plot
ggplot(final_plot_data, aes(x = id, y = features.plot)) +
geom_point(aes(size = pct.exp, color = avg.exp.scaled)) +
scale_size(range = c(0, 6)) +
facet_wrap(~ Group, scales = "free", ncol = 4) + # Stack vertically
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1),
strip.text = element_text(face = "bold", size = 12)
) +
labs(x = "Cell Types / Clusters", y = "Marker Genes", color = "Average Expression", size = "% Expressing")
DotPlot_MouseBrain

We can infer some clusters that are unique to certain cell types:

  1. Cluster 5 is oligodendrocytes
  2. Cluster 9 is microglial
  3. Cluster 13 is hippocampus neurons
  4. Broad neural genes are in every cluster

#Other neuronal markers to explore yourself!

# Granular cell types - Neurotransmitters
#GABAergic neurons
GABAergic_genes <- c("Slc32a1","Gabra1","Gad2","Galntl6", "Grip1", "Gbx1")
#Glutamergic neurons
Glutamate_genes <- c("Slc17a6", "Slc17a7", "Slc17a8", "Arpp21","Car10","Sv2b","Camk4", "Pcsk2", "Brinp1")
#Cholinergic neurons
Choline_genes <- c("Chat","Slc18a3","Tbx20","Chrnb3","Nppb","Ush1c", "Npy4r")
#Dopaminergic neurons
Dopa_genes <- c("Slc6a3", "Chrna6", "Gucy2c", "Aldh1a7", "Tdh")
#Serotonin neurons
Sero_genes <- c("Slc6a4","Tph2") #not in this section
#Noradrenergic neurons 
Noradren_genes <- c("Slc6a2", "Dbh") #not in this section

Extra ways to plot your data for both clusters and gene features

## Highlighting specific clusters

cells <- CellsByIdentities(Object_8um_filteredQ, idents = c(5, 9, 13))
p <- SpatialDimPlot(Object_8um_filteredQ,
                    cells.highlight = cells[setdiff(names(cells), "NA")],
                    cols.highlight = c("#FFFF00", "grey50"), facet.highlight = T, combine = T,pt.size.factor = 10
) + NoLegend()
p

##Highlighting specific genes
p1 <- SpatialFeaturePlot(Object_8um_filteredQ, features = "Rorb",pt.size.factor = 10) + ggtitle("Rorb expression")

p2 <- SpatialFeaturePlot(Object_8um_filteredQ, features = "Hpca",pt.size.factor = 10) + ggtitle("Hpca expression")

p1 | p2
Spatialdimplot_mousebrain Featurespatialplot_Mousebrain

Spot Decomposition with RCTD (Don't RUN!!!)

Robust Cell Type Decomposition (RCTD) is a probabilistic method used to infer cell type composition in spatial transcriptomics data by mapping expression profiles to a single-cell RNA-seq reference. Each spatial bin is classified based on how confidently the model matches the expression to one or more cell types.

For the current mouse brain tissue VisiumHD data, we will utilize a single-cell reference dataset sourced from Allen brain map consortium (https://community.brain-map.org/c/how-to/abc-atlas/19/l/top)

## load the RCTD reference made from Allen mouse brain scRNA data (for the workshop it is a downsampled version!)
reference_path <- paste("THE_PATH_TO_THE_REFERENCEscRNAseqdataset_rds") 
reference <- readRDS(reference_path) #Read the mouse brain dataset - allen_scRNAseq_ref.rds

#Examine the reference dataset
reference
names(reference@meta.data)

### Create the RCTD reference object (for mouse scRNAseq data)
Idents(reference) <- "subclass_label"
counts <- reference[["RNA"]]$counts
cluster <- as.factor(reference$subclass_label)
nUMI <- reference$nCount_RNA
levels(cluster) <- gsub("/", "-", levels(cluster))
cluster <- droplevels(cluster)
reference <- Reference(counts, cluster, nUMI)

#Examine what is reference now
str(reference)

## prepare the query 
counts_hd <- GetAssayData(Object_8um_filteredQ, layer = "counts")
cortex_cells_hd <- colnames(Object_8um_filteredQ)
coords <- GetTissueCoordinates(Object_8um_filteredQ)[cortex_cells_hd, 1:2]
query <- SpatialRNA(coords, counts_hd, colSums(counts_hd))
  
## run RCTD (it take around 4 hours to run on my server)
Bin8_RCTD <- create.RCTD(query, reference, max_cores = 1, UMI_min = 10) ## the number of core need to set as 1 on core server, but it take all cores anyway
Bin8_RCTD <- run.RCTD(Bin8_RCTD, doublet_mode = "doublet")

## RCTD Result
Object_8um_filteredQ <- AddMetaData(Object_8um_filteredQ, metadata = Bin8_RCTD@results$results_df)
Object_8um_filteredQ$RCTD_CellType <- as.character(Object_8um_filteredQ$first_type)
Object_8um_filteredQ$RCTD_CellType[is.na(Object_8um_filteredQ$RCTD_CellType)] <- "Unknown"

## Skip waiting time by load pre-run rds object
Object_8um_filteredQ <- readRDS(paste(wd, SampleName,"_RCTDNew.rds",sep=""))

Let's plot the RCTD results!

## cell type
Idents(Object_8um_filteredQ) <- "RCTD_CellType"

## fix the cell type order
celltypes_8 <- as.character(Idents(Object_8um_filteredQ))
current_levels_8 <- unique(sort(celltypes_8))
unknown_levels_8 <- sort(current_levels_8[grepl("(?i)^unknown$", current_levels_8, perl = TRUE)])
other_levels_8 <- setdiff(current_levels_8, unknown_levels_8)
new_levels_8 <- c(other_levels_8, unknown_levels_8)
Object_8um_filteredQ$RCTD_CellType <- factor(celltypes_8, levels = new_levels_8)

Bin8_RCTD_celltype_VP <- VlnPlot(Object_8um_filteredQ, features = c("nCount_Spatial.008um"), group.by = "RCTD_CellType",pt.size = 0, log = TRUE, cols = Color_palette_VisiumHD) + NoLegend()

Bin8_RCTD_dimplot_celltype <- DimPlot(Object_8um_filteredQ, reduction = "umap",pt.size = 1 , group.by = c("RCTD_CellType"), cols = Color_palette_VisiumHD, ncol = 1) + ggtitle("UMAP of RCTD clustering")

## cluster markers
bin8_celltye_markers <- FindAllMarkers(Object_8um_filteredQ, only.pos = TRUE)
bin8_celltye_markers %>%
  group_by(cluster) %>%
  dplyr::filter(avg_log2FC > 1) %>%
  slice_head(n = 5) %>%
  ungroup() -> bin8_ct_top5

bin8_CT_Heat <- DoHeatmap(Object_8um_filteredQ, features = bin8_ct_top5$gene, size = 2.5) + theme(axis.text = element_text(size = 8)) + NoLegend()

Bin8_RCTD_CellType_DP <- SpatialDimPlot(Object_8um_filteredQ, label = F, repel = F,  
                                        group.by = "RCTD_CellType",
                                        pt.size.factor = 10, 
                                        image.alpha = 0.5,
                                        stroke = 0.1)  + 
                                        scale_fill_manual(values = Color_palette_VisiumHD)

Bin8_RCTD_CellType_ALL <- plot_grid(Bin8_RCTD_celltype_VP,
                                    bin8_CT_Heat,
                                    Bin8_RCTD_dimplot_celltype,
                                    Bin8_RCTD_CellType_DP, 
                                    rel_heights = c(0.2,0.2,0.3,0.3,0.3), ncol=1)
Bin8_RCTD_CellType_ALL
RCTD_nCount_Violinplot RCTD_Heatmap2 RCTD_UMAPplot RCTD_Celltype RCTD_Spatialdimplot_Labels

Identification of Spatially Variable Features (Moran's I)

Moran’s I was used to identify genes with similar expression levels in neighboring spots across Visium HD data. This high-resolution spatial analysis highlights genes influenced by local tissue architecture and the tumor microenvironment, revealing region-specific biological activity within the tissue.

## the FindSpatiallyVariableFeatures only working with SCT assay for now
#Object_8um_filteredQ <- SCTransform(Object_8um_filteredQ, assay = "Spatial.008um", verbose = FALSE)

Object_8um_filteredQ <- FindSpatiallyVariableFeatures(Object_8um_filteredQ, 
                                                      assay = "SCT",
                                                      features = VariableFeatures(Object_8um_filteredQ)[1:500],
                                                      selection.method = "moransi")

SpatialFeaturePlot(Object_8um_filteredQ, 
                   features = head(SpatiallyVariableFeatures(Object_8um_filteredQ, method = "moransi"), 12),
                   ncol = 4,  max.cutoff = "q95", pt.size.factor = 12)
Screenshot 2026-05-20 at 3 06 58 PM

Spatial Domain Analysis

BANKSY BANKSY is a method for clustering spatial omics data by augmenting the features of each cell with both an average of the features of its spatial neighbors along with neighborhood feature gradients. BANKSY is applicable to a wide array of spatial technologies (e.g. 10x Visium, Slide-seq, MERFISH, CosMX, CODEX) and scales well to large datasets.

For more details, please refer to the paper - https://www.nature.com/articles/s41588-024-01664-3

the BANKSY may take long time to run, we can skip this section by reading the processed object

##Load the Libraries
library(Banksy)
library(SummarizedExperiment)
library(SpatialExperiment)
library(scuttle)

library(scater)
library(cowplot)
library(ggplot2)

## Read in the Seurat object as a Banksy Spatial Experiment object
gcm1 <- Object_8um_filteredQ@assays$Spatial.008um$counts
coords <- GetTissueCoordinates(Object_8um_filteredQ)
coords<-coords[,-3]
Banksyobject <- SpatialExperiment(assay = list(counts = gcm1), spatialCoords = as.matrix(coords))

#Examine the Banksy object

## Do basic normalization necessary for running Banksy
Banksyobject <- computeLibraryFactors(Banksyobject)
aname <- "normcounts"
assay(Banksyobject, aname) <- normalizeCounts(Banksyobject, log = FALSE)

## Run Banksy algorithm
lambda <- c(0.2)
k_geom <- c(15, 30)
Banksyobject <- Banksy::computeBanksy(Banksyobject, assay_name = aname, compute_agf = TRUE, k_geom = k_geom)
set.seed(1000)
Banksyobject <- Banksy::runBanksyPCA(Banksyobject, use_agf = TRUE, lambda = lambda)
Banksyobject <- Banksy::runBanksyUMAP(Banksyobject, use_agf = TRUE, lambda = lambda)
Banksyobject <- Banksy::clusterBanksy(Banksyobject, use_agf = TRUE, lambda = lambda, resolution = 1.2)
#Banksyobject <- Banksy::connectClusters(Banksyobject) [Not needed when working with just one lambda value]

## Put Banksy spatial domain information back into the Seurat Object for plotting and DE analysis
Object_8um_filteredQ$Banksyclusters<-Banksyobject$clust_M1_lam0.2_k50_res1.2

Let's plot BANKSY spatial niche results

##Read in the Banksy Object
Object_8um_filteredQ <- readRDS("/data/shared/VisumHD/Mousebrain_RCTD_NEWwithBANKSY.rds")

p1<-SpatialDimPlot(Object_8um_filteredQ, label = F, repel = F,
                   group.by = c("CellType"),pt.size.factor = 10,
                   image.alpha = 0.5,stroke = 0.1) +
  scale_fill_manual(values = Color_palette_VisiumHD) +
  theme(plot.margin = margin(0.2, 0.2, 0.2, 0.2), aspect.ratio = 1)

p2<-SpatialDimPlot(Object_8um_filteredQ, label = F, repel = F,
                   group.by = c("Banksyclusters"),pt.size.factor = 10,
                   image.alpha = 0.5,stroke = 0.1) +
  scale_fill_manual(values = Color_palette_VisiumHD) +
  theme(plot.margin = margin(0.2, 0.2, 0.2, 0.2), aspect.ratio = 1)

plot_grid(p1,p2)

## Calculate how many different cell types are present in each spatial domain

library(data.table)
df_m<-dcast(data=data.table(Object_8um_filteredQ@meta.data),
Banksyclusters ~ RCTD_CellType,
fun.aggregate = length,
value.var = "RCTD_CellType")

df_m_melt = melt(df_m, id.vars = "Banksyclusters")

celltype_df_prop <- df_m_melt %>%
group_by(Banksyclusters) %>%
mutate(freq = value / sum(value)*10)

Cell_type_Bar <- ggplot(celltype_df_prop, aes(x = Banksyclusters, y = freq, fill = variable)) +
geom_bar(stat = "identity", position = "fill") +
theme_minimal() +
labs(x = "Banksy_clusters", y = "Proportion", fill = "variable") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
scale_fill_manual(values = Color_palette_VisiumHD)
Cell_type_Bar
RCTD_Celltype Banksy_clusters Banksy_RCTDCelltype_Barplot

GSEA Analysis

One of the most important analysis we can use spot-level sequencing based data for is to perform gene set enrichment analysis and/or gene signature scores analysis. For this workshop we will utilize the msigdbr database curated cell type gene signatures category: C8 (https://www.gsea-msigdb.org/gsea/msigdb/human/collection_details.jsp#C8)

##Load the Libraries
library(presto)
library(msigdbr)
library(fgsea)

#First we need to add an RNA assay slot into the object
Object_8um_filteredQ[["RNA"]] <- Object_8um_filteredQ[["Spatial.008um"]]
DefaultAssay(Object_8um_filteredQ)<-"RNA"

#Convert Seurat to SinglecellExperiment
Object_8um_filtered_sce<-as.SingleCellExperiment(Object_8um_filteredQ)
#We calculate DE genes to calculate weights for gene set enrichments
Mousebrain.genes <- wilcoxauc(Object_8um_filtered_sce, 'RCTD_CellType')

#Now we can subset to our particular cell of interest e.g. let's use Car3 cells that are a major subclass of glutamatergic excitatory, intratelencephalic (IT) projection neurons associated with sensory perceptions and behavioral functions
Car3.genes<- Mousebrain.genes %>%
dplyr::filter(group == "Car3") %>%
arrange(dplyr::desc(auc)) %>%
dplyr::select(feature, auc)
ranks<- deframe(Car3.genes)

#Specify species and category of curated gene signature (You can also use 'H' for Hallmark gene sets)
m_df<- msigdbr(species = "Mus musculus", category = "C8")
fgsea_sets<- m_df %>% split(x = .$gene_symbol, f = .$gs_name)
gene_data <- tibble::enframe(ranks, name = "gene", value = "stat")
head(gene_data)

gene_data_unique <- gene_data %>%
dplyr::group_by(gene) %>%
dplyr::summarize(stat = mean(stat, na.rm = TRUE))
dim(gene_data)
dim(gene_data_unique)
ranked_genes_unique <- deframe(gene_data_unique)
ranked_genes_unique <- sort(ranked_genes_unique, decreasing = TRUE)

#Calculate GSEA
fgseaRes<- fgsea(fgsea_sets, stats = ranked_genes_unique, nperm = 1000)
fgseaResTidy <- fgseaRes %>%
as_tibble() %>%
arrange(dplyr::desc(NES))
fgseaResTidy %>%
dplyr::select(-leadingEdge, -ES, -nMoreExtreme) %>%
arrange(padj) %>%
head()

#Plot the GSEA pathways most enriched with pvalues (unadjusted & adjusted)
plot_data <- fgseaResTidy %>%
mutate(sign = ifelse(NES > 0, "Activated", "Suppressed"),
Count = sapply(leadingEdge, length)) %>%
filter(pval < 0.05) %>%
head(20) # Top 20

p5<-ggplot(plot_data, aes(x = NES, y = reorder(pathway, NES), size = Count, color = pval)) +
geom_point() +
scale_color_continuous(low="red", high="blue") +
theme_minimal() +
labs(title = "GSEA Dotplot of Car3 glutamatergic neurons", y = "C8 Cell Type Signatures")

plot_data <- fgseaResTidy %>%
mutate(sign = ifelse(NES > 0, "Activated", "Suppressed"),
Count = sapply(leadingEdge, length)) %>%
filter(padj < 0.05) %>%
head(20) # Top 20

p6<-ggplot(plot_data, aes(x = NES, y = reorder(pathway, NES), size = Count, color = padj)) +
geom_point() +
scale_color_continuous(low="red", high="blue") +
theme_minimal() +
labs(title = "GSEA Dotplot of Car3 glutamatergic neurons", y = "C8 Cell Type Signatures")

p5 | p6
GSEA_Car3_pval GSEA_Car3_padj

Try running the same analysis for Oligodendrocytes and generate the plot below! C8celltype_signature_Oligo

Delaunay-Based Network

Giotto is an R package for analyzing and visualizing spatial transcriptomics data. In our study, we use it to build spatial networks and perform neighborhood enrichment analysis to explore cell–cell interactions in the tumor microenvironment.

A Delaunay-based network was used to connect each spot in the Visium HD data to nearby neighbors by forming triangles without overlaps. This creates a balanced and accurate map of local spatial relationships, even when spot positions are uneven. It helps capture the tissue structure and is useful for analyzing how cells interact within their neighborhood.

## convert seurat object for Giotto
{
find("seuratToGiottoV5")

# Copy the installed function
seuratToGiottoV5_fixed <- seuratToGiottoV5

# View outdated calls
grep(
  pattern = "slot\\s*=",
  x = deparse(body(seuratToGiottoV5_fixed)),
  value = TRUE
)

# Replace slot = with layer =
fn_txt <- paste(deparse(body(seuratToGiottoV5_fixed)), collapse = "\n")
fn_txt <- gsub("slot\\s*=", "layer =", fn_txt)

body(seuratToGiottoV5_fixed) <- parse(text = fn_txt)[[1]]
environment(seuratToGiottoV5_fixed) <- environment(seuratToGiottoV5)

}

## for the Seurat V4
# Object_8um_filteredQG<-seuratToGiottoV5(sobject =Object_8um_filteredQ, spatial_assay ="Spatial.008um")

## for the Seurat V5
Object_8um_filteredQG<-seuratToGiottoV5_fixed(sobject =Object_8um_filteredQ, spatial_assay ="Spatial.008um")

## Create Delaunay Network
Object_8um_filteredQG <- createSpatialDelaunayNetwork(Object_8um_filteredQG)

spatPlot2D(
  gobject = Object_8um_filteredQG,
  show_network = TRUE,
  point_size = 1.2,
  network_color = "grey",
  save_plot = TRUE,
 cell_color ="CellType"
)
0-spatPlot2D

Neighborhood Enrichment Analysis

Neighborhood Enrichment Analysis using a Delaunay-based network was used to study how different cell types are positioned near each other in Visium HD data. By comparing real cell–cell interactions to random patterns, we identified which cell types tend to cluster together or stay apart. These spatial patterns help reveal unique communicating patterns of the neurons and supportive cells in the highly organized brain structure. In tumors, this can help identify tumor cell proximities and cell communication in the tumor microenvironment e.g. with tumor infiltrating immune cells.

Object_8um_filteredQG_Enrich <- cellProximityEnrichment(
  gobject = Object_8um_filteredQG,
  cluster_column = "RCTD_CellType",           # column in metadata
  spatial_network_name = "Delaunay_network",  # default name
  adjust_method = "fdr"                   # multiple test correction
)

## heatmap from cell-cell proximity scores
cellProximityHeatmap( Object_8um_filteredQG, 
                      Object_8um_filteredQG_Enrich, 
                      save_plot =TRUE)
1-cellProximityHeatmap

copykat Infering CNV

Copykat is the Navin lab tool that infers genome-wide copy number variations (CNVs) from single-cell RNA-seq data, distinguishing aneuploid tumor cells from diploid normal cells. Please refer to our published paper Delineating copy number and clonal substructure in human tumors from single cell transcriptomes

For this analysis, we will use a human breast cancer Fresh Frozen VisiumHD dataset from 10X

## run copykat
BC_16um_filteredQ <- readRDS("/data/shared/VisumHD/BreastCancer_FF_10X_16um_Filtered_Subset10k.rds")
head(BC_16um_filteredQ@meta.data)

BC_16um_filteredQ <- copykat(rawmat=BC_16um_filteredQ@assays$Spatial.016um$counts,
        ngene.chr=5,
        win.size=25,
        KS.cut=0.2, ##  segmentation parameter 0.05-0.15, Increasing KS.cut decreases sensitivity, i.e. less segments/breakpoints
        min.gene.per.cell = 50, ## min gene
        sam.name="BreastCancer_FF", distance="euclidean", n.cores=50)

knitr::include_graphics("BreastCancer_FF_copykat_heatmap.jpeg")

image

Cell Ploidy Mapping

ck <- read.table("BreastCancer_FF_copykat_prediction.txt"),header = T)

BC_16um_filteredQ@meta.data$CopyKat<-"Undefined"
diploid<-ck[which((ck$copykat.pred=="diploid")|(ck$copykat.pred=="c1:low.confidence")),]
aneuploid<-ck[which((ck$copykat.pred=="aneuploid")|(ck$copykat.pred=="c2:low.confidence")),]

BC_16um_filteredQ@meta.data[which(row.names(BC_16um_filteredQ@meta.data) %in% aneuploid$cell.names),"CopyKat"]<-"Aneuploid"
BC_16um_filteredQ@meta.data[which(row.names(BC_16um_filteredQ@meta.data) %in% diploid$cell.names),"CopyKat"]<-"Diploid"

#saveRDS(BC_16um_filteredQ, paste(SampleName,"_8um_Filtered_Subset10k.rds",sep=""))

Bin16_dimplot_copykat <- DimPlot(BC_16um_filteredQ, reduction = "umap",pt.size = 1 , group.by = c("CopyKat"), ncol = 1) + ggtitle("UMAP of RCTD clustering") + theme(plot.margin = margin(0.2, 0.2, 0.2, 0.2), aspect.ratio = 1)

Bin16_Sdimplot_copykat <- SpatialDimPlot(BC_16um_filteredQ, label = F, repel = F,  
                        pt.size.factor = 10, 
                        image.alpha = 0.5,
                        stroke = 0.1,
                        group.by = c("CopyKat") ) + theme(plot.margin = margin(0.2, 0.2, 0.2, 0.2), aspect.ratio = 1)

plot_grid(Bin16_dimplot_copykat, Bin16_Sdimplot_copykat, ncol=2)

image

Input copykat ratio matrix to copykit for further analysis

library(copykit)
source('/data/shared/VisumHD/Copykit_copykat_functions.R')

CNA.test <- read.table(paste(SampleName,"_copykat_CNA_results.txt",sep="") ,header = T, check.names = FALSE)
colnames(CNA.test) <- sub("\\.1$", "-1", colnames(CNA.test))

cpkobj <- Copykit_copykat(CNA_result = CNA.test, genome_version = 'hg19',resolution = '200k',method = 'copykat')
assay(cpkobj, "segratio_unlog") <- 2^assay(cpkobj, "segment_ratios")
assay(cpkobj, "logr") <- assay(cpkobj, "segment_ratios")

meta <- colData(cpkobj) %>% data.frame()

#add celltype annotation
meta_16Q <- BC_16um_filteredQ@meta.data
meta_16Q$cell.name <- rownames(meta_16Q)

meta <- left_join(meta,meta_16Q)
colData(cpkobj)$CopyKat = meta$CopyKat
colData(cpkobj)$CellType = meta$CellType

color_heat = circlize::colorRamp2(breaks = c(-0.5,0,0.5), c("dodgerblue3", "white", "firebrick3"))
getwd()
#subset the data for quick demonstration
#cpkobj_sub = cpkobj[,sample(colnames(cpkobj),300,replace = F)]
pdf('Copykit_Heatmap_celltype.pdf',width = 10,height = 10)
print(plotHeatmap_copykatST(cpkobj,n_threads = 40, pt.name = 'test', order_cells = "hclust",
                            use_raster = T, assay = "segratio_unlog", col = color_heat,
                            row_split = 'CellType',
                            label = c('CopyKat','CellType')))
dev.off()

image

image

Key Observations

  1. All three tumor clusters (lumhr_1, lumhr_6, lumhr_8) show strong CNV signals, and their spatial locations match well with pathologist annotations of IDC and DCIS, confirming their malignant nature.

  2. lumhr_1 is located in the IDC region, while lumhr_6 and lumhr_8 are in DCIS areas. Although DCIS is non-invasive, both clusters already show clear CNV patterns, indicating early genetic changes.

  3. CAFs_2 are strongly enriched around lumhr_1 in the IDC region, suggesting stromal involvement in tumor invasion.

  4. The surrounding cell types differ between IDC and DCIS, reflecting changes in the tumor microenvironment during cancer progression.

Clone this wiki locally