## Integration of ScaleBio Single-Cell RNA-Seq data and Vizgen Merscope Spatial Transcriptomics Data

This notebook performs a workflow for integrating ScaleBio scRNA-seq data with Vizgen spatial transcriptomics data, including data normalization, dimensionality reduction, clustering, and data transfer between the two datasets. Each step is commented to clarify its purpose and function within the analysis. This analysis was adapted from the following tutorial found on the Seurat website. 

https://satijalab.org/seurat/articles/spatial_vignette.html

This chunk defines the variables for the paths to the matrix files for the ScaleBio data.

In [1]:
scale_mtx_path <- "/path/to/ScaleBio/matrix.mtx"
scale_feature_path <- "/path/to/ScaleBio/features.tsv"
scale_barcode_path <- "/path/to/ScaleBio/barcodes.csv"

This chunk defines the variables to the input data for the Vizgen Merscope data.

In [2]:
vizgen_data_dir <- "/path/to/vizgen/data/directory"
vizgen_transcript_counts <- "/path/to/vizgen/cellpose_cell_by_gene.csv"
vizgen_spatial_meta_data <- "/path/to/vizgen/cellpose_cell_metadata.csv"

This chunk loads the necesarry packages and defines a custom function for loading the Vizgen Merscope data into a Seurat Object.

In [None]:
library(Seurat)           
library(SeuratObject)     

# Custom function to load Vizgen data into a Seurat object.
LoadVizgenCustom <- function(data, fov, assay = 'Spatial', z = 3L) {
  cents <- CreateCentroids(data$centroids)  # Create centroids from spatial data.
  segmentations.data <- list(
    "centroids" = cents  # Store centroids in a list for segmentation data.
  )
  coords <- CreateFOV(
    coords = segmentations.data,
    type = c("centroids"),
    molecules = data$microns,
    assay = assay
  )
  # Create a Seurat object using the transcript count data and metadata.
  obj <- CreateSeuratObject(counts = data$transcripts, assay = assay, meta.data = data$metadata)
  # Add coordinates to the Seurat object.
  obj[[fov]] <- coords
  return(obj)
}

This chunk performs the analysis of the single-cell RNA seq data from ScaleBio.

In [None]:
# Load the ScaleBio scRNA-seq data
mtx <- ReadMtx(
    mtx = scale_mtx_path,
    features = scale_feature_path,
    cells = scale_barcode_path
)

# Create a Seurat object
scrna <- CreateSeuratObject(mtx)

# Normalize and scale the data using SCTransform
scrna <- SCTransform(scrna, ncells = 3000, conserve.memory = TRUE)

# Perform Principal Component Analysis (PCA)
scrna <- RunPCA(scrna, features = VariableFeatures(object = scrna))

# Find the nearest neighbors
scrna <- FindNeighbors(scrna, dims = 1:15)

# Cluster the cells
scrna <- FindClusters(scrna, resolution = 0.1)

# Run UMAP for dimensionality reduction to visualize the clusters in 2D space.
scrna <- RunUMAP(scrna, dims = 1:15, verbose = F)


Visualization of the ScaleBio single-cell RNA-seq data.

In [None]:
UMAPPlot(object = scrna)

This chunk performs analysis of the Vizgen Merscope spatial data.

In [None]:
vizgen <- ReadVizgen(
    data.dir=vizgen_data_dir, 
    transcripts=vizgen_transcript_counts, 
    spatial=vizgen_spatial_meta_data, 
    molecules=NA,
    type="centroids", 
    metadata="volume", 
    filter = "^Blank-",
)

# Load the Vizgen spatial transcriptomics data using the custom function.
vizgen <- LoadVizgenCustom(vizgen, "slice1")

# Filter the Vizgen data
vizgen <- subset(vizgen, subset = nCount_Spatial > 50 & nCount_Spatial < 3000 & volume < 2500 & volume > 100)

# Calculate the total counts for each gene
gene_sums <- rowSums(vizgen@assays$Spatial$counts)

# Get all gene names from the Vizgen data.
all.genes <- row.names(vizgen)

# Normalize and scale the Vizgen data using SCTransform.
vizgen <- SCTransform(vizgen, assay = "Spatial", clip.range = c(-10,10))

# Perform PCA
vizgen <- RunPCA(vizgen, features = all.genes, verbose = F)

# Find the nearest neighbors
vizgen <- FindNeighbors(vizgen, dims = 1:15)

# Cluster the cells
vizgen <- FindClusters(vizgen, resolution = 0.1)

# Run UMAP for dimensionality reduction to visualize the clusters.
vizgen <- RunUMAP(vizgen, dims = 1:15, verbose = F)

Visualization of the Vizgen Merscope Data

In [None]:
UMAPPlot(object = vizgen)

This chunk performs integration of the ScaleBio single-cell RNA Seq data and the Vizgen Merscope spatial transcriptomic data.

In [None]:
# Add a new metadata column with the log-transformed UMI count for each cell.
vizgen@meta.data['log_umi'] <- log10(vizgen$nCount_Spatial)

# Identify genes shared between the Vizgen and scRNA-seq datasets.
sharedGenes <- intersect(row.names(vizgen), row.names(scrna))

# Find anchors for data integration between the scRNA-seq and Vizgen datasets using shared genes.
anchors <- FindTransferAnchors(reference = scrna, query = vizgen, normalization.method = "SCT", features = sharedGenes, reduction = 'cca')

# Transfer cell type predictions from the scRNA-seq data to the Vizgen data.
predictions_1 <- TransferData(
    anchorset = anchors,
    refdata = Idents(scrna),
    weight.reduction = 'cca'
)

# Add the cell type predictions as metadata to the Vizgen object.
vizgen <- AddMetaData(object = vizgen, metadata = predictions_1)

# Transfer the RNA expression profiles from the scRNA-seq data to the Vizgen data.
vizgen$projRNA <- TransferData(
    anchorset = anchors, 
    reference = scrna,
    refdata = "SCT", 
    weight.reduction = 'cca'
)

# Set the default assay in the Vizgen object to the projected RNA data.
DefaultAssay(vizgen) <- "projRNA"

# Switch the default assay back to the original Spatial data.
DefaultAssay(vizgen) <- "Spatial"

Visualization of the integrated data sets.

In [None]:
UMAPPlot(object = vizgen, group.by = "predicted.id")

Save ScaleBio single-cell RNA-Seq data analysis

In [13]:
saveRDS(object = scrna, file = "scale_data.rds")

Save integrated Vizgen spatial transcriptomic data

In [None]:
saveRDS(object = vizgen, file = "vizgen_data.rds")