![GML Logo](../images/logo.png)

# Tissue Segmentation with Voyager

Contact: Andrew Newman (andrew.newman@uq.edu.au)

The Visium spatial transcriptomics platform by 10X Genomics, based on the Spatial Transcriptomics (ST) technology published in 2016, captures mRNA from tissue sections on spatially barcoded spots immobilized on a microarray slide. After constructing a barcoded cDNA library, mRNA transcripts are mapped to specific spots on the slide and overlaid with a high-resolution tissue image, enabling visualization and analysis of gene expression in a spatial context.

Visium provides:
* 55 ùúám spot diameter and 100 ùúám center to center,
* 1-10% of the total mRNA molecules present in a given spot,
* A protocol has been adapted for long read sequencing and
* Compatibility with fresh frozen or FFPE tissue samples.

More reading:
* [An introduction to spatial transcriptomics for biomedical research](https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-022-01075-1)
* [Museum of spatial transcriptomics](https://www.nature.com/articles/s41592-022-01409-2)

# Setup

* Adds the libraries,
* Sets up functions to render graphics in the notebook and
* Adds the paths for reading in the data.

In [None]:
library(dplyr)
library(Voyager)
library(SpatialExperiment)
library(SpatialFeatureExperiment)
library(SingleCellExperiment)
library(ggplot2)
library(scater)
library(rlang)
library(scran)
library(scuttle)
library(terra)
library(sf)
library(rmapshaper)
library(scran)
library(stringr)
library(EBImage)
library(patchwork)
library(bluster)
library(rjson)
theme_set(theme_bw())

In [None]:
# Layout
custom_theme <- function() {
  theme_bw() +
    theme(
      legend.text = element_text(size = 14),
      legend.title = element_text(size = 16, face = "bold"),
      axis.text = element_text(size = 12),
      axis.title = element_text(size = 14, face = "bold"),
      legend.position = "right",
      legend.box.just = "right"
    )
}
options(repr.plot.width = 10, repr.plot.height = 8)

In [None]:
data_dir <- R.utils::getAbsolutePath('../data')
mouse_dir <- glue::glue("{data_dir}/Visium_Mouse_Olfactory_Bulb/outs")

## Visium Files Overview

This is a brief introduction to the raw Visium output data that is produced from an experiment.

### Count Data

The [count matrix directory](https://www.10xgenomics.com/support/software/space-ranger/latest/analysis/space-ranger-feature-barcode-matrices) can be **filtered_feature_bc_matrix** and **raw_feature_bc_matrix**, which contain:
* a count matrix,
* feature or gene matrix, and
* barcode (cell/spot) information.

Run the code to display the top 6 values of the matrix file:

In [None]:
head(read.csv(glue::glue("{mouse_dir}/raw_feature_bc_matrix/matrix.mtx.gz")))

This shows that that matrix was produced using Space Range version 2. The next line indicates:
* 32285 columns (genes),
* 4992 rows (spots),
* 6382095 non-zero cells in the matrix.

Subsequent lines are the data - gene in index 1393 has a count of 1 in spot 1.

Next, run the following code to display the features file:

In [None]:
head(read.csv(glue::glue("{mouse_dir}/raw_feature_bc_matrix/features.tsv.gz")))

This shows for each row, the ensemble ID for the gene "Gm1992" and it represents a gene expression value.

Next, run the following code to display the barcodes file:

In [None]:
head(read.csv(glue::glue("{mouse_dir}/raw_feature_bc_matrix/barcodes.tsv.gz")))

This gives the column IDs (spots) for each row in matrix.

### Spatial Metadata

The scalefactors_json.json file contains image metadata:
* **tissue_hires_scalef** and **tissue_lowres_scalef** are the ratio of the size of the high resolution (but not full resolution) and low resolution H&E image to the full resolution image.
* **fiducial_diameter_fullres** is the diameter of each fiducial spot used to align the spots to the H&E image in pixels in the full resolution image.
* **spot_diameter_fullres** is the diameter of each Visium spot in the full resolution H&E image in pixels. 

In [None]:
scale_factors <- fromJSON(file = glue::glue("{mouse_dir}/spatial/scalefactors_json.json"))
str(scale_factors)

### Tissue Metadata

The tissue_positions_list.csv file contains information about each spot/barcode:
* **in_tissue** indicates whether each spot is in tissue (in_tissue, 1 means yes and 0 means no) as automatically detected by 
Space Ranger or manually annotated in the Loupe browser.
* **array_row** and **array_col** are the coordinates on the matrix of spots,
* **pxl_row_in_fullres** and **pxl_col_in_fullres** are the coordinates of the spots in the full resolution 
image.

In [None]:
head(read.csv(glue::glue("{mouse_dir}/spatial/tissue_positions.csv")))

## Read Visium Data

In [None]:
raw_sfe <- SpatialFeatureExperiment::read10xVisiumSFE(dirs = mouse_dir, samples = ".", type = "sparse", data = "raw")

## Read Hi-res Image

In [None]:
img <- readImage(glue::glue("{mouse_dir}/spatial/tissue_hires_image.png"))
EBImage::display(img)

<img src="images/mouse_bulb_hires.png" height="240">

The following shows the separated RGB values of the H&E image.

In [None]:
img2 <- img
EBImage::colorMode(img2) <- EBImage::Grayscale
EBImage::display(img2, all = TRUE)
(EBImage::hist(img) + custom_theme())

<img src="images/mouse_bulb_rgb.png">

<img src="images/mouse_bulb_histogram.png">

Next, we use [EBImage](https://www.bioconductor.org/packages/devel/bioc/vignettes/EBImage/inst/doc/EBImage-introduction.html) to create a mask. The blue channel is used with a threshold value (87%) to select the mask. We then perform opening (erosion followed by dilation) and closing operations (dilation followed by erosion) to create a mask using a circular kernel. During erosion, the brush slides over the image, and a pixel is set to the background value if any part of the brush overlaps with a background pixel. During dilation, the brush slides over the image, and a pixel is set to the foreground value if any part of the brush overlaps with a foreground pixel. This cleans up some artefacts and creates a smooth boundary.

In [None]:
mask <- img2[,,3] < 0.87
kern <- EBImage::makeBrush(3, shape='disc')
mask_open <- EBImage::opening(mask, kern)
mask_close <- EBImage::closing(mask_open, kern)
EBImage::display(mask_open)

<img src="images/mouse_bulb_mask_1.png">

Next, we further process the mask, using EBImage's bwlabel command, finding every connected set of pixels other than the backgrounds, then connects the components, computes shape features for each labeled object, removes objects based on their area (objects with area less than 100 pixels and the object with label 797), and fills any holes in the remaining objects.

In [None]:
mask_label <- EBImage::bwlabel(mask_close)
fts <- EBImage::computeFeatures.shape(mask_label)
max_ind <- terra::which.max(fts[,"s.area"])
inds <- which(as.array(mask_label) == max_ind, arr.ind = TRUE)
row_inds <- c(seq_len(min(inds[,1])-1), seq(max(inds[,1])+1, nrow(mask_label), by = 1))
col_inds <- c(seq_len(min(inds[,2])-1), seq(max(inds[,2])+1, nrow(mask_label), by = 1))
# mask_label[row_inds, ] <- 0
# mask_label[,col_inds] <- 0
fts2 <- fts[unique(as.vector(mask_label))[-1],]
fts2 <- fts2[order(fts2[,"s.area"], decreasing = TRUE),]

In [None]:
polygon_ids_to_remove <- c(174, 561, 546, 484, 150, 74, 622, 551, 121, 47, 450, 849, 797, 461, 840, 862, 839, 775)
polygon_area <- 100
polygon_by_area_to_remove <- as.numeric(rownames(fts2)[fts2[,1] < polygon_area])

In [None]:
mask_label[mask_label %in% c(polygon_ids_to_remove, polygon_by_area_to_remove)] <- 0
mask_label <- EBImage::fillHull(mask_label)
EBImage::display(mask_label)

<img src="images/mouse_bulb_mask_2.png">

Next, we can visualise the areas of each object, to determine if the processing is adequate. Ideally, we want few objects with the most area.

In [None]:
plot(fts2[,1][-1], type = "l", ylab = "Area")

<img src="images/mouse_bulb_mask_stats.png">

In [None]:
head(fts2, 20)

We can now visualise the final output.

In [None]:
display(paintObjects(mask_label, img, col=c("red", "yellow"), opac=c(1, 0.3)))

<img src="images/mouse_bulb_mask_3.png">

In [None]:
raster2polygon <- function(seg, keep = 0.2) {
    seg <- flip(seg)
    r <- terra::rast(as.array(seg), extent = ext(0, nrow(seg), 0, ncol(seg))) |> trans()
    r[r < 1] <- NA
    contours <- st_as_sf(as.polygons(r, dissolve = TRUE))
    simplified <- ms_simplify(contours, keep = keep)
    return(list(full = contours, simplified = simplified))
}

In [None]:
tb <- raster2polygon(mask_label)
print(head(tb$full,5))
print(head(tb$simplified,5))

We can now visualise a sample of the tissue boundary against the H&E image.

In [None]:
scale_factors <- fromJSON(file = glue::glue("{mouse_dir}/spatial/scalefactors_json.json"))
tb[["simplified"]][["geometry"]] <- tb[["simplified"]][["geometry"]] / scale_factors[["tissue_hires_scalef"]]
is_mt <- str_detect(rowData(raw_sfe)$symbol, "^mt-")
segmented_sfe <- scuttle::addPerCellQCMetrics(raw_sfe, subsets = list(mito = is_mt))
colData(segmented_sfe)[["nCounts"]] <- colSums(counts(segmented_sfe))
SpatialFeatureExperiment::tissueBoundary(segmented_sfe) <- tb[["simplified"]]
Voyager::plotSpatialFeature(segmented_sfe, "sum", annotGeometryName = "tissueBoundary", 
                   annot_fixed = list(fill = NA, color = "black"),
                   image_id = "lowres") + custom_theme()

<img src="images/mouse_bulb_mask_4.png">

The fiducials indicate that the image needs to be rotated (so the pyramid is in the bottom left).

In [None]:
segmented_sfe <- SpatialFeatureExperiment::transpose(segmented_sfe)
Voyager::plotSpatialFeature(segmented_sfe, "sum", annotGeometryName = "tissueBoundary", 
                   annot_fixed = list(fill = NA, color = "black"),
                   image_id = "lowres")

<img src="images/mouse_bulb_mask_5.png">

We can now assing the tissue boundary covering the image or intesects. To compare the Space Ranger "in tissue" annotation with our new tissue mask.

In [None]:
segmented_sfe$int_tissue <- SpatialFeatureExperiment::annotPred(segmented_sfe, colGeometryName = "spotPoly", 
                            annotGeometryName = "tissueBoundary",
                            pred = st_intersects)
segmented_sfe$cov_tissue <- SpatialFeatureExperiment::annotPred(segmented_sfe, colGeometryName = "spotPoly", 
                            annotGeometryName = "tissueBoundary",
                            pred = st_covered_by)
segmented_sfe$diff_sr <- 
    dplyr::case_when(
        segmented_sfe[['in_tissue']] == segmented_sfe[['int_tissue']] ~ "same",
        segmented_sfe[['in_tissue']] & !segmented_sfe[['int_tissue']] ~ "Space Ranger",
        segmented_sfe[['int_tissue']] & !segmented_sfe[['in_tissue']] ~ "segmentation"
    ) |> 
    factor(levels = c("Space Ranger", "same", "segmentation"))
Voyager::plotSpatialFeature(
    segmented_sfe, "diff_sr", 
    annotGeometryName = "tissueBoundary", 
    annot_fixed = list(fill = NA, size = 0.5, color = "black")) +
    scale_fill_brewer(type = "div", palette = 4) + custom_theme()

<img src="images/mouse_bulb_space_ranger_comparison.png">

Finally, we save the output as an RDS to be used in the "voyager" tutorial.

In [None]:
# Uncomment this out to write out a new output for processing.
# data_dir <- R.utils::getAbsolutePath('../data')
# saveRDS(segmented_sfe, glue::glue("{data_dir}/Visium_Mouse_Olfactory_Bulb.rds"))

# More Information

The homepage for the Voyager R project is https://pachterlab.github.io/voyager/index.html

Introduction to Visium Technology:
* https://pachterlab.github.io/voyager/articles/visium_landing.html

This tutorial was based on the following:
* https://pachterlab.github.io/voyager/articles/visium_10x.html
* https://pachterlab.github.io/voyager/articles/vig1_visium_basic.html
* https://pachterlab.github.io/voyager/articles/vig2_visium.html
* https://pachterlab.github.io/voyager/articles/visium_10x_spatial.html
* https://pachterlab.github.io/voyager/articles/multispati.html