In [None]:
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>", fig.align = "center"
)

In [None]:
# Install Google Colab dependencies
# Note: this can take 30+ minutes (many of the dependencies include C++ code, which needs to be compiled)

# First install `sf`, `ragg` and `textshaping` and their system dependencies:
system("apt-get -y update && apt-get install -y  libudunits2-dev libgdal-dev libgeos-dev libproj-dev libharfbuzz-dev libfribidi-dev")
install.packages("sf")
install.packages("textshaping")
install.packages("ragg")

# Install system dependencies of some other R packages that Voyager either imports or suggests:
system("apt-get install -y libfribidi-dev libcairo2-dev libmagick++-dev")

# Install Voyager from Bioconductor:
install.packages("BiocManager")
BiocManager::install(version = "3.16", ask = FALSE, update = FALSE, Ncpus = 2)
BiocManager::install("scater")
system.time(
  BiocManager::install("Voyager", dependencies = TRUE, Ncpus = 2, update = FALSE)
)

packageVersion("Voyager")

# Introduction

Xenium is a new technology from 10X genomics for single cell resolution smFISH based spatial transcriptomics. The first Xenium dataset is for formalin fixed paraffin embedded (FFPE) human breast tumor, reported in [@Janesick2022-rp] and downloaded from the [10X website](https://www.10xgenomics.com/products/xenium-in-situ/preview-dataset-human-breast).

The gene count matrix was downloaded as an HDF5 file and read into R as a `SingleCellExperiment` (SCE) object with `DropletUtils::read10xCounts()`. The gene count matrix is originally a `DelayedArray`, so the data is not all loaded into memory. For now, the matrix has been converted into an in memory `dgCMatrix`. However, for the next release, we would like to write another vignette on on disk analyses. The challenge is representing `sf` data frames on disk, perhaps with [`sedona`](https://github.com/apache/incubator-sedona) and [`SQLDataFrame`](https://bioconductor.org/packages/release/bioc/html/SQLDataFrame.html).

The cell metadata (including centroid coordinates) and cell segmentation polygons were downloaded as `parquet` files, a more compact way to store columnar data than CSV, and read into R as data frames with `arrow::read_parquet()`. The cell polygons were converted into `sf` data frame with `SpatialFeatureExperiment::df2sf()`. Then the SCE object was converted into `SpatialFeatureExperiment` (SFE) and the polygon geometry was added to the SFE object, which is in the `SFEData` package.

Here we load the packages used in this vignette.

In [None]:
library(Voyager)
library(SFEData) # Pushed fix to Bioc but changes may take a while to show up
library(SingleCellExperiment)
library(SpatialExperiment)
library(SpatialFeatureExperiment)
library(ggplot2)
library(stringr)
library(scater) # devel version of plotExpression
library(scuttle)
library(BiocParallel)
library(BiocSingular)
library(bluster)
library(scran)
library(patchwork)
theme_set(theme_void())

In [None]:
# Fixed version this function may not immediately show up on Bioconductor
# Use remotes::install_github("pachterlab/SFEData") if Bioc version doesn't work
(sfe <- JanesickBreastData(dataset = "rep2"))

There are 118708 cells in this dataset, a little more than in the CosMX dataset.

The SFE object doesn't have column names (i.e. cell IDs). Here we assign cell IDs.

In [None]:
colnames(sfe) <- seq_len(ncol(sfe))

This is what the tissue, with the cell outlines, looks like

In [None]:
ggplot(cellSeg(sfe)) + geom_sf() +
    theme_bw() +
    scale_x_continuous(expand = expansion()) +
    scale_y_continuous(expand = expansion())

Plot cell density in space

In [None]:
plotCellBin2D(sfe) + theme_bw()

# Quality control
## Cells
Some QC metrics are precomputed and are stored in `colData`

In [None]:
names(colData(sfe))

Since there're more cells, it would be better to plot the tissue larger, so we'll plot the histogram of QC metrics and the spatial plots separately, unlike in the CosMx vignette.

In [None]:
n_panel <- 313
colData(sfe)$nCounts_normed <- sfe$nCounts/n_panel
colData(sfe)$nGenes_normed <- sfe$nGenes/n_panel

Here we divided nCounts by the total number of genes probed, so this histogram is comparable to those from other smFISH-based datasets. 

In [None]:
plotColDataHistogram(sfe, c("nCounts_normed", "nGenes_normed")) + theme_bw()

Compared to the [FFPE CosMX non-small cell lung cancer dataset](https://pachterlab.github.io/voyager/articles/vig4_cosmx.html#cells), more transcripts per gene on average and a larger proportion of all genes are detected in this dataset, which is also FFPE. However, this should be interpreted with care, since these two datasets are from different tissues and have different gene panels, so this may or may not indicate that Xenium has better detection efficiency than CosMX.

In [None]:
plotSpatialFeature(sfe, "nCounts", colGeometryName = "cellSeg")

There seem to be FOV artifacts. However, the cell ID and FOV information were unavailable so we cannot examine them. 

In [None]:
plotSpatialFeature(sfe, "nGenes", colGeometryName = "cellSeg")

A standard examination is to look at the relationship between nCounts and nGenes:

In [None]:
plotColDataBin2D(sfe, "nCounts", "nGenes") + theme_bw()

There appear to be two branches. 

Here we plot the distribution of cell area

In [None]:
plotColDataHistogram(sfe, c("cell_area", "nucleus_area"), scales = "free_y") +
    theme_bw()

That should be in pixels. There's a very long tail. The nuclei are much smaller than the cells.

How is cell area distributed in space?

In [None]:
plotSpatialFeature(sfe, "cell_area", colGeometryName = "cellSeg")

Cells in the sparse region tend to be larger than those in the dense region. This may be biological or an artifact of the cell segmentation algorithm or both.

Here the nuclei segmentations are plotted instead of cell segmentation. The nuclei are much smaller to the extent that they are difficult to see.

In [None]:
plotSpatialFeature(sfe, "nucleus_area", colGeometryName = "nucSeg")

There's an outlier near the right edge of the section, throwing off the dynamic range of the plot. Upon inspection of the H&E image, the outlier is a bit of tissue debris that doesn't look like a cell. But we can still that cells in the dense, gland like regions tend to have larger nuclei. This may be biological, or that nuclei are so densely packed in those regions that they are more likely to be undersegmented, i.e. when multiple nuclei are counted as one by the nuclei segmentation program, or both.

These observations motivate an examination of the relationship between cell area and nuclei area:

In [None]:
plotColDataBin2D(sfe, "cell_area", "nucleus_area") + theme_bw() +
    scale_fill_viridis_c()

Again, there are two branches, probably related to cell density and cell type. The nucleus outlier also has large cell area, though it is not as much an outlier in cell area. However, it is a spatial outlier as it's unusually large compared to its neighbors (scroll up two plots back). 

Next we calculate the proportion of cell in this z-plane taken up by the nucleus, and examine the distribution:

In [None]:
colData(sfe)$prop_nuc <- sfe$nucleus_area / sfe$cell_area

In [None]:
plotColDataHistogram(sfe, "prop_nuc") + theme_bw()

This distribution could have been generated from two peaks that were combined. From the histogram, there do not seem to be cells without nuclei or segmentation artifacts where the nucleus is larger than the cell. However, there are so many cells in this dataset and it is possible that just a few cells would not be visible on this histogram. We double check:

In [None]:
# No nucleus
sum(sfe$nucleus_area < 1)
# Nucleus larger than cell
sum(sfe$nucleus_area > sfe$cell_area)

So there are no cells without nuclei or nuclei larger than their cells. Here we plot the nuclei proportion in space:

In [None]:
plotSpatialFeature(sfe, "prop_nuc", colGeometryName = "cellSeg")

Cells in some histological regions have larger proportions occupied by the nuclei. It is interesting to check, controlling for cell type, how cell area, nucleus area, and the proportion of cell occupied by nucleus relate to gene expression. However, a problem in performing such an analysis is that cell segmentation is only available for one z-plane here and these areas also relate to where this z-plane intersects each cell. 

Below we plot a 2D histogram to better show the density of points on this plot:

In [None]:
plotColDataBin2D(sfe, "cell_area", "prop_nuc") + theme_bw() +
    scale_fill_viridis_c()

Smaller cells tend to have higher proportion occupied by the nucleus. This can be related to cell type, or it could be a limitation in how small the nuclei can be in this tissue.

We also examine the relationship between nucleus area and the proportion of cell occupied by the nucleus:

In [None]:
plotColDataBin2D(sfe, "nucleus_area", "prop_nuc") + theme_bw() +
    scale_fill_viridis_c()

The outlier is obvious. There are more cells with both small nuclei and low proportion of area occupied by the nucleus.

## Negative controls
Since there are only a few hundred genes plus negative control probes, all row names of the SFE object can be printed out to find what the negative control probes are called.

In [None]:
rownames(sfe)

According to the Xenium paper [@Janesick2022-rp], there are 3 types of controls:

> 1) probe controls to assess non-specific binding to RNA, 
2) decoding controls to assess misassigned genes, and 
3) genomic DNA (gDNA) controls to ensure the signal is from RNA.

The paper does not explain in detail how those control probes were designed, nor explain what the blank probes are. But the blank probes can be used as a negative control.

In [None]:
is_blank <- str_detect(rownames(sfe), "^BLANK_")
sum(is_blank)

This should be number 1, the probe control

In [None]:
is_neg <- str_detect(rownames(sfe), "^NegControlProbe")
sum(is_neg)

This should be number 2, the decoding control

In [None]:
is_neg2 <- str_detect(rownames(sfe), "^NegControlCodeword")
sum(is_neg2)

This must be number 3, gDNA control

In [None]:
is_anti <- str_detect(rownames(sfe), "^antisense")
sum(is_anti)

Also make an indicator of whether a feature is any sort of negative control

In [None]:
is_any_neg <- is_blank | is_neg | is_neg2 | is_anti

The `addPerCellQCMetrics()` function in the `scuttle` package can conveniently add transcript counts, proportion of total counts, and number of features detected for any subset of features to the SCE object. Here we do this for the SFE object, as SFE inherits from SCE. 

In [None]:
sfe <- addPerCellQCMetrics(sfe, subsets = list(blank = is_blank,
                                               negProbe = is_neg,
                                               negCodeword = is_neg2,
                                               anti = is_anti,
                                               any_neg = is_any_neg))

In [None]:
names(colData(sfe))

Next we plot the proportion of transcript counts coming from any negative control. 

In [None]:
cols_use <- names(colData(sfe))[str_detect(names(colData(sfe)), "_percent$")]
plotColDataHistogram(sfe, cols_use, bins = 100, ncol = 3) + theme_bw()

The histogram is dominated by the bin at zero and there are some extreme outliers too few to be seen but evident from the scale of the x axis. We also plot the histogram only for cells with at least 1 count from a negative control. The NA's come from cells that got segmented but have no transcripts detected.

In [None]:
plotColDataHistogram(sfe, cols_use, bins = 100, ncol = 3) + 
    scale_x_log10() +
    annotation_logticks(sides = "b") +
    theme_bw()

The vast majority of these cells have less than 1% of transcript counts from negative controls, but there are outliers with up to 50%. 

Next we plot the distribution of the number of negative control counts per cell:

In [None]:
cols_use2 <- names(colData(sfe))[str_detect(names(colData(sfe)), "_detected$")]
plotColDataHistogram(sfe, cols_use2, bins = 20, ncol = 3) +
    # Avoid decimal breaks on x axis unless there're too few breaks
    scale_x_continuous(breaks = scales::breaks_extended(Q = c(1,2,5))) +
    theme_bw()

The counts are low, mostly zero, but there are outliers with up to 10 counts of all types aggregated. Then the outlier with 50% of counts from negative controls must have very low total real transcript counts to begin with.

The `scuttle` package can detect outliers, but by default it assigns anything above zero as an outlier, since that is over 3 median absolute deviations (MADs) away from the median, which is 0, and the MAD is 0 since the vast majority of cells don't have any negative control count. But it makes sense to allow a small proportion of negative controls. Here we use the distribution just for cells with at least 1 negative control count to find outliers. This distribution has a very long tail and some definite outliers.

The code below extracts the outliers, based only on cells with at least one negative control count

In [None]:
get_neg_ctrl_outliers <- function(col, sfe) {
    inds <- colData(sfe)$nCounts > 0 & colData(sfe)[[col]] > 0
    df <- colData(sfe)[inds,]
    outlier_inds <- isOutlier(df[[col]], type = "higher")
    outliers <- rownames(df)[outlier_inds]
    col2 <- str_remove(col, "^subsets_")
    col2 <- str_remove(col2, "_percent$")
    new_colname <- paste("is", col2, "outlier", sep = "_")
    colData(sfe)[[new_colname]] <- colnames(sfe) %in% outliers
    sfe
}

In [None]:
cols_use <- names(colData(sfe))[str_detect(names(colData(sfe)), "_percent$")]
for (n in cols_use) {
    sfe <- get_neg_ctrl_outliers(n, sfe)
}

In [None]:
names(colData(sfe))

Below we examine where the outliers are located in space:

In [None]:
plotSpatialFeature(sfe, "is_blank_outlier", colGeometryName = "cellSeg")

We find that the outliers are difficult to see:

In [None]:
plotColData(sfe, y = "is_blank_outlier", x = "cell_area", 
            # point_fun = function(...) list() # uncomment to not to plot points (only in devel version of scater)
            ) 

The analysis reveals that the outliers seem to be smaller. Outliers for negative probe controls and negative codeword controls are also hard to see on the plot, so their plots are skipped here. But the top left region in the tissue tends to have more counts from antisense controls. 

In [None]:
plotSpatialFeature(sfe, "is_anti_outlier", colGeometryName = "cellSeg")

Now that we have identified the outliers, we can remove them along with empty cells before proceeding to further analysis:

In [None]:
inds_keep <- sfe$nCounts > 0 & sfe$nucleus_area < 400 & !sfe$is_anti_outlier &
    !sfe$is_blank_outlier & !sfe$is_negCodeword_outlier & !sfe$is_negProbe_outlier
(sfe <- sfe[,inds_keep])

Over 1000 cells were removed. 

Next we check how many negative control features are detected per cell:

In [None]:
plotColDataHistogram(sfe, cols_use2, bins = 20, ncol = 3) +
    # Avoid decimal breaks on x axis unless there're too few breaks
    scale_x_continuous(breaks = scales::breaks_extended(3, Q = c(1,2,5))) +
    theme_bw()

There are at most 3 counts per cell per type. For the non-outliers, each type is at most around 1%, so this data looks good. 

## Genes
Here we look at the mean and variance of each gene

In [None]:
rowData(sfe)$means <- rowMeans(counts(sfe))
rowData(sfe)$vars <- rowVars(counts(sfe))

Real genes generally have higher mean expression across cells than negative controls.

In [None]:
rowData(sfe)$is_neg <- is_any_neg
plotRowData(sfe, x = "means", y = "is_neg") +
    scale_y_log10() +
    annotation_logticks(sides = "b")

Here the real genes and negative controls are plotted in different colors

In [None]:
plotRowDataBin2D(sfe, "means", "vars", subset = "is_neg", 
                 name_true = "Counts (negative controls)", 
                 name_false = "Counts (real genes)", bins = 50) +
    geom_abline(slope = 1, intercept = 0, color = "red") +
    scale_x_log10() + scale_y_log10() +
    annotation_logticks() +
    coord_equal() +
    theme_bw()

The red line $y = x$ is expected if the data follows a Poisson distribution. Negative controls and real genes form mostly separate clusters. Negative controls stick close to the line, while real genes are overdispersed. Unlike in the [CosMX dataset](https://pachterlab.github.io/voyager/articles/vig4_cosmx.html#genes), the negative controls don't seem overdispersed.

# Spatial autocorrelation of QC metrics

There's a sparse and a dense region. This poses the question of what type of neighborhood graph to use, e.g. it is conceivable that cells in the sparse region should just be singletons. Furthermore, it is unclear what the length scale of their influence might be. It might depend on the cell type and how contact and secreted signals are used in the cell type, and length scale of the influence. If k nearest neighbors are used, then the neighbors in the dense region are much closer together than those in the sparse region. If distance based neighbors are used, then cells in the dense region will have more neighbors than cells in the sparse region, and the sparse region can break into multiple compartments if the distance cutoff is not long enough. 

For the purpose of demonstration, we use k nearest neighbors with $k = 5$, with inverse distance weighting. Note that using more neighbors leads to longer computation time of spatial autocorrelation metrics.

In [None]:
system.time(
    colGraph(sfe, "knn5") <- findSpatialNeighbors(sfe, method = "knearneigh", 
                                                  dist_type = "idw", k = 5, 
                                                  style = "W")
)

In [None]:
sfe <- colDataMoransI(sfe, c("nCounts", "nGenes", "cell_area", "nucleus_area"),
                      colGraphName = "knn5")

In [None]:
colFeatureData(sfe)[c("nCounts", "nGenes", "cell_area", "nucleus_area"),]

Global Moran's I indicatse positive spatial autocorrelation. As the strength of spatial autocorrelation can vary spatially, we also run local Moran's I.

In [None]:
sfe <- colDataUnivariate(sfe, type = "localmoran", 
                         features = c("nCounts", "nGenes", "cell_area", 
                                      "nucleus_area"),
                         colGraphName = "knn5", BPPARAM = MulticoreParam(2))

The `pointsize` argument adjusts the point size in `scattermore`. The default is 0, meaning single pixels, but since the cells in the sparse region are hard to see that way, we increase `pointsize`. We would still plot the polygons in larger single panel plots, but use `scattermore` in multi-panel plots where the polygons in each panel are invisible anyway due to the small size to save some time.

In [None]:
plotLocalResult(sfe, "localmoran",
                features = c("nCounts", "nGenes", "cell_area", "nucleus_area"),
                colGeometryName = "centroids", scattermore = TRUE,
                divergent = TRUE, diverge_center = 0, pointsize = 1)

Interestingly, nCounts is more homogeneous in the interior of the dense region, while nGenes is more homogeneous by the edge of the dense region. As expected, cell area is more homogeneous in the sparse region. However, the nucleus area is more homogeneous in the interior of the dense region. 

Moran plot for nCounts

In [None]:
sfe <- colDataUnivariate(sfe, "moran.plot", "nCounts", colGraphName = "knn5")

In [None]:
p1 <- moranPlot(sfe, "nCounts", binned = TRUE, plot_influential = FALSE) 
p2 <- moranPlot(sfe, "nCounts", binned = TRUE)
p1 / p2 + plot_layout(guides = "collect") & theme_bw()

There are no obvious clusters here. In the lower panel, the 2D histogram of influential points is plotted in red.

# Moran's I

By default, for gene expression, the log normalized counts are used in spatial autocorrelation metrics, so before running Moran's I, we normalize the data.

In [None]:
sfe <- logNormCounts(sfe)

Use more cores if available to speed this up.

In [None]:
system.time(
    sfe <- runMoransI(sfe, colGraphName = "knn5", BPPARAM = MulticoreParam(2))
)

In [None]:
rowData(sfe)$is_neg <- is_any_neg

In [None]:
plotRowData(sfe, x = "moran_sample01", y = "is_neg")

As expected, generally the negative controls are tightly clustered around 0, while the real genes have positive Moran's I, which means there is generally no technical artifact spatial trend. No significantly negative Moran's I is observed. Why is negative spatial autocorrelation so rare in gene expression?

What are the two negative controls with a sizable Moran's I?

In [None]:
ord <- order(rowData(sfe)$moran_sample01[is_any_neg], decreasing = TRUE)[1:2]
top_neg <- rownames(sfe)[is_any_neg][ord]
plotSpatialFeature(sfe, top_neg, colGeometryName = "centroids",
                   scattermore = TRUE, pointsize = 1)

There is somewhat a spatial trend for that antisense probe, with more detected in the upper left. However, this might not significantly affect other results since there are at most 2 counts and at most about 1% of all counts in each cell. The negative control codeword has at most 1 count per cell and the cells with this negative control detected seem to be few and far between. 

These are the most detected negative controls, and the most detected one is also the one with the highest Moran's I among negative controls. However, the other negative control with higher Moran's I is not among the most detected.

In [None]:
head(sort(rowData(sfe)$means[is_any_neg], decreasing = TRUE), 15)

What are the genes with the highest Moran's I?

In [None]:
top_moran <- rownames(sfe)[order(rowData(sfe)$moran_sample01, decreasing = TRUE)[1:6]]
plotSpatialFeature(sfe, top_moran, colGeometryName = "centroids",
                   scattermore = TRUE, ncol = 2, pointsize = 0.5)

They all highlight the same histological regions, as in the [CosMX vignette](https://pachterlab.github.io/voyager/articles/vig4_cosmx.html#morans-i). How does Moran's I relate to gene expression level?

In [None]:
plotRowData(sfe, x = "means", y = "moran_sample01")

Very highly expressed genes have higher Moran's I, but there are some less expressed genes with higher Moran's I as well.

# Non-spatial dimension reduction and clustering

Here we run non-spatial PCA as for scRNA-seq data

In [None]:
set.seed(29)
sfe <- runPCA(sfe, ncomponents = 30, scale = TRUE, BSPARAM = IrlbaParam())

In [None]:
ElbowPlot(sfe, ndims = 30) + theme_bw()

In [None]:
plotDimLoadings(sfe, dims = 1:6) + theme_bw()

In [None]:
spatialReducedDim(sfe, "PCA", 6, colGeometryName = "centroids", divergent = TRUE,
                  diverge_center = 0, ncol = 2, scattermore = TRUE, pointsize = 0.5)

While spatial region is not explicitly used, the PC's highlight spatial regions due to spatial autocorrelation in gene expression and histological regions with different cell types.

In [None]:
set.seed(29)
sfe <- runUMAP(sfe, dimred = "PCA", n_dimred = 15)

Non-spatial clustering and locating the clusters in space

In [None]:
colData(sfe)$cluster <- clusterRows(reducedDim(sfe, "PCA")[,1:15],
                                    BLUSPARAM = SNNGraphParam(
                                        cluster.fun = "leiden",
                                        cluster.args = list(
                                            resolution_parameter = 0.5,
                                            objective_function = "modularity")))

Now the `scater` can also rasterize the plots with lots of points with the `rasterise` argument, but with a different mechanism from `scattermore` that requires more system dependencies. 

In [None]:
plotPCA(sfe, ncomponents = 4, colour_by = "cluster", rasterise = FALSE)

In [None]:
plotUMAP(sfe, colour_by = "cluster", rasterise = FALSE)

Plot the location of the clusters in space

In [None]:
plotSpatialFeature(sfe, "cluster", colGeometryName = "cellSeg")

# Differential expression
Cluster marker genes are found with Wilcoxon rank sum test as commonly done for scRNA-seq.

In [None]:
markers <- findMarkers(sfe, groups = colData(sfe)$cluster,
                       test.type = "wilcox", pval.type = "all", direction = "up")

It's already sorted by p-values:

In [None]:
markers[[6]]

The code below extracts the significant markers for each cluster:

In [None]:
genes_use <- vapply(markers, function(x) rownames(x)[1], FUN.VALUE = character(1))
plotExpression(sfe, genes_use, x = "cluster", 
              # point_fun = function(...) list() # uncomment to not to plot points (only in devel version of scater)
              )

This allows for plotting more top marker genes in a heatmap:

In [None]:
genes_use2 <- unique(unlist(lapply(markers, function(x) rownames(x)[1:5])))
plotGroupedHeatmap(sfe, genes_use2, group = "cluster", colour = scales::viridis_pal()(100))

# Local spatial statistics of marker genes

First we plot those genes in space as a reference

In [None]:
plotSpatialFeature(sfe, genes_use, colGeometryName = "centroids", ncol = 3,
                   pointsize = 0.3, scattermore = TRUE)

Global Moran's I of these marker genes is shown below:

In [None]:
setNames(rowData(sfe)[genes_use, "moran_sample01"], genes_use)

All these marker genes have positive spatial autocorrelation, but some stronger than others.

Local Moran's I of these marker genes is shown below:

In [None]:
sfe <- runUnivariate(sfe, "localmoran", features = genes_use, colGraphName = "knn5",
                     BPPARAM = MulticoreParam(2))

In [None]:
plotLocalResult(sfe, type = "localmoran", features = genes_use, 
                colGeometryName = "centroids", ncol = 3, divergent = TRUE,
                diverge_center = 0, scattermore = TRUE, pointsize = 0.3)

It seems that some histological regions tend to be more spatially homogenous in gene expression than others. The epithelial region tends to be more homogenous. For some genes, regions with higher expression also have higher local Moran's I, such as FOXA1 and GATA3, while for some genes, this is not the case, such as FGL2 and LUM. 

Finally, we assess local spatial heteroscdasticity (LOSH) for these marker genes to find local heterogeneity:

In [None]:
sfe <- runUnivariate(sfe, "LOSH", features = genes_use, colGraphName = "knn5",
                     BPPARAM = MulticoreParam(2))

In [None]:
plotLocalResult(sfe, type = "LOSH", features = genes_use, 
                colGeometryName = "centroids", ncol = 3, scattermore = TRUE, 
                pointsize = 0.3)

Again, just like in the [CosMX dataset](https://pachterlab.github.io/voyager/articles/vig4_cosmx.html#local-spatial-statistics-of-marker-genes), LOSH is higher where the gene is more highly expressed in some (e.g. CD3E, LUM, TENT5C) but not all cases (e.g. FOXA1, GATA3). This may be due to spatial distribution of different cell types.

# Session info

In [None]:
sessionInfo()

# References