Welcome to RETROFIT, a Bioconductor package for reference-free learning of cell-type composition and cell-type-specific gene expression in spatial transcriptomics (ST).
If you find this R package or any part of this repository useful for your work, please kindly cite the following research article:
Roopali Singh, Xi He, Adam Keebum Park, Ross Cameron Hardison, Xiang Zhu, Qunhua Li. RETROFIT: Reference-free deconvolution of cell-type mixtures in spatial transcriptomics. bioRxiv (2023) https://doi.org/10.1101/2023.06.07.544126.
Correspondence should be addressed to X.Z. (xiangzhu[at]psu.edu
) and Q.L. (qunhua.li[at]psu.edu
).
ST profiles gene expression in intact tissues. However, ST data measured at each spatial location may represent gene expression of multiple cell types, making it difficult to identify cell-type-specific transcriptional variation across spatial contexts. Existing cell-type deconvolutions of ST data often require single-cell transcriptomic references, which can be limited by availability, completeness and platform effect of such references. We present RETROFIT, a reference-free Bayesian method that produces sparse and interpretable solutions to deconvolve cell types underlying each location independent of single-cell transcriptomic references.
Built on a Bayesian hierarchical model, RETROFIT deconvolves the ST data matrix (
The following figure shows the method schematic of RETROFIT. First, RETROFIT takes a ST data matrix as the only input and decomposes this matrix into latent components in an unsupervised manner, without using any single-cell transcriptomic information. Second, RETROFIT matches these latent components to known cell types using either a cell-type-specific gene expression reference or a list of cell-type-specific marker genes for the cell types present in the ST sample. Lastly, RETROFIT outputs a cell-type-specific gene expression matrix and a cell-type proportion matrix.
To install retrofit
from Bioconductor,
please start R (version "4.3") and enter:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("retrofit")
Alternatively, please follow these steps to install retrofit
from GitHub:
install.packages("devtools")
devtools::install_github("qunhualilab/retrofit")
Please follow the code below to run RETROFIT on your ST data.
library(retrofit)
## load the built-in demo data
utils::data(testSimulationData)
x <- testSimulationData$extra5_x
sc_ref <- testSimulationData$sc_ref
marker_ref <- testSimulationData$marker_ref
## decompose the ST data matrix into latent components
res <- retrofit::decompose(x, L=16, iterations=100, verbose=TRUE)
W <- res$w
H <- res$h
TH <- res$th
## match the latent components to known cell types using a cell-type-specific gene expression reference
res <- retrofit::annotateWithCorrelations(sc_ref, K=8, decomp_w=W, decomp_h=H)
W_annotated <- res$w
H_annotated <- res$h
cells <- res$ranked_cells
## match the latent components to known cell types using a list of cell-type-specific marker genes
res <- retrofit::annotateWithMarkers(marker_ref, K=8, decomp_w=W, decomp_h=H)
W_annotated <- res$w
H_annotated <- res$h
cells <- res$ranked_cells
Here we provide two vignettes illustrating how use the RETROFIT R package.
The first and simpler vignette (RMD file; HTML file) aims to help users get started with RETROFIT and understand its basic usage on a simulated ST dataset. Running the codes in this vignette will help users get an overall picture of what RETROFIT can do.
The second and slightly more advanced vignette (RMD file; HTML file) aims to showcase how to use RETROFIT in a real-world ST study. This vignette utilizes ST data from a human fetal intestine sample, generated on the 10x Genomics Visium Spatial Gene Expression by Fawkner-Corbett et al (Cell 2021).