Skip to content
Shanyr edited this page Dec 11, 2022 · 13 revisions

TIST- Transcriptome and Histopathological Image Integrative Analysis for Spatial Transcriptomics

Introduction

Sequencing-based spatial transcriptomics (ST) is an emerging technique to study in situ gene expression patterns at whole-genome scale. However, this technique still has several limitations, especially high data dropouts and local molecular diffusion effects. Except the transcriptomic data, the technique usually generates matched histopathological images for the same tissue sample. The image data are of high spatial continuity and resolution, which can provide complementary cellular phenotypical information with the noisy ST data.

Here, we propose a novel ST data analysis method called TIST (Transcriptome and histopathological Image integrative analysis for Spatial Transcriptomics) by integrating the information from the sequencing-based ST data and the histopathological images. TIST uses Markov random field to learn the macroscopic cellular features from histopathological images, and devises a random-walk-based strategy to integrate the extracted image features, the transcriptomic features and the location information for spatial cluster (SC) identification and gene expression enhancement. The workflow of TIST is as follows: TIST

Basic instructions

System Requirements

  • R version: >= 3.5.0

Installation

Firstly, please install or update the package devtools by running

install.packages("devtools")

Then the TIST can be installed via

library(devtools)
devtools::install_github("ShanYiran/TIST")

Hint:

Note:

  1. A dependent package NNLM was removed from the CRAN repository recently, so an error about it may be reported during the installation. If so, you can install a formerly available version manually from its archive.

Loading

library(TIST)

Data preparation

The TIST is mainly designed for 10X Genomics platform, and it requires a data folder containing the results generated by the software Space Ranger. In general, the data folder needs to be organized as following which is the output of Space Ranger:

    /sampleFolder
    ├── filtered_feature_bc_matrix
    │   ├── barcodes.tsv.gz
    │   ├── features.tsv.gz
    │   └── matrix.mtx.gz
    ├── spatial
    │   ├── tissue_hires_image.png
    │   ├── tissue_lowres_image.png
    │   ├── scalefactors_json.json
	│	└── tissue_positions_list.csv
    └── web_summary.html

For other droplet-based platforms, the data folder should be prepared likewise.

Quick start

Here, we provide an example data of Mouse brain from 10X Genomics. Users can download it and run following scripts to understand the workflow of TIST. when there is no mask file, we need to call a function to generate this mask file by:

generate_mask(imagefile,
              savePath,
              sacle_score,
              spaceFile,
              spot_r_max = 20)
library(TIST)
#The address where the Mask file is stored. You can use no mask to generate an all 1 matrix. The more recommended approach is generate this mask file use our python code in this package.
Maskfile <- "./data/V1_Adult_Mouse_Brain/spatial/Imginit/mask1.txt"
#The address where the Image file is stored. 
imagefile <- "./data/V1_Adult_Mouse_Brain/spatial/tissue_hires_image.png"
#The address where the results stored.
savePath <- "./data/V1_Adult_Mouse_Brain/test_package_results/"
#The address where the local file of all spots is stored.
spaceFile <- "./data/V1_Adult_Mouse_Brain/spatial/tissue_positions_list.csv"
#The address where the RNA-seq file is stored.
exprPath = "./data/V1_Adult_Mouse_Brain/filtered_feature_bc_matrix/"
#The scale to map the spots coordinate in spaceFile with image.
sacle_score <- 0.17011142

#Run Meta_St_img_unsupervised
Spot_manifest_imgunsup <- Meta_St_img_unsupervised(Maskfile = Maskfile,
                                                     imagefile = imagefile,
                                                     spaceFile = spaceFile,
                                                     exprPath = exprPath,
                                                     colors = NULL,
                                                     savePath = savePath,
                                                     Method = "walktrap",
                                                     sacle_score = sacle_score)

Running the Meta_St_img_unsupervised script will generate some files/folders as below:

  1. Spot_manifest_imgunsup.csv : Cluster labels of each sample.
  2. ST_imgunsupnet.RDS : The multi-feature network integrating histological and transcriptomic features: TIST-net.
  3. McRFlabel.RDS : Initiation labels of histopathological images.
  4. expr_obj.RDS : Expression data in Seurat format.
  5. ST_meta_imgunsup_plot.png : Visualized results of clustering.

TIST_img1

We then integrate all spots within one cluster, and obtain a meta expression matrix in Meta_expr_matrix.

Mc_manifest <- Meta_expr_matrix(exprPath = exprPath,Spot_manifest = Spot_manifest_imgunsup,
                                  imagefile = imagefile,
                                  savePath = savePath,merge_method = "mean")

Running the Meta_expr_matrix script will generate some files as below:

  1. Mc_obj.RDS : Integrative meta expression data in Seurat format.
  2. Mc_manifest : TIST-net for downstream analysis.
  3. MC_idents.RDS : Initiation labels of histopathological images.
  4. expr_obj.RDS : Expression data in Seurat format.

Downstream analysis could be achieved to identify spatially differentially expressed genes (SDE genes). We listed the comparison between TIST and a popular SDE genes detection method named SPARK.

Spark_methods(exprPath,
			  spaceFile,
			  savePath)
#Saving path of gene expression data
SC_obj_file <- paste0(savePath,"expr_obj.RDS")
#Saving path of clustering results to show the similarity within clusters of expression data.
MC_obj_file <- paste0(savePath,"Mc_obj.RDS")
#TIST information
netfile <- paste0(savePath,"ST_imgunsupnet.RDS")
SPARK_file <- paste0(savePath,"SPARK.rds")
mc_marker_plot(gl = NULL,MC_obj_file,Mc_manifest,Spot_manifest_imgunsup,savePath)
#Expression data in Seurat format
SC_obj <- readRDS(SC_obj_file)
SpaceDiffGene(SC_obj = SC_obj,
                Spot_manifest= Spot_manifest_imgunsup,
                savePath = savePath,
                MC_obj_file = MC_obj_file,
                Mc_manifest = Mc_manifest,
                methods= "walktrap",
                SPARK_file = NULL,
                netfile = netfile)


Running the SpaceDiffGene script will generate some files as below:

  1. TIST_markers.csv : Spatially differentially expressed genes identified by TIST in .csv format.
  2. TIST_markers.RDS : Spatially differentially expressed genes identified by TIST in .RDS format.

TIST_img2

Dataset

  1. The 12 10X Genomics datasets were downloaded from 10X genomics support https://www.10xgenomics.com/resources/datasets;
  2. The human liver datasets were downloaded from GSA under accession number HRA000437.