Spa3D is an algorithm that utilizes spatial pattern enhancement and graph convolutional neural network model to reconstruct 3D-based spatial structure from multiple spatially resolved transcriptomics (SRT) slides with 3D spatial coordinates.
Paper: 3D reconstruction of spatial transcriptomics with spatial pattern enhanced graph convolutional neural network
Keywords: Spatial transcriptomics, 3D reconstruction algorithm, graph convolutional network, spatial patten enhancement.
Below is an example of the Spa3D for the Human embryonic heart dataset data.
Inserting these data into a 3D human heart model can result in:
3.1. Download the package
The package can be downloaded by running the following command in the terminal:
git clone https://github.com/Lin-Xu-lab/Spa3D.git
Then, use
cd Spa3D
to access the downloaded folder.
If the "git clone" command does not work with your operation system, you can download the zip file from the website https://github.com/Lin-Xu-lab/Spa3D.git and decompress it. Then, the folder that you need to access is Spa3D-main.
Note, please see the SPE folder for the processes of spatial pattern enhancement and 3D data assembly.
3.2. Environment setup
The package has been successuflly tested in a Linux environment of python version 3.8.8, pandas version 1.3.4, and g++ version 11.2.0. An option to set up the environment is to use Conda (https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html).
You can use the following command to create an environment for Spa3D:
conda create -n myenv_Spa3D python=3.8.8 pandas=1.3.4 numpy==1.20.1
pip install numba==0.53.1 python-igraph torch louvain scipy scanpy anndata natsort sklearn
When you install the package, if we find a package is not available or is not consistent with your system, my suggestion is to use the nearest version that you could find. For example, if numpy==1.20.1 does not work with your system and the system messages tells you the earlist version that you could install is numpy==1.21.0, please go ahead to install it and see if you could obtain the same result with the notebook example below. I have tested numpy==1.21.0 using my workstation and it works.
After the environment is created, you can use the following command to activate it:
conda activate myenv_Spa3D
Please install Jupyter Notebook from https://jupyter.org/install. For example, you can run
pip install notebook
in the terminal to install the classic Jupyter Notebook.
3.3. Install Spa3D
To install Spa3D on your computer, please run
python setup.py install --user &> log
in the terminal.
After doing these successfully, you are supposed to be able to import Spa3D when you are using Python or Jupyter Notebook:
import Spa3D
Another option is to directly use
import __init__ as Spa3D
to import Spa3D in the current folder. This way is suggested if you would like to development your own method based on this Spa3D software.
Most of the parameters have a default value. There are two parameters that I think are the most important ones:
num_cluster: The number of clusters.
num_neighbors: The number of nearest neighbors for GCN clustering.
There are also some other parameters:
p: This parameter adjusts the weights of the current cell and its nearest neighbors when searching for the decal value, when p = 0.5, it means their weights are equal.
start: This parameter determines the lowest value of the searching range for the decay value.
end: This parameter determines the highest value of the searching range for the decay value.
For introductions about other parameters, please see the *.py files.
5.1. Spa3D example
There are one Spa3D examples in the "Spa3D-example" folder.
cd Spa3D-example
Please use jupyter notebook to open the spagcn_heart3D_week6_20230510.ipynb for the 3D human heart example. The input data is a 3D data with spatial pattern enhancement.
5.2. Preprocessing example
There is one preprocessing example in the "Spa3D-preprocessing-example" folder.
cd preprocessing-example
This folder also includes the MPI code for spatial pattern enhancement.
Please use Jupyter notebook to open the preprocessing_alignment.ipynb for the Human Heart example.
5.3. The notebook script for Spa3D example
Below is the notebook script for the Spa3D example in 5.1. First, please decompress the data.zip file and obtain the data folder. Then, please type
python3 -m notebook &
to open the Jupyter Notebook. Left click the Spa3D_heart_example.ipynb file to open it.
Run the code below to import packages:
import sys, os, csv, re, math, cv2, random, torch
import pandas as pd
import anndata as ad
import numpy as np
import scanpy as sc
import numba
import louvain
import scipy
import natsort
import sklearn
import igraph
from scipy.sparse import issparse
from sklearn.metrics.cluster import adjusted_rand_score
import matplotlib.colors as clr
import matplotlib.pyplot as plt
Run the code below to import Spa3D:
import __init__ as Spa3D
Run the code below to print the version of packages:
print("Spa3D version: " + Spa3D.__version__)
print("python version: " + sys.version)
print("pandas version: " + pd.__version__)
print("numpy version: " + np.__version__)
print("numba version: " + numba.__version__)
print("python-igraph: " + igraph.__version__)
print("torch version: " + torch.__version__)
print("louvain version: " + louvain.__version__)
print("scipy version: " + scipy.__version__)
print("scanpy version: " + sc.__version__)
print("anndata version: " + ad.__version__)
print("natsort version: " + natsort.__version__)
print("sklearn version: " + sklearn.__version__)
The output is:
Load data and apply simple processings such as norm and log.
adata = sc.read("/home/chentang/data/humanHeart/adata3D_week6_part_aligned_lfft_filter.h5ad")
adata.var_names_make_unique()
Spa3D.preprocessing_filter_by_gene_number(adata,min_cells=3)
sc.pp.normalize_per_cell(adata)
sc.pp.log1p(adata)
Create x_array, y_array, z_array, x_pixel, y_pixel, z_pixel, and set up the paramters:
x_array = adata.obs["x_array"].tolist()
y_array = adata.obs["y_array"].tolist()
z_array = adata.obs["z_array"].tolist()
x_pixel = adata.obs["x_pixel_aligned"].tolist()
y_pixel = adata.obs["y_pixel_aligned"].tolist()
z_pixel = adata.obs["z_pixel"].tolist()
p = 0.5
start = 0.01
end = 200
num_clusters = 5
num_neighbors = 20
r_seed = t_seed = n_seed = 100
Calculate the adjacent matrix, decay_value, and the resolution:
adj3d = Spa3D.calculate_adj_matrix_3D(x = x_pixel, y = y_pixel, m = z_pixel)
decay_value = Spa3D.search_decay_value_3D(adj3d, p = p, start = start, end = end)
resolution = Spa3D.search_proper_resolution_3D(adata, adj3d, decay_value, num_clusters = num_clusters, num_neighbors = num_neighbors, r_seed = r_seed, t_seed = t_seed, n_seed = n_seed)
The output is:
GCN clustering
random.seed(r_seed)
torch.manual_seed(t_seed)
np.random.seed(n_seed)
clf = Spa3D.Spa3D()
clf.set_l(decay_value)
clf.train(adata, adj3d, init_spa = True,init = "louvain",res = resolution, tol=5e-16, lr=0.05, num_neighbors=num_neighbors, max_epochs=2000)
y_pred, prob, pca=clf.predict()
adata.obs["pred"]= y_pred
adata.obs["pred"]=adata.obs["pred"].astype('category')
The output is:
Plot the Spa3D Figure:
plot_color=["#F56867","#7495D3","#59BE86","#FEB915","#997273"]
x_array = adata.obs["x_array"]
y_array = adata.obs["y_array"]
z_array = 12 - adata.obs["z_array"]
z_array_cluster = set(z_array.to_list())
x_pixel = adata.obs["x_pixel_aligned"]
y_pixel = adata.obs["y_pixel_aligned"]
z_pixel = adata.obs["z_pixel"]
fig = plt.figure(figsize=(16, 16))
ax = fig.add_subplot(projection='3d')
ax.grid(False)
dict_x_pixel = {}
dict_y_pixel = {}
dict_z_array = {}
dict_z_pixel = {}
pred = adata.obs["pred"].to_numpy()
cluster = set(pred)
#plt.axis('off')
ax.set_xlim3d(0, 1000)
for j in z_array_cluster:
i = 0
for id in cluster:
dict_x_pixel[id] = x_pixel[(pred == id) & (z_array == j)].tolist()
dict_y_pixel[id] = y_pixel[(pred == id) & (z_array == j)].tolist()
dict_z_pixel[id] = z_pixel[(pred == id) & (z_array == j)].tolist()
# ax.axes.invert_yaxis()
ax.scatter(dict_z_pixel[id], dict_x_pixel[id],dict_y_pixel[id], c=plot_color[i], linewidths = 5.0)
i = i + 1
Plot the Spa3D Figure that can show each individual slide:
plot_color=["#F56867","#7495D3","#59BE86","#FEB915","#997273"]
x_array = adata.obs["x_array"]
y_array = adata.obs["y_array"]
z_array = 12 - adata.obs["z_array"]
z_array_cluster = set(z_array.to_list())
x_pixel = adata.obs["x_pixel_aligned"]
y_pixel = adata.obs["y_pixel_aligned"]
z_pixel = adata.obs["z_pixel"]
fig = plt.figure(figsize=(8, 8))
ax = fig.add_subplot(projection='3d')
ax.grid(False)
dict_x_pixel = {}
dict_y_pixel = {}
dict_z_array = {}
pred = adata.obs["pred"].to_numpy()
cluster = set(pred)
plt.axis('off')
for j in z_array_cluster:
i = 0
for id in cluster:
dict_x_pixel[id] = x_pixel[(pred == id) & (z_array == j)].tolist()
dict_y_pixel[id] = y_pixel[(pred == id) & (z_array == j)].tolist()
dict_z_array[id] = z_array[(pred == id) & (z_array == j)].tolist()
# ax.axes.invert_yaxis()
ax.scatter(dict_z_array[id], dict_x_pixel[id],dict_y_pixel[id], c=plot_color[i], linewidths = 2.0)
i = i + 1
The output is:
Calculate Spa3D ARI:
adata.obs["annotation2"] = adata.obs["annotation"]
domains="annotation2"
b = adata.obs[domains].to_numpy()
n = len(b)
for i in range(n):
if b[i] == 0:
b[i] = 0
elif b[i] == 1:
b[i] = 1
elif b[i] == 2:
b[i] = 1
elif b[i] == 3:
b[i] = 1
elif b[i] == 4:
b[i] = 2
elif b[i] == 5:
b[i] = 3
elif b[i] == 6:
b[i] = 3
elif b[i] == 7:
b[i] = 3
elif b[i] == 8:
b[i] = 4
elif b[i] == 9:
b[i] = 0
else:
b[i] = 0
b = b.astype(int)
adata.obs[domains] = b
domains1="pred"
domains2="annotation2"
a = adata.obs[domains1].to_numpy()
b = adata.obs[domains2].to_numpy()
n = len(b)
a = a.astype(float)
b = b.astype(float)
b[np.isnan(b)] = 0
ARI = adjusted_rand_score(a, b)
print("ARI =", ARI)
The output is:
Plot the slide ID 9 from Spa3D:
adata_sub = adata[adata.obs['z_array'].isin([9]),:]
#Set colors used
plot_color=["#F56867","#7495D3","#59BE86","#FEB915","#997273"]
#Plot spatial domains
domains="pred"
num_celltype=len(adata_sub.obs[domains].unique())
adata_sub.uns[domains+"_colors"]=list(plot_color[:num_celltype])
ax=sc.pl.scatter(adata_sub,alpha=1,y="y_pixel_aligned",x="x_pixel_aligned",color=domains,title="Spa3D",color_map=plot_color,show=False,size=100000/adata_sub.shape[0])
ax.set_aspect('equal', 'box')
#ax.axes.invert_yaxis()
#plt.savefig(pred_pngfilename, dpi=600)
#plt.close()
plot_color=["#F56867","#7495D3","#FEB915","#59BE86","#997273"]
#Plot refined spatial domains
domains="annotation2"
adata_sub.obs[domains] = adata_sub.obs[domains].astype(str)
num_celltype=len(adata_sub.obs[domains].unique())
adata_sub.uns[domains+"_colors"]=list(plot_color[:num_celltype])
ax=sc.pl.scatter(adata_sub,alpha=1,y="y_pixel_aligned",x="x_pixel_aligned",color=domains,title="annotation",color_map=plot_color,show=False,size=100000/adata_sub.shape[0])
ax.set_aspect('equal', 'box')
#ax.axes.invert_yaxis()
#plt.savefig(pred_pngfilename, dpi=600)
#plt.close()
domains1="pred"
domains2="annotation2"
a = adata_sub.obs[domains1].to_numpy()
b = adata_sub.obs[domains2].to_numpy()
n = len(b)
a = a.astype(float)
b = b.astype(float)
b[np.isnan(b)] = 0
ARI = adjusted_rand_score(a, b)
print("ARI =", ARI)
The output is:
Plot the slide ID 10 from Spa3D::
adata_sub = adata[adata.obs['z_array'].isin([10]),:]
#Set colors used
plot_color=["#F56867","#7495D3","#59BE86","#FEB915","#997273"]
#Plot spatial domains
domains="pred"
num_celltype=len(adata_sub.obs[domains].unique())
adata_sub.uns[domains+"_colors"]=list(plot_color[:num_celltype])
ax=sc.pl.scatter(adata_sub,alpha=1,y="y_pixel_aligned",x="x_pixel_aligned",color=domains,title="Spa3D",color_map=plot_color,show=False,size=100000/adata_sub.shape[0])
ax.set_aspect('equal', 'box')
#ax.axes.invert_yaxis()
#plt.savefig(pred_pngfilename, dpi=600)
#plt.close()
plot_color=["#F56867","#7495D3","#FEB915","#59BE86","#997273"]
#Plot refined spatial domains
domains="annotation2"
adata_sub.obs[domains] = adata_sub.obs[domains].astype(str)
num_celltype=len(adata_sub.obs[domains].unique())
adata_sub.uns[domains+"_colors"]=list(plot_color[:num_celltype])
ax=sc.pl.scatter(adata_sub,alpha=1,y="y_pixel_aligned",x="x_pixel_aligned",color=domains,title="annotation",color_map=plot_color,show=False,size=100000/adata_sub.shape[0])
ax.set_aspect('equal', 'box')
#ax.axes.invert_yaxis()
#plt.savefig(pred_pngfilename, dpi=600)
#plt.close()
domains1="pred"
domains2="annotation2"
a = adata_sub.obs[domains1].to_numpy()
b = adata_sub.obs[domains2].to_numpy()
n = len(b)
a = a.astype(float)
b = b.astype(float)
b[np.isnan(b)] = 0
ARI = adjusted_rand_score(a, b)
print("ARI =", ARI)
The output is:
Plot the slide ID 11 from Spa3D::
adata_sub = adata[adata.obs['z_array'].isin([11]),:]
#Set colors used
plot_color=["#F56867","#7495D3","#59BE86","#FEB915","#997273"]
#Plot spatial domains
domains="pred"
num_celltype=len(adata_sub.obs[domains].unique())
adata_sub.uns[domains+"_colors"]=list(plot_color[:num_celltype])
ax=sc.pl.scatter(adata_sub,alpha=1,y="y_pixel_aligned",x="x_pixel_aligned",color=domains,title="Spa3D",color_map=plot_color,show=False,size=100000/adata_sub.shape[0])
ax.set_aspect('equal', 'box')
#ax.axes.invert_yaxis()
#plt.savefig(pred_pngfilename, dpi=600)
#plt.close()
plot_color=["#F56867","#7495D3","#FEB915","#59BE86","#997273"]
#Plot refined spatial domains
domains="annotation2"
adata_sub.obs[domains] = adata_sub.obs[domains].astype(str)
num_celltype=len(adata_sub.obs[domains].unique())
adata_sub.uns[domains+"_colors"]=list(plot_color[:num_celltype])
ax=sc.pl.scatter(adata_sub,alpha=1,y="y_pixel_aligned",x="x_pixel_aligned",color=domains,title="annotation",color_map=plot_color,show=False,size=100000/adata_sub.shape[0])
ax.set_aspect('equal', 'box')
#ax.axes.invert_yaxis()
#plt.savefig(pred_pngfilename, dpi=600)
#plt.close()
domains1="pred"
domains2="annotation2"
a = adata_sub.obs[domains1].to_numpy()
b = adata_sub.obs[domains2].to_numpy()
n = len(b)
a = a.astype(float)
b = b.astype(float)
b[np.isnan(b)] = 0
ARI = adjusted_rand_score(a, b)
print("ARI =", ARI)
The output is:
Plot the slide ID 12 from Spa3D::
adata_sub = adata[adata.obs['z_array'].isin([12]),:]
#Set colors used
plot_color=["#F56867","#7495D3","#59BE86","#FEB915","#997273"]
#Plot spatial domains
domains="pred"
num_celltype=len(adata_sub.obs[domains].unique())
adata_sub.uns[domains+"_colors"]=list(plot_color[:num_celltype])
ax=sc.pl.scatter(adata_sub,alpha=1,y="y_pixel_aligned",x="x_pixel_aligned",color=domains,title="Spa3D",color_map=plot_color,show=False,size=100000/adata_sub.shape[0])
ax.set_aspect('equal', 'box')
#ax.axes.invert_yaxis()
#plt.savefig(pred_pngfilename, dpi=600)
#plt.close()
plot_color=["#F56867","#7495D3","#FEB915","#59BE86","#997273"]
#Plot refined spatial domains
domains="annotation2"
adata_sub.obs[domains] = adata_sub.obs[domains].astype(str)
num_celltype=len(adata_sub.obs[domains].unique())
adata_sub.uns[domains+"_colors"]=list(plot_color[:num_celltype])
ax=sc.pl.scatter(adata_sub,alpha=1,y="y_pixel_aligned",x="x_pixel_aligned",color=domains,title="annotation",color_map=plot_color,show=False,size=100000/adata_sub.shape[0])
ax.set_aspect('equal', 'box')
#ax.axes.invert_yaxis()
#plt.savefig(pred_pngfilename, dpi=600)
#plt.close()
domains1="pred"
domains2="annotation2"
a = adata_sub.obs[domains1].to_numpy()
b = adata_sub.obs[domains2].to_numpy()
n = len(b)
a = a.astype(float)
b = b.astype(float)
b[np.isnan(b)] = 0
ARI = adjusted_rand_score(a, b)
print("ARI =", ARI)
The output is:
Please contact our team if you have any questions:
Chen Tang (Chen.Tang@UTSouthwestern.edu)
Lei Dong (Lei.Dong@UTSouthwestern.edu)
Xue Xiao (Xiao.Xue@UTSouthwestern.edu)
Yuansheng Zhou (Yuansheng.Zhou@UTSouthwestern.edu)
Lin Xu (Lin.Xu@UTSouthwestern.edu)
Please contact Chen Tang for questions about the code and the README file.
Please see the LICENSE file for the copyright information of the Spa3D software.
Notice: This Spa3D software is adapted from the SpaGCN code (https://github.com/jianhuupenn/SpaGCN). The implementation of the SpaGCN software is described in the publication "SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network" (https://doi.org/10.1038/s41592-021-01255-8). Please also refer to the LICENSE file on SpaGCN's GitHub website.