Learning to Relative Expression under Batch Effects and Stochastic Noise in Spatial Transcriptomics

This repository is the official implementation of Learning to Relative Expression under Batch Effects and Stochastic Noise in Spatial Transcriptomics.

Requirements

pytorch
doten

Training and evaluation on synthetic data

To train the model(s) in the paper, run this command:

cd toy_example
bash ./scripts/run_comparisons.sh

Training and evaluation on real dataset

Data preparation: Our experimetns used HEST-1k dataset. Before execute our experiments, you should get permittion.

Execute all experiments

python ./scripts/execute_all_exps.py

Separately running code

# download dataset
# Put HEST_v1_1_0.csv file into dataset/hest1k/ 

# another examples
python ./preprocessing/download_hest_benchmarks.py st_v2
python ./preprocessing/download_hest_benchmarks.py task_1
python ./preprocessing/download_hest_benchmarks.py st_v3

preprocessing

python ./preprocessing/make_paired.py ./dataset/hest1k/task_1
python ./preprocessing/feature_extraction.py --model_name conch_v1 \
    --save_dir ./dataset/hest1k/task_1/feat/conch_v1 \
    --input_dir ./dataset/hest1k/task_1/paired_data

python ./scripts/export_highly_variable.py \
    --data_dir ./dataset/hest1k/task_1/feat/conch_v1 \
    --output_path ./dataset/hest1k/task_1/opts/comp/highly_variable_genes_50.txt \
    --ntop_genes 50

run benchmark

python ./strank/train.py \
        --data_dir  ./dataset/hest1k/task_1/feat/conch_v1\
        --param_path ./dataset/hest1k/task_1/opts/comp/stranklist/opt_param.pt \
        --test_sample_ids NCBI783  \
        --val_sample_ids TENX95 \
        --log_dir ./dataset/hest1k/task_1/opts/comp/logs \
        --loss stranklist \
        --model linear \
        --max_epochs 1000 \
        --use_gene ./dataset/hest1k/task_1/opts/comp/highly_variable_genes.txt \
        --ngpu 1

python ./strank/evaluation.py \
        --data_dir ./dataset/hest1k/task_1/feat/conch_v1\
        --param_path ./dataset/hest1k/task_1/opts/comp/stranklist/opt_param.pt \
        --sample_ids NCBI783  \
        --model linear \
        --loss stranklist \
        --batch_size 1024 \
        --use_gene ./dataset/hest1k/task_1/opts/comp/highly_variable_genes.txt \
        --output_csv {save_path}

Results

Our model achieves the following performance on :

Table: Real dataset from Hest 1k. Bold = best performance, Underline = second-best. Ave. = average performance.

	Loss	IDC	PRAD	PAAD	COAD	READ	ccRCC	IDC-L	Ave.
Point	MSE	0.393	0.484	0.307	0.556	0.140	0.093	0.168	0.306
	Po	0.314	0.485	0.336	0.524	0.172	0.091	0.134	0.293
	NB	0.199	0.491	0.119	0.538	0.160	0.075	0.126	0.244
Pair	Rank	0.317	0.317	0.181	0.566	0.047	0.059	0.110	0.228
	PairSTrank	0.494	0.458	0.346	0.613	0.136	0.127	0.228	0.343
List	PCC	0.472	0.459	0.307	0.640	0.105	0.102	0.198	0.326
	ListSTrank	0.510	0.459	0.343	0.597	0.140	0.125	0.238	0.345

Acknowledgement

We used HEST-1k dataset.
For the feature extractor, we implemented the code based on CLAM.

Citation

If you use the code or results in this repository, please cite our paper:

@article{kazuya2025learning,
  title={Learning Relative Gene Expression Trends
under Batch Effects and Stochastic Noise
in Spatial Transcriptomics},
  author={Kazuya Nishimura, Haruka Hirose, Ryoma Bise, Kaito Shiku, Yasuhiro Kojima},
  journal={Neurips},
  year={2025}
}

License

This repository is licensed under the MIT License. See the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
figures		figures
preprocessing		preprocessing
scripts		scripts
strank		strank
toy_example		toy_example
.gitignore		.gitignore
README.md		README.md
_config.yml		_config.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning to Relative Expression under Batch Effects and Stochastic Noise in Spatial Transcriptomics

Requirements

Training and evaluation on synthetic data

Training and evaluation on real dataset

Separately running code

preprocessing

run benchmark

Results

Acknowledgement

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Learning to Relative Expression under Batch Effects and Stochastic Noise in Spatial Transcriptomics

Requirements

Training and evaluation on synthetic data

Training and evaluation on real dataset

Separately running code

preprocessing

run benchmark

Results

Acknowledgement

Citation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages