Skip to content

QuIIL/HEXST

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HEXST: Hexagonal Shifted-Window Transformer for Spatial Transcriptomics Gene Expression Prediction

Paper arXiv Framework

Official implementation of HEXST, a geometry-aligned Transformer for spatial gene expression prediction from histology.


Given spot-level image features and spatial coordinates, HEXST predicts spot-wise gene expression.

The implementation uses:

  • model/HEXST.py: main HEXST architecture
  • model/hex.py: hexagonal coordinate conversion, window construction, slot packing
  • model/pos_embed.py: HexRoPE implementation
  • utils/loss.py: MSE, Pearson loss, deviation-matching loss, and feature-alignment loss

Installation

The spatial gene expression prediction experiments in the paper were run with Python 3.10 and PyTorch 2.6.

Create a conda environment:

conda create -n hexst python=3.10 -y
conda activate hexst

Install PyTorch according to your CUDA environment, then install the remaining dependencies:

pip install torch torchvision
pip install numpy scipy scikit-learn pandas h5py pyyaml pillow tqdm

Data Preparation

The training and evaluation scripts expect pre-cached .pt files.

  • dataloader/data_cacheing.py
  • It contains project-specific absolute paths and need to be edited before use.
  • The current implementation assumes pre-extracted UNI image features and scFoundation transcriptomic embeddings.

Each cached file should contain the following structure:

{
    "train": {
        "UNI_feats": Tensor[N_train, 1024],
        "scFM_feats": Tensor[N_train, 3072],
        "metadata": [
            [img_path, spot_id, split, slide_id, gene_expression, coords],
            ...
        ],
    },
    "val": {
        "UNI_feats": Tensor[N_val, 1024],
        "scFM_feats": Tensor[N_val, 3072],
        "metadata": [...],
    },
    "test": {
        "UNI_feats": Tensor[N_test, 1024],
        "scFM_feats": Tensor[N_test, 3072],
        "metadata": [...],
    },
    "num_genes": 128
}

The current dataloader reads the following fields:

  • UNI_feats: spot-level pathology foundation model features
  • scFM_feats: transcriptomic embeddings used for feature alignment
  • metadata[i][3]: slide ID
  • metadata[i][4]: target gene expression vector
  • metadata[i][5]: spot coordinate

Supported Dataset Configs

Dataset configs are provided under config/data/:

abalo_human_squamous_cell_carcinoma.yaml
erickson_human_prostate_cancer_p1.yaml
mirzazadeh_mouse_bone.yaml
mirzazadeh_mouse_brain_p1.yaml
mirzazadeh_mouse_brain_p2.yaml
vicari_mouse_brain.yaml
villacampa_lung_organoid.yaml

Additional configs are also included:

mirzazadeh_human_small_intestine.yaml
vicari_human_striatium.yaml
villacampa_mouse_brain.yaml

Training

Train HEXST on a single dataset:

python train.py \
  --base_config ./config/baseline.yaml \
  --data_config ./config/data/abalo_human_squamous_cell_carcinoma.yaml \
  --model_config ./config/model/HEXST.yaml \
  --loss_function_config ./config/loss/function/MSEPL.yaml \
  --loss_mode_config ./config/loss/mode/DIOR.yaml \
  --loss_type_config ./config/loss/type/IF.yaml

Checkpoints and logs are saved to:

./results/<project>/HEXST_MSEPL_DIOR_IF/
├── model_best.pth
├── model_last.pth
└── log.txt

Evaluation

Evaluate a trained model on the test split:

python eval.py \
  --base_config ./config/baseline.yaml \
  --data_config ./config/data/abalo_human_squamous_cell_carcinoma.yaml \
  --model_config ./config/model/HEXST.yaml \
  --loss_function_config ./config/loss/function/MSEPL.yaml \
  --loss_mode_config ./config/loss/mode/DIOR.yaml \
  --loss_type_config ./config/loss/type/IF.yaml

Predicted expression tensors are saved to:

/data/SpaRED_pred/HEXST/<project>/HEXST_MSEPL_DIOR_IF/<slide_id>.pt

The evaluation code reports:

  • PCC_F: gene-wise Pearson correlation
  • PCC_S: spot-wise Pearson correlation
  • PCC_M: matrix-level Pearson correlation
  • MI_F: gene-wise mutual information
  • MI_M: matrix-level mutual information
  • NRMSE_F
  • NRMSE_M
  • AUC_0vNZ
  • AUC_Q50
  • JSDIV_M

Third-party Resources and Data

The current implementation assumes pre-extracted UNI image features and scFoundation transcriptomic embeddings. The spatial transcriptomics benchmark datasets and splits follow SpaRED.

  • UNI
    Chen, Richard J., et al. "Towards a general-purpose foundation model for computational pathology." Nature Medicine 30.3 (2024): 850–862.
    We use the official implementation from: https://github.com/mahmoodlab/UNI
  • scFoundation
    Hao, Minsheng, et al. "Large-scale foundation model on single-cell transcriptomics." Nature Methods 21.8 (2024): 1481–1491.
    We use the official implementation from: https://github.com/biomap-research/scFoundation
  • SpaRED
    Mejia, Gabriel, et al. "Enhancing gene expression prediction from histology images with spatial transcriptomics completion." International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer Nature Switzerland, 2024.
    We use the SpaRED benchmark data and splits from: https://bcv-uniandes.github.io/spared_webpage/

Please refer to the original repositories, papers, and dataset webpage for license terms, model access, data access, and usage restrictions.


License

This repository is released under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Commercial use is not permitted. Non-commercial research and educational use is permitted with appropriate attribution.
This license applies only to the original HEXST source code in this repository. Third-party resources are subject to their own licenses and usage restrictions.


Citation

If you find this repository useful, please cite HEXST:

@article{byeon2026hexst,
  title   = {HEXST: Hexagonal Shifted-Window Transformer for Spatial Transcriptomics Gene Expression Prediction},
  author  = {Byeon, Keunho and Kwak, Jin Tae},
  journal = {arXiv preprint arXiv:2605.04682},
  year    = {2026}
}

Acknowledgements

This work was supported by a grant of the National Research Foundation of Korea (NRF) (No. RS-2025-00558322 and RS-2024-00397293) and the AI Computing Infrastructure Enhancement (GPU Rental Support) User Support Program funded by the Ministry of Science and ICT (MSIT) (No. RQT-25-120213), Republic of Korea.

About

HEXST: Hexagonal Shifted-Window Transformer for Spatial Transcriptomics Gene Expression Prediction

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors