The official code of "Pathology-and-genomics Multimodal Transformer for Survival Outcome Prediction" (Accepted to MICCAI2023, top 9%).
Our Paper [Link]
[New Update!!!] We updated the paper list of pathology-and-genomics multimodal analysis approaches in healthcare at the end of this repo.
Workflow overview of the pathology-and-genomics multimodal transformer (PathOmics) for survival prediction. In (a), we show the pipeline of extracting image and genomics feature embedding via an unsupervised pretraining towards multimodal data fusion. In (b) and (c), our supervised finetuning scheme could flexibly handle multiple types of data for prognostic prediction. With the multimodal pretrained model backbones, both multi- or single-modal data can be applicable for our model fine-tuning.
@inproceedings{ding2023pathology,
title={Pathology-and-genomics multimodal transformer for survival outcome prediction},
author={Ding, Kexin and Zhou, Mu and Metaxas, Dimitris N and Zhang, Shaoting},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
pages={622--631},
year={2023},
organization={Springer}
}
python 3.8.18
Pytorch 2.0.1
pytorch-cuda 11.8
Torchvision 0.15.2
Pillow 9.4.0
numpy 1.24.3
pandas 2.0.3
scikit-survival 0.21.0
scikit-learn 1.2.0
h5py 2.8.0
1. Download WSIs from TCGA-COAD and TCGA-READ.
2. Download genomics data from CbioPortal and move the downloaded folder into "PathOmics" folder.
* "coadread_tcga_pan_can_atlas_2018" in `bash_main.py` and `bash_main_read.py` is the downloaded folder, please download it before you run the code.
3. Split WSIs into patches and only keep the foreground patches.
4. Extract patch features via pretrained models (e.g., ImageNet-pretrained ResNet50, ResNet101, etc).
5. Save patch features as .npz files. (For each slide, we generate one .npz file to save patch features).
For more details about extracting feature, please check Issue 1 and the code in split_tiles_utils/helper.py
Model will be pretrained and finetuned on theTCGA-COAD training set (4-fold cross-validation). The finetuned model will be evaluated on the TCGA-COAD hold-out set.
python bash_main.py --pretrain_loss 'MSE' --save_model_folder_name 'reproduce_experiments' --experiment_folder_name 'COAD_reproduce' --omic_modal 'miRNA' --kfold_split_seed 42 --pretrain_epochs 25 --finetune_epochs 25 --model_type 'PathOmics' --model_fusion_type 'concat' --model_pretrain_fusion_type 'concat' --cuda_device '2' --experiment_id '1' --use_GAP_in_pretrain_flag --seperate_test
Model will be pretrained on TCGA-COAD (5-fold cross-validation). Model will be finetuned, validated, and evaluated on the TCGA-READ dataset.
python bash_main_read.py --k_fold 5 --fusion_mode 'concat' --prev_fusion_mode 'concat' --pretrain_loss 'MSE' --save_model_folder_name 'reproduce_experiments' --experiment_folder_name 'READ_reproduce' --omic_modal 'miRNA' --kfold_split_seed 42 --pretrain_epochs 25 --finetune_epochs 25 --model_type 'PathOmics' --cuda_device '2' --experiment_id '1' --use_GAP_in_pretrain_flag
If you want to use TCGA-COAD pretrain weights and skip the pretraining stage, please add --load_model_finetune
into your script.
Please modify the code to ensure your pretrain weights saving directory is correct.
Please add --less_data
into your script and set --finetune_test_ratio
as your preferred ratio for indicating the ratio of data used for model finetuning.
Publish Date | Title | Paper Link | Code |
---|---|---|---|
2023.10 | Pathology-and-genomics Multimodal Transformer for Survival Outcome Prediction | MICCAI 2023 | Code |
2023.10 | Gene-induced Multimodal Pre-training for Image-omic Classification | MICCAI 2023 | Code |
2023.10 | Cross-Modal Translation and Alignment for Survival Analysis | ICCV 2023 | Code |
2023.07 | Assessment of emerging pretraining strategies in interpretable multimodal deep learning for cancer prognostication | BioData Mining | NA |
2023.04 | Multimodal data fusion for cancer biomarker discovery with deep learning | Nature Machine Intelligence | NA |
2023.03 | Hierarchical multimodal fusion framework based on noisy label learning and attention mechanism for cancer classification with pathology and genomic features | Computerized Medical Imaging and Graphics | NA |
2023.03 | Hybrid Graph Convolutional Network With Online Masked Autoencoder for Robust Multimodal Cancer Survival Prediction | TMI | Code |
2023.01 | Multimodal deep learning to predict prognosis in adult and pediatric brain tumors | Communications Medicine | Code |
2023.01 | Survival Prediction for Gastric Cancer via Multimodal Learning of Whole Slide Images and Gene Expression | BIBM 2022 | Code |
2023.01 | Deep Biological Pathway Informed Pathology-Genomic Multimodal Survival Prediction | arXiv | NA |
2022.09 | Survival Prediction of Brain Cancer with Incomplete Radiology, Pathology, Genomic, and Demographic Data | MICCAI 2022 | Code |
2022.09 | Discrepancy and gradient guided multi-modal knowledge distillation for pathological glioma grading | MICCAI 2022 | Code |
2022.08 | Multimodal integration of radiology, pathology and genomics for prediction of response to PD-(L)1 blockade in patients with non-small cell lung cancer | Nature Cancer | Code |
2022.08 | Pan-cancer integrative histology-genomic analysis via multimodal deep learning | Cancer Cell | Code |
2022.03 | HFBSurv: hierarchical multimodal fusion with factorized bilinear models for cancer survival prediction | Bioinformatics | Code |
2021.10 | Multimodal co-attention transformer for survival prediction in gigapixel whole slide images | ICCV 2021 | Code |
2021.09 | Deep Orthogonal Fusion: Multimodal Prognostic Biomarker Discovery Integrating Radiology, Pathology, Genomic, and Clinical Data | MICCAI 2021 | NA |
2020.09 | Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis | TMI | Code |
2020.08 | Applying Machine Learning for Integration of Multi-Modal Genomics Data and Imaging Data to Quantify Heterogeneity in Tumour Tissues | Artificial Neural Networks | NA |
2019.09 | Multimodal Multitask Representation Learning for Pathology Biobank Metadata Prediction | arXiv | NA |
2019.07 | Deep learning with multimodal representation for pancancer prognosis prediction | Bioinformatics | Code |
2019.06 | Integrative Analysis of Pathological Images and Multi-Dimensional Genomic Data for Early-Stage Cancer Prognosis | TMI | Code |