Skip to content

Cassie07/PathOmics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 

Repository files navigation

PathOmics: Pathology-and-genomics Multimodal Transformer for Survival Outcome Prediction

The official code of "Pathology-and-genomics Multimodal Transformer for Survival Outcome Prediction" (Accepted to MICCAI2023, top 9%).

Our Paper [Link]

[New Update!!!] We updated the paper list of pathology-and-genomics multimodal analysis approaches in healthcare at the end of this repo.

Workflow overview of the PathOmics

Workflow overview of the pathology-and-genomics multimodal transformer (PathOmics) for survival prediction. In (a), we show the pipeline of extracting image and genomics feature embedding via an unsupervised pretraining towards multimodal data fusion. In (b) and (c), our supervised finetuning scheme could flexibly handle multiple types of data for prognostic prediction. With the multimodal pretrained model backbones, both multi- or single-modal data can be applicable for our model fine-tuning.

Citation

@inproceedings{ding2023pathology,
  title={Pathology-and-genomics multimodal transformer for survival outcome prediction},
  author={Ding, Kexin and Zhou, Mu and Metaxas, Dimitris N and Zhang, Shaoting},
  booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
  pages={622--631},
  year={2023},
  organization={Springer}
}

Prerequisites

python 3.8.18
Pytorch 2.0.1
pytorch-cuda 11.8
Torchvision 0.15.2
Pillow 9.4.0
numpy 1.24.3
pandas 2.0.3
scikit-survival 0.21.0 
scikit-learn 1.2.0
h5py 2.8.0

Usage

Data prerpocessing

1. Download WSIs from TCGA-COAD and TCGA-READ.
2. Download genomics data from CbioPortal and move the downloaded folder into "PathOmics" folder.
* "coadread_tcga_pan_can_atlas_2018" in `bash_main.py` and `bash_main_read.py` is the downloaded folder, please download it before you run the code.
3. Split WSIs into patches and only keep the foreground patches.
4. Extract patch features via pretrained models (e.g., ImageNet-pretrained ResNet50, ResNet101, etc).
5. Save patch features as .npz files. (For each slide, we generate one .npz file to save patch features).

For more details about extracting feature, please check Issue 1 and the code in split_tiles_utils/helper.py

Run code on TCGA-COAD only

Model will be pretrained and finetuned on theTCGA-COAD training set (4-fold cross-validation). The finetuned model will be evaluated on the TCGA-COAD hold-out set.

python bash_main.py --pretrain_loss 'MSE' --save_model_folder_name 'reproduce_experiments' --experiment_folder_name 'COAD_reproduce' --omic_modal 'miRNA' --kfold_split_seed 42 --pretrain_epochs 25 --finetune_epochs 25 --model_type 'PathOmics' --model_fusion_type 'concat' --model_pretrain_fusion_type 'concat' --cuda_device '2' --experiment_id '1' --use_GAP_in_pretrain_flag --seperate_test

Run code on TCGA-COAD and TCGA-READ

Model will be pretrained on TCGA-COAD (5-fold cross-validation). Model will be finetuned, validated, and evaluated on the TCGA-READ dataset.

python bash_main_read.py --k_fold 5 --fusion_mode 'concat' --prev_fusion_mode 'concat' --pretrain_loss 'MSE' --save_model_folder_name 'reproduce_experiments' --experiment_folder_name 'READ_reproduce' --omic_modal 'miRNA' --kfold_split_seed 42 --pretrain_epochs 25 --finetune_epochs 25 --model_type 'PathOmics' --cuda_device '2' --experiment_id '1' --use_GAP_in_pretrain_flag

If you want to use TCGA-COAD pretrain weights and skip the pretraining stage, please add --load_model_finetune into your script. Please modify the code to ensure your pretrain weights saving directory is correct.

Use data-efficient mode in finetuning stage

Please add --less_data into your script and set --finetune_test_ratio as your preferred ratio for indicating the ratio of data used for model finetuning.

Literature reviews of pathology-and-genomics multimodal analysis approaches in healthcare.

Publish Date Title Paper Link Code
2023.10 Pathology-and-genomics Multimodal Transformer for Survival Outcome Prediction MICCAI 2023 Code
2023.10 Gene-induced Multimodal Pre-training for Image-omic Classification MICCAI 2023 Code
2023.10 Cross-Modal Translation and Alignment for Survival Analysis ICCV 2023 Code
2023.07 Assessment of emerging pretraining strategies in interpretable multimodal deep learning for cancer prognostication BioData Mining NA
2023.04 Multimodal data fusion for cancer biomarker discovery with deep learning Nature Machine Intelligence NA
2023.03 Hierarchical multimodal fusion framework based on noisy label learning and attention mechanism for cancer classification with pathology and genomic features Computerized Medical Imaging and Graphics NA
2023.03 Hybrid Graph Convolutional Network With Online Masked Autoencoder for Robust Multimodal Cancer Survival Prediction TMI Code
2023.01 Multimodal deep learning to predict prognosis in adult and pediatric brain tumors Communications Medicine Code
2023.01 Survival Prediction for Gastric Cancer via Multimodal Learning of Whole Slide Images and Gene Expression BIBM 2022 Code
2023.01 Deep Biological Pathway Informed Pathology-Genomic Multimodal Survival Prediction arXiv NA
2022.09 Survival Prediction of Brain Cancer with Incomplete Radiology, Pathology, Genomic, and Demographic Data MICCAI 2022 Code
2022.09 Discrepancy and gradient guided multi-modal knowledge distillation for pathological glioma grading MICCAI 2022 Code
2022.08 Multimodal integration of radiology, pathology and genomics for prediction of response to PD-(L)1 blockade in patients with non-small cell lung cancer Nature Cancer Code
2022.08 Pan-cancer integrative histology-genomic analysis via multimodal deep learning Cancer Cell Code
2022.03 HFBSurv: hierarchical multimodal fusion with factorized bilinear models for cancer survival prediction Bioinformatics Code
2021.10 Multimodal co-attention transformer for survival prediction in gigapixel whole slide images ICCV 2021 Code
2021.09 Deep Orthogonal Fusion: Multimodal Prognostic Biomarker Discovery Integrating Radiology, Pathology, Genomic, and Clinical Data MICCAI 2021 NA
2020.09 Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis TMI Code
2020.08 Applying Machine Learning for Integration of Multi-Modal Genomics Data and Imaging Data to Quantify Heterogeneity in Tumour Tissues Artificial Neural Networks NA
2019.09 Multimodal Multitask Representation Learning for Pathology Biobank Metadata Prediction arXiv NA
2019.07 Deep learning with multimodal representation for pancancer prognosis prediction Bioinformatics Code
2019.06 Integrative Analysis of Pathological Images and Multi-Dimensional Genomic Data for Early-Stage Cancer Prognosis TMI Code

About

[MICCAI 2023 Oral] The official code of "Pathology-and-genomics Multimodal Transformer for Survival Outcome Prediction" (top 9%)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages