MOMA: The Multi-omics Multi-cohort Assessment (MOMA) Platform

Pei-Chen Tsai, Tsung-Hua Lee, Kun-Chi Kuo, Fang-Yi Su, Tsung-Lu Michael Lee, Eliana Marostica, Tomotaka Ugai, Melissa Zhao, Mai Chan Lau, Juha P. Väyrynen, Marios Giannakis, Yasutoshi Takashima, Seyed Mousavi Kahaki, Kana Wu, Mingyang Song, Jeffrey A. Meyerhardt, Andrew T. Chan, Jung-Hsien Chiang, Jonathan Nowak, Shuji Ogino, Kun-Hsing Yu. Histopathology Images Predicted Multi-Omics Aberrations and Prognoses in Colorectal Cancer Patients. Nature Communications. 2023 Apr 13;14(1):2102. Paper

Requirements

Survival prediction
- Python==3.6.0
- tensorflow==2.4.0
- lifelines
- scipy
- statistics
- matplotlib
Multi-omics characterization
- Python==3.6.0
- torch==1.6.0
- torchvision==0.7.0
- scikit-learn
- numpy
- smooth-topk
- opencv-python
- tqdm

Dataset

Survival prediction: TCGA-COAD and TCGA-READ
Multi-omics characterization: TCGA-COAD and TCGA-READ
Interpretation: Dataset(NCT-CRC-HE-100K) provided by Kather et al

Data Preprocessing

Tiling : Modify from github Deepslide, or you can download the processed dataset provided by Kather et al.
Tumor detection : Resnet50
Color normalization : Modify from github HEnorm_python

Feature Extraction

You can use any pre-trained CNN model (like our multi-omics characterization task) or train model on our own (like our survival prediction task) to extract each patch's features.

Data Preparation

Survival Prediction

Color normalization

Create a dataframe

# Survival dataframe
data = {
    'bcr_patient_barcode' : patient id,
    'vital_status' : overall survival status or disease free status,
    'Days' : overall survival days or disease free days
    '0' : pathology image feature (dimension 1)
    '1' : pathology image feature (dimension 2)
    ...
    'n' : pathology image feature (dimension n)
}

df = pd.DataFrame(data)

Multi-omics characterization

XXX_id can be patient’s ID or slide’s ID, which is depending on your task. And please be sure that the patch_name in features pickle file and in cluster pickle file is the same.

Sample file

# Patch features pickle
{
  'patch_name' : array([latent feature]),
  'patch_name' : array([latent feature]),
  ...
}

# Cluster pickle file
{
  XXX_id: {
    'patch_name' : cluster label,
    'patch_name' : cluster label,
    ...
  },
  XXX_id: {
    'patch_name' : cluster label,
    'patch_name' : cluster label,
    ...
  },
}

# Label pickle file
{
  XXX_id: class,
  XXX_id: class,
  ...
}

Interpretation

Create a dataframe

# Interpretation dataframe
data = {
    'fig' : fig name,
    'folder' : file path,
    'class' : 0~n (class num)
  
}

df = pd.DataFrame(data)

Usage

Survival prediction
- Both overall survival prediction and disease free prediction use the same .ipynb file

Multi-omics characterization

Sample Command

# Training
python3 Train.py --level patient --hidden_dim 512 --encoder_layer 6 --k_sample 3 --tau 0.5 --save_path 'path/to/save/' --label 'path/to/label pickle file' --use_kather_data True --epoch 60 --lr 3e-4 --evaluate_mode kfold --kfold 5

# Validation
python3 Validation.py --level patient --hidden_dim 512 --encoder_layer 6 --k_sample 3 --tau 0.5 --save_path 'path/to/save/' --label 'path/to/label pickle file' --use_kather_data True

--level                 slide or patient level
--hidden_dim            The dimension in the Transformer encoder
--encoder_layer         The layers of the Transformer encoder
--k_sample              The top-k and bottom-k for the instance selection
--tau                   The smoothness term for smoothSVM
--use_kather_data       Using the data provided by kather et al. or not
--save_path             Model weights save path
--label                 Path to label pickle file
--lr                    Learning rate
--epoch                 Training epochs
--evaluate_mode         Kfold or holdout test
--kfold                 The number of fold

Interpretation
- It's not recommended to test data with low foreground and background ratio

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
Multi-omics characterization		Multi-omics characterization
Survival Prediction		Survival Prediction
.gitignore		.gitignore
Color_normalization.py		Color_normalization.py
Interpretation.ipynb		Interpretation.ipynb
README.md		README.md
Tiling.py		Tiling.py
Tumor_detection.ipynb		Tumor_detection.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MOMA: The Multi-omics Multi-cohort Assessment (MOMA) Platform

Requirements

Dataset

Data Preprocessing

Feature Extraction

Data Preparation

Usage

About

Releases

Packages

Contributors 2

Languages

hms-dbmi/MOMA

Folders and files

Latest commit

History

Repository files navigation

MOMA: The Multi-omics Multi-cohort Assessment (MOMA) Platform

Requirements

Dataset

Data Preprocessing

Feature Extraction

Data Preparation

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages