MoCLIM: Towards Accurate Cancer Subtyping via Multi-Omics Contrastive Learning with Omics-Inference Modeling
This repository contains the implementations of MoCLIM, presented at the 32nd ACM International Conference on Information and Knowledge Management (CIKM '23).
Original paper: [https://dl.acm.org/doi/10.1145/3583780.3614970].
Presentation&Slide: [https://youtu.be/26uYBmsyiLM]
MoCLIM, developed by Ziwei Yang, Zheng Chen, Yasuko Matsubara, and Yasushi Sakurai, introduces a novel approach to cancer subtype identification using multi-omics contrastive learning with omics-inference modeling.
MoCLIM Framework Overview |
---|
An overview of the MoCLIM workflow: (1) MoCLIM takes multi-omics data as input. (2) Omics-specific encoders parallelly learn latent features for each omics source. (3) Multi-omics contrastive learning with a contrastive anchor integrates the learned features. The clustering is implemented on the integrated feature. (4) Comprehensive biomedical evaluations following the feature learning help users understand the results generated by MoCLIM.
Each specific cancer comprises multiple subtypes. These subtypes refer to groups of patients with specific biochemical mechanisms that require tailored therapeutic strategies.
Different cancer subtypes often share the same morphological traits. This can result in high similarity in histopathological images. However, the differences can be found across various omics levels.
Omics Data for Cancer Subtyping |
---|
We position MoCLIM in a biological axiom: genome-wide transcriptomics analysis is the mainstay of omics studies.
In this schematic diagram, various omics information, including genomics (DNA) and proteomics (protein) data, are interconnected through processes like gene expression and regulation at the mRNA (transcriptomics) level.
Biological Observation: Transcriptomics as a Focal Point |
---|
Experimental results on six cancer datasets demonstrate that our approach significantly improves data fit and subtyping performance in fewer high-dimensional cancer instances.
Moreover, our framework incorporates various medical evaluations as the final component, providing high interpretability in medical analysis.
Cancer Subtyping Examples |
---|
Gene Set Enrichment Analysis Examples |
The implementation details and code can be found in the "MoCLIM_main". Make sure to follow the setup instructions below to run the code successfully.
To set up the project environment, install the required dependencies using pip:
pip install -r requirements.txt
If you're using a CUDA version other than 10.2, please ensure PyTorch is installed for the appropriate CUDA version. Refer to the instructions for detailed instructions.
If you find our work helpful for your research, please consider citing our paper:
@inproceedings{MoCLIM,
author={Yang, Ziwei and Chen, Zheng and Matsubara, Yasuko and Sakurai, Yasushi},
booktitle = {Proceedings of the 32nd ACM International Conference on Information and Knowledge Management},
title={MoCLIM: Towards Accurate Cancer Subtyping via Multi-Omics Contrastive Learning with Omics-Inference Modeling},
year={2023},
series = {CIKM '23}
pages={2895–2905}}
Thank you for your interest in our research. For any questions or inquiries, feel free to reach out to us.