Due to the sophisticated nature of complex diseases, finding interpretable associations between multi-omics data can be challenging using standard approaches.
We propose a contrastive learning approach leveraging multi-omics data to generate many-to-many associations between any two types of multi-omics information. We generate learnable embeddings from tokenizations of each modality and utilize attention-based encoders to learn the connections between them.
Our modal-agnostic approach uniquely identifies many-to-many associations via self-supervised learning schemes and cross-modal attention encoders. Our method also provides a pre-trained model for many-to-many multi-omic association discovery.
- Create the environment from the
comical_env.yml
file:
conda env create -f comical_env.yml
- Note: if you receive the error
bash: conda: command not found...
, you need to install Anaconda to your development environment (see "Additional resources" below)
- Activate the new environment:
conda activate comical-env
- Verify that the new environment was installed correctly:
conda env list
- Additional resources:
- Request resources from computing cluster:
jbsub -cores 2+1 -q x86_1h -mem 800g -interactive bash
- Activate the new environment:
conda activate comical-env
- Move to directory with source code and data:
cd /dccstor/ukb-pgx/comical/comical
- Run Comical:
nohup python wrapper.py --fname_out_root new_run_check_code --epochs 4 --top_n_perc 0.5 &
python wrapper.py --help
Contributors and contact info:
- Diego Machado Reyes
- Myson Burch (myson dot burch at ibm dot com)
- Aritra Bose (a dot bose at ibm dot com)
- Laxmi Parida (parida at us dot ibm dot com)
- 0.1
- Initial Release