✨ Official implementation of MoMST from ICML 2026.
📄 paper | 🔗 code
This repository implements MOMST, a framework for multi-objective protein sequence design. This framework alternates between noising and memory-guided denoising in diffusion models. By combining self-contrastive learning to extract residue-level preferences from historical trajectories with inference-time Pareto alignment, MOMST effectively balances conflicting functional rewards while strictly preserving the pre-trained model's sequence naturalness.
We present research results on the optimization of several fundamental structural objective functions. These include:
- Single-objective protein design, exemplified by optimizing the cRMSD metric (e.g., from run EHEE_rd1_0101).
- Multi-objective design, utilizing a dual-objective combination of globularity and pLDDT.
- Multi-objective design, incorporating a triple-objective combination of hydrophobicity, globularity, and pLDDT.
Install pytroch, pyrosseta. Then, run the following
conda create -n MoMST python=3.9
conda activate MoMST
pip install torch torchvision torchaudio
pip install -r requirements.txtAlso, to optimize match_ss and crmsd, go to the ./datasets folder and download the protein examples as shown below. You can also use any PDB files.
python download_model_data.py
This code puts several pdb files into ./datasets/AlphaFoldPDB/.
Below is an explanation of the available options.
| Argument | Description |
|---|---|
--decoding |
decoding method (momst, SVDD_edit, SVDD) |
--repeatnum |
batch size |
--duplicate |
number of andidates |
--metrics_name |
reward functions |
--metrics_list |
weights for rewards |
--proteinname |
target PDB name |
--iteration |
number of iterations |
--seq_length |
protein length |
1. Secondary Structure Match
Design a sequence that folds into a target secondary structure.
CUDA_VISIBLE_DEVICES=0 python refinement.py --decoding momst --repeatnum 10 --duplicate 20 --metrics_name match_ss --metrics_list 1 --proteinname XX_run1_0254_0003 --iteration 30
2. cRMSD
Design a sequence that folds into a target structure based on cRMSD.
CUDA_VISIBLE_DEVICES=0 python refinement.py --decoding momst --repeatnum 20 --duplicate 20 --metrics_name crmsd --metrics_list 1 --proteinname 5KPH --iteration 40
1. Globularity + pLDDT
The globularity-pLDDT combination provides structural confidence in a compact sphere for stable scaffold design.
CUDA_VISIBLE_DEVICES=0 python refinement.py --decoding momst --repeatnum 10 --duplicate 20 --metrics_name globularity,plddt --metrics_list 1,1 --iteration 20 --seq_length 150
2. Hydrophobicity + Surface Exposure + pLDDT
The hydrophobicity-surface exposure-pLDDT combination suits therapeutic protein design, ensuring high structural stability, solubility, and reduced aggregation-mediated immunogenic risks.
CUDA_VISIBLE_DEVICES=0 python refinement.py --decoding momst --repeatnum 10 --duplicate 20 --metrics_name hydrophobic,surface_expose,plddt --metrics_list 1,1,1 --iteration 20 --seq_length 150
Our codebase is heavily based on RERD, evodiff, openfold, ESMfold.


