MoLA: Molecular Multimodal Layerwise Adaptive Network

Official/author-maintained partial code release for:

MoLA: Molecular multimodal layerwise adaptive network for molecular property prediction
Jiayi Li, Zihang Zhang, Zhenyu Lei, Jiujun Cheng, Lianbo Ma, Cong Liu, Shangce Gao
Knowledge-Based Systems, Volume 338, 115563, 2026.

DOI: https://doi.org/10.1016/j.knosys.2026.115563
ScienceDirect: https://www.sciencedirect.com/science/article/pii/S0950705126003059
Share Link: https://authors.elsevier.com/c/1md~S3OAb9Kb5p

Highlights

Multimodal molecular representation learning with:
- molecular graph features
- Morgan fingerprint features
- SMILES sequence features
- optional MoLFormer embeddings
Layerwise adaptive fusion with cross-layer attention.
Reproducible training script for MoleculeNet classification tasks.
Built-in ablation switch for MoLFormer:
- --use-molformer (default)
- --no-molformer

Repository Structure

.
|-- main_Cla_MoLA.py        # training/evaluation entry
|-- model_MoLA.py           # MoLA architecture
|-- prepare_molformer_embeddings.py  # build MoLFormer embeddings (reference utility)
|-- prepare_MoLFormer.py    # backward-compatible launcher
|-- environment.yml         # base conda environment specification
|-- setup_env.sh            # one-command environment bootstrap
|-- utils.py                # metrics (PRC-AUC / ROC-AUC)
|-- datasets/
|   |-- featurized/         # DeepChem cached data
|   `-- Molformer_OUTPUT/   # optional MolFormer embeddings
|-- result_Cla/             # training logs
`-- weights/                # checkpoints

Environment

Requirements

conda (or mamba)
Python 3.10
CUDA 12.1 (for GPU training with the default setup)
Linux/macOS shell (or Git Bash on Windows) for setup_env.sh

Quick Setup (recommended)

bash setup_env.sh
conda activate chem

Data Preparation

The training script uses DeepChem MoleculeNet loaders and expects:

DeepChem cache directory:
- ./datasets/featurized/...
(Optional) MoLFormer embedding file:
- ./datasets/Molformer_OUTPUT/<dataset_name>/Molformer_Emb_2025.h5
H5 keys:
- train_fp, valid_fp, test_fp

Build MoLFormer embeddings (reference utility)

This repository provides prepare_molformer_embeddings.py to compute embeddings and save:

datasets/Molformer_OUTPUT/<dataset_name>/Molformer_Emb_2025.h5

Notes:

This script is a project helper for MoLA and is provided for reference.
It must be used together with the official IBM MoLFormer release:
- https://github.com/IBM/molformer

Run:

python prepare_molformer_embeddings.py

Quick Start

1) Train with MoLFormer embeddings

python main_Cla_MoLA.py \
  --use-molformer \
  --run-times 1 \
  --epochs 300 \
  --batch-size 256

2) Train without MoLFormer embeddings

python main_Cla_MoLA.py \
  --no-molformer \
  --run-times 1 \
  --epochs 300 \
  --batch-size 256

Reproducibility Notes

Dataset loaders currently configured:
- bace_classification, bbbp, clintox, muv, sider, tox21
main_Cla_MoLA.py does not require --datasets; it runs the built-in dataset list.
Seed control:
- --seed (default: 2025)
Typical output paths:
- logs: result_Cla/<model_name>/result_<model_name>_<run_id>.txt
- checkpoints: weights/<model_name>_<dataset>_<run_id>.pth

Main Arguments

--model-name: experiment name prefix
--run-times: repeated runs
--epochs: maximum epochs
--batch-size: mini-batch size
--embed-dim: hidden width
--num-layers: encoder depth
--max-len: SMILES max length
--fp-bits: Morgan fingerprint bits (default 2048)
--lr: learning rate
--weight-decay: weight decay
--scheduler-patience: LR scheduler patience
--early-stop-patience: early stop patience
--device: cuda or cpu
--use-molformer / --no-molformer: whether to use MoLFormer embeddings

Current Release Scope

This repository is a partial release and currently includes limited prepared data (mainly BACE-related files in this snapshot). It focuses on core model/training logic and reproducible experiments.

Citation

If you use this codebase, please cite:

@article{li2026mola,
  title   = {{MoLA}: Molecular multimodal layerwise adaptive network for molecular property prediction},
  author  = {Li, Jiayi and Zhang, Zihang and Lei, Zhenyu and Cheng, Jiujun and Ma, Lianbo and Liu, Cong and Gao, Shangce},
  journal = {Knowledge-Based Systems},
  volume  = {338},
  pages   = {115563},
  year    = {2026},
  doi     = {10.1016/j.knosys.2026.115563}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MoLA: Molecular Multimodal Layerwise Adaptive Network

Highlights

Repository Structure

Environment

Requirements

Quick Setup (recommended)

Data Preparation

Build MoLFormer embeddings (reference utility)

Quick Start

1) Train with MoLFormer embeddings

2) Train without MoLFormer embeddings

Reproducibility Notes

Main Arguments

Current Release Scope

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
datasets		datasets
result_Cla/MoLA_M20		result_Cla/MoLA_M20
README.md		README.md
environment.yml		environment.yml
main_Cla_MoLA.py		main_Cla_MoLA.py
model_MoLA.py		model_MoLA.py
prepare_molformer_embeddings.py		prepare_molformer_embeddings.py
setup_env.sh		setup_env.sh
utils.py		utils.py

Rirock/MoLA

Folders and files

Latest commit

History

Repository files navigation

MoLA: Molecular Multimodal Layerwise Adaptive Network

Highlights

Repository Structure

Environment

Requirements

Quick Setup (recommended)

Data Preparation

Build MoLFormer embeddings (reference utility)

Quick Start

1) Train with MoLFormer embeddings

2) Train without MoLFormer embeddings

Reproducibility Notes

Main Arguments

Current Release Scope

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages