Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

ConfVAE for Conformation Generation


[arXiv] [Code]

This is the official code repository of our ICML paper "An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming" (2021).


Install via Conda (Recommended)

You can follow the instructions here. We adopt an environment the same as our another previous project.

Install Manually

# Create conda environment
conda create --name ConfVAE python=3.7

# Activate the environment
conda activate ConfVAE

# Install packages
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
conda install rdkit==2020.03.3 -c rdkit
conda install tqdm networkx scipy scikit-learn h5py tensorboard -c conda-forge
pip install torchdiffeq==0.0.1

# Install PyTorch Geometric
conda install pytorch-geometric -c rusty1s -c conda-forge


Official Datasets

The official datasets are available here.

Input Format / Make Your Own Datasets

The dataset file is a pickled Python list consisting of rdkit.Chem.rdchem.Mol objects. Each conformation is stored individually as a Mol object. For example, if a dataset contains 3 molecules, where the first molecule has 4 conformations, the second one and the third one have 5 and 6 conformations respectively, then the pickled Python list will contain 4+5+6 Mol objects in total.

Output Format

The output format is identical to the input format.



Example: training a model for QM9 molecules.

python \
    --train_dataset ./data/qm9/train_QM9.pkl \
    --val_dataset ./data/qm9/val_QM9.pkl

More training options can be found in

Generate Conformations

Example: generating conformations for each molecule in the QM9 test-split, with twice the number of test set for each molecule.

python \
    --ckpt ./logs/VAE_QM9 \
    --dataset ./data/iclr/qm9/test_QM9.pkl \
    --num_samples -2

More generation options can be found in


Please consider citing our work if you find it helpful.

  title={An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming},
  author={Xu, Minkai and Wang, Wujie and Luo, Shitong and Shi, Chence and Bengio, Yoshua and Gomez-Bombarelli, Rafael and Tang, Jian},
  booktitle={International Conference on Machine Learning},


If you have any question, please contact me at or

📢 Attention

Please also check our another concurrent work on molecular conformation generation, which has also been accepted in ICML'2021 (Long Talk): Learning Gradient Fields for Molecular Conformation Generation. [Code]