Sample code for Constrained Graph Variational Autoencoders
Switch branches/tags
Nothing to show
Clone or download
Latest commit 7522bf2 Sep 27, 2018

Constrained Graph Variational Autoencoders for Molecule Design

This repository contains our implementation of Constrained Graph Variational Autoencoders for Molecule Design (CGVAE).

  title={Constrained Graph Variational Autoencoders for Molecule Design},
  author={Liu, Qi and Allamanis, Miltiadis and Brockschmidt, Marc and Gaunt, Alexander L.},
  journal={The Thirty-second Conference on Neural Information Processing Systems},


This code was tested in Python 3.5 with Tensorflow 1.3. conda, docopt and rdkit are also necessary. A Bash script is provided to install all these requirements.

source ./

To evaluate SAS scores, use to download the SAS implementation from rdkit

Data Extraction

Three datasets (QM9, ZINC and CEPDB) are in use. For downloading CEPDB, please refer to CEPDB.

For downloading QM9 and ZINC, please go to data directory and run and, respectively.



Running CGVAE

We provide two settings of CGVAE. The first setting samples one breadth first search path for each molecule. The second setting samples transitions from multiple breadth first search paths for each molecule.

To train and generate molecules using the first setting, use

python --dataset qm9|zinc|cep

To avoid training and generate molecules with a pretrained model, use

python --dataset qm9|zinc|cep --restore pretrained_model --config '{"generation": true}'

To train and generate molecules using the second setting, use

python --dataset qm9|zinc|cep --config '{"sample_transition": true, "multi_bfs_path": true, "path_random_order": true}'

To use optimization in the latent space, set optimization_step to a positive number

python --dataset qm9|zinc|cep --restore pretrained_model --config '{"generation": true, "optimization_step": 50}'

More configurations can be found at function default_params in


To evaluate the generated molecules, use

python --dataset qm9|zinc|cep

Pretrained Models and Generated Molecules

We will provide pretrained models and generated molecules soon.

A program in folder molecules is provided to read and visualize the molecules

python molecule_file output_file


Please submit a Github issue or contact


This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact with any additional questions or comments.