Skip to content

GFNOrg/gflownet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation

Implementation for our paper, submitted to NeurIPS 2021 (also check this high-level blog post).

This is a minimum working version of the code used for the paper, which is extracted from the internal repository of the Mila Molecule Discovery project. Original commits are lost here, but the credit for this code goes to @bengioe, @MJ10 and @MKorablyov (see paper).

Note: for more modern implementations of GFlowNet, check out recursionpharma/gflownet, saleml/gfn, and alexhernandezgarcia/gflownet.

Grid experiments

Requirements for base experiments:

  • torch numpy scipy tqdm

Additional requirements for active learning experiments:

  • botorch gpytorch

Molecule experiments

Additional requirements:

  • pandas rdkit torch_geometric h5py ray
  • a few biochemistry programs, see mols/Programs/README

For rdkit in particular we found it to be easier to install through (mini)conda, but rdkit-pypi also works on pip in a vanilla python virtual environment. torch_geometric has non-trivial installation instructions.

If you have CUDA 10.1 configured, you can run pip install -r requirements.txt. You can also change requirements.txt to match your CUDA version. (Replace cu101 to cuXXX, where XXX is your CUDA version).

We compress the 300k molecule dataset for size. To uncompress it, run cd mols/data/; gunzip docked_mols.h5.gz.