RJT-RL: De novo molecular design using a Reversible Junction Tree and Reinforcement Learning

Implementation of "Molecular design method using a reversible tree representation of chemical compounds and deep reinforcement learning" by Ryuichiro Ishitani, Toshiki Kataoka, Kentaro Rikimaru.
(Paper: https://doi.org/10.1021/acs.jcim.2c00366)

Environment

This package run under the environment:

Python: 3.8
RDKit: 2020_09_3

The other package dependencies are described in requirements.txt.

To install the package and dependencies:

pip install .

Usage

Download dataset and model weights

If you want to use the pretrained policy using Zinc250k dataset, please download the dataset and model weights files from zenodo and save to the data directory.

Pretraining the policy network

Caution: If you are not trying to use other dataset and have downloaded the dataset and pretrained model files, you can skip this section.

To pretrain the policy network using Zinc250k dataset:

Download the dataset (into the data directory)

mkdir data
wget https://raw.githubusercontent.com/aspuru-guzik-group/chemical_vae/main/models/zinc/250k_rndm_zinc_drugs_clean_3.csv -O data/zinc250k.csv

Preprocess the csv file containing the SMILES and create the vocabulary file and pkl files containing preprocessed mols.

bash examples/run_prep_dataset.sh

The files will be created in the directory "results".

Create the worker's pkl files for the pretraining from the preprocessed pkl files. This example supposes 16 worker processes used in the training.

bash examples/pretrain/run_create_expert_dataset.sh

Run the pretraining of the policy network using the created dataset files.

bash examples/pretrain/run_pretrain_policy.sh

You may change the size of hidden vectors (128 in the example) and num of worker processes (16 in the examples) depending on your dataset and/or environment.

Training and exploration of molecules

To train the policy using the specific reward functions, run the script contained in the specific subdirectories.

Citation

If you find our work relevant to your research, please cite:

@article{ishitani2022rjtrl,
    title={Molecular design method using a reversible tree representation of chemical compounds and deep reinforcement learning},
    author={Ryuichiro Ishitani and Toshiki Kataoka and Kentaro Rikimaru},
    year={2022},
    journal={J. Chem. Inf. Model. https://doi.org/10.1021/acs.jcim.2c00366}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
examples		examples
png		png
results		results
rjt_rl		rjt_rl
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

License

pfnet-research/RJT-RL

Folders and files

Latest commit

History

Repository files navigation

RJT-RL: De novo molecular design using a Reversible Junction Tree and Reinforcement Learning

Environment

Usage

Download dataset and model weights

Pretraining the policy network

Training and exploration of molecules

Citation

About

Resources

License

Stars

Watchers

Forks

Languages