PyTorch implementation of "Generating diverse molecular de novo structures using reinforcement learning"

This model is similar to our model used in "Molecular De Novo Design through Deep Reinforcement Learning". It's implementation is different from the one in the paper in several ways:

 * The GRU model has an embedding layer
 * Scoring is (0, 1) rather than (-1, 1)
 * Only unique sequences are considered, ie if the same sequence is generated twice, it
   still only contributes to the loss once.
 * Sequences are penalized for being very likely. This and the point above means that the
   training is much more robust towards getting stuck in local minimum, and often very high
   values of sigma can be used if needed.
 * Prioritized experience replay is implemented. This is a little unusual for policy gradient
   since it is sensitive to how often an action is taken, but works well in some cases. (It's deactivated by
   default)

Install

A Conda environment.yml is supplied with all the required libraries.

git clone https://github.com/tblaschke/reinvent
cd reinvent
conda env create -f environment.yml
conda activate reinvent

General usage

###Use the provided ChEMBL model We already provide a model which is reasonably trained on ChEMBL. To get started we recommend to use this model and play around with the different scoring functions.

Run reinforce_model.py to start the reinforcement learning to generate new structures.
(Optional) Check out Vizor (https://github.com/tblaschke/vizor) to have a visualization for the reinforcement learning

###Create a new model You might be interested to train a model with your own set of compounds. To do so here is a quick list of steps you should follow.

Use create_model.py to preprocess a SMILES file and to build an untrained model.
Use train_model.py to train your model on a SMILES file.

BONUS: Train_model also allows you to do some transfer learning on any model if you just train an already trained Prior a second time on a small subset of compounds.

Run reinforce_model.py to start the reinforcement learning to generate new structures.
(Optional) Check out Vizor (https://github.com/tblaschke/vizor) to have a visualization for the reinforcement learning

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
chem		chem
models		models
priors/ChEMBL		priors/ChEMBL
scoring		scoring
utils		utils
Dockerfile		Dockerfile
README.md		README.md
__init__.py		__init__.py
create_model.py		create_model.py
environment.yml		environment.yml
reinforce_model.py		reinforce_model.py
reinforcement.py		reinforcement.py
sample_from_model.py		sample_from_model.py
train_model.py		train_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chem

chem

models

models

priors/ChEMBL

priors/ChEMBL

scoring

scoring

utils

utils

Dockerfile

Dockerfile

README.md

README.md

init.py

init.py

create_model.py

create_model.py

environment.yml

environment.yml

reinforce_model.py

reinforce_model.py

reinforcement.py

reinforcement.py

sample_from_model.py

sample_from_model.py

train_model.py

train_model.py

Repository files navigation

PyTorch implementation of "Generating diverse molecular de novo structures using reinforcement learning"

Install

General usage

About

Releases

Packages

Languages

tblaschke/reinvent

Folders and files

Latest commit

History

Repository files navigation

PyTorch implementation of "Generating diverse molecular de novo structures using reinforcement learning"

Install

General usage

About

Resources

Stars

Watchers

Forks

Languages