Skip to content

mathcom/ReBADD-SE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ReBADD-SE: Multi-objective Molecular Optimisation using SELFIES Fragment and Off-Policy Self-critical Sequence Training

This is the repository for ReBADD-SE, a multi-objective molecular optimization model that designs a molecular structures in the format of SELFIES. For more details, please refer to our paper.

  • Latest update: 26 Jan 2024

Install

conda env create -f environment.yml

Task Descriptions

  • TASK1: ReBADD-SE for GSK3b, JNK3, QED, and SA (frag-level)
  • TASK3: ReBADD-SE for BCL2, BCLXL, and BCLW (frag-level)
  • TASK4: ReBADD-SE for BCL2, BCLXL, and BCLW (char-level)
  • TASK7: SELFIES Collapse Analaysis between ReBADD-SE (frag-level) and GA+D

Notebook Descriptions

0_preprocess_data.ipynb

  • (Important!) Before starting any TASK, please first run the scripts in the directory 'data/chembl' or 'data/zinc15'
  • Read the training data
  • Preprocess the data for model training

1_pretraining.ipynb

  • Read the training data
  • The generator learns the grammar rules of SELFIES

2_optimize+{objectives}.ipynb

  • (Important!) Please check first the 'ReBADD_config.py' in which a reward function have to be defined appropriately
  • Load the pretrained generator

3_checkpoints+{objectives}.ipynb

  • Load the checkpoints stored during optimization
  • Sample molecules for each checkpoint

4_calculate_properties.ipynb

  • For each checkpoint, load the sampled molecules
  • Evaluate their property scores

5_evaluate_checkpoints.ipynb

  • Calculate metrics (e.g. success rate)
  • Find the best checkpoint

Note

If you have any further questions, please do not hesitate to let me know.

jonghwanc@hallym.ac.kr

Citation

@article{CHOI2023106721,
	title = {ReBADD-SE: Multi-objective molecular optimisation using SELFIES fragment and off-policy self-critical sequence training},
	journal = {Computers in Biology and Medicine},
	volume = {157},
	pages = {106721},
	year = {2023},
	issn = {0010-4825},
	doi = {https://doi.org/10.1016/j.compbiomed.2023.106721},
	url = {https://www.sciencedirect.com/science/article/pii/S0010482523001865},
	author = {Jonghwan Choi and Sangmin Seo and Seungyeon Choi and Shengmin Piao and Chihyun Park and Sung Jin Ryu and Byung Ju Kim and Sanghyun Park},
	keywords = {Drug discovery, De novo drug design, Multi-objective optimisation, SELFIES, Reinforcement learning}
}

About

ReBADD-SE: Multi-objective molecular optimisation using SELFIES fragment and off-policy self-critical sequence training

Resources

License

Stars

Watchers

Forks

Packages

No packages published