This is the code repository for our paper Empirical Evidence for the Fragment level understanding on Drug Molecular Structure of LLMs, which was accepted by AAAI 2024 LLMs4Bio workshop.
pytorch==1.12.1
rdkit==2020.03
tqdm
tensorboard
guacamol
The ChEMBL dataset is available at ChEMBL.
python codes/pretrain.py
python codes/generator_guacamol.py --task_id 0
python codes/generator_guacamol.py --task_id 1
python codes/generator_guacamol.py --task_id 2
See the .ipynb
notebooks.