LSDC

1. Data processing and data partitioning

(1) deal_SMILES.py

Firstly, use the deal_SMILES code to process the initial data set.

(2) data_structs.py

The data_structs code is used at this time to continue cleaning the data.

The voc character library generated at this time cannot be used.

(3) calculate_qed_sa.py

At this time, the calculate_qed_sa code is used to calculate qed, sa, logP, etc. After getting test.csv, you also need to calculate attributes such as qed, sa, logP and add them to the data set.

(4) 4_bulid_voc

Build a character library and then divide the train and valid data sets.

2. Prior_train.py

Use the positive and negative samples of NLRP3 to train the pre-trained model.

3. Generator_train.py

Targeting NLRP3 to generate a large number of compounds using trained generative models.

4. Dm_train.py

Use the compounds generated in the previous step as an expanded training set to train the distillation model.

5. LSDC-agent_train.py

Use LSDC's reinforcement learning strategy to train the agent model to improve the skeleton diversity of generated molecules.

References

[1] Jike Wang et al. Multi-constraint molecular generation based on conditional transformer, knowledge distillation and reinforcement learning. Nature Machine Intelligence, 3, 914–922 (2021).

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
1-data_build		1-data_build
2-train_transformer		2-train_transformer
3-train_generator		3-train_generator
4-train_distilled		4-train_distilled
5_train_LSDC_agent		5_train_LSDC_agent
ckpt		ckpt
models		models
utils		utils
LICENSE		LICENSE
README.md		README.md
data.rar		data.rar
reinvent_scoring.rar		reinvent_scoring.rar

License

wichen-2022/LSDC

Folders and files

Latest commit

History

Repository files navigation

LSDC

1. Data processing and data partitioning

(1) deal_SMILES.py

(2) data_structs.py

(3) calculate_qed_sa.py

(4) 4_bulid_voc

2. Prior_train.py

3. Generator_train.py

4. Dm_train.py

5. LSDC-agent_train.py

References

About

Resources

License

Stars

Watchers

Forks

Languages