GitHub - uta-smile/CD-MVGNN

Data

Datasets used are from MoleculeNet [1]. We also include the csv files of the data in the data folder. [1] Wu, Zhenqin, Bharath Ramsundar, Evan N. Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S. Pappu, Karl Leswing, and Vijay Pande. "MoleculeNet: a benchmark for molecular machine learning." Chemical science 9, no. 2 (2018): 513-530.

Training example

### Generate rdkit features

python scripts/save_features.py --data_path bbbp.csv --features_generator rdkit_2d_normalized --save_path bbbp.npz --restart

### train CD-MVGNN model
CUDA_VISIBLE_DEVICES=0 python train.py --data_path ../data/bbbp.csv \
--features_path ../data/bbbp.npz --no_features_scaling \
--split_type scaffold_balanced --dataset_type classification --epochs 40 --metric auc \
--model_type dualmpnnplus --init_lr 1.5e-4 --max_lr 1e-3 --final_lr 1.5e-4 --batch_size 32 --depth 4 \
--hidden_size 4 --ffn_num_layers 4 --ffn_hidden_size 2 --dropout 0 --weight_decay 1e-7

The training settings for each dataset is also provided in the scripts folder, which can be run by

bash <dataset>.sh

Conda environment

The environment can be created by

conda env create -f environment.yaml

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
automl/legacy		automl/legacy
data		data
dglt		dglt
scripts		scripts
service		service
third_party		third_party
CHANGELOG		CHANGELOG
CONTRIBUTING.md		CONTRIBUTING.md
DESIGN.md		DESIGN.md
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
decrypt.pyd		decrypt.pyd
deploy.py		deploy.py
environment.yaml		environment.yaml
mpp_service.py		mpp_service.py
predict.py		predict.py
random_forest.py		random_forest.py
start.sh		start.sh
stop.sh		stop.sh
train.py		train.py
turbo.py		turbo.py

License

uta-smile/CD-MVGNN

Folders and files

Latest commit

History

Repository files navigation

Data

Training example

Conda environment

About

Resources

License

Stars

Watchers

Forks

Languages