Improving Compound-Protein Interaction Prediction by Self-Training with Augmenting Negative Samples.
Authors: Takuto Koyama, Shigeyuki Matsumoto, Hiroaki Iwata, Ryosuke Kojima, and Yasushi Okuno
Journal of Chemical Information and Modeling
https://pubs.acs.org/doi/10.1021/acs.jcim.3c00269
Dependencies can be installed with conda:
conda env create -f environment.yml
conda activate kmol
bash install.sh
Datsets used for this work is avaiable. (https://drive.google.com/drive/folders/1LILb8msjzWFPdM79dOWF1XZzSqerxm4d?usp=share_link)
Please refer to the original repository for the usage of kMoL v.1.0.1 (https://github.com/elix-tech/kmol).
.
├── data
| ├── self_training_kinase : data for kinase families
| ├──cv1
| ├──kinase_self_training
| ├── config/ : configuration file
| ├── data/ : dataset (csv)
| ├── split/ : json files for split
| └── run.sh : script for running self-training
├── docker/ :
├── src : source code
│ ├── kmol/ : modules for kmol
│ ├── mila/ : modules for federated learning
│ └── self : scripts for self-training
├── LICENSE : LICENSE file
└── README.md : this file
After moving to the ./data/self_training_*/cv1/*_self_training/
directory, you can initialize the self-training by the following command:
bash run.sh