Skip to content

chao1224/BioChemGNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BioChem GNN

This repository provides the main Graph Neural Network models for Drug Discovery in pytorch.

Currently BioChem GNN supports following tasks:

  • Molecule Property Prediction
  • Drug-Target Prediction
  • Drug-Drug Prediction

Environments

conda create -n bcg python=3.7
source activate bcg

conda install -y -c pytorch pytorch=1.6 torchvision
conda install -y -c rdkit rdkit
conda install -y scikit-learn
conda install -y numpy
conda install -y matplotlib

To install the current package, now we support the development mode: Go to home directory of this repo and do pip install --user -e ..

Graph Neural Networks

Now we provide following graph neural networks:

Model Paper
Graph Attention Networks (GAT) Graph Attention Networks, ICLR 2018
Graph Isomorphism Network (GIN) How Powerful are Graph Neural Networks?, ICLR 2019
Directed Message Passing Neural Network (D-MPNN) Analyzing Learned Molecular Representations for Property Prediction, ACS JCIM 2019

Motivation

BioChen GNN includes sparse and efficient implementations of Graph Neural Network models. Specifically, the sparse operation is based on the fact that all molecules nodes have very low degree, which allows us to do padding to some degree.

The benefit is that it can avoid the scatter operation. Recall that scatter operation is not deterministic and it can bring in some extra issues on some smaller datasets like delaney. About the deterministic-related discussion, feel free to check more details on the Reproducibility on pytorch document here.

And accordingly, the drawback of this repo is that it does not support graph of nodes with higher degrees, like biological knowledge graph. For those applications, we would recommend another well-organized git repo: Drug Discovery Platform.

Acknowledgements

We would like to acknowledge the following related projects.

  • BioChemGNN_Dense: This is a smart and dense version of BioChemGNN, and is only applicable for small-molecule data.
  • Drug Discovery Platform: This is a well-organized platform and it also supports large-scale biology knowledge graph. Yet it includes the catter operation, which is non-deterministic.
  • chemprop: This is the repo for D-MPNN and is the first well-acknowledged repo to use such padding idea for non-determinstic operation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages