Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge with latest updates from master #1

Merged
merged 11 commits into from
Aug 11, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CONTRIBUTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,4 @@ Contributors
* [Wenxuan Fan](https://github.com/wenx00): Strategies for Pre-training Graph Neural Networks
* [Vignesh Venkataraman](https://github.com/VIGNESHinZONE): PAGTN
* [Eric O. Korman](https://github.com/ekorman): Fix for ogbg_ppa
* [Marcos Leal](https://github.com/marcossilva): Change default number of processes to 1 for rexgen
41 changes: 31 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,30 @@

We also have a **slack channel** for real-time discussion. If you want to join the channel, contact mufeili1996@gmail.com.

## Table of Contents

- [Introduction](#introduction)
- [Installation](#installation)
* [Requirements](#requirements)
* [Pip installation for DGL-LifeSci](#pip-installation-for-dgl-lifesci)
* [Installation from source](#installation-from-source)
* [Verifying successful installation](#verifying-successful-installation)
- [Command Line Interface](#command-line-interface)
- [Examples](#examples)
- [Cite](#cite)

## Introduction

Deep learning on graphs has been an arising trend in the past few years. There are a lot of graphs in
life science such as molecular graphs and biological networks, making it an import area for applying
deep learning on graphs. DGL-LifeSci is a DGL-based package for various applications in life science
with graph neural networks.
Deep learning on graphs has been an arising trend in the past few years. There are a lot of graphs in
life science such as molecular graphs and biological networks, making it an import area for applying
deep learning on graphs. DGL-LifeSci is a DGL-based package for various applications in life science
with graph neural networks.

We provide various functionalities, including but not limited to methods for graph construction,
We provide various functionalities, including but not limited to methods for graph construction,
featurization, and evaluation, model architectures, training scripts and pre-trained models.

For a list of community contributors, see [here](CONTRIBUTORS.md).

**For a full list of work implemented in DGL-LifeSci, see [here](examples/README.md).**

## Installation

### Requirements
Expand Down Expand Up @@ -46,7 +56,7 @@ Additionally, we require `RDKit 2018.09.3` for utils related to cheminformatics.
```
conda install -c rdkit rdkit==2018.09.3
```

For other installation recipes for RDKit, see the [official documentation](https://www.rdkit.org/docs/Install.html).

### Pip installation for DGL-LifeSci
Expand All @@ -67,7 +77,7 @@ python setup.py install

### Verifying successful installation

Once you have installed the package, you can verify the success of installation with
Once you have installed the package, you can verify the success of installation with

```python
import dgllife
Expand All @@ -76,7 +86,18 @@ print(dgllife.__version__)
# 0.2.9
```

### Cite
## Command Line Interface

DGL-LifeSci provides command line interfaces that allow users to perform modeling without any background in programming and deep learning. You will need to first clone the github repo.

- [Molecular Property Prediction](examples/property_prediction/csv_data_configuration/)
- [Reaction Prediction](examples/reaction_prediction/rexgen_direct/)

## Examples

For a full list of work implemented in DGL-LifeSci, see [here](examples/README.md).

## Cite

If you use DGL-LifeSci in a scientific publication, we would appreciate citations to the following paper:

Expand Down
10 changes: 10 additions & 0 deletions docs/source/api/model.gnn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,13 @@ GNNOGB
------
.. automodule:: dgllife.model.gnn.gnn_ogb
:members:

NF
--
.. automodule:: dgllife.model.gnn.nf
:members:

PAGTN
-----
.. automodule:: dgllife.model.gnn.pagtn
:members:
20 changes: 20 additions & 0 deletions docs/source/api/model.zoo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,16 @@ GNN OGB Predictor
.. automodule:: dgllife.model.model_zoo.gnn_ogb_predictor
:members:

Neural Fingerprint Predictor
````````````````````````````
.. automodule:: dgllife.model.model_zoo.nf_predictor
:members:

Path-Augmented Graph Transformer Predictor
``````````````````````````````````````````
.. automodule:: dgllife.model.model_zoo.pagtn_predictor
:members:

Generative Models
-----------------

Expand All @@ -77,6 +87,11 @@ DGMG
.. automodule:: dgllife.model.model_zoo.dgmg
:members:

JTNNVAE
```````
.. automodule:: dgllife.model.model_zoo.jtvae
:members:

Reaction Prediction

WLN for Reaction Center Prediction
Expand All @@ -95,3 +110,8 @@ ACNN
````
.. automodule:: dgllife.model.model_zoo.acnn
:members:

PotentialNet
````````````
.. automodule:: dgllife.model.model_zoo.potentialnet
:members:
1 change: 1 addition & 0 deletions docs/source/api/utils.complexes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ Utilities in DGL-LifeSci for working with protein-ligand complexes.
:toctree: ../generated/

dgllife.utils.ACNN_graph_construction_and_featurization
dgllife.utils.PN_graph_construction_and_featurization
7 changes: 7 additions & 0 deletions docs/source/api/utils.mols.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,9 @@ three common graph constructions:
dgllife.utils.k_nearest_neighbors
dgllife.utils.mol_to_nearest_neighbor_graph
dgllife.utils.smiles_to_nearest_neighbor_graph
dgllife.utils.ToGraph
dgllife.utils.MolToBigraph
dgllife.utils.SMILESToBigraph

Featurization for Molecules
---------------------------
Expand Down Expand Up @@ -130,6 +133,8 @@ For using featurization methods like above in creating node features:
dgllife.utils.PretrainAtomFeaturizer
dgllife.utils.AttentiveFPAtomFeaturizer
dgllife.utils.AttentiveFPAtomFeaturizer.feat_size
dgllife.utils.PAGTNAtomFeaturizer
dgllife.utils.PAGTNAtomFeaturizer.feat_size

Featurization for Edges
```````````````````````
Expand Down Expand Up @@ -164,3 +169,5 @@ For using featurization methods like above in creating edge features:
dgllife.utils.PretrainBondFeaturizer
dgllife.utils.AttentiveFPBondFeaturizer
dgllife.utils.AttentiveFPBondFeaturizer.feat_size
dgllife.utils.PAGTNEdgeFeaturizer
dgllife.utils.PAGTNEdgeFeaturizer.feat_size
2 changes: 1 addition & 1 deletion docs/source/api/utils.pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,4 @@ Early stopping is a standard practice for preventing models from overfitting and
class for handling it.

.. autoclass:: dgllife.utils.EarlyStopping
:members:
:members: step
35 changes: 35 additions & 0 deletions docs/source/cli.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
.. _cli:

Command Line Interface
======================

DGL-LifeSci provides command line interfaces that allow users
to perform modeling without any background in programming and
deep learning. In addition to installation, you will need to
clone the github repo with

.. code:: bash

git clone https://github.com/awslabs/dgl-lifesci.git

Molecular Property Prediction
-----------------------------

Go to the directory below with

.. code:: bash

cd dgl-lifesci/examples/property_prediction/csv_data_configuration/

and then follow the README file.

Reaction Prediction
-------------------

Go to the directory below with

.. code:: bash

cd dgl-lifesci/examples/reaction_prediction/rexgen_direct/

and then follow the README file.
28 changes: 6 additions & 22 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,12 @@ DGL-LifeSci: Bringing Graph Neural Networks to Chemistry and Biology
===========================================================================================

DGL-LifeSci is a python package for applying graph neural networks to various tasks in chemistry
and biology, on top of PyTorch and DGL. It provides:
and biology, on top of PyTorch, DGL, and RDKit. It covers various applications, including:

* Various utilities for data processing, training and evaluation.
* Efficient and flexible model implementations.
* Pre-trained models for use without training from scratch.

We cover various applications in our
`examples <https://github.com/awslabs/dgl-lifesci/tree/master/examples>`_, including:

* `Molecular property prediction <https://github.com/awslabs/dgl-lifesci/tree/master/examples/property_prediction>`_
* `Attention visualization <https://github.com/awslabs/dgl-lifesci/tree/master/examples/property_prediction/pubchem_aromaticity>`_
* `Generative models <https://github.com/awslabs/dgl-lifesci/tree/master/examples/generative_models>`_
* `Protein-ligand binding affinity prediction <https://github.com/awslabs/dgl-lifesci/tree/master/examples/binding_affinity_prediction>`_
* `Reaction prediction <https://github.com/awslabs/dgl-lifesci/tree/master/examples/reaction_prediction>`_

Get Started
------------

Follow the :doc:`instructions<install/index>` to install DGL.
* Molecular property prediction
* Generative models
* Reaction prediction
* Protein-ligand binding affinity prediction

.. toctree::
:maxdepth: 1
Expand All @@ -29,6 +16,7 @@ Follow the :doc:`instructions<install/index>` to install DGL.
:glob:

install/index
cli

.. toctree::
:maxdepth: 2
Expand All @@ -50,7 +38,3 @@ Free software
-------------
DGL-LifeSci is free software; you can redistribute it and/or modify it under the terms
of the Apache License 2.0. We welcome contributions. Join us on `GitHub <https://github.com/awslabs/dgl-lifesci>`_.

Index
-----
* :ref:`genindex`
26 changes: 3 additions & 23 deletions docs/source/install/index.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
Install DGL-LifeSci
===================

This topic explains how to install DGL-LifeSci. We recommend installing DGL-LifeSci by using ``conda`` or ``pip``.
Installation
============

System requirements
-------------------
DGL-LifeSci works with the following operating systems:
DGL-LifeSci should work on:

* Ubuntu 16.04
* macOS X
Expand All @@ -17,15 +15,6 @@ DGL-LifeSci requires:
* `DGL 0.4.3 or later <https://www.dgl.ai/pages/start.html>`_
* `PyTorch 1.2.0 or later <https://pytorch.org/>`_

If you have just installed DGL, the first time you use it, a message will pop up as follows:

.. code:: bash

DGL does not detect a valid backend option. Which backend would you like to work with?
Backend choice (pytorch, mxnet or tensorflow):

and you need to enter ``pytorch``.

Additionally, we require **RDKit 2018.09.3** for cheminformatics. We recommend installing it with

.. code:: bash
Expand All @@ -34,15 +23,6 @@ Additionally, we require **RDKit 2018.09.3** for cheminformatics. We recommend i

Other verions of RDKit are not tested.

Install from conda
----------------------
If ``conda`` is not yet installed, get either `miniconda <https://conda.io/miniconda.html>`_ or
the full `anaconda <https://www.anaconda.com/download/>`_.

.. code:: bash

conda install -c dglteam dgllife

Install from pip
----------------

Expand Down
3 changes: 3 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,9 @@ We provide various examples across 3 applications -- property prediction, genera
- Atomic Convolutional Networks for Predicting Protein-Ligand Binding Affinity (ACNN) [[paper]](https://arxiv.org/abs/1703.10603), [[github]](https://github.com/deepchem/deepchem/tree/master/contrib/atomicconv)
- [ACNN with DGL](../python/dgllife/model/model_zoo/acnn.py)
- [Example Training Script](binding_affinity_prediction)
- PotentialNet for molecular property prediction (PotentialNet) [[paper]](https://pubs.acs.org/doi/10.1021/acscentsci.8b00507)
- [PotentialNet with DGL](../python/dgllife/model/model_zoo/potentialnet.py)
- [Example Training Script](binding_affinity_prediction)

## Reaction Prediction
- A graph-convolutional neural network model for the prediction of chemical reactivity [[paper]](https://pubs.rsc.org/en/content/articlelanding/2019/sc/c8sc04228d#!divAbstract), [[github]](https://github.com/connorcoley/rexgen_direct)
Expand Down
5 changes: 3 additions & 2 deletions examples/property_prediction/MTL/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,10 @@ For demonstration, you can generate a synthetic dataset as follows.
import torch
import pandas as pd

# 'nan' for missing property labels
data = {
'smiles': ['CCO' for _ in range(128)],
'logP': torch.randn(128).numpy().tolist(),
'smiles': ['CCO', 'CO', 'C', 'O'] * 32,
'logP': torch.randn(127).numpy().tolist() + [float('nan')],
'logD': torch.randn(128).numpy().tolist()
}
df = pd.DataFrame(data)
Expand Down
13 changes: 7 additions & 6 deletions examples/property_prediction/MTL/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@

from argparse import ArgumentParser
from dgllife.data import MoleculeCSVDataset
from dgllife.utils import smiles_to_bigraph, RandomSplitter
from dgllife.utils import SMILESToBigraph, RandomSplitter

from configure import configs
from run import main
Expand Down Expand Up @@ -67,13 +67,14 @@
# Setup for experiments
mkdir_p(args['result_path'])

node_featurizer = atom_featurizer
edge_featurizer = CanonicalBondFeaturizer(bond_data_field='he', self_loop=True)
df = pd.read_csv(args['csv_path'])

smiles_to_g = SMILESToBigraph(add_self_loop=True, node_featurizer=atom_featurizer,
edge_featurizer=edge_featurizer)

dataset = MoleculeCSVDataset(
df, partial(smiles_to_bigraph, add_self_loop=True),
node_featurizer=node_featurizer,
edge_featurizer=edge_featurizer,
df, smiles_to_g,
smiles_column=args['smiles_column'],
cache_file_path=args['result_path'] + '/graph.bin',
task_names=args['tasks']
Expand All @@ -84,4 +85,4 @@
dataset, frac_train=0.8, frac_val=0.1,
frac_test=0.1, random_state=0)

main(args, node_featurizer, edge_featurizer, train_set, val_set, test_set)
main(args, atom_featurizer, edge_featurizer, train_set, val_set, test_set)
Original file line number Diff line number Diff line change
Expand Up @@ -9,17 +9,17 @@
import torch

from dgllife.data import UnlabeledSMILES
from dgllife.utils import mol_to_bigraph
from functools import partial
from dgllife.utils import MolToBigraph
from torch.utils.data import DataLoader
from tqdm import tqdm

from utils import mkdir_p, collate_molgraphs_unlabeled, load_model, predict, init_featurizer

def main(args):
dataset = UnlabeledSMILES(args['smiles'], node_featurizer=args['node_featurizer'],
edge_featurizer=args['edge_featurizer'],
mol_to_graph=partial(mol_to_bigraph, add_self_loop=True))
mol_to_g = MolToBigraph(add_self_loop=True,
node_featurizer=args['node_featurizer'],
edge_featurizer=args['edge_featurizer'])
dataset = UnlabeledSMILES(args['smiles'], mol_to_graph=mol_to_g)
dataloader = DataLoader(dataset, batch_size=args['batch_size'],
collate_fn=collate_molgraphs_unlabeled, num_workers=args['num_workers'])
model = load_model(args).to(args['device'])
Expand Down
Loading