## Meta-MGNN: Few-Shot Graph Learning for Molecular Property Prediction

ABSTRACT: The recent success of graph neural networks has significantly
boosted molecular property prediction, advancing activities such as
drug discovery. The existing deep neural network methods usually
require large training dataset for each property, impairing their
performance in cases (especially for new molecular properties) with
a limited amount of experimental data, which are common in real
situations. To this end, we propose Meta-MGNN, a novel model
for few-shot molecular property prediction. Meta-MGNN applies
molecular graph neural network to learn molecular representations and builds a meta-learning framework for model optimization.
To exploit unlabeled molecular information and address task heterogeneity of different molecular properties, Meta-MGNN further
incorporates molecular structures, attribute based self-supervised
modules and self-attentive task weights into the former framework,
strengthening the whole learning model. Extensive experiments on
two public multi-property datasets demonstrate that Meta-MGNN
outperforms a variety of state-of-the-art methods.

Link to paper: https://arxiv.org/pdf/2102.07916v1.pdf

Credit: https://github.com/zhichunguo/Meta-MGNN

Google Colab: https://colab.research.google.com/drive/1sV3gdPlRjSY0FDZ25z7cXFxAAls3bFae?usp=sharing

In [None]:
# Clone the  repository and cd into directory
!git clone https://github.com/zhichunguo/Meta-MGNN.git
%cd Meta-MGNN

/content/Meta-MGNN


In [None]:
# Install requirements / dependencies
!pip install torch==1.8.0 torchvision==0.9.0
!pip install torch-scatter==2.0.6 torch-sparse==0.6.9 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu102.html
!pip install scikit-learn==0.23.2 tqdm==4.50.0

# Install RDKit
!pip install rdkit-pypi==2021.3.1.5

### Datasets

The datasets uploaded can be downloaded to train our model directly.

The original datasets are downloaded from Data. We utilize `Original_datasets/splitdata.py` to split the datasets according to the molecular properties and save them in different files in the `Original_datasets/[DatasetName]/new`. Then run `main.py`, the datasets will be automatically preprocessed by loader.py and the preprocessed results will be saved in the `Original_datasets/[DatasetName]/new/[PropertyNumber]/propcessed`.

### Run code

Datasets and k (for k-shot) can be changed in the last line of `main.py`.

In [None]:
# create the result folder
!mkdir result

In [None]:
!python main.py

### Performance
The performance of meta-learning is not stable for some properties. We report two times results and the number of the iteration where we obtain the best results here for your reference.

| Dataset    | k    | Iteration | Property   | Results   || k    | Iteration | Property  | Results   |
| ---------- | :-----------:  | :-----------: | :-----------: | :-----------:  | ---------- | :-----------:  | :-----------: | :-----------: | :-----------:  |
| Sider | 1 | 307/599 | Si-T1| 75.08/75.74 | | 5 | 561/585 | Si-T1 | 76.16/76.47 | 
|  |  | | Si-T2| 69.44/69.34 | |  | | Si-T2 | 68.90/69.77 | 
|  |  | | Si-T3| 69.90/71.39 | |  | | Si-T3 | 72.23/72.35 | 
|  |  | | Si-T4| 71.78/73.60 | |  | | Si-T4 | 74.40/74.51 | 
|  |  | | Si-T5| 79.40/80.50 | |  | | Si-T5 | 81.71/81.87 | 
|  |  | | Si-T6| 71.59/72.35 | |  | | Si-T6 | 74.90/73.34 | 
|  |  | | Ave.| 72.87/73.82 | |  | | Ave. | 74.74/74.70 | 
| Tox21 | 1 | 1271/1415 | SR-HS | 73.72/73.90 | | 5 | 1061/882 | SR-HS | 74.85/74.74 | 
|  |  | | SR-MMP | 78.56/79.62 | |  | | SR-MMP | 80.25/80.27 | 
|  |  | | SR-p53| 77.50/77.91 | |  | | SR-p53 | 78.86/79.14 | 
|  |  | | Ave.| 76.59/77.14 | |  | | Ave. | 77.99/78.05 | 

### Acknowledgements

The code is implemented based on <a href="https://github.com/snap-stanford/pretrain-gnns"> Strategies for Pre-training Graph Neural Networks</a>.