https://www.kaggle.com/code/widhiwinata/mpnn-a-type-of-graph-neural-network-gnn

https://deepchem.readthedocs.io/en/latest/api_reference/models.html

# Install library

In [1]:
!pip uninstall torch torch-geometric rdkit dgl dpdata rdkit-pypi Pillow pydot dgllife deepchem -y

Found existing installation: torch 2.4.0
Uninstalling torch-2.4.0:
  Successfully uninstalled torch-2.4.0
Found existing installation: torch_geometric 2.5.3
Uninstalling torch_geometric-2.5.3:
  Successfully uninstalled torch_geometric-2.5.3
Found existing installation: rdkit 2024.3.5
Uninstalling rdkit-2024.3.5:
  Successfully uninstalled rdkit-2024.3.5
Found existing installation: dgl 2.1.0
Uninstalling dgl-2.1.0:
  Successfully uninstalled dgl-2.1.0
Found existing installation: dpdata 0.2.19
Uninstalling dpdata-0.2.19:
  Successfully uninstalled dpdata-0.2.19
Found existing installation: rdkit-pypi 2022.9.5
Uninstalling rdkit-pypi-2022.9.5:
  Successfully uninstalled rdkit-pypi-2022.9.5
Found existing installation: pillow 10.4.0
Uninstalling pillow-10.4.0:
  Successfully uninstalled pillow-10.4.0
Found existing installation: pydot 3.0.1
Uninstalling pydot-3.0.1:
  Successfully uninstalled pydot-3.0.1
Found existing installation: dgllife 0.3.2
Uninstalling dgllife-0.3.2:
  Successful

In [2]:
!pip install dgl-cu116 -f https://data.dgl.ai/wheels/dgl_cu116-0.9.1-cp39-cp39-win_amd64.whl
!pip install dgl -f https://data.dgl.ai/wheels/dgl-2.2.1-cp39-cp39-win_amd64.whl

Looking in links: https://data.dgl.ai/wheels/dgl_cu116-0.9.1-cp39-cp39-win_amd64.whl
Looking in links: https://data.dgl.ai/wheels/dgl-2.2.1-cp39-cp39-win_amd64.whl
[0mCollecting dgl
  Using cached dgl-2.1.0-cp310-cp310-manylinux1_x86_64.whl.metadata (553 bytes)
[0mCollecting torch>=2 (from torchdata>=0.5.0->dgl)
  Using cached torch-2.4.0-cp310-cp310-manylinux1_x86_64.whl.metadata (26 kB)
Using cached dgl-2.1.0-cp310-cp310-manylinux1_x86_64.whl (8.5 MB)
Using cached torch-2.4.0-cp310-cp310-manylinux1_x86_64.whl (797.2 MB)
Installing collected packages: torch, dgl
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
fastai 2.7.16 requires pillow>=9.0.0, which is not installed.
torchvision 0.18.1+cu121 requires pillow!=8.3.*,>=5.3.0, which is not installed.
torchaudio 2.3.1+cu121 requires torch==2.3.1, but you have torch 2.4.0 which is incompatible.
torchvision

In [3]:
!pip install torch torch-geometric rdkit dpdata rdkit-pypi Pillow pydot dgllife deepchem lightning

Collecting torch-geometric
  Using cached torch_geometric-2.5.3-py3-none-any.whl.metadata (64 kB)
Collecting rdkit
  Using cached rdkit-2024.3.5-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (3.9 kB)
Collecting dpdata
  Using cached dpdata-0.2.19-py3-none-any.whl.metadata (26 kB)
Collecting rdkit-pypi
  Using cached rdkit_pypi-2022.9.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.9 kB)
Collecting Pillow
  Using cached pillow-10.4.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (9.2 kB)
Collecting pydot
  Using cached pydot-3.0.1-py3-none-any.whl.metadata (9.9 kB)
Collecting dgllife
  Using cached dgllife-0.3.2-py3-none-any.whl.metadata (667 bytes)
Collecting deepchem
  Using cached deepchem-2.8.0-py3-none-any.whl.metadata (2.0 kB)
Using cached torch_geometric-2.5.3-py3-none-any.whl (1.1 MB)
Using cached rdkit-2024.3.5-cp310-cp310-manylinux_2_28_x86_64.whl (33.1 MB)
Using cached dpdata-0.2.19-py3-none-any.whl (151 kB)
Using cached rdkit_pypi-2022.9.5-cp310-cp

In [4]:
!sudo apt-get -qq install graphviz

In [5]:
!pip show dgllife

Name: dgllife
Version: 0.3.2
Summary: DGL-based package for Life Science
Home-page: https://github.com/awslabs/dgl-lifesci
Author: 
Author-email: 
License: APACHE
Location: /usr/local/lib/python3.10/dist-packages
Requires: hyperopt, joblib, networkx, numpy, pandas, requests, scikit-learn, scipy, tqdm
Required-by: 


# Import library and load dataset

In [6]:
import os

# Temporary suppress tf logs
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
from rdkit import Chem
from rdkit import RDLogger
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem.Draw import MolsToGridImage
from rdkit.Chem import PandasTools

# Temporary suppress warnings and RDKit logs
warnings.filterwarnings("ignore")
RDLogger.DisableLog("rdApp.*")

np.random.seed(42)
tf.random.set_seed(42)

In [7]:
train = pd.read_csv('/content/train.csv')[['IC50_nM', 'Smiles']]
train[10:12]

Unnamed: 0,IC50_nM,Smiles
10,0.19,CC(C)(O)[C@H](F)CN1Cc2cc(NC(=O)c3cnn4cccnc34)c...
11,0.2,COc1cc2nn([C@H]3CC[C@@]4(CC3)CC(=O)N(C)C4)cc2c...


# Define Features

In [8]:
from deepchem.feat import ConvMolFeaturizer

train_smiles = ConvMolFeaturizer().featurize(train['Smiles'])
train_smiles[0]



FileNotFoundError: Cannot find DGL C++ graphbolt library at /usr/local/lib/python3.10/dist-packages/dgl/graphbolt/libgraphbolt_pytorch_2.4.0.so

In [3]:
from deepchem.models import *

AttributeError: module 'dgl' has no attribute 'DGLGraph'

In [63]:
MPNN_model = MPNNModel(
    n_tasks=1,  # 예측할 타겟의 수 (IC50)
    mode='regression',  # 회귀 문제이므로 'regression'
    number_of_features=75,  # ConvMolFeaturizer가 생성하는 기본 feature 수
    n_graph_feat=128,  # 각 노드에서 사용할 그래프 특성 수
    n_pair_feat=14,  # 각 엣지에서 사용할 특성 수
    T=5,  # 메세지 전달 스텝 수
    M=3,  # 메세지 전달 레이어 수
)

ImportError: cannot import name 'DGLHeteroGraph' from 'dgl.heterograph' (/usr/local/lib/python3.10/dist-packages/dgl/heterograph.py)