# AlphaNN

We studied the success of AlphaFold 2 and attempted to incorporate its architectural design into our previously described model ingeniously. AlphaFold 2 incorporates neural network architectures and training procedures that are guided by the evolutionary, physical, and geometric constraints of protein structures.

We chose to integrate GAT and GCN layers as the attention-based and non-attention-based components, respectively, in our subnetwork called AlphaNN. The aim is to take advantage of the respective strengths of both models. Specifically, GAT layers are proficient in modeling the node-to-node relationships in the graph, while GCN layers are well-suited for capturing the global graph structure.

Another notable technique utilized in their model involves reinforcing the concept of iterative refinement, termed recycling, which could be integrated into our solubility prediction model, such as AlphaNN and 1D-CNN.

In [1]:
import sys

sys.path.insert(0, '..')

In [2]:
from data.dataset import Dataset
import torch
from sklearn.metrics import mean_squared_error, r2_score
from data.featurization.dgl_Graph import DGL_Graph
from model.abstractmodel import AbstractModel
import pandas as pd

In [3]:
TRN = Dataset.load_csv("ds/all/TRN_DC")
TST1 = Dataset.load_csv("ds/all/TST_1")
TST2 = Dataset.load_csv("ds/all/TST_2")

In [4]:
featurizer = DGL_Graph(
    graph_type="BI_GRAPH",
    featurize_type="Canonical",
    self_loop=True
)
TRN.X = TRN.featurize(featurizer)
TST1.X = TST1.featurize(featurizer)
TST2.X = TST2.featurize(featurizer)

In [None]:
from model.alpha.AlphaGNN import AlphaGNN

AbstractModel.set_seed(2387)
num_heads = 5
MODEL = AlphaGNN(
    task_type="regression",
    # AlphaGNN Configuration
    n_tasks=1,
    in_feats=featurizer.get_node_feat_size(),
    recycle=3,
    # hidden_feats=[64, 64 * num_heads],
    allow_zero_in_degree=False,
    gat_num_heads=num_heads,
    gat_feat_drop=0.,
    gat_attn_drop=0.,
    gat_alpha=0,
    gat_residual=True,
    gat_agg_mode="flatten",
    gat_bias=True,
    gcn_norm="both",
    gcn_residual=True,
    gcn_batchnorm=False,
    gcn_dropout=0.13108904159657686,
    recycle_alpha=0.7,
    predictor_hidden_feats=128,
    predictor_dropout=0.,
    # Abstract DGL Configuration
    lr=0.001,
    y_name="LogS exp (mol/L)",
    weight_decay=0.007319939418114051,
    batch_size=4096,
)
scores = MODEL.fit(TRN, val=TST1, epochs=300)

In [None]:
pd.DataFrame({
    "loss": [v.item() for v in MODEL.scores["loss"]],
    "rmse": [v.item() for v in MODEL.scores["rmse"]]
}).plot()

In [5]:
trn_sets, val_sets = TRN.k_fold_split(5)

In [6]:
from model.alpha.AlphaGNN import AlphaGNN

AbstractModel.set_seed(2387)
num_heads = 4

k_pred_tst1 = []
k_pred_tst2 = []

for trn, val in zip(trn_sets, val_sets):
    model = AlphaGNN(
        task_type="regression",
        # AlphaGNN Configuration
        n_tasks=1,
        in_feats=featurizer.get_node_feat_size(),
        recycle=3,
        # hidden_feats=[64, 64 * num_heads],
        allow_zero_in_degree=False,
        gat_num_heads=num_heads,
        gat_feat_drop=0.,
        gat_attn_drop=0.,
        gat_alpha=0,
        gat_residual=True,
        gat_agg_mode="flatten",
        gat_bias=True,
        gcn_norm="both",
        gcn_residual=True,
        gcn_batchnorm=False,
        gcn_dropout=0.13108904159657686,
        recycle_alpha=0.7,
        predictor_hidden_feats=128,
        predictor_dropout=0.,
        # Abstract DGL Configuration
        lr=0.001,
        y_name="LogS exp (mol/L)",
        weight_decay=0.007319939418114051,
        batch_size=4096,
    )
    model.fit(trn, val=val, epochs=800, min_epoch=300, early_stop=20)

    k_pred_tst1.append(model.predict(TST1).cpu())
    k_pred_tst2.append(model.predict(TST2).cpu())

[INFO] Expect to use 'DGL_Graph' to featurize SMILES
[INFO] Device cuda


 42%|████▏     | 338/800 [02:10<02:58,  2.59it/s, loss: 2.345 rmse: 1.499]


[INFO] Expect to use 'DGL_Graph' to featurize SMILES
[INFO] Device cuda


 42%|████▏     | 336/800 [02:09<02:58,  2.60it/s, loss: 1.828 rmse: 1.863]


[INFO] Expect to use 'DGL_Graph' to featurize SMILES
[INFO] Device cuda


 40%|████      | 323/800 [02:04<03:03,  2.60it/s, loss: 1.989 rmse: 1.868]


[INFO] Expect to use 'DGL_Graph' to featurize SMILES
[INFO] Device cuda


 44%|████▎     | 349/800 [02:38<03:25,  2.20it/s, loss: 2.091 rmse: 1.201]


[INFO] Expect to use 'DGL_Graph' to featurize SMILES
[INFO] Device cuda


 42%|████▏     | 334/800 [02:18<03:12,  2.42it/s, loss: 2.102 rmse: 1.313]


In [7]:
pred_tst1 = [torch.mean(pred_i).item() for pred_i in torch.cat(k_pred_tst1, 1)]
pred_tst2 = [torch.mean(pred_i).item() for pred_i in torch.cat(k_pred_tst2, 1)]

In [8]:
print(f"TST1 : RMSE {mean_squared_error(TST1.y, pred_tst1, squared=False)}")
print(f"TST2 : RMSE {mean_squared_error(TST2.y, pred_tst2, squared=False)}")

TST1 : RMSE 0.9859856373465161
TST2 : RMSE 1.7583186905921087


In [9]:
print(f"TST1 : R^2 {r2_score(TST1.y, pred_tst1)}")
print(f"TST2 : R^2 {r2_score(TST2.y, pred_tst2)}")

TST1 : R^2 0.3933251547531803
TST2 : R^2 0.3259840971371806
