# Steerable Graph Convolutions for Learning Protein Structures
<!-- An analysis of the paper and its key components. Think about it as nicely formatted review as you would see on OpenReview.net -->
by *Synthesized Solututions*

Machine learning is increasingly being applied to the analysis of molecules for tasks such as protein design, model quality assessment, and ablation studies. These techniques can help us better understand the structure and function of proteins, which is useful for many medical application, such as drug discovery. Convolutional Neural Networks (CNNs) and Graph Neural Networks (GNNs) are two types of machine learning models that are  particularly well-suited for analyzing molecular data. CNNs can operate directly on the geometry of a structure and GNNs are expressive in terms of relational reasoning.

- something about them being translation equivariant/invariant
- problem of not being rotation equivariant 

<!-- State something about it still being hard to make these models rotation equivariant; and write what equivariant is? -->

- Shift invariance is achieved in CNNs -> convolutional layers are shift equivariant (since convolutions are shift equivariant) and pooling layers are shift invariant (even if image is shifted -a little bit- the max e.g. will remain the same)
- CNNs are not rotation equivariant; if an image is shifted, it is not detected by the same filter as before, so CNN needs to learn every filter for every possible rotation (very inconvenient): "Although, a deep CNN can be made rotationally equivalent by making the network explicitly learn the different feature maps at rotated versions of the input image. But in this approach, CNN is compelled to learn the rotated versions of the same image which increases the risk of overfitting and introduces a redundant degree of freedom."
- GNNs are great and all, but not rotation equivariant; doesn't see the molecule that is rotated one way as the same molecule rotated another way

Formally we can define equivariance as follows:
$$f(g\cdot x) = g\cdot f(x)$$



In order to keep more geometric information, Jing, Eismann, Suriana, Townshend, and Dror (2020) propose a modification to the standard GNN by changing the multilayer perceptrons (MLPs) with geometric vector perceptrons (GVPs). The approach described in the paper, GVPs, are used to learn the relationship between protein sequences and their structures. GVPs are a type of layer that operates on geometric objects, such as vectors and matrices, rather than on scalar values like most neural networks. This makes GVPs well-suited to tasks that involve analyzing spatial relationships, such as predicting protein structures.



In [7]:
import importlib  
gvp = importlib.import_module("gvp-pytorch.gvp")
import torch
import torch_geometric
import torch.nn.functional as F

In [8]:
tasks = ["SMP", "PSR", "RSR", "MSP", "SMP", "LBA", "LEP"] #,'PPI', 'RES']

args = {"task": tasks[0], # see options above
        "num-workers": 4,
        "smp-idx": None, # range 0-19
        "batch": 8,
        "lba-split": 30, # 30 or 60
        "train-time": 120, # in minutes
        "val-time": 20, # in minutes
        "epochs": 50,
        "test": None, # path to evaluate a trained model
        "lr": 1e-4,
        "load": None, # path to initialize first 2 GNN layers with pretrained weights
        "save": "models", # directory to save models to
        "data": "data", # directory to data
        "monitor": True, # trigger tensorboard monitoring
        "no-pbar": True, # when set, do not show TWDM bars
} 

In [9]:
### OUR CODE HERE
# from run_atom3d import *



# datasets = get_datasets(args.task, args.data, args.lba_split)
# dataloader = partial(torch_geometric.loader.DataLoader,
#                 num_workers=args.num_workers, batch_size=args.batch)
# if args.task not in ['PPI', 'RES']:
#     dataloader = partial(dataloader, shuffle=True)

# trainset, valset, testset = map(dataloader, datasets)
# model = get_model(args.task).to(device)
# model = nn.DataParallel(model)

# if args.test:
#     test(model, testset)
# some test stuff
scalars_in, vectors_in = 10, 10
scalars_out, vectors_out = 10, 10

in_dims = scalars_in, vectors_in
out_dims = scalars_out, vectors_out
gvp_ = gvp.GVP(in_dims, out_dims)

gvp_ = gvp.GVP(in_dims, out_dims,
            activations=(F.relu, None), vector_gate=True)

dropout = gvp.Dropout(drop_rate=0.1)
layernorm = gvp.LayerNorm(out_dims)

x = gvp.randn(n=5, dims=in_dims)
# x = (s, V) with s.shape = [5, scalars_in] and V.shape = [5, vectors_in, 3]

out = gvp_(x)
out = dropout(out)
out = layernorm(out)

y = gvp.randn(n=5, dims=in_dims)
z = gvp.tuple_sum(x, y)
z = gvp.tuple_cat(x, y, dim=-1) # concat along channel axis
z = gvp.tuple_cat(x, y, dim=-2) # concat along node / batch axis

node_mask = torch.rand(5) < 0.5
z = gvp.tuple_index(x, node_mask) # select half the nodes / batch at random

print(z)


(tensor([[ 0.9619, -0.4841, -0.1284, -1.1194,  3.3360, -0.9701,  0.0931,  2.5875,
          0.7545, -1.4744],
        [ 1.5812, -2.0930,  1.6547,  0.0272,  0.8697,  1.1716,  0.4379, -1.3952,
          1.4812,  0.2163],
        [ 1.9377, -0.8061, -0.2328, -0.1506, -0.5729, -1.2459,  1.4393,  1.2837,
          1.0559, -0.3273]]), tensor([[[-1.0871,  0.2218,  2.4097],
         [-1.4756, -0.6310, -0.0874],
         [ 1.5917, -0.1781, -0.7701],
         [ 1.1409,  2.2131, -0.9159],
         [ 0.7890,  0.3483, -0.3917],
         [-0.0260,  0.0851, -1.5376],
         [-0.9060,  0.8046, -1.3302],
         [ 0.6292, -0.4953, -1.5158],
         [-0.0615,  0.5774,  0.7823],
         [-0.2549, -0.2908, -0.2219]],

        [[ 0.1714,  0.0084, -1.3378],
         [-0.5415,  1.4138, -1.6238],
         [ 0.7055, -0.1119, -2.1222],
         [-0.8087, -0.5466, -0.8699],
         [ 1.9468,  0.5484,  0.4135],
         [ 0.8651, -0.3576, -1.5791],
         [ 0.0067, -2.1135, -0.1313],
         [-0.3478, -0.

Exposition of its weaknesses/strengths/potential which triggered your group to come with a response.

- Current model is not very expressive; it's not steerable; can only handle type-1
  - GVPs would kind of be part of the Invariant Message Passing NNs?
  - So I consider it as a “incomplete” steerable mlp
  - My point is that steerable MLP can enable the information exchange between all possible pairs of vectors (type 0, 1, …, n), but GVP can only exchange the information from scalar vector to type-1 vector by using gating and from type-1 vector to scalar using norm.
- only invariant to rotation, due to taking norm (scalar value) (i think)

In [2]:
### OUR CODE HERE

Describe your novel contribution.
- changing the GVP layers to steerable graph convolutions layers
- perhaps change the k in knn for these graph convolution (message passing layers)

hidden stuff
<!-- ipv xyz  componenten steerable componenten om het equivariant te maken
- invariant op dit moment meer omdat oprolling van allerlei \/ moleculen (die wel allemaal andere kant opstaan) worden nu zelfde behandelt als allerlei \/ moleculen op een rechte lijn
- door equivariant te maken (met steerable) hopen wij dat deze wel verschillend worden behandeld

- scalar features (zijn sowieso al rotation invariant, ze hebben geen hoek) en vector features  (rotation invariant atm) waarvan we ook de norm nemen -> scalar dus rotation invariant -->

In [3]:
### OUR CODE HERE

Support your contribution with actual code and experiments (hence the colab format!)

In [13]:
### OUR CODE HERE

# testje om te zien of we de tfevent files direct kunnen visualiseren in de notebook
# ik moet nu deze cel 2x runnen voor tensorboard daadwerkelijk zichtbaar word
%load_ext tensorboard 
%tensorboard --logdir tfevent-files/


Conclude

In [5]:
### OUR CODE HERE

Close the notebook with a description of the each students' contribution.