# Molecule Property Prediction

Each graph represents a molecule, where nodes are atoms, and edges are chemical bonds. Input node features are 9-dimensional, containing atomic number and chirality, as well as other additional atom features such as formal charge and whether the atom is in the ring or not. 

**Prediction**: The task is to predict the target molecular properties as accurately as possible, where the molecular properties are cast as binary labels, e.g, whether a molecule inhibits HIV virus replication or not

In [1]:
from ogb.graphproppred import PygGraphPropPredDataset, Evaluator
from torch_geometric.loader import DataLoader
from tqdm.notebook import tqdm


dataset = PygGraphPropPredDataset(name='ogbg-molhiv', root='data')

device = 'cpu' # 'cuda' if torch.cuda.is_available() else 'cpu'
print('Device: {}'.format(device))

split_idx = dataset.get_idx_split()

# Check task type
print('Task type: {}'.format(dataset.task_type))
print('Task type: {}'.format(dataset.num_tasks))



Device: cpu
Task type: binary classification
Task type: 1


In [2]:
# Load each dataset split in a different DataLoader
train_loader = DataLoader(dataset[split_idx["train"]], batch_size=32, shuffle=True, num_workers=0)
valid_loader = DataLoader(dataset[split_idx["valid"]], batch_size=32, shuffle=False, num_workers=0)
test_loader = DataLoader(dataset[split_idx["test"]], batch_size=32, shuffle=False, num_workers=0)

In [3]:
dataset[0] # this is the first graph

Data(edge_index=[2, 40], edge_attr=[40, 3], x=[19, 9], y=[1, 1], num_nodes=19)

In [4]:
batch_0 = next(iter(train_loader)) # this is the first batch of 32 graphs
batch_0

DataBatch(edge_index=[2, 1718], edge_attr=[1718, 3], x=[804, 9], y=[32, 1], num_nodes=804, batch=[804], ptr=[33])

When working with a batch of graphs, PyG actually creates a unique disconnected graph (a block diagonal adjacency matrix) so that it can parallelize the embedding procedure. Specifically, the data is aa `torch_geometric.data.Batch` which inherits from Data but contains additionally the *batch* attribute which specifies also the mapping of each node to the graph it belongs to. Overall in the above batch there are 804 nodes from 32 distinct graphs.

    batch = [0, ..., 0, 1, ..., n - 2, n - 1, ..., n - 1]


This is useful for the last embedding layer to do global pooling based on such attribute.



In [5]:
print(f"Feature matrix X: {batch_0.x}")
batch_0.batch # node to graph mapping -- 32 different graphs in the batch

tensor([[ 5,  0,  4,  ...,  2,  0,  0],
        [ 5,  0,  3,  ...,  1,  0,  0],
        [ 7,  0,  1,  ...,  1,  0,  0],
        ...,
        [ 5,  0,  4,  ...,  2,  0,  0],
        [34,  0,  1,  ...,  2,  0,  0],
        [34,  0,  1,  ...,  2,  0,  0]])


tensor([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,
         2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  3,  3,  3,  3,  3,  3,  3,  3,
         3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  4,  4,  4,  4,  4,  4,  4,  4,
         4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,
         4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  5,
         5,  5,  5,  5,  5,  5,  5,  5,  6,  6,  6,  6,  6,  6,  6,  6,  6,  6,
         6,  6,  6,  6,  6,  6,  6,  6,  7,  7,  7,  7,  7,  7,  7,  7,  7,  7,
         7,  7,  7,  7,  7,  7,  7,  7,  7,  7,  7,  7,  8,  8,  8,  8,  8,  8,
         8,  8,  8,  8,  9,  9,  9,  9,  9,  9,  9,  9,  9,  9,  9,  9,  9,  9,
         9,  9,  9,  9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
        10, 10, 10, 11, 11, 11, 11, 11, 

## Model

We use the GCN we created in the node prediction task to create the node embeddings, which will then be pooled together to create the graph-level embedding.

In [6]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch_geometric.nn import GCNConv

class GCN(torch.nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim, num_layers, act_fn = F.relu, batch_norm = True, dropout = 0, return_embeddings = False) -> None:
        super(GCN, self).__init__()
        
        self.batch_norm = batch_norm
        self.act_fn = act_fn
        self.dropout = dropout
        self.prediction_head = torch.nn.LogSoftmax(dim=1)
        self.return_embeddings = return_embeddings
        
        # Initialize the convolution layers
        self.conv_layers = torch.nn.ModuleList([GCNConv(input_dim, hidden_dim)]) #first layer
        for _ in range(num_layers-2):
            self.conv_layers.append(GCNConv(hidden_dim, hidden_dim)) #hidden layers
        self.conv_layers.append(GCNConv(hidden_dim, output_dim)) #last layer
        
        if batch_norm:
            self.batch_units = nn.ModuleList([torch.nn.BatchNorm1d(hidden_dim) for _ in range(num_layers - 1)])
            
    
    def reset_parameters(self):
        for conv in self.conv_layers:
            conv.reset_parameters()
        if self.batch_norm:
            for b in self.batch_units:
                b.reset_parameters()
                

    def forward(self, x, edge_index):
        
        for l in range(len(self.conv_layers) - 1):
            x = self.conv_layers[l](x, edge_index)
            x = self.batch_units[l](x)
            x = self.act_fn(x)
            x = F.dropout(x, p = self.dropout, training = True)
        
        # Final layer
        out = self.conv_layers[-1](x, edge_index) # Embeddings
        if not self.return_embeddings:
            out = self.prediction_head(out)  # Classification
        
        return out       
                
        

In [7]:
from ogb.graphproppred.mol_encoder import AtomEncoder
from torch_geometric.nn import global_add_pool, global_mean_pool
import torch_geometric.data.batch
import typing

class GCN_Graph(torch.nn.Module):
    def __init__(self, hidden_dim, output_dim, num_layers, dropout) -> None:
        super().__init__()
        
        self.node_encoder = AtomEncoder(hidden_dim)
        
        self.gnn = GCN(input_dim = hidden_dim,
                              hidden_dim= hidden_dim,
                              output_dim= hidden_dim,
                              num_layers = num_layers,
                              dropout=dropout,
                              return_embeddings = True) # now we need them
        
        self.pool = global_mean_pool
        
        # Output layer
        self.linear = torch.nn.Linear(hidden_dim, output_dim)
        
    def reset_parameters(self):
      self.gnn.reset_parameters()
      self.linear.reset_parameters()
      
    
    
    # Usually in the forward we pass the data and edge_index, since it is one graph. But now
    # we have multiple graphs with multiple features, so we pass the all batched data. In the 
    # actual conv_layer we will pass the edge index associated to a specific graph. In this
    # case, each batch will contain as explained before the edge index of all the graphs but
    # in a disconnected way. So we can pass directly such edge index because the batch attribute
    # will let discriminate the different graphs
    def forward(self, batched_data: torch_geometric.data.Batch):
        
        x = batched_data.x
        edge_index = batched_data.edge_index
        batch_indicator_vector = batched_data.batch # each node its own graph
        
        x = self.node_encoder(x)
        x = self.gnn(x, edge_index) # edge index contains all the edges in the disconnected graph
        x = self.pool(x, batch = batch_indicator_vector) # batch-wise pooling thanks to the batch attribute
        out=self.linear(x)
        
        return out
      

In [8]:
args = {
    'device': device,
    'num_layers': 5,
    'hidden_dim': 256,
    'dropout': 0.5,
    'lr': 0.001,
    'epochs': 30,
}
args

{'device': 'cpu',
 'num_layers': 5,
 'hidden_dim': 256,
 'dropout': 0.5,
 'lr': 0.001,
 'epochs': 30}

In [9]:
model = GCN_Graph(hidden_dim = args['hidden_dim'],
                  output_dim=dataset.num_tasks,
                  num_layers=args['num_layers'],
                  dropout=args['dropout']
                  ).to(device)

model(batch_0)[0:10] # prediction without training, just for debugging

tensor([[0.3511],
        [0.1852],
        [0.3711],
        [0.0610],
        [0.0609],
        [0.2694],
        [0.4110],
        [0.4006],
        [0.4486],
        [0.0654]], grad_fn=<SliceBackward0>)

### Train

In [10]:
def train(model, device, data_loader, optimizer, loss_fn):

    # data_loader: loader containing the batches of data (graphs)

    model.train()
    loss = 0

    for step, batch in enumerate(tqdm(data_loader, desc="Iteration")):
      batch = batch.to(device)

      if batch.x.shape[0] == 1 or batch.batch[-1] == 0:
          pass
      else:
        ## ignore nan targets (unlabeled) when computing training loss.
        is_labeled = batch.y == batch.y 
        optimizer.zero_grad()
        preds = model(batch)
        loss = loss_fn(preds[is_labeled].to(torch.float32), batch.y[is_labeled].to(torch.float32))

        loss.backward()
        optimizer.step()

    return loss.item()

### Evaluation

In [11]:
import pandas as pd

# The evaluation function, used to get the metrics and to save the results, when a model has been trained 
def eval(model, device, loader, evaluator, save_model_results=False, save_file=None):
    model.eval()
    y_true = []
    y_pred = []

    for step, batch in enumerate(tqdm(loader, desc="Iteration")):
        batch = batch.to(device)

        if batch.x.shape[0] == 1:
            pass
        else:
            with torch.no_grad():
                pred = model(batch)

            y_true.append(batch.y.view(pred.shape).detach().cpu())
            y_pred.append(pred.detach().cpu())

    y_true = torch.cat(y_true, dim = 0).numpy()
    y_pred = torch.cat(y_pred, dim = 0).numpy()

    input_dict = {"y_true": y_true, "y_pred": y_pred}

    if save_model_results:
        print ("Saving Model Predictions")

        # Create a pandas dataframe with a two columns
        # y_pred | y_true
        data = {}
        data['y_pred'] = y_pred.reshape(-1)
        data['y_true'] = y_true.reshape(-1)

        df = pd.DataFrame(data=data)
        # Save to csv
        df.to_csv('ogbg-molhiv_graph_' + save_file + '.csv', sep=',', index=False)

    return evaluator.eval(input_dict)

In [12]:
import copy

model = GCN_Graph(hidut=args['dropout']
                  ).to(den_dim = args['hidden_dim'],
                  output_dim=dataset.num_tasks,
                  num_layers=args['num_layers'],
                  dropodevice)

evaluator = Evaluator(name='ogbg-molhiv')
optimizer = torch.optim.Adam(model.parameters(), lr=args['lr'])
loss_fn = torch.nn.BCEWithLogitsLoss()
best_model = None
best_valid_acc = 0



for epoch in range(1, 1 + args["epochs"]):
    print('Training...')
    loss = train(model, device, train_loader, optimizer, loss_fn)

    print('Evaluating...')
    train_result = eval(model, device, train_loader, evaluator)
    val_result = eval(model, device, valid_loader, evaluator)
    test_result = eval(model, device, test_loader, evaluator)

    train_acc, valid_acc, test_acc = train_result[dataset.eval_metric], val_result[dataset.eval_metric], test_result[dataset.eval_metric]
    if valid_acc > best_valid_acc:
        best_valid_acc = valid_acc
        best_model = copy.deepcopy(model)
    print(f'Epoch: {epoch:02d}, '
            f'Loss: {loss:.4f}, '
            f'Train: {100 * train_acc:.2f}%, '
            f'Valid: {100 * valid_acc:.2f}% '
            f'Test: {100 * test_acc:.2f}%')



Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 01, Loss: 0.0349, Train: 69.33%, Valid: 67.61% Test: 68.04%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 02, Loss: 0.0196, Train: 73.58%, Valid: 69.40% Test: 67.56%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 03, Loss: 1.4120, Train: 75.12%, Valid: 72.50% Test: 70.79%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 04, Loss: 0.0115, Train: 75.12%, Valid: 75.94% Test: 69.85%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 05, Loss: 0.0168, Train: 76.91%, Valid: 76.61% Test: 72.49%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 06, Loss: 0.0211, Train: 77.69%, Valid: 75.89% Test: 72.46%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 07, Loss: 0.0435, Train: 77.26%, Valid: 77.25% Test: 71.49%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 08, Loss: 0.0205, Train: 78.54%, Valid: 77.60% Test: 72.07%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 09, Loss: 0.0305, Train: 77.56%, Valid: 78.17% Test: 71.68%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 10, Loss: 0.0358, Train: 78.93%, Valid: 76.88% Test: 70.99%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 11, Loss: 0.0526, Train: 79.45%, Valid: 76.79% Test: 70.01%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 12, Loss: 0.0287, Train: 79.94%, Valid: 77.97% Test: 72.31%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 13, Loss: 0.0407, Train: 80.45%, Valid: 77.16% Test: 72.16%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 14, Loss: 0.0299, Train: 79.86%, Valid: 77.97% Test: 71.91%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 15, Loss: 0.0222, Train: 81.16%, Valid: 76.32% Test: 72.65%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 16, Loss: 0.0219, Train: 81.15%, Valid: 76.54% Test: 72.10%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 17, Loss: 0.0213, Train: 80.10%, Valid: 79.11% Test: 70.60%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 18, Loss: 0.0355, Train: 80.82%, Valid: 75.56% Test: 72.17%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 19, Loss: 0.0280, Train: 81.21%, Valid: 76.29% Test: 72.65%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 20, Loss: 0.0243, Train: 82.03%, Valid: 77.96% Test: 72.85%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 21, Loss: 0.4573, Train: 81.96%, Valid: 77.91% Test: 72.86%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 22, Loss: 0.0235, Train: 81.23%, Valid: 78.00% Test: 73.07%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 23, Loss: 0.6924, Train: 82.24%, Valid: 80.49% Test: 73.53%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 24, Loss: 0.0532, Train: 82.29%, Valid: 80.88% Test: 75.46%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 25, Loss: 0.0187, Train: 82.28%, Valid: 79.80% Test: 72.27%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 26, Loss: 0.0248, Train: 82.05%, Valid: 77.97% Test: 73.66%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 27, Loss: 0.0303, Train: 82.77%, Valid: 77.87% Test: 73.54%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 28, Loss: 0.0199, Train: 82.78%, Valid: 79.31% Test: 72.91%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 29, Loss: 0.0361, Train: 83.08%, Valid: 76.30% Test: 71.39%
Training...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Evaluating...


Iteration:   0%|          | 0/1029 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Iteration:   0%|          | 0/129 [00:00<?, ?it/s]

Epoch: 30, Loss: 0.0407, Train: 83.32%, Valid: 79.88% Test: 74.50%
