Building and training a simple neural network (MLP)
===================================================



In [None]:
# Don't forget to configure the GPU
!pip install glasspy

## Introduction



In this notebook we will create and train a simple multilayer perceptron (MLP) neural network. We will use the `lightning` module to do this.



## Imports



In [None]:
import pickle
import torch
import lightning as L
import torch.nn as nn
import torch.optim as optim
from sklearn.metrics import mean_squared_error
from torch.nn import functional as F
from sklearn.model_selection import train_test_split
from glasspy.data import SciGlass
from sklearn.preprocessing import MaxAbsScaler
from torch.utils.data import DataLoader, TensorDataset

## Data pipeline



The first step is to collect, process and split the data.



### Collecting the data



In [None]:
config_prop = {
    "must_have_and": ["Tg"],
}

config_comp = {
    "must_have_only": [
        "SiO2",
        "Li2O",
        "Na2O",
        "K2O",
        "CaO",
        "MgO",
        "BaO",
        "Al2O3",
        "TiO2",
    ],
}

source = SciGlass(
    elements_cfg={},
    properties_cfg=config_prop,
    compounds_cfg=config_comp,
)

source.remove_duplicate_composition(
    scope="compounds",
    decimals=3,
    aggregator="median",
)

df = source.data

df["property"].info()

In [None]:
idx = df.index

X = df["compounds"]
y = df["property"]["Tg"]

### Splitting the data



For more information on why we need to split the data into training, test and validation datasets, see **Raschka, Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning, (2020). [https://doi.org/10.48550/arXiv.1811.12808](https://doi.org/10.48550/arXiv.1811.12808)**.



In [None]:
TEST_SIZE = 0.1
RANDOM_SEED = 61455

In [None]:
indices = df.index
train_val_idx, test_idx = train_test_split(
    indices, test_size=TEST_SIZE, random_state=RANDOM_SEED
)

X_train_val = X.loc[train_val_idx]
y_train_val = y.loc[train_val_idx]

X_test = X.loc[test_idx].values
y_test = y.loc[test_idx].values.reshape(-1,1)

In [None]:
train_idx, val_idx = train_test_split(
    train_val_idx, test_size=TEST_SIZE, random_state=RANDOM_SEED
)

X_train = X.loc[train_idx].values
y_train = y.loc[train_idx].values.reshape(-1,1)

X_val = X.loc[val_idx].values
y_val = y.loc[val_idx].values.reshape(-1,1)

### Normalization



Normalizing the data is usually a good practice when training neural networks. Note that to avoid *data leakage*, we can only train the scaler with the training dataset.



In [None]:
x_scaler = MaxAbsScaler()
x_scaler.fit(X_train)

y_scaler = MaxAbsScaler()
y_scaler.fit(y_train)

X_train = x_scaler.transform(X_train)
y_train = y_scaler.transform(y_train)

X_val = x_scaler.transform(X_val)
y_val = y_scaler.transform(y_val)

X_test = x_scaler.transform(X_test)
y_test = y_scaler.transform(y_test)

In [None]:
X_train = torch.tensor(X_train, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.float32)

X_val = torch.tensor(X_val, dtype=torch.float32)
y_val = torch.tensor(y_val, dtype=torch.float32)

X_test = torch.tensor(X_test, dtype=torch.float32)
y_test = torch.tensor(y_test, dtype=torch.float32)

### The DataModule class



The `DataModule` class takes care of feeding data to the neural network model during training and validation. While you can write your own `DataModule` class, the code below should be good for most use cases.



In [None]:
class DataModule(L.LightningDataModule):
    def __init__(
        self,
        X_train,
        y_train,
        X_val,
        y_val,
        X_test,
        y_test,
        batch_size = 256,
        num_workers = 2,
    ):
        super().__init__()

        self.batch_size = batch_size
        self.num_workers = num_workers

        self.X_train = X_train
        self.y_train = y_train
        self.X_val = X_val
        self.y_val = y_val
        self.X_test = X_test
        self.y_test = y_test

    def train_dataloader(self):
        return DataLoader(
            TensorDataset(self.X_train, self.y_train),
            batch_size=self.batch_size,
            num_workers=self.num_workers,
        )

    def val_dataloader(self):
        return DataLoader(
            TensorDataset(self.X_val, self.y_val),
            batch_size=self.batch_size,
            num_workers=self.num_workers,
        )

    def test_dataloader(self):
        return DataLoader(
            TensorDataset(self.X_test, self.y_test),
            batch_size=self.batch_size,
            num_workers=self.num_workers,
        )

We need to instantiate the `DataModule` class.



In [None]:
dm = DataModule(X_train, y_train, X_val, y_val, X_test, y_test)

## Neural network



### Creating the neural network class



Neural networks are complex machine learning algorithms with many dials and knobs to configure. When using `Pytorch` or `Lightning`, we need to create the neural network class. Below is an example that creates the `MLP` class with two hidden layers. The methods ending in `_step` are necessary for training and you probably won&rsquo;t want to change them when you start learning neural networks.



In [None]:
class MLP(L.LightningModule):
    def __init__(
        self, num_features, layer1, layer2, num_targets
    ):
        super().__init__()

        self.layers = nn.Sequential(
            nn.Linear(num_features, layer1),
            nn.Sigmoid(),
            nn.Linear(layer1, layer2),
            nn.Sigmoid(),
            nn.Linear(layer2, num_targets),
        )

        self.loss_fun = F.mse_loss

    def forward(self, x):
        x = self.layers(x)
        return x

    def training_step(self, batch, batch_idx):
        x, y = batch
        y_pred = self(x)
        loss = self.loss_fun(y, y_pred)

        self.log("loss", loss, prog_bar=True)

        return loss

    def validation_step(self, batch, batch_idx):
        x, y = batch
        y_pred = self(x)
        loss = self.loss_fun(y, y_pred)

        self.log("val_loss", loss, prog_bar=True)

        return loss

    def test_step(self, batch, batch_idx):
        x, y = batch
        y_pred = self(x)
        loss = self.loss_fun(y, y_pred)

        self.log("test_loss", loss)

        return loss

    def configure_optimizers(self):
        optimizer = optim.SGD(self.parameters(), lr=1e-3)
        return optimizer

We need to instantiate the class we have just created.



In [None]:
num_features = X.shape[1]
num_targets = 1
layer1 = 3
layer2 = 2

my_mlp = MLP(
    num_features, layer1, layer2, num_targets
)

### Training the neural network



We say that an epoch has passed when the neural network &ldquo;sees&rdquo; all the training data. Training a good neural network usually involves running the data through it many times. You can control this with the `max_epochs` argument.



In [None]:
NUM_EPOCHS = 20

trainer = L.Trainer(max_epochs=NUM_EPOCHS)

In [None]:
trainer.fit(my_mlp, dm)

### Testing the neural network



Finally, we will put the network into `evaluation` mode and test if it is a reasonable model.



In [None]:
my_mlp.eval()

In [None]:
dm.setup("test")

with torch.no_grad():
    X_true = dm.X_test

    y_true = dm.y_test
    y_true = y_scaler.inverse_transform(y_true)

    y_pred = my_mlp(X_true)
    y_pred = y_scaler.inverse_transform(y_pred)

    RMSE = mean_squared_error(y_true, y_pred) ** (1/2)

    print(RMSE)

We have not trained this neural network in a deterministic way. Therefore, every time you run this code, you will (most likely) get a different result. Nevertheless, we have created a very, very, very simple neural network and it is unreasonable to think that it will perform well for the glass transition data we have collected. Try increasing the number of neurons to see if the performance improves!



### Saving the model



You may wish to save your model for future use. This is easy to do.



In [None]:
file_name = "my_model.p"
torch.save(my_mlp.state_dict(), file_name)

Then you can load it&#x2026;



In [None]:
other_mlp = MLP(
    num_features, layer1, layer2, num_targets
)

state_dict = torch.load(file_name, weights_only=True)
other_mlp.load_state_dict(state_dict)

&#x2026; and check that it gives the same results (as it should).



In [None]:
other_mlp.eval()

dm.setup("test")

with torch.no_grad():
    X_true = dm.X_test

    y_true = dm.y_test
    y_true = y_scaler.inverse_transform(y_true)

    y_pred = other_mlp(X_true)
    y_pred = y_scaler.inverse_transform(y_pred)

    RMSE = mean_squared_error(y_true, y_pred) ** (1/2)

    print(RMSE)

### Training on the GPU



Training on the GPU greatly increases the training speed. Make sure you have the GPU-enabled version of `Pytorch` and updated GPU drivers.



In [None]:
yet_another_mlp = MLP(num_features, layer1, layer2, num_targets)

trainer = L.Trainer(
    devices=1,
    accelerator="gpu",
    max_epochs=NUM_EPOCHS,
)

trainer.fit(yet_another_mlp, dm)