In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import sys; sys.path.extend(["../src", ".."])
import sensai
import pandas as pd
import numpy as np
from typing import *
import config

cfg = config.get_config()
sensai.util.logging.configureLogging()

# Neural Networks

Neural networks being a very powerful class of models, especially in cases where the learning of representations from low-level information (such as pixels, audio samples or text) is key, sensAI provides many useful abstractions for dealing with this class of models, facilitating data handling, learning and evaluation.

sensAI mainly provides abstractions for PyTorch, but there is also rudimentary support for TensorFlow.

## Image Classification

As an example use case, let us solve the classification problem of classifying digits in pixel images from the MNIST dataset. Images are greyscale (no colour information) and 28x28 pixels in size.

In [None]:
mnistDF = pd.read_csv(cfg.datafile_path("mnist_train.csv.zip"))

The data frame contains one column for every pixel, each pixel being represented by an 8-bit integer (0 to 255).

In [None]:
mnistDF.head(5)

Let's create the I/O data for our experiments.

In [None]:
mnistIoData = sensai.InputOutputData.fromDataFrame(mnistDF, "label")

Now that we have the image data separated from the labels, let's write a function to restore the 2D image arrays and take a look at some of the images.

In [None]:
import matplotlib.pyplot as plt

def reshape2DImage(series):
    return series.values.reshape(28, 28)

fig, axs = plt.subplots(nrows=1, ncols=5, figsize=(10, 5))
for i in range(5):
    axs[i].imshow(reshape2DImage(mnistIoData.inputs.iloc[i]), cmap="binary")

### Applying Predefined Models



We create an evaluator in order to test the performance of our models, randomly splitting the data.

In [None]:
evaluatorParams = sensai.evaluation.VectorClassificationModelEvaluatorParams(fractionalSplitTestFraction=0.2)
evalUtil = sensai.evaluation.ClassificationEvaluationUtil(mnistIoData, evaluatorParams=evaluatorParams)

One pre-defined model we could try is a simple multi-layer perceptron. A PyTorch-based implementation is provided via class `MultiLayerPerceptronVectorClassificationModel`. This implementation supports CUDA-accelerated computations (on Nvidia GPUs), yet we shall stick to CPU-based computation (cuda=False) in this tutorial.

In [None]:
import sensai.torch

nnOptimiserParams = sensai.torch.NNOptimiserParams(earlyStoppingEpochs=2, batchSize=54)
torchMLPModel = sensai.torch.models.MultiLayerPerceptronVectorClassificationModel(hiddenDims=(50, 20), 
        cuda=False, normalisationMode=sensai.torch.NormalisationMode.MAX_ALL, 
        nnOptimiserParams=nnOptimiserParams, pDropout=0.0) \
    .withName("MLP")

Neural networks work best on **normalised inputs**, so we have opted to apply basic normalisation by specifying a normalisation mode which will transforms inputs by dividing by the maximum value found across all columns in the training data. For more elaborate normalisation options, we could have used a data frame transformer (DFT), particularly `DFTNormalisation` or `DFTSkLearnTransformer`.

sensAI's default **neural network training algorithm** is based on early stopping, which involves checking, in regular intervals, the performance of the model on a validation set (which is split from the training set) and ultimately selecting the model that performed best on the validation set. You have full control over the loss evaluation method used to select the best model (by passing a respective `NNLossEvaluator` instance to NNOptimiserParams) as well as the method that is used to split the training set into the actual training set and the validation set (by adding a `DataFrameSplitter` to the model or using a custom `TorchDataSetProvider`).

Given the vectorised nature of our MNIST dataset, we can apply any type of model which can accept the numeric inputs. Let's compare the neural network we defined above against another pre-defined model, which is based on a scikit-learn implementation and uses decision trees rather than neural networks.

In [None]:
randomForestModel = sensai.sklearn.classification.SkLearnRandomForestVectorClassificationModel(min_samples_leaf=1, n_estimators=10) \
    .withName("RandomForest")

Let's compare the two models using our evaluation utility.

In [None]:
evalUtil.compareModels([randomForestModel, torchMLPModel])

Both models perform reasonably well.

### Creating a Custom CNN Model

Given that this is an image recognition problem, it can be sensible to apply convolutional neural networks (CNNs), which can analyse patches of the image in order to generate more high-level features from them.
Specifically, we shall apply a neural network model which uses multiple convolutions, a max-pooling layer and a multi-layer perceptron at the end in order to produce the classification.

For classification and regression, sensAI provides the fundamental classes `TorchVectorClassificationModel` and `TorchVectorRegressionModel` respectively. Ultimately, these classes will wrap an instance of `torch.nn.Module`, the base class for neural networks in PyTorch.

#### Wrapping a Custom torch.nn.Module Instance

If we already had an implementation of a ``torch.nn.Module`, it can be straightforwardly adapted to a sensAI model.

Let's say we had the following implementation of a torch module, which performs the steps described above.


In [None]:
import torch

class MnistCnnModule(torch.nn.Module):
    def __init__(self, imageDim: int, outputDim: int, numConv: int, kernelSize: int, poolingKernelSize: int, 
            mlpHiddenDims: Sequence[int], outputActivationFn: sensai.torch.ActivationFunction, pDropout=0.0):
        super().__init__()
        k = kernelSize
        p = poolingKernelSize
        self.cnn = torch.nn.Conv2d(1, numConv, (k, k))
        self.pool = torch.nn.MaxPool2d((p, p))
        self.dropout = torch.nn.Dropout(p=pDropout)
        reducedDim = (imageDim-k+1)/p
        if int(reducedDim) != reducedDim:
            raise ValueError(f"Pooling kernel size {p} is not a divisor of post-convolution dimension {imageDim-k+1}")
        self.mlp = sensai.torch.models.MultiLayerPerceptron(numConv * int(reducedDim)**2, outputDim, mlpHiddenDims,
            outputActivationFn=outputActivationFn.getTorchFunction(),
            hidActivationFn=sensai.torch.ActivationFunction.RELU.getTorchFunction(),
            pDropout=pDropout)

    def forward(self, x):
        x = self.cnn(x.unsqueeze(1))
        x = self.pool(x)
        x = x.view(x.shape[0], -1)
        x = self.dropout(x)
        return self.mlp(x)

Since this module requires 2D images as input, we will need a component that transforms the vector input that is given in our data frame into a tensor that will serve as input to the module.
In sensAI, the abstraction for this purpose is a ``sensai.torch.Tensoriser``. A **Tensoriser** can, in principle, perform arbitrary computations in order to produce, from a data frame with N rows, one or more tensors of length N (first dimension equal to N) that will ultimately be fed to the neural network.

Luckily, for the case at hand, we already have the function ``reshape2DImage`` from above to assist in the implementation of the tensoriser.

In [None]:
class ImageReshapingInputTensoriser(sensai.torch.RuleBasedTensoriser):
    def _tensorise(self, df: pd.DataFrame) -> Union[torch.Tensor, List[torch.Tensor]]:
        images = [reshape2DImage(row) for _, row in df.iterrows()]
        return torch.tensor(np.stack(images)).float() / 255

In this case, we derived the class from ``RuleBasedTensorised`` rather than ``Tensoriser``, because it does not require fitting. We additionally took care of the normalisation within the tensoriser.

Now we have all we need to create a sensAI ``TorchVectorClassificationModel`` that will work on the input/output data we loaded earlier.

In [None]:
cnnModule = MnistCnnModule(28, 10, 32, 5, 2, (200, 20), sensai.torch.ActivationFunction.LOG_SOFTMAX)
nnOptimiserParams = sensai.torch.NNOptimiserParams(optimiser=sensai.torch.Optimiser.ADAMW, optimiserLR=0.01, batchSize=1024, 
    earlyStoppingEpochs=3)
cnnModelFromModule = sensai.torch.TorchVectorClassificationModel.fromModule(
        sensai.torch.ClassificationOutputMode.LOG_PROBABILITIES, 
        cnnModule, cuda=False, nnOptimiserParams=nnOptimiserParams) \
    .withInputTensoriser(ImageReshapingInputTensoriser()) \
    .withName("CNN")

We have now fully defined all the necessary parameters, including parameters controlling the training of the model.

We are now ready to evaluate the model.

In [None]:
evalUtil.performSimpleEvaluation(cnnModelFromModule);

#### Creating an Input-/Output-Adaptive Custom Model

While the above approach allows us to straightforwardly encapsulate a ``torch.nn.Module``, it really doesn't follow sensAI's principle of adapting model hyperparameters based on the inputs and outputs we receive during training - whenever possible. Notice that in the above example, we had to hard-code the image dimension (``28``) as well as the number of classes (``10``), even though these parameters could have been easily determined from the data. Especially in other domains where feature engineering is possible, we might want to experiment with different combinations of features, and therefore automatically adapting to inputs is key if we want to avoid editing the model hyperparameters time and time again; similarly, we might change the set of target labels in our classification problem and the model should simply adapt to a changed output dimension.

To design a model that can fully adapt to the inputs and outputs, we can simply subclass ``TorchVectorClassificationModel``, where the late instantiation of the underlying model is catered for. Naturally, delayed construction of the underlying model necessitates the use of factories and thus results in some indirections. 

If we had designed the above model to be within the sensAI ``VectorModel`` realm from the beginning, here's what we might have written:

In [None]:
import torch

class CnnModel(sensai.torch.TorchVectorClassificationModel):
    def __init__(self, cuda: bool, kernelSize: int, numConv: int, poolingKernelSize: int, mlpHiddenDims: Sequence[int], 
            nnOptimiserParams: sensai.torch.NNOptimiserParams, pDropout=0.0):
        self.cuda = cuda
        self.outputActivationFn = sensai.torch.ActivationFunction.LOG_SOFTMAX
        self.kernelSize = kernelSize
        self.numConv = numConv
        self.poolingKernelSize = poolingKernelSize
        self.mlpHiddenDims = mlpHiddenDims
        self.pDropout = pDropout
        super().__init__(sensai.torch.ClassificationOutputMode.forActivationFn(self.outputActivationFn),
            modelClass=self.VectorTorchModel, modelArgs=[self], nnOptimiserParams=nnOptimiserParams)

    class VectorTorchModel(sensai.torch.VectorTorchModel):
        def __init__(self, parent: "CnnModel"):
            super().__init__(parent.cuda)
            self._parent = parent

        def createTorchModuleForDims(self, inputDim: int, outputDim: int) -> torch.nn.Module:
            return self.Module(int(np.sqrt(inputDim)), outputDim, self._parent)

        class Module(torch.nn.Module):
            def __init__(self, imageDim, outputDim, parent: "CnnModel"):
                super().__init__()
                k = parent.kernelSize
                p = parent.poolingKernelSize
                self.cnn = torch.nn.Conv2d(1, parent.numConv, (k, k))
                self.pool = torch.nn.MaxPool2d((p, p))
                self.dropout = torch.nn.Dropout(p=parent.pDropout)
                reducedDim = (imageDim-k+1)/p
                if int(reducedDim) != reducedDim:
                    raise ValueError(f"Pooling kernel size {p} is not a divisor of post-convolution dimension {imageDim-k+1}")
                self.mlp = sensai.torch.models.MultiLayerPerceptron(parent.numConv * int(reducedDim)**2, outputDim, parent.mlpHiddenDims,
                    outputActivationFn=parent.outputActivationFn.getTorchFunction(),
                    hidActivationFn=sensai.torch.ActivationFunction.RELU.getTorchFunction(),
                    pDropout=parent.pDropout)

            def forward(self, x):
                x = self.cnn(x.unsqueeze(1))
                x = self.pool(x)
                x = x.view(x.shape[0], -1)
                x = self.dropout(x)
                return self.mlp(x)

It is only insignificantly more code than in the previous implementation.
The outer class, which provides the sensAI `VectorModel` features, serves mainly to hold the parameters, and the inner class inheriting from `VectorTorchModel` serves as a factory for the `torch.nn.Module`, providing us with the input and output dimensions (number of input columns and number of classes respectively) based on the data, thus enabling the model to adapt. If we had required even more adaptiveness, we could have learnt more about the data from within the fitting process of a custom input tensoriser (i.e. we could have added an inner ``Tensoriser`` class, which could have derived further hyperparameters from the data in its implementation of the fitting method.)

Let's instantiate our model and evaluate it.

In [None]:
cnnModel = CnnModel(cuda=False, kernelSize=5, numConv=32, poolingKernelSize=2, mlpHiddenDims=(200,20),
        nnOptimiserParams=nnOptimiserParams) \
    .withName("CNN'") \
    .withInputTensoriser(ImageReshapingInputTensoriser())

evalData = evalUtil.performSimpleEvaluation(cnnModel)

Our CNN models does slightly improve upon the MLP model we evaluated earlier. Let's do another comparison, so we get all the metrics in one place.

In [None]:
comparisonData = evalUtil.compareModels([torchMLPModel, cnnModelFromModule, cnnModel, randomForestModel], fitModels=False)
comparisonData.resultsDF

Note that any differences between the two CNN models are due only to randomness in the parameter initialisation; they are functionally identical.

Could the CNN model have produced even better results? Let's take a look at some examples where the CNN model went wrong by inspecting the evaluation data that was returned earlier.

In [None]:
misclassified = evalData.getMisclassifiedTriplesPredTrueInput()
fig, axs = plt.subplots(nrows=3, ncols=3, figsize=(9,9))
for i, (predClass, trueClass, input) in enumerate(misclassified[:9]):
    axs[i//3][i%3].imshow(reshape2DImage(input), cmap="binary")
    axs[i//3][i%3].set_title(f"{trueClass} misclassified as {predClass}")
plt.tight_layout()

While some of these examples are indeed ambiguous, there still is room for improvement.