I'm going to explore multi-dimensional inputs and outputs in this notebook.

We're going to have $y = y(y_{0}, y_{1}, ..., y_{n})$, and $x = x(x_{0}, x_{1}, ..., x_{n})$. For simplicity's sake, let $n = 1$.

I'm also going to assume each parameter of $x$ is mapped to its corresponding $y$ parameter: as in,

$x_{0} \rightarrow y_{0}$, and  $x_{1} \rightarrow y_{1}$.

There'll be functions mapping each input to its corresponding output, of course, for example,

$y_{0} = x_{0}^{2}$, and $y_{1} = x_{1}^{4}$.

I'm going to use more complex functions to strain the model a little more, but the real challenge is plotting these, because

$x = [x_{0} \; x_{1}]$, and $y = [y_{0} \; y_{1}]$.

A neural network usually has one output layer. I'm going to have to branch out the layer this time, such that each y-element has an output layer.

Also, I'm going to use a dropout probability of 0.3, since, from the dropouts notebook, that seems to be the optimal choice.

With that, I'll get started.

In [21]:
#importing libraries
import numpy as np
from sklearn import datasets
import torch
import torch.nn as nn
import torch.optim as optim
import torchbnn as bnn
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split

In [22]:
#allocating datasets and model to GPU for speed's sake
is_available = torch.cuda.is_available()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

My data generation will be different here. I'm going to:
* Have two linspaces: x0, x1
* Have two  functions: fx0, fx1
* Have two more linspaces: y0, y1
* Form a dataframe (df) of the four linspaces
* Use train_test_split on the df

I'll think about device applications later.

I'll also keep the functions simple:

$y_{0} = f(x_{0}) = x^{5} -7x^{2} + 4$

$y_{1} = f(x_{1}) = 3x^{2} + 2x - 5$

In [33]:
#creating dataset
x0, x1 = torch.linspace(-2, 2, 1000), torch.linspace(-2, 2, 1000)

fx0 = lambda x: np.power(x, 5) - 7*np.power(x, 2) + (torch.rand(x.size())*4 + 2)
fx1 = lambda x: 3*np.power(x, 2) + 2*x - (torch.rand(x.shape)*5 + 2)

y0 = fx0(x0)
y1 = fx1(x1)

df = pd.DataFrame(
    {
        "x0":x0,
        "x1":x1,
        "y0":y0,
        "y1":y0
    }
)

#df.head()
x_train, x_test, y_train, y_test = train_test_split(df[["x0", "x1"]], df[["y0", "y1"]], test_size=0.2, random_state=1)

#reset indices and drop old indices of train and test sets
x_train.reset_index(drop=True, inplace=True)
x_test.reset_index(drop=True, inplace=True)
y_train.reset_index(drop=True, inplace=True)
y_test.reset_index(drop=True, inplace=True)

Unnamed: 0,x0,x1
0,-0.470471,-0.470471
1,1.97998,1.97998
2,1.931932,1.931932
3,-1.811812,-1.811812
4,0.086086,0.086086


Since I'm working with multi-dimensional inputs and outputs, the model architecture has to change to accommodate the new dataset's properties.

In [34]:
class DualOutputBNN(nn.Module):
    def __init__(self, no_of_neurones, dropout_prob):
        super(DualOutputBNN, self).__init__()
        self.shared_layer = nn.Sequential(
            bnn.BayesLinear(prior_mu=0, prior_sigma=0.1, in_features=2, out_features=no_of_neurones),
            nn.ReLU(),
            nn.Dropout(dropout_prob),
        )
        self.output_layer_y0 = bnn.BayesLinear(prior_mu=0, prior_sigma=0.1, in_features=no_of_neurones, out_features=1)
        self.output_layer_y1 = bnn.BayesLinear(prior_mu=0, prior_sigma=0.1, in_features=no_of_neurones, out_features=1)

    def forward(self, x):
        shared = self.shared_layer(x)
        y0 = self.output_layer_y0(shared)
        y1 = self.output_layer_y1(shared)
        return y0, y1

I'll now write a function to initialise the model above, in the same way that I did in previous notebooks.

In [None]:
def initialise_model(no_of_neurones: int, dropout_prob: float, lr: float = 0.01) -> tuple:
    """
    Initialise the DualOutputBNN model with its loss functions and optimizer.

    Parameters:
    - no_of_neurones (int): Number of neurons in the hidden layer.
    - dropout_prob (float): Dropout probability.
    - lr (float): Learning rate for the optimizer. Default is 0.01.

    Returns:
    - A tuple containing the initialized model, MSE loss function, KL loss function, KL weight, and optimizer.
    """

    model = DualOutputBNN(no_of_neurones, dropout_prob).to(device)

    mse_loss = nn.MSELoss().to(device)
    kl_loss = bnn.BKLLoss(reduction='mean', last_layer_only=False).to(device)
    kl_weight = 0.01  # This could also be parameterized if needed

    optimizer = torch.optim.Adam(model.parameters(), lr=lr)

    return model, mse_loss, kl_loss, kl_weight, optimizer

