Optimising functions
====================================

Now for something a bit different.
PyTorch is a tensor processing library and whilst it has a focus on neural networks, it can also be used for more standard funciton optimisation.
In this example we will use torchbearer to minimise a simple function.

Imports
--------------------------------

In [None]:
import torch
from torch.nn import Module

try:
    import torchbearer
except:
    !pip install torchbearer
    import torchbearer

The Model
----------------------------------------

First we will need to create something that looks very similar to a neural network model - but with the purpose of minimising our function.
We store the current estimates for the minimum as parameters in the model (so PyTorch optimisers can find and optimise them) and we return the function value in the forward method.

In [None]:
ESTIMATE = torchbearer.state_key('est')


class Net(Module):
    def __init__(self, x):
        super().__init__()
        self.pars = torch.nn.Parameter(x)

    def f(self):
        """
        function to be minimised:
        f(x) = (x[0]-5)^2 + x[1]^2 + (x[2]-1)^2
        Solution:
        x = [5,0,1]
        """
        out = torch.zeros_like(self.pars)
        out[0] = self.pars[0]-5
        out[1] = self.pars[1]
        out[2] = self.pars[2]-1
        return torch.sum(out**2)

    def forward(self, _, state):
        state[ESTIMATE] = self.pars.detach().unsqueeze(1)
        return self.f()

The Loss
----------------------------------------

For function minimisation we have an analogue to neural network losses - we minimise the value of the function under the current estimates of the minimum.
Note that as we are using a base loss, torchbearer passes this the network output and the "label" (which is of no use here).


In [None]:
def loss(y_pred, y_true):
    return y_pred

Optimising
----------------------------------------

We need two more things before we can start optimising with torchbearer.
We need our initial guess - which we've set to [2.0, 1.0, 10.0] and we need to tell torchbearer how "long" an epoch is - I.e. how many optimisation steps we want for each epoch.
For our simple function, we can complete the optimisation in a single epoch, but for more complex optimisations we might want to take multiple epochs and include tensorboard logging and perhaps learning rate annealing to find a final solution.
We have set the number of optimisation steps for this example as 50000.


In [None]:
p = torch.tensor([2.0, 1.0, 10.0])
training_steps = 50000

The learning rate chosen for this example is very low and we could get convergence much faster with a larger rate, however this allows us to view convergence in real time.
We define the model and optimiser in the standard way.

In [None]:
model = Net(p)
optim = torch.optim.SGD(model.parameters(), lr=0.0001)

Finally we start the optimising on the GPU and print the final minimum estimate.

In [None]:
tbtrial = torchbearer.Trial(model, optim, loss, [torchbearer.metrics.running_mean(ESTIMATE, dim=1), 'loss'])
tbtrial.for_train_steps(training_steps).to('cuda')
tbtrial.run()
print(list(model.parameters())[0].data)

Viewing Progress
--------------------------------------------

You might have noticed in the previous snippet that the example uses a metric we've not seen before.

In [None]:
0/1(t): 100%|██████████| 50000/50000 [00:53<00:00, 931.36it/s, loss=4.5502, running_est=[4.9988, 0.0, 1.0004], running_loss=0.0]