First, we import all necessary modules from Numpy and PyTorch (Code copied from neural_net.py).

The last import is just so I can slow it down a little later.

In [1]:
import numpy as np
import torch.nn as nn
import torch
import time

I create a set of input data. I believe this must be as a Torch tensor, but first I will create it as a list.

In [2]:
Input_Data = [n for n in range(10)]
print(Input_Data, type(Input_Data))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] <class 'list'>


To turn this into a Torch tensor, we first need to turn it into a NumPy array. I do not think it is possible to turn a list directly into a tensor.

In [3]:
Input_Data = np.array(Input_Data)
print(Input_Data, type(Input_Data))

[0 1 2 3 4 5 6 7 8 9] <class 'numpy.ndarray'>


We can then turn this array into a tensor. Contrary to the guidelines in the 60 minute blitz tutorial say, you can turn a numpy array into a PyTorch tensor simply using the torch.tensor operation. This, crucially, allows you specify teh data type as float. For reasons best known to the developers of PyTorch, the datatype of a float is not assumed, and then this creates errors later on.

The requires_grad=True is not necessary at least for feedfoward use of the network.

In [4]:
Input_Data = torch.tensor(Input_Data, dtype=torch.float32, requires_grad=True)

We then need to make sure the tensor is not given in terms of columns. Each row is taken later as a seperate instance for training, and all higher dimensions are data associated with that same instance. If we supply our simple tensor as a row vector, it will look like one single instance with 10 input parameters. This, we transpose it. Transposing is not made easy because when the array or tensor is just a vector, python modules like to strip it of its awareness of higher dimensions for reasons best known to the developers of NumPy and PyTorch. The -1 means default to whatever dimension (#rows) necessary.

In [5]:
Input_Data = Input_Data.view(-1, 1)
print(Input_Data, type(Input_Data))

tensor([[0.],
        [1.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]], grad_fn=<ViewBackward>) <class 'torch.Tensor'>


We can now create the target data, which in this case is the result of performing y = 2x + 54 where y is the Target Data and x is the Input_Data. The same method will be used but in one step this time.

In [6]:
def Poly(x):
    y = 2*x + 54
    return y

Target_Data = [Poly(n) for n in range(10)]
Target_Data = np.array(Target_Data)
Target_Data = torch.tensor(Target_Data, dtype=torch.float32, requires_grad=True)
Target_Data = Target_Data.view(-1,1)
print(Target_Data, type(Target_Data))

tensor([[54.],
        [56.],
        [58.],
        [60.],
        [62.],
        [64.],
        [66.],
        [68.],
        [70.],
        [72.]], grad_fn=<ViewBackward>) <class 'torch.Tensor'>


Now that we have Data and the corresponding target, which change subject slightly and set up a simple neural network. The design for this will be one that has a single input, 2 hidden layers, each with 3 nodes, and no activation functions. Edit:(the loop fails after at maximum teh first 10 epochs unless some kind of activation function is used, presumably because the results blow up. All results become Nan). There will be one output. Deciding on the shape and functional description of each node should be enough to make a forward pass on the network. To do this we will need to use PyTorch.

I will construct the netowrk the same way as in "neural_net" which seems to differ significantly from what I read in the tutorial. In the tutorial, fundamentally, a class was created with certain properties that were then linked together by a method of that class. In DeepMoD, what seems to be occurring is that a list is created of neural network layers, and then some function is used to transform that into a network of some class that i cannot guess.

Create list of layers, we start with the first hidden layer. It has 1 as the first arguement to specify that each node should expect 1 input (each single element of out input data). It then has a 3 to specify the number of nodes in this layer.

In [7]:
Network = [nn.Linear(1, 3)]

We then append the piecewise activation function.

In [8]:
Network.append(nn.Sigmoid())

We then append the second hidden layer, which we chose to be identical to the first. This time the 1st arguement is 3 as there were 3 nodes in teh previous layer. In between these two elements is where we would have applied the "activation function" if we were including one. append() is a method of lists.

In [9]:
Network.append(nn.Linear(3, 3))
Network.append(nn.Sigmoid())

We then append the output. It is single valued, so the second arguement becomes 1. There is no activation function here as we just want to see the result.

In [11]:
Network.append(nn.Linear(3, 1))

We then apply ths funny function that changes it into a network:

In [12]:
Torch_Network = nn.Sequential(*Network)
print(type(Torch_Network))

<class 'torch.nn.modules.container.Sequential'>


We now have a built network. We can feed a tensor to this network and get the feedforward output. Currently, the weights and biases have all been set randomly so the result will itself be fairly random. However, for a given input, it will be consistant, as no backprop occurs, and the network is unchanged.

To demonstrate the feedforward output, we can create a tensor of single value that will therefore output a single value. We need to call the number 1 as "1." so that it takes it as a torch.float class object, which otherwise is, for soem reason, not assumed. We will output what teh network currently tells us for 1 and 20, twixe each, to show consistency.

In [13]:
print(Torch_Network(torch.tensor([1.])))
print(Torch_Network(torch.tensor([1.])))
print(Torch_Network(torch.tensor([20.])))
print(Torch_Network(torch.tensor([20.])))

tensor([0.7607], grad_fn=<AddBackward0>)
tensor([0.7607], grad_fn=<AddBackward0>)
tensor([0.7586], grad_fn=<AddBackward0>)
tensor([0.7586], grad_fn=<AddBackward0>)


In [14]:
Output_Data = Torch_Network(Input_Data)
print(Output_Data, type(Output_Data))

tensor([[0.7589],
        [0.7607],
        [0.7619],
        [0.7624],
        [0.7625],
        [0.7621],
        [0.7617],
        [0.7612],
        [0.7607],
        [0.7603]], grad_fn=<AddmmBackward>) <class 'torch.Tensor'>


The next thing to do is to start to train the neural network. For that, 2 additional things need to be decided upon; the loss function, and the optimisation function. For the former, we will simply use MSE loss:

In [15]:
Loss_Function = nn.MSELoss()

And for the optimisation function, we will simply use stochastic gradient descent with a learning rate of 0.01.

In [16]:
optimizer = torch.optim.SGD(Torch_Network.parameters(), lr=0.01)

Borrowing the simple components of the loop from the Neural Networks part of the PyTorch Tutorial, we combine everything into a loop, so that we train the neural network. The function "optimizer.zero_grad()" apparently needs to be run.
After that, we calaculate the networks prediction on the data. We are feeding it a Tensor of size 10, so it will sequentially (I think) use the network to evaluate each element in turn, and output a tensor of equal size to give the results.
The loss is then calculated, I believe this is a tensor as well
Then we just sort of *do* the backprop to work out the gradients in loss with respect to each weight and bias
Then we trigger the SGD to adjust each bias for each input data element

In [17]:
Max_Iterations = 10000

for n in range(Max_Iterations):
    optimizer.zero_grad()   # zero the gradient buffers
    Output_Data = Torch_Network(Input_Data)
    Loss = Loss_Function(Output_Data, Target_Data)
    Loss.backward()
    optimizer.step()    # Does the update
    '''
    if n % 10 == 0:
        print('Reached Epoch', n)
        print('Loss is', Loss)
        print('Current Prediction is', Output_Data)
        #time.sleep(5) # This is just for my convenience so i can see what is happening before it continues
    '''

print('Ready to end this')

Ready to end this


Seeing as we never used the 1000th iteration of our network to calculate results, we do this last bit one last time:

In [18]:
Output_Data = Torch_Network(Input_Data)
Loss = Loss_Function(Output_Data, Target_Data)
print(Loss)
print(Output_Data)

tensor(0.0627, grad_fn=<MeanBackward0>)
tensor([[54.1621],
        [55.9690],
        [58.1034],
        [60.2701],
        [62.2707],
        [64.1645],
        [66.1712],
        [68.3785],
        [70.4899],
        [72.0553]], grad_fn=<AddmmBackward>)
