# Tutorial 1: Introduction

In this tutorial we would like to implement and train a simple perceptron.

We will be doing so by using PyTorch. PyTorch is a machine learning framework originally developed by Facebook AI Research (FAIR), but is now a thriving open-source project.

Over the last couple of years it has gained traction in industry and academia as the defacto standard for implementing and deploying neural network models.

PyTorch is particularly useful for implementing nueral networks, because as we will in this and the following tutorials, it provides many of the core operations required to train, implement and test neural networks out-of-the-box, such as applying backpropogation or executing stochastic gradient descend. 

Moreover, PyTorch also provides support for fine-tuning models on Graphical Processing Units (GPUs), this is partcularly useful in many industrial applications where training the models on standard computing hardware might be infeasible.

# Section 1: Implementing a simple linear NN

First let's import PyTorch, make sure you install it if you don't have it already.

In [1]:
# to install pytorch using conda run the following in your terminal:
# conda install -c pytorch pytorch
import torch
import torch.nn as nn

To implement a neural network model in PyTorch we first need to define a template for it, we can do this by extending the nn.Module class

In [2]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(2,1)
        
    def forward(self, x):
        x = self.fc1(x)
        return x

In the __init__ function we specify the type of nueral network we are using

Here we specified it as nn.Linear(2,1), which models a NN with only one neuron that takes two inputs.

fc1 stands for "fully connected" neural network, it is conventional in PyTorch to name neurons in this way, where 1 stands for which layer the neuron lies in. Here we have only one layer, but later in the module we will deal with multi-layer neurons where this convention will make more sense.

In the __forward__ function we specify which function to apply on the input

In this example we specified the output of the neuron to be the plain output returned by nn.Linear neurons, which is just the weigted sum of the input plus the bias of the neuron.

Now that we have specified the template for our neuron (called Net) let's create one:

In [3]:
net1 = Net()

Because we didn't specify any weights and biases for our NN, PyTorch simply picks some random parameters.

You can view these as follows:

In [5]:
print(list(net1.parameters()))

[Parameter containing:
tensor([[-0.4143,  0.5419]], requires_grad=True), Parameter containing:
tensor([0.5279], requires_grad=True)]


You can see from the above that there are two parameters set, these are the weights of the neuron (since we specified that it will take values as input) and one more parameter set, this is the bias. 

There are two important thing to notice here:

First, the lists are wrapped as __tensor__, this is an important feature of PyTorch, all lists are wrapped in tensors, so you will need to get used to that. You can think of tensors as being more optimised multi-dimensional lists. You can read more about them here:

https://pytorch.org/tutorials/beginner/nlp/pytorch_tutorial.html

You will need to get used to wrapping any list you use with a PyTorch object in the tensor keyword. The good news is that using tensors is very similar to lists or numpy arrays, so there shouldn't be any new conceptual difficulty in using them.

The other second to note in the parameter list is the keywords _requires_grad=True_. This is another importnat keyword in PyTorch which implies these parameters will be optimised by PyTorch if we run any backpropogation algorithm.

We will not be running any backpropogation this tutorial, so you can ignore these keywords for now.

Now let's give our neuron an input to verify it is working properly. First let's create an input consiting of two coordinates (again, note that we have to wrap in a tensor).

In [8]:
x_input = torch.tensor([[0.1,  0.5]])

Now let's feed it to the neuron and verfiy the output:

In [9]:
net1(x_input)

tensor([[0.7574]], grad_fn=<AddmmBackward>)

You should verify that the output is the weighted sum of the inputs plus the bias

# Section 2: Implementing a perceptorn in PyTorch

To make the above neural network a perceptron, we need to implement a threshold function. This can be specified by filling in the following two gaps:

In [None]:
class Perceptron(nn.Module):
    def __init__(self):
        super(Perceptron, self).__init__()
        self.fc1 = # Gap 1
        
    def forward(self, x):
        x = self.fc1(x)
        x = # Gap 2
        return x

Notice in the above we applied the heaviside function on x, before returning the output. That is now our NN returns 1 if its output is more than 0, zero otherwise. In other words it is now a Perceptron, hence why we also changed it's class name.

It would also be nice to hardcode the initial weights and bias, rather than relying on randomised values set by PyTorch. This can be done as follows:

In [8]:
my_perceptron = Perceptron()
my_perceptron.fc1.weight.data = torch.tensor([[ 0.4, 0.2]])
my_perceptron.fc1.bias.data = torch.tensor([-0.3])

Here I hardcode the weights 0.4 and 0.2, and the bias -0.3

Now let's check the output of this perceptron, and verify it is working as expected (testing both for when it should return 1 and when it should return 0):

# Section 3: Solving a classification problem using a Perceptron

You are now equipped with the knowledge of using PyTorch to implement a perceptron that solves the classification problem presented in the following graph. The model takes as input two attributes of a student: what percentage of exams they failed and what proportion of lectures they visited. Both are reprsented by a number in the interval [0,1]. Students who are above the line are classified as having passed the exam (so they get a value of 1), or zero otherwise (meaning they failed the exam).

Set  the  weights  of  a  threshold  perceptron  that  implements  this problem.

<img src="ex1.png">

In [13]:
class ExNet(nn.Module):
    def __init__(self):
        super(ExNet, self).__init__()
        self.fc1 = nn.Linear(2,1)
        
        self.fc1.weights = torch.nn.Parameter(torch.tensor([[0.66],[1]]))
        self.fc1.bias = torch.nn.Parameter(torch.tensor([[-0.5]]))
        
        self.heaviside = torch.heaviside
        
        
    def forward(self, x):
        x = self.fc1(x)
        output = self.heavside(x, 0.5)
        return output

# Section 4: Training a perceptron.

In the above problem, we first computed the weights of the model by hand and then hardcoded the model with these values. In real life this is impractical, so we need to implement a method that trains the model using some training data. In particular we will be implementing the method presented in the lecture and then we will evaluate it's performance.

We will be using the following small dataset as training:

In [14]:
data = [(0.7,0.3,1),(0.4,0.5,1), (0.6,0.9, 1), (0.2,0.2, 0), (0.1, 0.1, 0)]

We will then using the following template for the perceptron we will train (it is the same one as above, just repeated here for convenience)

In [15]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(2,1)
        
    def forward(self, x):
        x = self.fc1(x)
        x = torch.heaviside(x, torch.Tensor(1))
        return x

Now let's define some vlaues, first we define our training data (which contains only four training samples), which we call input_vectors, then we specify the learning_rate, and finally we define some initial weights to kick-start our training with:

In [16]:
training_data = torch.Tensor(
    [(0.7,0.3, 1),
     (0.4,0.5, 1),
     (0.6,0.9, 1),
     (0.2,0.2, 0)]
)
learning_rate = 0.1
initial_weights = torch.Tensor(   (-0.5, 0.3, -0.2)  )

For the training method, it will be useful to have the following helper function, which takes as input a nn and some data, it then returns True, if the nn fails to classify a single sample correctly. That's why it's called keep_training, since whenever it returns True, it will imply that our nn require more training.

In [17]:
def keep_training(nn, data):
    for sample in data:
        if not torch.eq(nn(sample[0:2]), sample[-1]    ):
            return True
    return False

We now implement a function that implements the learning algorithm of page 17 of the notes:

In [18]:
def train_perceptron(learning_rate, initial_weights, data):
    perceptron = # Gap 1
    perceptron.fc1.bias.data = # Gap 2
    perceptron.fc1.weight.data = # Gap 3
    
    
    while keep_training(perceptron, data):
        for sample in data:
            temp_output = # Gap 4
            label = sample[-1]
            delta_w = # Gap 5
            print(delta_w)
            perceptron.fc1.bias.data = # Gap 6
            perceptron.fc1.weight.data = # Gap 7
            
    return perceptron
            
            

Let us now run our algorithm, and see how far it goes in solving or classification problem!

In [None]:
train_perceptron(0.1, initial_weights, training_data)

**Question:** *Compute the generalisation error of the perceptron trained above w.r.t. the ideal classification given in the above figure, i.e.  draw the classification boundary for the trained perceptron and compare itwith that of the figure.*