# The first neural network

We want to build a neural network to perform a basic xor operation, the network is defined as follow:

    - it takes as input vectors x of dimension 2.
    - an hidden layer non linearly transform vector two a 4-dimensional vector.
    - the vector is given as input to a logistic regression that produces the output

$f(x) = \sigma(\sigma(xW_1 + b_1)W_2 + b_2)$

let's start by initializing the learnable weights

In [2]:
import torch

In [3]:
W1 = torch.rand(2, 4, requires_grad=True)
b1 = torch.rand(4, requires_grad=True)
W1, b1

(tensor([[ 0.6701,  0.6276,  0.0528,  0.1514],
         [ 0.0269,  0.5575,  0.3007,  0.7299]]),
 tensor([ 0.2402,  0.1666,  0.5950,  0.1036]))

In [4]:
W2 = torch.rand(4, 1, requires_grad=True)
b2 = torch.rand(1, requires_grad=True)
W2, b2

(tensor([[ 0.7847],
         [ 0.9579],
         [ 0.2046],
         [ 0.6609]]), tensor(1.00000e-02 *
        [ 5.1208]))

and the sigmoid function:

In [5]:
def sigmoid(x):
    return 1 / (1 + torch.exp(-x))

sigmoid(torch.Tensor([1, 2, 3, -3]))

tensor([ 0.7311,  0.8808,  0.9526,  0.0474])

Prepare the dataset:

In [6]:
def dataset(n):
    import random
    for _ in range(n):
        x1 = random.choice([0, 1])
        x2 = random.choice([0, 1])
        y = int(bool(x1)^bool(x2))
        yield torch.FloatTensor([x1, x2]), torch.FloatTensor([y])

train the network:

In [7]:
running_loss = 0
for i, (x, y) in enumerate(dataset(10000)):
    h1 = sigmoid(x @ W1 + b1)
    out = sigmoid(h1 @ W2 + b2)
    
    
    loss = -(y * torch.log(out) + (1 - y) * torch.log((1 - out)))
    
    if i%1000 == 999:
        print('{:.3f}'.format(running_loss/1000))
        running_loss = 0
    running_loss += loss.data[0]
    
    loss.backward()
    
    W1.data -=  0.05 * W1.grad.data
    b1.data -=  0.05 * b1.grad.data
    
    W2.data -=  0.05 * W2.grad.data
    b2.data -=  0.05 * b2.grad.data
    
    W1.grad.data.zero_()
    b1.grad.data.zero_()
    
    W2.grad.data.zero_()
    b2.grad.data.zero_()


0.708
0.696
0.689
0.667
0.611
0.532
0.440
0.312
0.199
0.127


In [8]:
x = torch.FloatTensor([1, 1])

In [9]:
h1 = sigmoid(x @ W1 + b1)
out = sigmoid(h1 @ W2 + b2)
    

In [10]:
x, h1, out

(tensor([ 1.,  1.]),
 tensor([ 0.7267,  0.9999,  0.7104,  0.7552]),
 tensor([ 0.1113]))

In [11]:
W1, b1

(tensor([[ 1.8405,  5.8679,  0.9192,  2.3144],
         [ 0.8119,  5.8647,  1.6645,  2.4469]]),
 tensor([-1.6743, -2.4024, -1.6863, -3.6347]))

In [12]:
W2, b2

(tensor([[-3.1569],
         [ 8.1206],
         [-3.2604],
         [-4.8421]]), tensor([-1.9302]))