## Lower-level Programming with PyTorch

Let's dig just a little deeper. We'll first get the data again.

In [1]:
from torch.autograd import Variable
import torch
import numpy as np
import matplotlib.pyplot as plt

from make_face_dataset import make_dataset


act = ['Fran Drescher', 'America Ferrera', 'Kristin Chenoweth', 'Alec Baldwin', 'Bill Hader', 'Steve Carell']
train_x, train_y = make_dataset(range(100), act)
test_x, test_y = make_dataset(range(100,120),act)

dim_x = 1024
dim_h = 20
dim_out = 6

Now,  let's define `Variable`s containing the training data

In [2]:
dtype_float = torch.FloatTensor

x = Variable(torch.from_numpy(train_x), requires_grad=False).type(dtype_float)
y = Variable(torch.from_numpy(train_y.astype(float)), requires_grad=False).type(dtype_float)

In [3]:
b0 = Variable(torch.randn((1, dim_h)), requires_grad=True)
W0 = Variable(torch.randn((dim_x, dim_h)), requires_grad=True)

b1 = Variable(torch.randn((1, dim_out)), requires_grad=True)
W1 = Variable(torch.randn((dim_h, dim_out)), requires_grad=True)

Note that everything is accessible right away:

In [4]:
b0

Variable containing:

Columns 0 to 9 
 0.3389  1.9895  0.0045 -0.0334 -1.5259  2.0825  0.7016  2.3932 -0.1088  0.6518

Columns 10 to 19 
 0.6502  1.4249  1.0805 -2.4361 -1.2705  0.1939 -1.0271  0.7681 -0.3775  0.0505
[torch.FloatTensor of size 1x20]

Let's now define the the model. Note that since we'll want to reuse it for different inputs, we'll want it to be in a function (or really in a class -- we'll show how to do that later). First, we'll remind ourselves of the dimensions of the data

In [5]:
x.data.shape, b0.data.shape, W0.data.shape

(torch.Size([600, 1024]), torch.Size([1, 20]), torch.Size([1024, 20]))

In [6]:
b1.data.shape, W1.data.shape, y.data.shape

(torch.Size([1, 6]), torch.Size([20, 6]), torch.Size([600, 6]))

In [7]:
def model(x, b0, W0, b1, W1):
    h = torch.matmul(x, W0) + b0.repeat(x.data.shape[0], 1)
    out = torch.matmul(h, W1) + b1.repeat(h.data.shape[0], 1)
    return out

In [8]:
y_out = model(x, b0, W0, b1, W1)

In [9]:
logSoftMax = torch.nn.LogSoftmax() # We'll be too lazy to define this one by hand
loss = -torch.mean(torch.sum(y * logSoftMax(y_out), 1))

In [10]:
loss

Variable containing:
 47.8923
[torch.FloatTensor of size 1]

In [11]:
learning_rate = 1e-1

for t in range(1000):
    y_out = model(x, b0, W0, b1, W1)
    loss = -torch.mean(torch.sum(y * logSoftMax(y_out), 1))
    loss.backward()
    b0.data -= learning_rate * b0.grad.data
    W0.data -= learning_rate * W0.grad.data
    
    b1.data -= learning_rate * b1.grad.data
    W1.data -= learning_rate * W1.grad.data
    
    
    b0.grad.data.zero_()
    W0.grad.data.zero_()
    b1.grad.data.zero_()
    W1.grad.data.zero_()
    
    #print(loss.data.numpy())

In [12]:
x_test_all_var = Variable(torch.from_numpy(test_x), requires_grad=False).type(dtype_float)

In [13]:
y_test_out = model(x_test_all_var, b0, W0, b1, W1).data.numpy()

In [14]:
np.argmax(y_test_out, 1)

array([0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 4, 0, 0, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 5, 1, 1, 1, 1, 3, 0, 1, 1, 1, 2, 2, 2, 1, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 0, 2, 3, 5, 2, 3, 5, 3, 3, 3, 0,
       3, 3, 3, 2, 5, 5, 3, 3, 3, 3, 3, 4, 4, 4, 5, 4, 4, 4, 4, 4, 1, 4, 4,
       4, 4, 4, 4, 4, 4, 4, 4, 5, 2, 1, 5, 5, 4, 5, 5, 5, 5, 1, 5, 5, 2, 5,
       5, 2, 2, 5, 5])