## Chapter 3: Deep Learning with PyTorch

No RL in this chapter! Just an intro to using PyTorch for deep learning. I'm excited as I've seen a bunch of positive stuff about PyTorch - interested to see how it compares to Tensorflow and Keras.

#### Tensors

In [1]:
import torch
import numpy as np

In [2]:
# Initialise a random tensor
a = torch.FloatTensor(3, 2)
a

tensor([[-7.1296e+10,  4.5652e-41],
        [-7.1296e+10,  4.5652e-41],
        [ 1.6533e+19,  1.8336e+31]])

In [3]:
# Tensors have methods, e.g. set all values to 0 with .zero_()
# Trailing underscore indicates an in-place method
a.zero_()

tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])

In [4]:
# Alternatively we can convert a numpy array - but we probably want a smaller
# dtype than the 64-bit default (overkill for DL generally)
n = np.zeros(shape=(3, 2))
torch.tensor(n, dtype=torch.float32)

tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])

#### Gradient calculations

In [5]:
# If we want auto-calculated gradients, we have to explicitly say so
# That property will then be appropriately inherited
v1 = torch.tensor([1.0, 1.0], requires_grad=True)
v2 = torch.tensor([2.0, 2.0])

v_sum = v1 + v2
v_res = (v_sum*2).sum()
v_res

tensor(12., grad_fn=<SumBackward0>)

In [6]:
# Note how the is_leaf (= was this explicitly defined by the user, rather than
# created as a result of function transformation) and requires_grad attributes change
print(v1.is_leaf, v2.is_leaf, v_sum.is_leaf, v_res.is_leaf)
print(v1.requires_grad, v2.requires_grad, v_sum.requires_grad, v_res.requires_grad)

True True False False
True False True True


In [7]:
# Tell PyTorch to calculate gradients - the .backward() method calculates numerical derivatives
v_res.backward()
v1.grad

tensor([2., 2.])

In [8]:
# We don't get gradients for anything which didn't require them (i.e. for which we
# didn't state that they were required
v2.grad

#### NN building blocks

There's a load of preimplemented classes in the `torch.nn` package.

In [9]:
import torch.nn as nn

# Randomly initialised feed-forward layer with 2 inputs and 5 outputs
L = nn.Linear(2, 5)
v = torch.FloatTensor([1, 2])
L(v)

tensor([-0.1829, -0.3304, -0.7165, -1.3641,  1.6994], grad_fn=<AddBackward0>)

In [10]:
# The Sequential() class is useful for building a multilayered network
s = nn.Sequential(
    nn.Linear(2, 5),
    nn.ReLU(),
    nn.Linear(5, 20),
    nn.ReLU(),
    nn.Linear(20, 10),
    nn.Dropout(p=0.3),
    nn.Softmax(dim=1)
)
s

Sequential(
  (0): Linear(in_features=2, out_features=5, bias=True)
  (1): ReLU()
  (2): Linear(in_features=5, out_features=20, bias=True)
  (3): ReLU()
  (4): Linear(in_features=20, out_features=10, bias=True)
  (5): Dropout(p=0.3, inplace=False)
  (6): Softmax(dim=1)
)

In [11]:
# Pushing a tensor through it, just to prove it works
# NOTE: we are defining a 2d tensor here using nested lists i.e. [[row0], [row1], ...]
s(torch.FloatTensor([[1, 2]]))

tensor([[0.0947, 0.0727, 0.1015, 0.1015, 0.1553, 0.1015, 0.1017, 0.1015, 0.0458,
         0.1237]], grad_fn=<SoftmaxBackward0>)

## Custom modules

Really easy to create these, e.g. to implement new layer types. The only thing we _must_ define is usually the `.forward()` method.

In [17]:
class OurModule(nn.Module):
    def __init__(self, num_inputs, num_classes, dropout_prob=0.3):
        super(OurModule, self).__init__()

        # This module is just going to wrap a few other layers
        self.pipe = nn.Sequential(
            nn.Linear(num_inputs, 5),
            nn.ReLU(),
            nn.Linear(5, 20),
            nn.ReLU(),
            nn.Linear(20, num_classes),
            nn.Dropout(p=dropout_prob),
            nn.Softmax(dim=1)
        )

    def forward(self, x):
        return self.pipe(x)

In [18]:
# When we "call" a Module object, we're actually calling .forward()
# (Python is doing some clever stuff with module.__call__())
# We should never call .forward() directly
net = OurModule(num_inputs=2, num_classes=3)
v = torch.FloatTensor([[2, 3]])
out = net(v)
print(net)
print(out)

OurModule(
  (pipe): Sequential(
    (0): Linear(in_features=2, out_features=5, bias=True)
    (1): ReLU()
    (2): Linear(in_features=5, out_features=20, bias=True)
    (3): ReLU()
    (4): Linear(in_features=20, out_features=3, bias=True)
    (5): Dropout(p=0.3, inplace=False)
    (6): Softmax(dim=1)
  )
)
tensor([[0.3901, 0.3000, 0.3099]], grad_fn=<SoftmaxBackward0>)


## TensorBoard

Though originally developed for and released with TensorFlow, the TensorBoard utility is now well integrated with PyTorch and is really useful for monitoring DL training.

In [19]:
import math
# This previously lived in the tensorboardX package
from torch.utils.tensorboard import SummaryWriter

writer = SummaryWriter()

# Define functions that we are going to visualise
funcs = {"sin": math.sin, "cos": math.cos, "tan": math.tan}

for angle in range(-360, 360):
    angle_rad = angle*math.pi / 180
    for name, fun in funcs.items():
        val = fun(angle_rad)
        writer.add_scalar(name, val, angle)

writer.close()

No output here, but we do now have data in a directory called `runs/` (we could have changed this name with the `log_dir` argument to `SummaryWriter()`). Then we can point TensorBoard at that directory with the following in a terminal:

    tensorboard --logdir runs