* Understanding and working with PyTorch computation graphs
* Working with PyTorch tensor objects
* Solving the classic XOR problem and understanding model capacity
* Building complex NN models using PyTorch's **Sequential** class and **nn.Module** class
* Computing gradients using automatic differentiation and **torch.autograd**

PyToch provides Dynamic computational graphs which help debugging friendly

Pytorch perform its computations based on a **directed acyclic graph(DAG)**

In [2]:
import torch
def compute_z(a, b, c):
    r1 = torch.sub(a, b)
    r2 = torch.mul(r1, 2)
    z = torch.add(r2, c)
    return z

print('Scalar Inputs: ', compute_z(torch.tensor(1),
                                    torch.tensor(2),
                                    torch.tensor(3)))
print('Rank 1 Inputs: ', compute_z(torch.tensor([1]),
                                    torch.tensor([2]),
                                    torch.tensor([3])))
print('Rank 2 Inputs: ', compute_z(torch.tensor([[1]]),
                                    torch.tensor([[2]]),
                                    torch.tensor([[3]])))

Scalar Inputs:  tensor(1)
Rank 1 Inputs:  tensor([1])
Rank 2 Inputs:  tensor([[1]])


In [3]:
a = torch.tensor(3.14, requires_grad=True)
print(a)
b = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
print(b)

tensor(3.1400, requires_grad=True)
tensor([1., 2., 3.], requires_grad=True)


In [4]:
import torch.nn as nn
torch.manual_seed(0)
w = torch.empty(2, 3)
nn.init.xavier_normal_(w)
print(w)

tensor([[ 0.9746, -0.1856, -1.3780],
        [ 0.3595, -0.6859, -0.8845]])


#### Computing gradients via automatic differentiation

book will use term gradient to refer to both partial derivatives and gradients

a simple example : $ z=wx+b $ , $ Loss=(y-z)^2 $

In [8]:
w = torch.tensor(1.0, requires_grad=True)
b = torch.tensor(0.5, requires_grad=True)
x = torch.tensor([1.4])
y = torch.tensor([2.1])

# forward
z = torch.add(torch.mul(w, x), b)
loss = (y-z).pow(2).sum()
loss.backward()
print(type(loss))
print('dL/dw : ', w.grad)
print('dL/db : ', b.grad)
print(2 * x * ((w * x + b) - y))

<class 'torch.Tensor'>
dL/dw :  tensor(-0.5600)
dL/db :  tensor(-0.4000)
tensor([-0.5600], grad_fn=<MulBackward0>)
