## Intro to PyTorch

Torch is a numerical computation library like Numpy but instead of arrays, it uses tensors. 

Tensors behave the same way as arrays but PyTorch has extra functionality under the hood which builds a computation graph as you perform calculations on the tensors and calculate gradients with one line of code.

![title](images/NN1_computation_graph.JPG)

In [1]:
import torch

#there are many ways of initializing tensors
a = torch.Tensor([1, 2])
b = torch.arange(10).view(5, 2)
c = torch.zeros(10, 2)

### Linear regression

In [42]:
def sample_linear_data(m=20): 
    ground_truth_w = torch.Tensor([[2.3, 1.2]]) # slope
    ground_truth_b = -8 #intercept
    X = torch.randn(m, 2)
    Y = torch.matmul(X, ground_truth_w.t()) + ground_truth_b + 0.2*torch.randn(m, 1)
    return X, Y #returns X (the input) and Y (labels)

In [43]:
X, Y = sample_linear_data()
print('X:',X, '\n')
print('Y:',Y, '\n')

X: tensor([[ 0.1392,  0.6675],
        [ 1.2486,  1.4047],
        [-0.1421, -0.3008],
        [-1.0824, -0.1464],
        [-0.7338, -0.8334],
        [ 0.3825, -0.9441],
        [-0.6978, -0.2023],
        [-1.0797, -0.0226],
        [ 1.8519, -0.1263],
        [ 0.2155, -1.2444],
        [-0.3460,  0.0163],
        [-2.0616,  0.9679],
        [ 0.0282, -0.9284],
        [ 0.3159, -1.5359],
        [-0.3081,  0.1873],
        [ 0.0820, -0.1434],
        [ 0.6531, -1.6971],
        [-1.9849,  0.2293],
        [-0.9236, -0.1811],
        [-0.4425,  1.4756]]) 

Y: tensor([[ -6.7628],
        [ -3.4803],
        [ -8.6642],
        [-10.7079],
        [-10.8685],
        [ -8.4336],
        [ -9.5745],
        [-10.7069],
        [ -3.4764],
        [ -8.6176],
        [ -8.5480],
        [-11.6444],
        [ -9.2652],
        [ -9.0537],
        [ -8.3913],
        [ -7.9818],
        [ -8.6755],
        [-11.9595],
        [ -9.9847],
        [ -7.1044]]) 



#### From scratch

In [48]:
#linear regression optimized using gradient descent in pytorch from scratch
lr = 0.001
epochs = 5
weights = torch.randn((2, 1), requires_grad=True) #initialized weights, we will need to calculate gradients for this tensor
optimizer = torch.optim.SGD([weights], lr=lr) #optimizer object to optimize the weights tensor

for i in range(epochs):
    y_hat = torch.matmul(X, weights)
    cost = torch.sum(torch.pow(y_hat-Y, 2))
    
    optimizer.zero_grad() #forget currently stored gradients
    cost.backward() #calculate gradients of tensors (which have requires_grad=True) with respect to the cost tensor
    optimizer.step() #update the weights using the calculated gradients
    
    print('Epoch cost:', cost.item()) #print cost for this epoch

Epoch cost: 1457.4254150390625
Epoch cost: 1436.4327392578125
Epoch cost: 1416.7264404296875
Epoch cost: 1398.2254638671875
Epoch cost: 1380.8533935546875


#### Using torch.nn.Module

In [49]:
class LinearModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.dense1 = torch.nn.Linear(2, 1)
    def forward(self, x):
        return self.dense1(x)
        
lr = 0.001
epochs = 5
H = LinearModel()
optimizer = torch.optim.SGD(H.parameters(), lr=lr) #optimizer object to optimize the weights tensor

for i in range(epochs):
    y_hat = H(X)
    cost = torch.sum(torch.pow(y_hat-Y, 2))
    
    optimizer.zero_grad() #forget currently stored gradients
    cost.backward() #calculate gradients of tensors (which have requires_grad=True) with respect to the cost tensor
    optimizer.step() #update the weights using the calculated gradients
    
    print('Epoch cost:', cost.item()) #print cost for this epoch

Epoch cost: 1421.1077880859375
Epoch cost: 1289.6917724609375
Epoch cost: 1170.557373046875
Epoch cost: 1062.5509033203125
Epoch cost: 964.6268310546875
