<h4 align="center">PLEASE OPEN THIS FILE IN JUPYTER, AS GITHUB CAN'T DISPLAY IT PROPERLY.</h4>

### PyTorch

<center><img src="img/pytorch-logo.jpeg" width="800" /></center>

### What is PyTorch?

<center><img src="img/pytorch-logo.jpeg" width="400" /></center>

* open-source machine learning library written in Python, C++ and CUDA

* has NumPy-like interfaces

* provides two core features:
    * operations with tensors
    * automatic differentiation
    
    
* initialy developed at Facebook

https://pytorch.org

### What are tensors?

* Tensors are nothing but multidimensional arrays. 

<img src="img/tensor.jpeg" width="800">

### PyTorch Tensor operations - Vector

In [1]:
import torch

In [2]:
v = torch.tensor([3.5, 2.6, 7.1])

In [3]:
v

tensor([3.5000, 2.6000, 7.1000])

In [4]:
type(v)

torch.Tensor

In [5]:
v.shape

torch.Size([3])

### PyTorch Tensor operations - Matrix

In [6]:
m = torch.ones(3, 3)

In [7]:
m

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

In [8]:
m.shape

torch.Size([3, 3])

### Multiply matrix by vector

In [9]:
m = torch.rand(5, 5)

In [10]:
m

tensor([[0.8417, 0.6079, 0.0884, 0.2759, 0.2184],
        [0.0930, 0.7163, 0.2147, 0.3660, 0.4807],
        [0.1781, 0.8792, 0.2100, 0.4183, 0.2520],
        [0.3426, 0.1494, 0.5589, 0.8576, 0.5515],
        [0.5582, 0.9533, 0.1440, 0.4115, 0.3344]])

In [11]:
v = torch.rand(5, 1)

In [12]:
v

tensor([[0.3837],
        [0.1602],
        [0.0260],
        [0.8665],
        [0.7974]])

In [13]:
m * v

tensor([[0.3229, 0.2332, 0.0339, 0.1058, 0.0838],
        [0.0149, 0.1147, 0.0344, 0.0586, 0.0770],
        [0.0046, 0.0228, 0.0055, 0.0109, 0.0065],
        [0.2969, 0.1295, 0.4843, 0.7432, 0.4779],
        [0.4451, 0.7601, 0.1148, 0.3281, 0.2666]])

### Converting a PyTorch Tensor to a NumPy Array

In [14]:
a = torch.randn(10)

In [15]:
a

tensor([-0.1857, -0.2784, -1.1172, -0.8458, -1.0361,  1.0809,  2.7587, -1.3076,
        -0.6104, -1.8193])

In [16]:
a.numpy()

array([-0.18573917, -0.27842855, -1.1172398 , -0.84583735, -1.0360615 ,
        1.080877  ,  2.7587461 , -1.3075684 , -0.61044526, -1.8193228 ],
      dtype=float32)

### Converting NumPy Array to PyTorch Tensor

In [17]:
import numpy as np

In [18]:
a = np.ones(7)
a

array([1., 1., 1., 1., 1., 1., 1.])

In [19]:
b = torch.from_numpy(a)
b

tensor([1., 1., 1., 1., 1., 1., 1.], dtype=torch.float64)

### autograd.Variable

<center><img src="img/autograd-variable.png" width="400" /></center>

* part of **autograd** module of PyTorch
* simply a wraper around **torch.Tensor**
* are used to build computational graph
* automatically accumulate gradient w.r.t. this variable (can be controlled using **requires_grad** parameter)

### autograd.Variable

In [20]:
from torch.autograd import Variable

In [21]:
x = Variable(torch.ones(2, 2), requires_grad=True)

In [22]:
y = x + 100

In [23]:
z = 2 * (y ** 2)

In [24]:
z

tensor([[20402., 20402.],
        [20402., 20402.]], grad_fn=<MulBackward>)

In [25]:
out = torch.mean(z)

In [26]:
out

tensor(20402., grad_fn=<MeanBackward1>)

In [27]:
# computes the sum of gradients of given tensors w.r.t. graph leaves
out.backward()

In [28]:
x.grad

tensor([[101., 101.],
        [101., 101.]])

### Simple autograd example - Regression

In [29]:
# Input feature vector

x = [1., 2., 3., 4., 5.]

In [30]:
# Target variables

y = [10., 20., 30., 40., 50.]

In [31]:
# Weight / Bias

w = Variable(torch.tensor([1.]), requires_grad=True)

<center><img src="img/linear-regression-graph.png" width="600"></center>

In [32]:
# training iterations

for epoch in range(5):
    
    for x_i, y_i in zip(x, y):
        
        # compute predicted target variable
        y_pred = x_i * w
                
        # compute Mean Squared Error (MSE)
        loss = (y_pred - y_i) ** 2
        
        # compute gradients
        loss.backward()
        
        print('\t x={x_i}, y={y_i}, w.grad={w.grad[0]}'.format(**locals()))
        
        # make one step towards the local minima, with learning rate 0.01
        w.data -= 0.01 * w.grad.data
        
        # clear gradients after updating weights
        w.grad.data.zero_()
        
    print('Loss at epoch #%d: %.6f \n' % (epoch+1, loss.data[0]))

print('Final: w = %.4f' % w.data)

	 x=1.0, y=10.0, w.grad=-18.0
	 x=2.0, y=20.0, w.grad=-70.55999755859375
	 x=3.0, y=30.0, w.grad=-146.0592041015625
	 x=4.0, y=40.0, w.grad=-212.92185974121094
	 x=5.0, y=50.0, w.grad=-226.22947692871094
Loss at epoch #1: 511.797760 

	 x=1.0, y=10.0, w.grad=-4.524589538574219
	 x=2.0, y=20.0, w.grad=-17.73638916015625
	 x=3.0, y=30.0, w.grad=-36.71432876586914
	 x=4.0, y=40.0, w.grad=-53.521331787109375
	 x=5.0, y=50.0, w.grad=-56.866416931152344
Loss at epoch #2: 32.337894 

	 x=1.0, y=10.0, w.grad=-1.1373271942138672
	 x=2.0, y=20.0, w.grad=-4.458320617675781
	 x=3.0, y=30.0, w.grad=-9.228721618652344
	 x=4.0, y=40.0, w.grad=-13.45343017578125
	 x=5.0, y=50.0, w.grad=-14.294281005859375
Loss at epoch #3: 2.043265 

	 x=1.0, y=10.0, w.grad=-0.2858867645263672
	 x=2.0, y=20.0, w.grad=-1.1206741333007812
	 x=3.0, y=30.0, w.grad=-2.3197975158691406
	 x=4.0, y=40.0, w.grad=-3.381744384765625
	 x=5.0, y=50.0, w.grad=-3.5931015014648438
Loss at epoch #4: 0.129104 

	 x=1.0, y=10.0, w.grad=

### Linear Regression using PyTorch

The diabetes dataset consists of 10 physiological variables (age, sex,
weight, blood pressure) measure on 442 patients, and an indication of
disease progression after one year

* **Samples total** - 442
* **Dimensionality** - 10
* **Features** - real, -.2 < x < .2
* **Targets** - integer 25 - 346

##### Load dataset from scikit-learn

In [33]:
from sklearn import datasets

# http://scikit-learn.org/stable/auto_examples/linear_model/plot_ols.html
diabetes = datasets.load_diabetes()

X = diabetes.data[:]
y = diabetes.target[:].reshape(-1,1)

### Define Neural Network class

In [34]:
from torch import nn

In [35]:
class LinearRegressionModel(nn.Module):
    """
    Define Linear Regression class
    """

    def __init__(self, input_dim, output_dim):
        super(LinearRegressionModel, self).__init__() 
        self.linear = nn.Linear(input_dim, output_dim)  # neural network with 1 layer

    def forward(self, x):
        return self.linear(x)

In [36]:
# instantiate Linear Regression

model = LinearRegressionModel(10, 1)

In [37]:
# define loss functions - Mean Square Error (MSE)

criterion = nn.MSELoss() # (y_hat − y)^2,

In [38]:
# define learning rate

lr = 0.5

# define parameter optimizer

optimizer = torch.optim.SGD(model.parameters(), lr=lr)

In [39]:
list(model.parameters())

[Parameter containing:
 tensor([[-0.3020, -0.1452,  0.2359, -0.1441,  0.2556,  0.0921,  0.0622,  0.0337,
          -0.2090,  0.0956]], requires_grad=True), Parameter containing:
 tensor([-0.1396], requires_grad=True)]

### Main training loop

In [40]:
n_epochs = 1000

for epoch in range(n_epochs):
    
    # convert features and target into PyTorch Variable
    inputs = Variable(torch.from_numpy(X).float())
    targets = Variable(torch.from_numpy(y).float())

    # forward pass
    outputs = model.forward(inputs)

    # calculate loss (MSE)
    loss = criterion(outputs, targets)
    
    # compute gradients
    loss.backward()
    
    # perform one step in the oposite direction to the gradient (update weights)
    optimizer.step()
    
    # clear gradient values after weights are updated
    optimizer.zero_grad()
    
    if epoch % 100 == 0:
        print('epoch {}, loss {}'.format(epoch, loss.item()))

epoch 0, loss 29117.064453125
epoch 100, loss 3934.593017578125
epoch 200, loss 3407.29345703125
epoch 300, loss 3194.216796875
epoch 400, loss 3079.41845703125
epoch 500, loss 3009.8671875
epoch 600, loss 2965.934814453125
epoch 700, loss 2937.665283203125
epoch 800, loss 2919.249267578125
epoch 900, loss 2907.12158203125


### Measure the accuracy

In [41]:
# switch to evaluation mode

model = model.eval()

In [42]:
# predict target variable for the whole dataset

with torch.no_grad():
    y_pred = model.forward(Variable(torch.from_numpy(X).float()))
    y_pred = y_pred.data.numpy()

In [43]:
from sklearn.metrics import r2_score

# calculate variance score
r2_score(y, y_pred)

0.511112265566122