# Introduction to PyTorch

Welcome to this pytorch tutorial.

All of the work done here is teached at: [https://www.youtube.com/watch?v=GIsg-ZUy0MY&t=14882s]

In [1]:
# Display URLs content
from jinja2 import Template
from IPython.display import IFrame
from IPython.core.display import display, HTML

In [19]:
import torch
import numpy as np
import pandas as pd

## What is PyTorch?

In [30]:
IFrame('https://en.wikipedia.org/wiki/PyTorch', width=1550, height=450)

## Creating Tensors

In short terms, a tensor is a generalization of a matrix or in other words, is a n-dimensional array.

A key concept about the tensors is the _rank_, it means the dimension of the vector:

<div>
<ul>
    <li> Scalar: rank 0 </li>
    <li> Vector: rank 1 </li>
    <li> Matrix: rank 2 </li>
    <li> Tensor: rank n </li>
</ul>
</div>

In [4]:
from torch import tensor

# We create several tensors
t0 = tensor(1.0)
t1 = tensor([1., 2., 3., 4., 5., 6.], dtype=torch.float64)
t2 = tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.double)
t3 = tensor([[[1, 1], [1, 1], [1, 1]]], dtype=torch.int8)

tensors = [t0, t1, t2, t3]

print("-----> Tensors <-----")
for t in tensors:
    print(f"\t>> {np.array(t)}")
    print(f"\t\tRank:\t{t.dim()}")
    print(f"\t\tType:\t{t.dtype}")
    print(f"\t\tShape:\t{t.size()}")
    print(f"\t\tN. Elements:\t{t.numel()}")

-----> Tensors <-----
	>> 1.0
		Rank:	0
		Type:	torch.float32
		Shape:	torch.Size([])
		N. Elements:	1
	>> [1. 2. 3. 4. 5. 6.]
		Rank:	1
		Type:	torch.float64
		Shape:	torch.Size([6])
		N. Elements:	6
	>> [[1. 2. 3.]
 [4. 5. 6.]]
		Rank:	2
		Type:	torch.float64
		Shape:	torch.Size([2, 3])
		N. Elements:	6
	>> [[[1 1]
  [1 1]
  [1 1]]]
		Rank:	3
		Type:	torch.int8
		Shape:	torch.Size([1, 3, 2])
		N. Elements:	6


In [5]:
template = Template(
"""
<html>
<big> This is fucking {{ whoami }} </big>
<hr/>
<h3> Tensors </h3>
    <ul>
        {% for t in tensors %}
            <li> {{ t }}  </li>
            <ul>
                <li> Rank = {{ t.dim() }} </li>
                <li> Size = {{ t.size() }} </li>
                <li> Type = {{ t.dtype }} </li>
                <li> Number of elements = {{ t.numel() }}
            </ul>
        {% endfor %}
    </ul>
</html>
""")

display(HTML(template.render({
    "whoami": "PyTorch",
    "tensors": tensors
})))

## Tensor Arithmetic

In [6]:
# Basic arithmetic and tensor operations


t1 = tensor([1, 1, 1, 1])
t2 = tensor([1, 2, 3, 4])

add = torch.add(t1, t2)
'''
    add = t1 + t2 + ... + tn
'''

subs = torch.add(t1, -t2)

times = torch.matmul(t1, t2)
'''
    times = t1 @ t2
'''
transpose = t1.T

dot = torch.dot(t1.T, t2)



In [7]:
template = Template(
"""
<html>
<h3> Binary and Unary operators </h3>
    
    <strong>Tensor 1:</strong> {{ t1 }}
    <br/>
    <strong>Tensor 2:</strong> {{ t2 }}
    
    <ul>
        <li> Addition:<br/> {{ add }} </li>
        <li> Substration:<br/> {{ subs }} </li>
        <li> Multiplication:<br/> {{ times }} </li>
        <li> Transpose:<br/> {{ transpose }} </li>
        <li> Dot product:<br/> {{ dot }} </li>
    </ul>
</html>
""")

display(HTML(template.render({
    "t1": t1,
    "t2": t2,
    "add": add,
    "subs": subs,
    "times": times,
    "transpose": transpose,
    "dot": dot
})))

## Tensor Gradients

Since we are interested in not only using tensors but mainly in creating an artificial neural network, for the network to be trained it needs to readjust its weights
every time there's an error in the outputs via the backpropagation algorithm.

Backpropagations means computing the gradient (partial derivative) of the _loss function_ with respect to the _network weights_. 

Symbolically: $$\frac{\delta{J}}{\delta\theta}$$

When we want to compute the __Gradient__ of a tensor, we pass as parameter to torch.tensor function the keyword argument _requires\_grad=True_.

In the basic model:

$ y = Wx + b$ 

where $W\in{R^{n,m}}$, 

$b\in{R^{m}}$

and $x\in{R^{m}}$

When we compute the gradient of $y$ with respect to the parameters $W$ and $b$, we get:

$$ \frac{\delta{y}}{\delta{W_i}} = x_i $$

$$ \frac{\delta{y}}{\delta{b}} = 1 $$

In [11]:
# We can use combinations of those operations
# For example, in Artificial Neural Networks the firing of
# a neuron can be represented as:

Weights = tensor([[1, 0, -1]], dtype=torch.float, requires_grad=True)
x = tensor([3, -1, -1], dtype=torch.float)
bias = tensor(1, dtype=torch.float, requires_grad=True)


output = Weights@x + bias

print(output)

tensor([5.], grad_fn=<AddBackward0>)


In [12]:
# When the output variable is calculated, then
# we can compute the gradient of that variable
# respect to the tensors that requires the grad
output.backward()

print(f"Weight Grad dO/dW = {Weights.grad}")
print(f"Bias Grad dO/db = {bias.grad}")

Weight Grad dO/dW = tensor([[ 3., -1., -1.]])
Bias Grad dO/db = 1.0


# Basic Machine Learning Algorithms

We'll cover:

+ Linear Regression
+ Neural Networks

## Linear Regression 

Make predictions about crop yields in terms of the change of temperature, rainfall and humidity.

In linear regresion, we assume that the output variable is a linear function of the parameters. This means, that a hyperplane is adjusted to the data by reducing an error function (tipically Mean Squared Error) comparing the target's value and the model's predicetd value.

### What LR is

Make a prediction by multiplying the weights and the input vector then adding the bias vector.

$$ y_i = \sum_{j=1}^{N} W_{ij}x_{j} + b_i $$

$$ y = Wx + b $$

### Loss Function

Compare the model's prediction with the true targets, then adjust the weights to minimize the error.

$$ MSE = \frac{1}{N}\sum_{i=1}^{N}(Y-Y')^2 $$

### Gradient

The gradient is a vector that points in the direction of the steepest increase of the decision surface. Another way to view it, it as the slope of the loss surface at a given point.

### Adjust the weights

The weights of the model are then updated using the calculated gradients. Update in the opsite direction of the steepest increase so the loss function is minimized.

$$ W_i = W_i - \alpha\Delta{W_i}  $$
$$ b_i = b_i - \alpha\Delta{b_i}  $$

In [37]:
# Input (temp, rainfall, humidity)
INPUTS = ['Temperature', 'Rainfall', 'Humidity']
inputs = np.array([[73, 67, 43], 
                   [91, 88, 64], 
                   [87, 134, 58], 
                   [102, 43, 37], 
                   [69, 96, 70]], dtype='float32')
inputT = torch.from_numpy(inputs)
print(f'---Input Tensor---\n{inputT}\n')

# Targets (apples, oranges)
TARGETS = ['Apples', 'Oranges']
targets = np.array([[56, 70], 
                    [81, 101], 
                    [119, 133], 
                    [22, 37], 
                    [103, 119]], dtype='float32')
targetT = torch.from_numpy(targets)
print(f'---Target Tensor---\n{targetT}')
      
# Dataframe for the training data
df = pd.DataFrame(np.concatenate((inputs, targets), axis=1), columns=[INPUTS+TARGETS])
df

---Input Tensor---
tensor([[ 73.,  67.,  43.],
        [ 91.,  88.,  64.],
        [ 87., 134.,  58.],
        [102.,  43.,  37.],
        [ 69.,  96.,  70.]])

---Target Tensor---
tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.]])


Unnamed: 0,Temperature,Rainfall,Humidity,Apples,Oranges
0,73.0,67.0,43.0,56.0,70.0
1,91.0,88.0,64.0,81.0,101.0
2,87.0,134.0,58.0,119.0,133.0
3,102.0,43.0,37.0,22.0,37.0
4,69.0,96.0,70.0,103.0,119.0


In [122]:
# Linear Regression Model
class LinearReg():
    debug = True
    def __init__(self, features, outputs):
        self.W = torch.randn((len(outputs), len(features)), requires_grad=True)
        self.b = torch.randn((len(outputs),), requires_grad=True)
    def predict(self, x):
        #y = x@self.W.t() + self.b
        print(self.W.shape)
        print(x.shape)
        print(self.b.shape)
        y = self.W@x.t() + self.b
        return y
    def mse(self, y, yh):
        error = y - yh
        return torch.sum(torch.square(error))/error.numel()
    def fit(self, x, y, epochs, lr=5e-3):
        # List for each epoch loss
        L = []
        for e in range(epochs):
             #for xi, yi in zip(x, y):
                #print(f'xi={xi}') 
                #print(f'yi={yi}') 
            # Make predictions
            yh = self.predict(x)
            print(f'yh = {yh}')
            # Compute the loss
            loss = self.mse(y, yh)
            #print(f'loss = {loss}')
            #L.append(loss)
            # Compute the gradient
            loss.backward()
            # Update the parameters
            # This way, the gradients are not modified
            with torch.no_grad():                              
                self.W -= self.W.grad * lr
                self.b -= self.b.grad * lr
                self.W.grad.zero_()
                self.b.grad.zero_()
        return np.array(L)

In [123]:
# Instance of the model
linear_model = LinearReg(INPUTS, TARGETS)
print(f'Weights:\n{linear_model.W}')
print(f'Bias:\n{linear_model.b}')

# Training the model
Loss = linear_model.fit(inputT, targetT, 100)

Weights:
tensor([[ 1.0862, -0.7111, -0.4393],
        [-0.0042,  1.8993,  0.8857]], requires_grad=True)
Bias:
tensor([-0.5391, -1.0671], requires_grad=True)
torch.Size([2, 3])
torch.Size([5, 3])
torch.Size([2])


RuntimeError: The size of tensor a (5) must match the size of tensor b (2) at non-singleton dimension 1

tensor(9)