# Introduction to PyTorch

- [PyTorch](http://pytorch.org/) it's a framework for developing and training neural networks. 
- It's very similar to numpy, but here, `array` is called `tensors`
- `tensors` make the communication between CPU and GPU much easier than `arrays`
- also, pytorch has usefull functions to calculate gradients (which is great for backpropagation tasks) and build neural networks
- compared with tensorflow and other frameworks, pytorch is better to work with python / numpy / scipy

Simple machine learning models (e.g. like perceptron, and linear and logistic regression) must solve linear equations like:

$$
\begin{align}
y &= f(w_1 x_1 + w_2 x_2 + b) \\
y &= f\left(\sum_i w_i x_i +b \right)
\end{align}
$$

representing with vectors:

$$
h = \begin{bmatrix}
x_1 \, x_2 \cdots  x_n
\end{bmatrix}
\cdot 
\begin{bmatrix}
           w_1 \\
           w_2 \\
           \vdots \\
           w_n
\end{bmatrix}
$$


## Tensors

- Linear algebra with `tensors`, matrix genaralization, and other math areas are  exactly what machine algorithms do..
   - vector is a 1D tensor
   - matrix is a 2D tensor
   - a 3D array is a 3D tensor (e.g. RBG images)


In [1]:
import torch

In [8]:
def activation_function_sigmoid(x):
    """ Defining the activation function - Sigmoid
    
        Args:
        ---------
        x: torch.Tensor
        return: Sigmoid f(x)
    """
    return 1/(1+torch.exp(-x))

In [9]:
### GENERATING RANDOM DATA
torch.manual_seed(7) # Setting the seed for replicable results

# creating a tensor with 1 line (because we have only 1 sample) and 5 columns (5 features per sample), 
features = torch.randn((1, 5))     #   torch.randn ---> normal distribution with mean=0 and variance=1

# generating wandom weights for the model: randn_like 
weights = torch.randn_like(features)   # it generates other tensors with the same characteristics of "features"

# BIAS term - it's a tensor with only 1 line and 1 column
bias = torch.randn((1, 1))

In [10]:
print('features:', features)
print('weights: ', weights)
print('bias:    ', bias)

features: tensor([[-0.1468,  0.7861,  0.9468, -1.1143,  1.6908]])
weights:  tensor([[-0.8948, -0.3556,  1.2324,  0.1382, -1.6822]])
bias:     tensor([[0.3177]])


* Just like `arrays`, `tensors` can be added, subtracted, multiplied, etc.
* the advantage here, it that we can use the GPU 

## Example: computing the output of a neuron model

In [11]:
### HOW TO PREDICT THE OUTPUT OF THE NEURON:

# Just like numpy, we can use torch.sum(), assim como o métodos .sum() nos tensores.

# torch.sum(w * f + b) which is the 1st degree equation
y1 = activation_function_sigmoid(torch.sum(features * weights) + bias)
print('option 1: ', y1)

# .sum()
y2 = activation_function_sigmoid((features * weights).sum() + bias)
print('option 2: ', y2)

# or we can multiply the matrixes (+effective, especially with GPUs) using torch.mm() or torch.matmul()
#  torch.mm()
y3 = activation_function_sigmoid(torch.mm(features, weights.view(5,1)) + bias)
print('option 3: ', y3)

option 1:  tensor([[0.1595]])
option 2:  tensor([[0.1595]])
option 3:  tensor([[0.1595]])


- As we can see, we have here 3 ways of getting to the same result!
- Note that in 3rd option, we had to reshape our tensor weights by calling 'view(5,1)'
- The error would be `RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x5 and 1x5)`
- To reshape our vector, we can use:

    1) `tensor.shape`

    2) `tensor.reshape()`

    3) `tensor.resize_()`

    4) `tensor.view()`

In [13]:
torch.mm(features, weights)   # ERROR: just to show how it  would be if we didnt reshape the tensor weights

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x5 and 1x5)

## Converting between Numpy **and** Pytorch

In [14]:
import numpy as np
# generating a random numpy array of 4 lines and 3 columns
a = np.random.rand(4,3)
a

array([[0.63251587, 0.1920365 , 0.21510379],
       [0.46302438, 0.28097784, 0.68033106],
       [0.39671541, 0.12517758, 0.11944696],
       [0.57902134, 0.79701151, 0.22283755]])

In [15]:
# Converting to tensor
b = torch.from_numpy(a)
b

tensor([[0.6325, 0.1920, 0.2151],
        [0.4630, 0.2810, 0.6803],
        [0.3967, 0.1252, 0.1194],
        [0.5790, 0.7970, 0.2228]], dtype=torch.float64)

In [16]:
# converting back to numpy
b.numpy()

array([[0.63251587, 0.1920365 , 0.21510379],
       [0.46302438, 0.28097784, 0.68033106],
       [0.39671541, 0.12517758, 0.11944696],
       [0.57902134, 0.79701151, 0.22283755]])

In [None]:
# if we change an object 'inplace', we change both objects
# so, if we multiply 'b' by 2, in-place
b.mul_(2)