## 1. Introduction to PyTorch, a Deep Learning Library
### 1.1. Introduction to deep learning with PyTorch

Importing **PyTorch** and related packages:
* torchvision - image data
* torchaudio - audio data 
* torchtext - text data

In [15]:
import torch

**Tensors** are multidimensional representations of their elements. We can load from:
* list

In [16]:
# load from list
lst = [[1,2,3], [4,5,6]]
tensor = torch.tensor(lst)

* NumPy array

In [17]:
# load from NumPy array
import numpy as np
np_array = np.array([1,2])
np_tensor = torch.tensor(np_array)

**Tensor attributes:**
* shape

In [18]:
tensor.shape

torch.Size([2, 3])

* data type

In [19]:
tensor.dtype

torch.int64

* device (CPU/GPU)

In [20]:
tensor.device

device(type='cpu')

**Tensor operations:**
* addition / subtraction

In [21]:
a = torch.tensor([[1, 1],
                 [2, 2]])
b = torch.tensor([[2, 2],
                 [3, 3]])
c = torch.tensor([[2, 2, 4],
                  [3, 3, 5]])

a+b
a-b
#a+b - error! (Incomatible shapes)

tensor([[-1, -1],
        [-1, -1]])

* element-wise multiplication

In [22]:
a*b

tensor([[2, 2],
        [6, 6]])

* transposition
* matrix multiplication
* concatenation

### 1.2. Creating our first neural network

**Linear layer operation:**

For input $X$, weights $W_0$ and bias $b_0$, the linear layer performs:
$$
y_0 = W_0 * X + b_0
$$
In PyTorch: `output = W0 @ input + b0`
* Weights and biases are initialized randomly
* They are not useful until they are tuned

In [23]:
import torch.nn as nn

# Create input_tensor with three features
input_tensor = torch.tensor([[0.3471, 0.4547, -0.2356]])

# Define first linear layer
linear_layer = nn.Linear(in_features=3, out_features=2)

# Pass input through linear layer
output = linear_layer(input_tensor)
output

tensor([[-0.2671, -0.6894]], grad_fn=<AddmmBackward0>)

* Input dimension: $1 × 3$
* Linear layer arguments:
  - `in_features = 3`
  - `out_features = 2`
* Output dimensions: $3 × 2$
* Networks with only linear layers are called **fully connected**
<center>
<img src = "img/picture1.png", alt="Neural network image", width=200>
</center>



**Stacking layers with nn.Sequential()**

In [24]:
model = nn.Sequential(
    nn.Linear(10, 18),
    nn.Linear(18, 20),
    nn.Linear(20, 5)
)

input_tensor = torch.tensor([[-0.0014, 0.4038, 1.0305, 0.7521, 0.7489, -0.3968, 0.0113, -1.3844, 0.8705, -0.9743]])
output_tensor = model(input_tensor)
output_tensor

tensor([[ 0.1028,  0.1490, -0.1487,  0.0694, -0.2521]],
       grad_fn=<AddmmBackward0>)

### 1.3. Discovering activation functions

**Sigmoid function**

*Binary classification* problems - to predict whether sth is `True (1)` or `False (0)`

<center>
<img src = "img/picture3.png", alt="NN image", width=400>
</center>

$$
σ(x)=\frac{1}{1+e^{-x}}
$$
<center>
<img src = "img/picture2.png", alt="Sigmoid image", width=800>
</center>


* If output is $> 0.5$, the class label $=1$ (True)
* If output is $<= 0.5$, class label $=0$ (False

In [25]:
input_tensor = torch.tensor([[6.0]])
sigmoid = nn.Sigmoid()
output_tensor = sigmoid(input_tensor)
output_tensor

tensor([[0.9975]])

Activatio function as the *last layer*

In [26]:
model = nn.Sequential(
    nn.Linear(6, 4), # First linear layer
    nn.Linear(4, 1), # Second linear layer
    nn.Sigmoid()     # Sigmpid activation function
)

**Softmax function**

*Multiclass classification* problems:
* takes N-element vector as input and outputs vector of same size
<center>
<img src = "img/picture4.png", alt="NN image", width=400>
</center>

$$
σ(x)=\frac{e^{x_i}}{\sum_{j=1}^Ne^x_j}
$$


In [27]:
input_tensor = torch.tensor([[4.3, 6.1, 2.3]])

softmax = nn.Softmax(dim=-1) # dim = -1 => softmax is applied to the input tensor's last dimension
output_tensor = softmax(input_tensor)
output_tensor

tensor([[0.1392, 0.8420, 0.0188]])