# Torch

Is a great library that provides really convenient and flexible interfaces for building neural networks.

For installation check [this page](https://pytorch.org/get-started/locally/).

In [1]:
import torch

## Tensor

Tensor is a generalisation of a matrix to the case of arbitrary dimensionality. Basic entity with wich torch operates is tensor. FInd out more in [specific page](torch/tensor.ipynb).

---

The following example demonstrates how to create a specific tensor. In this tensor, the elements are denoted as $\left[ijk\right]$, where $i$ represents the layer index in the third dimension, $j$ denotes the row index, and $k$ indicates the column index.

In [5]:
torch.tensor([
    [
        [111,112,113,114],
        [121,122,123,124],
        [131,132,133,134]
    ],
    [
        [211,212,213,214],
        [221,222,223,224],
        [231,232,233,244]
    ],
])

tensor([[[111, 112, 113, 114],
         [121, 122, 123, 124],
         [131, 132, 133, 134]],

        [[211, 212, 213, 214],
         [221, 222, 223, 224],
         [231, 232, 233, 244]]])

## Gradient

A key feature of PyTorch that sets it apart from NumPy is its ability to automatically compute gradients for tensors involved in computations. You just need to call the `backward` method on the result of your computations. The tensors that participated in these computations will then have a `grad` attribute containing the gradients. Find out more on the [relevant page](torch/differentiation.ipynb).

---

As example consider fuction:

$$f(\overline{X})=\sum_i x_i^2, \overline{X} = (x_1, x_2, x_3)$$

Suppose we want to calculate the gradient of the $f$ on $x$ in point $(1,2,3)$:

$$\nabla f=(2x_1, 2x_2, 2x_3) \Rightarrow \nabla f(1,2,3)=(2,4,6)$$

Now repeat the same procedure with the torch.

In [8]:
X = torch.tensor([1,2,3], dtype=torch.float, requires_grad=True)
res = (X**2).sum()
res.backward()
X.grad

tensor([2., 4., 6.])

## Functions

Torch implements some functions typical for neural networks. They are defined in the `torch.nn.functional` module. It's typical to define this module with an `F` alias.

In [11]:
import torch.nn.functional as F

### Loss functions

Torch implements common loss functions. The following table shows some of them:

| Loss Function                         | Description                              |
|--------------------------------------|------------------------------------------|
| `torch.nn.functional.binary_cross_entropy` | Binary Cross Entropy                     |
| `torch.nn.functional.binary_cross_entropy_with_logits` | Binary Cross Entropy with Logits        |
| `torch.nn.functional.cross_entropy`       | Cross Entropy Loss                       |
| `torch.nn.functional.hinge_embedding_loss` | Hinge Embedding Loss                     |
| `torch.nn.functional.kl_div`              | Kullback-Leibler Divergence Loss         |
| `torch.nn.functional.l1_loss`             | Mean Absolute Error Loss                |
| `torch.nn.functional.mse_loss`            | Mean Squared Error Loss                  |
| `torch.nn.functional.margin_ranking_loss` | Margin Ranking Loss                      |
| `torch.nn.functional.multi_label_margin_loss` | Multi-Label Margin Loss                |
| `torch.nn.functional.multi_label_soft_margin_loss` | Multi-Label Soft Margin Loss           |
| `torch.nn.functional.smooth_l1_loss`      | Smooth L1 Loss                           |
| `torch.nn.functional.triplet_margin_loss` | Triplet Margin Loss                      |
| `torch.nn.functional.nll_loss`            | Negative Log Likelihood Loss            |
| `torch.nn.functional.cosine_embedding_loss` | Cosine Embedding Loss                   |


---

The followgin cell shows applying `mse_loss`.

In [17]:
F.mse_loss(
    torch.tensor([1,2,3], dtype=torch.float),
    torch.tensor([2,3,4], dtype=torch.float)
)

tensor(1.)

#### Reduction

The `reduction` parameter allows you to specify the type of aggregation to apply to the results of the function. The three commonly used values are `none`, `mean`, and `sum`.

---

The following cell demonstrates how different types of reduction are applied to the same inputs:

In [25]:
tens1 = torch.tensor([1,2,3], dtype=torch.float)
tens2 = torch.tensor([2,3,4], dtype=torch.float)

for reduction in ["mean", "sum", "none"]:
    res = F.mse_loss(tens1, tens2, reduction=reduction)
    print(f"reduction - {reduction}, res={res}")

reduction - mean, res=1.0
reduction - sum, res=3.0
reduction - none, res=tensor([1., 1., 1.])
