<a href="https://colab.research.google.com/github/Existanze54/sirius-neural-networks-2025/blob/main/Seminars/S01_PyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Глубокое машинное обучение

### Семинар 1: Знакомство с библиотекой PyTorch

In [None]:
import torch

In [None]:
torch.__version__

'2.2.1+cu121'

![PyTorch 2.0](https://data.bioml.ru/htdocs/courses/bioml/neural_networks/pytorch/img/pytorch_2.png)

**Кто работал с NumPy?**

In [None]:
import numpy as np

Рекомендую вспомнить основы, почитав [лекцию предыдущего курса](https://colab.research.google.com/drive/1UZPtzIyxCTlKjyeyrrHj2zt8Z3qwWEfc) про NumPy.

**PyTorch** - одна из самых популярных на сегодня бибиблиотек для написания и обучения нейронных сетей.

Есть альтернатива: ~утки~ TensorFlow + Keras, но мы их не будем рассматривать в этом курсе.

## Тензоры и векторизация вычислений

Как и в NumPy, в PyTorch имплементирован класс многомерного массива данных (не путать с тензорами в физике, называется он здесь `Tensor`, но на самом деле это массив с дополнительным функционалом).

Для любопытных: раньше было 2 класса: `Tensor` и `Variable`, но в конце-концов `Variable` упразднили и остался только `Tensor`...

<font color="#bbbbbb">_А в чем отличие `Var` от `Tensor`, вам расскажет Дима Пензар..._</font>


Создадим тензор! Сделать это можно несколькими способами:

In [None]:
x = torch.tensor([[0.2, 0.5, 0.8],
                  [1.3, 1.7, 8.0]])
x

tensor([[0.2000, 0.5000, 0.8000],
        [1.3000, 1.7000, 8.0000]])

Можно сгенерировать псевдослучайные данные:

In [None]:
x = torch.rand(size=(2, 3))
x

tensor([[0.5421, 0.0563, 0.8661],
        [0.4258, 0.8212, 0.2117]])

Можно просто аллоцировать некие произвольные данные из памяти:

In [None]:
torch.empty((2, 3))

tensor([[2.6048e+05, 4.5576e-41, 2.6048e+05],
        [4.5576e-41, 7.4937e+31, 1.7753e+28]])

Или можно загрузить данные из NumPy:

In [None]:
x = np.random.normal(size=(2, 3)).astype(np.float32)
y = torch.from_numpy(x)

In [None]:
y

tensor([[-0.0243, -0.9843, -1.5323],
        [-1.5652,  0.4349,  0.0736]])

Обратите внимание, что `x` и `y` расположены в одном блоке памяти:

In [None]:
y[1, 1] = -10
x

array([[ -0.02431436,  -0.9843401 ,  -1.5322767 ],
       [ -1.5652288 , -10.        ,   0.07359219]], dtype=float32)

In [None]:
x[1, 1] = 5
y

tensor([[-0.0243, -0.9843, -1.5323],
        [-1.5652,  5.0000,  0.0736]])

### Свойства тензора

In [None]:
dir(x)

['H',
 'T',
 '__abs__',
 '__add__',
 '__and__',
 '__array__',
 '__array_priority__',
 '__array_wrap__',
 '__bool__',
 '__class__',
 '__complex__',
 '__contains__',
 '__deepcopy__',
 '__delattr__',
 '__delitem__',
 '__dict__',
 '__dir__',
 '__div__',
 '__dlpack__',
 '__dlpack_device__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__iand__',
 '__idiv__',
 '__ifloordiv__',
 '__ilshift__',
 '__imod__',
 '__imul__',
 '__index__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__invert__',
 '__ior__',
 '__ipow__',
 '__irshift__',
 '__isub__',
 '__iter__',
 '__itruediv__',
 '__ixor__',
 '__le__',
 '__len__',
 '__long__',
 '__lshift__',
 '__lt__',
 '__matmul__',
 '__mod__',
 '__module__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__nonzero__',
 '__or__',
 '__pos__',
 '__pow__',
 '__radd__',
 '__rand__',
 '__rdiv__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed_

In [None]:
y.dtype

torch.float32

In [None]:
y.shape

torch.Size([2, 3])

In [None]:
y.ndim

2

In [None]:
y.device

device(type='cpu')

### Базовые операции с тензорами

#### Преобразования типов

In [None]:
x = torch.tensor([1, 2, 3])
x.dtype

torch.int64

In [None]:
y = torch.tensor([0.4, 1.5, 2.6])
y.dtype

torch.float32

In [None]:
xf = x.float()
xf.dtype

torch.float32

In [None]:
yf = y.int()
yf.dtype

torch.int32

In [None]:
yf

tensor([0, 1, 2], dtype=torch.int32)

#### Бинарные операции

In [None]:
x = torch.tensor([[0, 1, 2],
                  [3, 4, 5]])
x

tensor([[0, 1, 2],
        [3, 4, 5]])

In [None]:
y = torch.tensor([[20, 10,  5],
                  [40, 30, 25]])
y

tensor([[20, 10,  5],
        [40, 30, 25]])

In [None]:
z = torch.tensor([[3],
                  [7]])
z

tensor([[3],
        [7]])

In [None]:
x + y

tensor([[20, 11,  7],
        [43, 34, 30]])

In [None]:
x + z

tensor([[ 3,  4,  5],
        [10, 11, 12]])

#### Элементарные функции

In [None]:
x

tensor([[0, 1, 2],
        [3, 4, 5]])

In [None]:
x.log()

tensor([[  -inf, 0.0000, 0.6931],
        [1.0986, 1.3863, 1.6094]])

In [None]:
x.log1p()

tensor([[0.0000, 0.6931, 1.0986],
        [1.3863, 1.6094, 1.7918]])

In [None]:
x.exp()

tensor([[  1.0000,   2.7183,   7.3891],
        [ 20.0855,  54.5981, 148.4132]])

In [None]:
x.sin()

tensor([[ 0.0000,  0.8415,  0.9093],
        [ 0.1411, -0.7568, -0.9589]])

In [None]:
x.cos()

tensor([[ 1.0000,  0.5403, -0.4161],
        [-0.9900, -0.6536,  0.2837]])

In [None]:
x.log1p().exp().log1p().exp()

tensor([[2.0000, 3.0000, 4.0000],
        [5.0000, 6.0000, 7.0000]])

#### Агрегирующие операции

In [None]:
x

tensor([[0, 1, 2],
        [3, 4, 5]])

In [None]:
x.mean()  # type conversion must be explicit (sometimes)!

RuntimeError: ignored

In [None]:
x.float().mean()  # why is it Tensor?

tensor(2.5000)

In [None]:
x = x.float()

In [None]:
x.mean(axis=1)

tensor([1., 4.])

In [None]:
x.max()

tensor(5.)

In [None]:
x.argmax(axis=None)

tensor(5)

In [None]:
x.argmax(axis=0)

tensor([1, 1, 1])

### Манипуляции формой массива

#### Индексация и фильтрация

In [None]:
x = torch.normal(1, 1.0, size=(6, 4))
x

tensor([[-0.4974,  1.1712,  1.7070,  2.2499],
        [ 1.2007,  1.3238, -0.9118,  0.3934],
        [ 0.4878,  3.1433,  0.2791,  0.1968],
        [ 1.6048,  2.7025,  2.2585,  1.2531],
        [ 1.9028,  1.8974,  1.4985,  2.2564],
        [ 1.1477,  1.9754,  2.5848,  2.3228]])

In [None]:
r = x[1,:]
r

tensor([ 1.2007,  1.3238, -0.9118,  0.3934])

In [None]:
x[1,1]

tensor(1.3238)

In [None]:
c = x[:,1]
c

tensor([1.1712, 1.3238, 3.1433, 2.7025, 1.8974, 1.9754])

In [None]:
x_m = x.mean(axis=0)
x_m

tensor([0.9744, 2.0356, 1.2360, 1.4454])

In [None]:
x_m > 1.4

tensor([False,  True, False,  True])

In [None]:
r[x_m > 1.4]

tensor([1.3238, 0.3934])

In [None]:
x[x_m > 1.4]  # won't work!

IndexError: ignored

In [None]:
x[:,(x_m > 1.4)]

tensor([[1.1712, 2.2499],
        [1.3238, 0.3934],
        [3.1433, 0.1968],
        [2.7025, 1.2531],
        [1.8974, 2.2564],
        [1.9754, 2.3228]])

In [None]:
x[1:3,0:3]

tensor([[ 1.2007,  1.3238, -0.9118],
        [ 0.4878,  3.1433,  0.2791]])

In [None]:
x[[1, 2],[0, 2]]

tensor([1.2007, 0.2791])

#### Изменение формы

In [None]:
x

tensor([[-0.4974,  1.1712,  1.7070,  2.2499],
        [ 1.2007,  1.3238, -0.9118,  0.3934],
        [ 0.4878,  3.1433,  0.2791,  0.1968],
        [ 1.6048,  2.7025,  2.2585,  1.2531],
        [ 1.9028,  1.8974,  1.4985,  2.2564],
        [ 1.1477,  1.9754,  2.5848,  2.3228]])

In [None]:
x.shape

torch.Size([6, 4])

In [None]:
x.reshape(shape=(8, 3))

tensor([[-0.4974,  1.1712,  1.7070],
        [ 2.2499,  1.2007,  1.3238],
        [-0.9118,  0.3934,  0.4878],
        [ 3.1433,  0.2791,  0.1968],
        [ 1.6048,  2.7025,  2.2585],
        [ 1.2531,  1.9028,  1.8974],
        [ 1.4985,  2.2564,  1.1477],
        [ 1.9754,  2.5848,  2.3228]])

In [None]:
x = torch.arange(0.0, 24.0).reshape(shape=(6, 4))
x

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [12., 13., 14., 15.],
        [16., 17., 18., 19.],
        [20., 21., 22., 23.]])

In [None]:
x.reshape(shape=(4, 6))

tensor([[ 0.,  1.,  2.,  3.,  4.,  5.],
        [ 6.,  7.,  8.,  9., 10., 11.],
        [12., 13., 14., 15., 16., 17.],
        [18., 19., 20., 21., 22., 23.]])

In [None]:
x.reshape(shape=24)  # won't work!

TypeError: ignored

In [None]:
x.reshape(shape=(24))  # what's wrong??

TypeError: ignored

In [None]:
x.reshape(shape=(24,))

tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
        14., 15., 16., 17., 18., 19., 20., 21., 22., 23.])

In [None]:
x.T

tensor([[ 0.,  4.,  8., 12., 16., 20.],
        [ 1.,  5.,  9., 13., 17., 21.],
        [ 2.,  6., 10., 14., 18., 22.],
        [ 3.,  7., 11., 15., 19., 23.]])

In [None]:
x.permute(1, 0)

tensor([[ 0.,  4.,  8., 12., 16., 20.],
        [ 1.,  5.,  9., 13., 17., 21.],
        [ 2.,  6., 10., 14., 18., 22.],
        [ 3.,  7., 11., 15., 19., 23.]])

In [None]:
x = torch.arange(0.0, 30.0).reshape(shape=(5, 3, 2))
x

tensor([[[ 0.,  1.],
         [ 2.,  3.],
         [ 4.,  5.]],

        [[ 6.,  7.],
         [ 8.,  9.],
         [10., 11.]],

        [[12., 13.],
         [14., 15.],
         [16., 17.]],

        [[18., 19.],
         [20., 21.],
         [22., 23.]],

        [[24., 25.],
         [26., 27.],
         [28., 29.]]])

In [None]:
x.permute(0, 2, 1)

tensor([[[ 0.,  2.,  4.],
         [ 1.,  3.,  5.]],

        [[ 6.,  8., 10.],
         [ 7.,  9., 11.]],

        [[12., 14., 16.],
         [13., 15., 17.]],

        [[18., 20., 22.],
         [19., 21., 23.]],

        [[24., 26., 28.],
         [25., 27., 29.]]])

#### Broadcasting

In [None]:
a = torch.arange(6.).reshape((2, 3))
a

tensor([[0., 1., 2.],
        [3., 4., 5.]])

In [None]:
b = torch.arange(2.)
b

tensor([0., 1.])

In [None]:
a + b

RuntimeError: ignored

In [None]:
b.reshape((1, 2))

tensor([[0., 1.]])

In [None]:
a + b.reshape((1, 2))

RuntimeError: ignored

In [None]:
b.reshape((2, 1))

tensor([[0.],
        [1.]])

In [None]:
a + b.reshape((2, 1))

tensor([[0., 1., 2.],
        [4., 5., 6.]])

In [None]:
a[:,None,:]

tensor([[[0., 1., 2.]],

        [[3., 4., 5.]]])

Существует **правило приведения размерностей**:

1. Предположим, что `a.shape = (a_1, a_2, ..., a_n)` и `b.shape = (b_1, b_2, ..., b_n)`. Над `a` и `b` можно произвести поэлементую бинарную операцию, если $\forall \; i \in (1...n)$ выполнено хотя бы одно из условий:
    * `a_i == b_i`;
    * `a_i == 1`;
    * `b_i == 1`.


2. Если размерности не совпадают, то к массиву меньшей размерности добавляются **_ведущие_ фиктивные размерности**.

Документация:
* https://pytorch.org/docs/stable/notes/broadcasting.html
* https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html (аналогично в NumPy)


In [None]:
lonng = torch.arange(20)[None,:]  # -> (1, 20)
lonng

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19])

In [None]:
lonng_multiplied = lonng.broadcast_to(10, 20)
lonng_multiplied

tensor([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
         18, 19],
        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
         18, 19],
        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
         18, 19],
        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
         18, 19],
        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
         18, 19],
        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
         18, 19],
        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
         18, 19],
        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
         18, 19],
        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
         18, 19],
        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
         18, 19]])

In [None]:
lonng = torch.arange(20)[:,None]  # -> (20, 1)
lonng

tensor([[ 0],
        [ 1],
        [ 2],
        [ 3],
        [ 4],
        [ 5],
        [ 6],
        [ 7],
        [ 8],
        [ 9],
        [10],
        [11],
        [12],
        [13],
        [14],
        [15],
        [16],
        [17],
        [18],
        [19]])

In [None]:
lonng = torch.arange(20)[:,None]
lonng.broadcast_to(20, 10)

tensor([[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
        [ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1],
        [ 2,  2,  2,  2,  2,  2,  2,  2,  2,  2],
        [ 3,  3,  3,  3,  3,  3,  3,  3,  3,  3],
        [ 4,  4,  4,  4,  4,  4,  4,  4,  4,  4],
        [ 5,  5,  5,  5,  5,  5,  5,  5,  5,  5],
        [ 6,  6,  6,  6,  6,  6,  6,  6,  6,  6],
        [ 7,  7,  7,  7,  7,  7,  7,  7,  7,  7],
        [ 8,  8,  8,  8,  8,  8,  8,  8,  8,  8],
        [ 9,  9,  9,  9,  9,  9,  9,  9,  9,  9],
        [10, 10, 10, 10, 10, 10, 10, 10, 10, 10],
        [11, 11, 11, 11, 11, 11, 11, 11, 11, 11],
        [12, 12, 12, 12, 12, 12, 12, 12, 12, 12],
        [13, 13, 13, 13, 13, 13, 13, 13, 13, 13],
        [14, 14, 14, 14, 14, 14, 14, 14, 14, 14],
        [15, 15, 15, 15, 15, 15, 15, 15, 15, 15],
        [16, 16, 16, 16, 16, 16, 16, 16, 16, 16],
        [17, 17, 17, 17, 17, 17, 17, 17, 17, 17],
        [18, 18, 18, 18, 18, 18, 18, 18, 18, 18],
        [19, 19, 19, 19, 19, 19, 19, 19, 19, 19]])

#### Склейка массивов

In [None]:
a = torch.arange(10).reshape((2, 5))
a

tensor([[0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9]])

In [None]:
b = torch.arange(20).reshape((4, 5))
b

tensor([[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]])

In [None]:
c = torch.arange(12).reshape((2, 6))
c

tensor([[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11]])

In [None]:
torch.concat([a, a, a, a])

tensor([[0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9]])

In [None]:
torch.concat([a, b])

tensor([[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]])

In [None]:
torch.concat([a, c])

RuntimeError: ignored

In [None]:
torch.concat([a, c], axis=1)

tensor([[ 0,  1,  2,  3,  4,  0,  1,  2,  3,  4,  5],
        [ 5,  6,  7,  8,  9,  6,  7,  8,  9, 10, 11]])

In [None]:
torch.stack([a, a, a])

tensor([[[0, 1, 2, 3, 4],
         [5, 6, 7, 8, 9]],

        [[0, 1, 2, 3, 4],
         [5, 6, 7, 8, 9]],

        [[0, 1, 2, 3, 4],
         [5, 6, 7, 8, 9]]])

In [None]:
torch.stack([a, b])

RuntimeError: ignored