# Getting Started with Pytorch

Torch is like Numpy. But better. It can automatically calculate symbolic derivative (perform backpropagation) and is able to seamlessly work with accelerators such as:

- GPUs
- TPUs

It also have multiple solutions for productionalization of the models such as:
- TorchMobile
- TorchServe
- TorchTensorRT

An also has multiple extensions for different applications such as:
- TorchVision
- TorchAudio
- Transformers
- and many others

Torch is the state-of-the-art neural network toolkit.

In this notebook we will learn, how to work with it on a very basic level as we do it with numpy.

In [1]:
%pip install "numpy<2.0" lovely_tensors


import json_tricks
import torch
import lovely_tensors


lovely_tensors.monkey_patch()

answer = {}

Collecting numpy<2.0
  Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (62 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.3/62.3 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (14.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m14.2/14.2 MB[0m [31m31.1 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hInstalling collected packages: numpy
  Attempting uninstall: numpy
    Found existing installation: numpy 2.1.2
    Uninstalling numpy-2.1.2:
      Successfully uninstalled numpy-2.1.2
Successfully installed numpy-1.26.4
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kerne

# Task 1: Creating a tensor

Create the following tensors:
- `X_zeros`, tensor of zeros of shape `[3, 4, 5]` (see `torch.zeros`)
- `X_ones`, tensor of ones of the same shape (see `torch.ones`)
- `X_custom`, tensor with all the numbers from 1 to 12 in 3 rows and 4 columns, enumerated row-by-row (any approach will work)
- `X_random`, tensor with normally distributed random values (see `torch.randn`)

In [20]:
import torch
X_zeros = (
    torch.zeros(3,4,5)
)
X_ones = (
    torch.ones(3,4,5)
)

X_custom = (
    torch.arange(1, 13, step=1.0).reshape(3, 4)
)

X_random = (
    torch.rand(3, 4, 5)
)

answer['zeros'] = X_zeros.clone().numpy()
answer['ones'] = X_ones.clone().numpy()
answer['custom'] = X_custom.clone().numpy()

print('X_zeros', X_zeros)
print('X_ones', X_ones)
print('X_custom', X_custom)
print('X_random', X_random)

X_custom.numpy()


X_zeros tensor[3, 4, 5] n=60 [38;2;127;127;127mall_zeros[0m
X_ones tensor[3, 4, 5] n=60 x∈[1.000, 1.000] μ=1.000 σ=0.
X_custom tensor[3, 4] n=12 x∈[1.000, 12.000] μ=6.500 σ=3.606
X_random tensor[3, 4, 5] n=60 x∈[0.029, 0.997] μ=0.482 σ=0.274


array([[ 1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12.]], dtype=float32)

# Checking tensor's properties

For the tensor called `X_custom`, extract its:
- shape
- mean value
- standard deviation
- minimal value
- maximal value

In [2]:
shape = (
    X_custom.shape
)
mean = (
    X_custom.mean()
)
std = (
    X_custom.std()
)
min_val = (
    X_custom.min()
)
max_val = (
    X_custom.max()
)

answer['shape'] = shape
answer['mean'] = mean.clone().numpy()
answer['std'] = std.clone().numpy()
answer['min_val'] = min_val.clone().numpy()
answer['max_val'] = max_val.clone().numpy()

print('shape', shape)
print('mean', mean)
print('std', std)
print('min_val', min_val)
print('max_val', max_val)


NameError: name 'X_custom' is not defined

# Slicing the tensors

From the matrix, called `X_custom`, extract:
- `x_0`: 0th row 
- `x_1`: 1st row
- `x_0_0`: elemnt from 0th row and 0th column
- `x_all_0`: 0th column

In [None]:
x = X_custom
print(x.numpy())

x_0 = (
    x[0]
)

x_1 = (
    x[1]
)

x_0_0 = (
    x[0, 0]
)

x_all_1 = (
    x[:, 0]
)

answer['x_0'] = x_0.clone().numpy()
answer['x_1'] = x_1.clone().numpy()
answer['x_0_0'] = x_0_0.clone().numpy()
answer['x_all_1'] = x_all_1.clone().numpy()

print('x_0', x_0)
print('x_1', x_1)
print('x_0_0', x_0_0)
print('x_all_1', x_all_1)

[[ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 9. 10. 11. 12.]]
x_0 tensor[4] x∈[1.000, 4.000] μ=2.500 σ=1.291 [1.000, 2.000, 3.000, 4.000]
x_1 tensor[4] x∈[5.000, 8.000] μ=6.500 σ=1.291 [5.000, 6.000, 7.000, 8.000]
x_0_0 tensor 1.000
x_all_0 tensor[3] x∈[1.000, 9.000] μ=5.000 σ=4.000 [1.000, 5.000, 9.000]


# Operations with tensors

For the pair of matrices `x` and `y` defined below, find:
- `x_and_10`: $X + 10$
- `x_squared`: $X^2$ (elementwise)
- `x_plus_y`: $X + Y$
- `x_times_y`: $X \cdot Y$ (elementwise)
- `x_divided_by_y`: $X / Y$ (elementwise)
- `x_mod_y`: $X \% Y$ (elementwise)
- `x_exp`: $\exp(X)$ (elementwise)
- `x_log`: $\log(X)$ (elementwise)
- `x_sin`: $\sin(X)$ (elementwise)
- `x_cos`: $\cos(X)$ (elementwise)
- `x_matmul_y`: $XY$ (matrix multiplication)

In [None]:
x = torch.Tensor([[1,  2,  3,  4],
                  [5,  6,  7,  8],
                  [9, 10, 11, 12]])

y = torch.Tensor([[12, 11, 10, 9],
                  [8, 7, 6, 5],
                  [4, 3, 2, 1]])

x_and_10 = (
    x + 10
)

x_squared = (
    torch.square(x)
)

x_plus_y = (
    x+y
)

x_times_y = (
    x * y
)

x_divided_by_y = (
    x / y
)

x_mod_y = (
    x % y
)

x_exp = (
    torch.exp(x)
)

x_log = (
    torch.log(x)
)

x_sin = (
    torch.sin(x)
)

x_cos = (
    torch.cos(x)
)

x_matmul_y = (
    torch.matmul(x.t(), y)
)

answer['x_and_10'] = x_and_10.clone().numpy()
answer['x_squared'] = x_squared.clone().numpy()
answer['x_plus_y'] = x_plus_y.clone().numpy()
answer['x_times_y'] = x_times_y.clone().numpy()
answer['x_divided_by_y'] = x_divided_by_y.clone().numpy()
answer['x_mod_y'] = x_mod_y.clone().numpy()
answer['x_exp'] = x_exp.clone().numpy()
answer['x_log'] = x_log.clone().numpy()
answer['x_sin'] = x_sin.clone().numpy()
answer['x_cos'] = x_cos.clone().numpy()
answer['x_matmul_y'] = x_matmul_y.clone().numpy()

print('x_and_10', x_and_10)
print('x_squared', x_squared)
print('x_plus_y', x_plus_y)
print('x_times_y', x_times_y)
print('x_divided_by_y', x_divided_by_y)
print('x_mod_y', x_mod_y)
print('x_exp', x_exp)
print('x_log', x_log)
print('x_sin', x_sin)
print('x_cos', x_cos)
print('x_matmul_y', x_matmul_y)

print(x_log.numpy())
print(x_sin.numpy())    

x_and_10 tensor[3, 4] n=12 x∈[11.000, 22.000] μ=16.500 σ=3.606
x_squared tensor[3, 4] n=12 x∈[1.000, 144.000] μ=54.167 σ=48.149
x_plus_y tensor[3, 4] n=12 x∈[13.000, 13.000] μ=13.000 σ=0.
x_times_y tensor[3, 4] n=12 x∈[12.000, 42.000] μ=30.333 σ=11.015
x_divided_by_y tensor[3, 4] n=12 x∈[0.083, 12.000] μ=2.362 σ=3.423
x_mod_y tensor[3, 4] n=12 x∈[0., 6.000] μ=2.333 σ=1.875
x_exp tensor[3, 4] n=12 x∈[2.718, 1.628e+05] μ=2.146e+04 σ=4.778e+04
x_log tensor[3, 4] n=12 x∈[0., 2.485] μ=1.666 σ=0.756
x_sin tensor[3, 4] n=12 x∈[-1.000, 0.989] μ=-0.010 σ=0.756
x_cos tensor[3, 4] n=12 x∈[-0.990, 0.960] μ=-0.047 σ=0.719
x_matmul_y tensor[3, 3] n=9 x∈[20.000, 436.000] μ=164.000 σ=135.174 [[100.000, 60.000, 20.000], [268.000, 164.000, 60.000], [436.000, 268.000, 100.000]]
[[0.        0.6931472 1.0986123 1.3862944]
 [1.609438  1.7917595 1.9459101 2.0794415]
 [2.1972246 2.3025851 2.3978953 2.4849067]]
[[ 0.84147096  0.9092974   0.14112    -0.7568025 ]
 [-0.9589243  -0.2794155   0.6569866   0.98935825

# Conditions and masking

For the matrix `x`, do:
- `x_greater_than_3` find the mask of all the elements that are greater than `3`
- `x_greater_than_3_and_less_than_10` find the mask of all the elements that are greater than `3` and less than `10`
- `x_greater_than_10_or_less_than_3` find the mask of all the elements that are either less than `3` or greater than `10`
- `x_not_equal_to_3` find the mask of all the elements that are not equal to `3`
- `x_vals_greater_than_3` extract all the elements that are greater than `3`

In [38]:
x_greater_than_3 = (
    x > 3
)

x_greater_than_3_and_less_than_10 = (
    (x > 3) & (x < 10)
)

x_greater_than_10_or_less_than_3 = (
    (x > 10) | (x < 3)
)

x_not_equal_to_3 = (
    x != 3
)

x_vals_greater_than_3 = (
    x[x > 3]
)

answer['x_greater_than_3'] = x_greater_than_3.clone().numpy()
answer['x_greater_than_3_and_less_than_10'] = x_greater_than_3_and_less_than_10.clone().numpy()
answer['x_greater_than_10_or_less_than_3'] = x_greater_than_10_or_less_than_3.clone().numpy()
answer['x_not_equal_to_3'] = x_not_equal_to_3.clone().numpy()
answer['x_vals_greater_than_3'] = x_vals_greater_than_3.clone().numpy()

print('x_greater_than_3', x_greater_than_3)
print('x_greater_than_3_and_less_than_10', x_greater_than_3_and_less_than_10)
print('x_greater_than_10_or_less_than_3', x_greater_than_10_or_less_than_3)
print('x_not_equal_to_3', x_not_equal_to_3)
print('x_vals_greater_than_3', x_vals_greater_than_3)

print(x_vals_greater_than_3.numpy())

x_greater_than_3 tensor[3, 4] bool n=12 x∈[False, True] μ=0.750 σ=0.452
x_greater_than_3_and_less_than_10 tensor[3, 4] bool n=12 x∈[False, True] μ=0.500 σ=0.522
x_greater_than_10_or_less_than_3 tensor[3, 4] bool n=12 x∈[False, True] μ=0.333 σ=0.492
x_not_equal_to_3 tensor[3, 4] bool n=12 x∈[False, True] μ=0.917 σ=0.289
x_vals_greater_than_3 tensor[9] x∈[4.000, 12.000] μ=8.000 σ=2.739 [4.000, 5.000, 6.000, 7.000, 8.000, 9.000, 10.000, 11.000, 12.000]
[ 4.  5.  6.  7.  8.  9. 10. 11. 12.]


# Beware of shallow copying

Note that in torch by default the tensors are copied using shallow copy!

- `y_shallow`: create a shallow copy of tensor `x`
- change element `y_shallow[0, 0]` to `999`
- check `x`
- you should see the original tensor also changed (because the tensors share memory)

In [39]:
y_shallow = None
y_shallow = x

y_shallow[0, 0] = 999
print(x.numpy())

answer['y_shallow'] = y_shallow.clone().numpy()
answer['x_shallow_victim'] = x.clone().numpy()

print('x_shallow_victim', x)
print('y_shallow', y_shallow)

[[999.   2.   3.   4.]
 [  5.   6.   7.   8.]
 [  9.  10.  11.  12.]]
x_shallow_victim tensor[3, 4] n=12 x∈[2.000, 999.000] μ=89.667 σ=286.383
y_shallow tensor[3, 4] n=12 x∈[2.000, 999.000] μ=89.667 σ=286.383


In [40]:
x = torch.Tensor([[1,  2,  3,  4],
                  [5,  6,  7,  8],
                  [9, 10, 11, 12]])

Now create deep copy 
- `y_deep` create a deep copy of a tensor `x` using `.clone()` operator
- change `y_deep`'s element `[0, 0]` to `999`
- check the original tensor
- original tensor stays the same!


In [41]:
y_deep = x.clone()
y_deep[0, 0] = 999

print(x.numpy())

answer['y_deep'] = y_deep.clone().numpy()
answer['x_cloned'] = x.clone().numpy()

print('x_cloned', x)
print('y_deep', y_deep)

[[ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 9. 10. 11. 12.]]
x_cloned tensor[3, 4] n=12 x∈[1.000, 12.000] μ=6.500 σ=3.606
y_deep tensor[3, 4] n=12 x∈[2.000, 999.000] μ=89.667 σ=286.383


# Types, devices and casting

For the tensor `x`, do:
- `x_dtype` find its data type
- `x_device` find its device
- `x_double` cast the tensor to double
- `x_int` cast the tensor to int
- `x_float` cast the tensor to float
- `x_half` cast tht tensor to half

In [53]:
x_dtype = (
    x.dtype
)

x_device = (
    x.device
)

x_double = (
    x.double()
)

x_int = (
    x.to(torch.int32)
)

x_float = (
    x.to(torch.float32)
)

x_half = (
    x.to(torch.float16)
)


answer['x_dtype'] = str(x_dtype)
answer['x_double'] = x_double.clone().numpy()
answer['x_int'] = x_int.clone().numpy()
answer['x_float'] = x_float.clone().numpy()
answer['x_half'] = x_half.clone().numpy()

print('x_dtype', x_dtype)
print('x_double', x_double)
print('x_int', x_int)
print('x_float', x_float)
print('x_half', x_half)

print(x.dtype)
print(x_device)
print(x.numpy())
print(x_double.numpy()) 
            

x_dtype torch.float32
x_double tensor[3, 4] f64 n=12 x∈[1.000, 12.000] μ=6.500 σ=3.606
x_int tensor[3, 4] i32 n=12 x∈[1, 12] μ=6.500 σ=3.606
x_float tensor[3, 4] n=12 x∈[1.000, 12.000] μ=6.500 σ=3.606
x_half tensor[3, 4] f16 n=12 x∈[1.000, 12.000] μ=6.500 σ=3.605
torch.float32
cpu
[[ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 9. 10. 11. 12.]]
[[ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 9. 10. 11. 12.]]


# Integration with Numpy

For the ndarray `x_np`:
- `x_torch`: convert an array to tensor using `from_numpy`
- `x_sqrt_np`: calculate `sqrt` of the tensor and convert the result to numpy

In [54]:
import numpy as np
x_np = np.array([[1, 2, 3, 4],
              [4, 3, 2, 1]])

x_torch = (
    torch.from_numpy(x_np)
)
x_sqrt_np = (
    torch.sqrt(x_torch).numpy()
)

answer['x_torch'] = x_torch.clone().numpy()
answer['x_sqrt_np'] = x_sqrt_np.copy()

print('x_torch', x_torch)
print('x_sqrt_np', x_sqrt_np)

x_torch tensor[2, 4] i64 n=8 x∈[1, 4] μ=2.500 σ=1.195 [[1, 2, 3, 4], [4, 3, 2, 1]]
x_sqrt_np [[1.        1.4142135 1.7320508 2.       ]
 [2.        1.7320508 1.4142135 1.       ]]


# Working with CUDA

Torch is seamlessly integrated with CUDA and GPU calculations.

To check it, you can throw this ipynb to Colab and try the cells below. Note that some of them will fail in case you do not have a CUDA device.

Besides, Torch can work with
- Nvidia GPUs
- AMD GPUs
- Apple's MPIs
- TPUs

Also, Torch has very powerful tools for multi-device parallelization

In [64]:
%pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118


Looking in indexes: https://download.pytorch.org/whl/cu118
Collecting torchvision
  Downloading https://download.pytorch.org/whl/torchvision-0.17.0-cp311-cp311-linux_aarch64.whl (14.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m14.0/14.0 MB[0m [31m70.9 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting torchaudio
  Downloading https://download.pytorch.org/whl/torchaudio-2.2.0-cp311-cp311-linux_aarch64.whl (1.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m49.9 MB/s[0m eta [36m0:00:00[0m
INFO: pip is looking at multiple versions of torchvision to determine which version is compatible with other requirements. This could take a while.
Collecting torchvision
  Downloading https://download.pytorch.org/whl/torchvision-0.16.2-cp311-cp311-linux_aarch64.whl (14.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m14.0/14.0 MB[0m [31m72.9 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25h 

In [2]:
import torch

if torch.backends.mps.is_available():
    print("MPS is available!")
    device = torch.device("mps")
else:
    print("MPS is not available.")

MPS is not available.


In [56]:
torch.device('cpu')

device(type='cpu')

In [57]:
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
device

device(type='cpu')

In [58]:
x_cuda = x.to(device)
x_cuda


tensor[3, 4] n=12 x∈[1.000, 12.000] μ=6.500 σ=3.606

In [59]:
%time y = (x - x + x * 10.0) ** 2

CPU times: user 874 μs, sys: 1.94 ms, total: 2.82 ms
Wall time: 3.39 ms


In [60]:
%time y_cuda = (x_cuda - x_cuda + x_cuda * 10.0) ** 2


CPU times: user 1.26 ms, sys: 30 μs, total: 1.29 ms
Wall time: 1.09 ms


In [61]:
from pprint import pprint

json_tricks.dump(answer, '.answer.json')


'{"zeros": {"__ndarray__": [[[0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0]], [[0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0]], [[0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0]]], "dtype": "float32", "shape": [3, 4, 5], "Corder": true}, "ones": {"__ndarray__": [[[1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0]], [[1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0]], [[1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0]]], "dtype": "float32", "shape": [3, 4, 5], "Corder": true}, "custom": {"__ndarray__": [[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0], [9.0, 10.0, 11.0, 12.0]], "dtype": "float32", "shape": [3, 4], "Corder": true}, "shape": [3, 4], "me