In [8]:
import torch
import numpy as np

## What's `tensors`?

`tensors` are a generalization of vectors and matrices. In PyTorch, they are a multi-dimensional matrix containing elements of a **single data type**.

In [2]:
t = torch.tensor([[1., -1.], [1., -1.]])
t

tensor([[ 1., -1.],
        [ 1., -1.]])

In [3]:
t.dtype # They have a type

torch.float32

In [4]:
t.shape # a shape

torch.Size([2, 2])

In [5]:
t.device # and live in some device

device(type='cpu')

Some insights:
- All PyTorch heavy work is actually implemented in C++.
- In Python, the integration of C++ code is (usually) done using what is called an **extension**;
- PyTorch uses **ATen**, which is the foundational tensor operation library on which all else is built;
- To do automatic differentiation, PyTorch uses **Autograd**, which is an augmentation on top of the **ATen** framework;

## Everything is an object

In [6]:
a = 300
b = 300
a is b

False

In [7]:
a = 200
b = 200
a is b

True

## Zero-copying `tensors`

It is very common to load tensors in numpy and convert them to PyTorch, or vice-versa;

In [9]:
np_array = np.ones((2,2))
np_array

array([[1., 1.],
       [1., 1.]])

In [10]:
torch_array = torch.tensor(np_array)
torch_array

tensor([[1., 1.],
        [1., 1.]], dtype=torch.float64)

In [11]:
torch_array.add_(1.0)

tensor([[2., 2.],
        [2., 2.]], dtype=torch.float64)

In [12]:
np_array

array([[1., 1.],
       [1., 1.]])

Let's use `torch.from_numpy()`

In [13]:
torch_array = torch.from_numpy(np_array)
torch_array

tensor([[1., 1.],
        [1., 1.]], dtype=torch.float64)

In [14]:
torch_array.add_(1.0)

tensor([[2., 2.],
        [2., 2.]], dtype=torch.float64)

In [15]:
np_array

array([[2., 2.],
       [2., 2.]])

Difference between in-place and standard operations might not be so clear in some cases:

In [16]:
np_array = np.ones((2,2))
np_array

array([[1., 1.],
       [1., 1.]])

In [17]:
torch_array = torch.from_numpy(np_array)

In [18]:
np_array = np_array + 1.0

In [19]:
torch_array # if you use `np_array += 1.0`, that is an in-place operation that will change `torch_array` memory.

tensor([[1., 1.],
        [1., 1.]], dtype=torch.float64)

## Tensor Storage

The abstraction responsible for holding the data isn’t actually the `Tensor`, but the `Storage`.
- The `Storage` abstraction is very powerful because it decouples the raw data and how we can interpret it;
- We can have multiple tensors sharing the **same storage**, but with different interpretations, also called **views**, but without **duplicating memory**:

In [20]:
x = torch.ones((2, 2))
x

tensor([[1., 1.],
        [1., 1.]])

In [21]:
x_view = x.view(4)
x_view

tensor([1., 1., 1., 1.])

In [23]:
x_data = x.data_ptr()
x_view_data = x_view.data_ptr()
x_data == x_view_data # True

# x_view is a different view (interpretation) of the same data
# present in the underlying storage that is shared between both
# tensors.

True

## Memory allocators (CPU/GPU)

The tensor storage can be allocated either in the CPU memory or GPU, therefore a mechanism is required to switch between these different allocations:

https://github.com/pytorch/pytorch/blob/main/c10/core/Allocator.h

There are Allocator s that will use the GPU allocators such as `cudaMalloc()` when the storage should be used for the GPU or `posix_memalign()` POSIX functions for data in the CPU memory.

PyTorch uses a CUDA caching allocator that maintains a cache of allocations with the `Block` structure:

https://github.com/pytorch/pytorch/blob/main/c10/xpu/XPUCachingAllocator.cpp