# PyTorch Tensors

Understanding tensors is critical to working with PyTorch. My background is in Physics and so I am used to thinking of tensors as mathematical objects that describe a multi-linear map

$u = T_{ijk\dotsb}^{lmn\dotsb}v$

and obey transformation laws like

$T_{i^\prime j^\prime k^\prime \dotsb}^{l^\prime m^\prime n^\prime \dotsb} = \Gamma_{i}^{i^\prime}\Gamma_{j}^{j^\prime}\Gamma_{k}^{k^\prime}\dotsb\Gamma^{l}_{l^\prime}\Gamma^{m}_{m^\prime}\Gamma^{n}_{n^\prime} T_{ijk\dotsb}^{lmn\dotsb}$

Working with PyTorch tensors do require some index gymnastics, but this is not what is explored here. Let's see what the tensors are here.

For instance, how are PyTorch tensors stored?

In [1]:
import sys

import torch

In [2]:
range_t: torch.Tensor = torch.tensor(list(range(9)))

In [3]:
sys.getsizeof(list(range(9)))

128

In [4]:
sys.getsizeof(range_t)

72

It turns out that the size of a tensor is less than constructor inputs! That is because the backend API stores the data in contiguous memory blocks unlike default Python allocating objects and reserving memory pointers.

In [5]:
range_t.is_contiguous()

True

In [23]:
range_t.storage()

 0
 1
 2
 3
 4
 5
 6
 7
 8
[torch.storage._TypedStorage(dtype=torch.int64, device=cpu) of size 9]

How does indexing work?

In [6]:
range_t.size(), range_t.storage_offset(), range_t.stride()

(torch.Size([9]), 0, (1,))

Since this is a 1D tensor, we are working equivalently with a vector. Moving between elements is unit steps. What if we want to make this a matrix?

In [24]:
range_t_tr: torch.Tensor = range_t.view(3, 3)

In [25]:
range_t_tr

tensor([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])

In [35]:
range_t_tr.storage()

 0
 1
 2
 3
 4
 5
 6
 7
 8
[torch.storage._TypedStorage(dtype=torch.int64, device=cpu) of size 9]

While they are not the same shape, they look like the same in storage.

In [36]:
range_t_tr.storage() == range_t.storage()

False

While the equality operator says the storage containers are not equal, but are the storage elements equal?

In [34]:
assert all(
    z1 == z2
    for (z1, z2) in zip(range_t_tr.storage(), range_t.storage())
)

What about using a different slice? Will the offset and stide be different?

In [41]:
range_t_tr[1:, 1:]

tensor([[4, 5],
        [7, 8]])

In [42]:
range_t_tr[1:, 1:].size(), range_t_tr[1:, 1:].stride(), range_t_tr[1:, 1:].storage_offset()

(torch.Size([2, 2]), (3, 1), 4)

The stride is the same from before, but does not match the tensor size. I guess this is due to the nature of slicing? What does a clone do?

In [44]:
range_t_tr[1:, 1:].clone().stride()

(2, 1)

This is the expected result. So since we are NOT dealing with contiguous memory object, the stride is different.  So we have to make the tensor contiguous (or clone if necessary)

In [48]:
range_t_tr[1:, 1:].contiguous().stride()

(2, 1)

What about applying functions to tensors?

In [72]:
range_t.dtype, range_t.sqrt().dtype

(torch.int64, torch.float32)

These are two different tensors. Like the `list.sort` and `sorted` methods, one method returns a new object and the other inplace, respectively. Let's verify this.

In [74]:
range_t.sqrt_()

RuntimeError: result type Float can't be cast to the desired output type Long

We have to convert the tensor to a floating-point type to perform the square-root operation element-wise and change the object in-place.

In [76]:
range_t.clone().to(torch.float).sqrt_()

tensor([0.0000, 1.0000, 1.4142, 1.7321, 2.0000, 2.2361, 2.4495, 2.6458, 2.8284])