In [1]:
import warnings
warnings.filterwarnings('ignore')

# Day 18 - It starts with a tensor

## The tensor API

* Most operations can also be called as exactly equivalent tensor methods

In [2]:
import torch

a = torch.ones(3, 2)
a_t = torch.transpose(a, 0, 1)

a.shape, a_t.shape

(torch.Size([3, 2]), torch.Size([2, 3]))

* This is equivalent to the following:

In [3]:
a = torch.ones(3, 2)
a_t = a.transpose(0, 1)

a.shape, a_t.shape

(torch.Size([3, 2]), torch.Size([2, 3]))

* The [online docs](http://pytorch.org/docs) are exhaustive, and group the operations
* Creation (`zeros`, `ones`, ...)
* Indexing, slicing mutating (`transpose`, ...)
* Math
    * Pointwise
    * Reduction
    * Comparison
    * Spectral (frequencies)
    * Other special functions
    * BLAS and LAPACK, standardized linear algebra operations
* Random sampling
* Serializtion
* Parallelism (setting parameters, eg. `set_num_threads`)

## Tensors: Scenic views of storage

* The values are stored in a `torch.Storage`, which holds a 1D, contiguous chunk of memory
* Actual `torch.Tensor`s are views over this storage
* Each tensor can have different offsets and per-dimension strides

### Indexing into storage

In [4]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
points.storage() # This is depracated, and untyped_storage should be used instead

 4.0
 1.0
 5.0
 3.0
 2.0
 1.0
[torch.storage.TypedStorage(dtype=torch.float32, device=cpu) of size 6]

In [5]:
points_storage = points.storage()
points_storage[0]

4.0

In [6]:
points.storage()[1]

1.0

* Of course, changing the underlying storage will also change what a tensor views

In [7]:
points_storage[0] = 2.0
points

tensor([[2., 1.],
        [5., 3.],
        [2., 1.]])

In [8]:
bools = torch.tensor([[True, False], [False, False], [False, True], [True, True]])
bools.untyped_storage() # <- Just one byte!!

 1
 0
 0
 0
 0
 1
 1
 1
[torch.storage.UntypedStorage(device=cpu) of size 8]

### Modifying stored values: In-place operations

* Some operations are only available as methods on `Tensor` objects
* These are identified by their trailing undescores, like `zero_()`
* They are *in-place* operations, modifying the underlying data
* Thus, they do not create and return a new tensor

In [9]:
a = torch.ones(3, 2)
a

tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])

In [10]:
a.zero_()
a

tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])

## Tensor metadata: Size, offset, and stride

* A tensor is fully defined by storage, size, offset and stride
* Size is a tuple, indicating the number of elements along each dimension
* Offset is the index of the tensor's first element in the underlying storage
* Stride is the number of elements that need to be skipped to get to the next element in each dimension

### Views of another tensor's storage

* To access the second point, the offset has to be 2, as the first two elements of the storage have to be skipped

In [11]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
second_point = points[1]
second_point.storage_offset()

2

In [12]:
second_point.size(), second_point.shape

(torch.Size([2]), torch.Size([2]))

* Two elements have to be skipped to get to the next point
* One element has to be skipped to get to the next coordinate of a point

In [13]:
points.stride()

(2, 1)

* Accessing the element at `i, j`, we retrieve `storage_offset + i * stride[0] + j * stride[1]`
* Defining a tensor like this makes many operations cheap, by simply allocating a new tensor with different size, offset, and stride, viewing the same underlying memory

In [14]:
second_point = points[1]
second_point.size(), second_point.storage_offset(), second_point.stride()

(torch.Size([2]), 2, (1,))

* Sometimes, we want to modify the tensor, without changing the original data
* In order to do this, we can clone the tensor

In [15]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
second_point = points[1].clone()
second_point[0] = 10.0
points

tensor([[4., 1.],
        [5., 3.],
        [2., 1.]])

### Transposing without copying

* The `t` function is a shorthand for `transpose`-ing a two-dimensional tensor
* This allows us, for example, to turn our points array from one where each row represents a point, to one where each column represents a point

In [16]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
points

tensor([[4., 1.],
        [5., 3.],
        [2., 1.]])

In [17]:
points_t = points.t()
points_t

tensor([[4., 5., 2.],
        [1., 3., 1.]])

* These share the same untyped storage, but differ in their properties

In [18]:
id(points.untyped_storage()) == id(points_t.untyped_storage())

True

In [19]:
points.stride(), points_t.stride()

((2, 1), (1, 2))

* Transposing is simply flipping the shape and stride of the tensor

### Transposing in higher dimensions

* To transpose in higher dimensions, we just have to specify the dimensions along which to flip the shape and stride

In [20]:
some_t = torch.ones(3, 4, 5)
transpose_t = some_t.transpose(0, 2)
some_t.shape, transpose_t.shape

(torch.Size([3, 4, 5]), torch.Size([5, 4, 3]))

In [21]:
some_t.stride(), transpose_t.stride()

((20, 5, 1), (1, 5, 20))

* A tensor that is laid out in memory starting from the rightmost dimension onward can be efficiently visited element by element, as it is `contiguous`
* This locality can provide the best performance

### Contiguous tensors

* Some operations work only on contiguous tensors
* Trying to use these will inform us to use `contiguous`, which costs nothing if the tensor already is contiguous

In [22]:
points.is_contiguous(), points_t.is_contiguous()

(True, False)

* To obtain a contiguous tensor from a non-contiguous one, the `contiguous` method will change the stride, as well as the underlying storage to match it

In [23]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
points_t = points.t()
points_t, points_t.stride(), points_t.storage()

(tensor([[4., 5., 2.],
         [1., 3., 1.]]),
 (1, 2),
  4.0
  1.0
  5.0
  3.0
  2.0
  1.0
 [torch.storage.TypedStorage(dtype=torch.float32, device=cpu) of size 6])

In [24]:
points_t_cont = points_t.contiguous()
points_t_cont, points_t_cont.stride(), points_t_cont.storage()

(tensor([[4., 5., 2.],
         [1., 3., 1.]]),
 (3, 1),
  4.0
  5.0
  2.0
  1.0
  3.0
  1.0
 [torch.storage.TypedStorage(dtype=torch.float32, device=cpu) of size 6])

* Essentially, transposing changes the view, while making a transpose `contiguous` changes the underlying storage to match the original stride