# 2. Introduction to tensors

Free after [Deep Learning with PyTorch, Eli Stevens, Luca Antiga, and Thomas Viehmann](https://www.manning.com/books/deep-learning-with-pytorch)

In [None]:
%%HTML
<style>
th {
  font-size: 24px
}
td {
  font-size: 16px
}
</style>

In [None]:
import torch
import helper
from matplotlib import pyplot as plt
import numpy as np
import seaborn as sns
sns.set_theme(style="ticks")

## Core concepts of this section

1. A `Tensor` is a `View` onto a `Storage`
2. `contiguous` memory layout enables fast computations
3. `broadcasting`: expand Tensor dimensions as needed

## Fundamentals
### Contrast to python list

<!-- ![](../img/memory.png "src: ") -->
<div align="center">
    <img src="../img/memory.svg" width="1200px" alt="in pytorch, a tensor refers to numbers in memory that are all next to each other">
</div>

    
| entity | plain python | pytorch| 
|:-------|:------------:|:------:|
| numbers | **boxed**: objects with reference counting | 32 bit numbers| 
| lists | sequential (1dim) collections of pointers to python objects | **adjacent entries in memory**: optimized for computational operations | 
| interpreter | slow list and math operations | fast | 


### Instantiation

Default type at instantiation is torch.float32

In [None]:
a = torch.ones(3); print(a, a.dtype)
b = torch.zeros((3, 2)).short(); print(b)
c = torch.tensor([1.,2.,3.], dtype=torch.double); print(c)

In [None]:
torch.tensor??

### Tensors and storages

* the `torch.Storage` is where the numbers actually are
* A `torch.Tensor` is a view onto a *torch.Storage*


In [None]:
a = torch.tensor([1,2,3,4,5,6])
b = a.reshape((3,2))
assert id(a.storage()) == id(b.storage())

* layout of the storage is always *1D*
* hence, changing the value in the storage changes the values of all views (i.e. torch.Tensor) that refer to the same storage 

### Size, storage offset, and strides

<div align="center">
    <img src="../img/tensor.svg" width="1200px" alt="Meaning of size, offset and stride">
</div>

* A Tensor is a view on a storage that is defined by its
  * **size:** `t.size()` / `t.shape`
  * **storage offset:** `t.stoage_offset()`
  * **stride:** `t.stride()`
* the **stride** informs how many elements in the storage one needs to move to get to the next value in that dimension
* to get `t[i,j]`, get `storage_offset + i * stride[0] + j * stride[1]` of storage
* this makes some tensor operations very cheap, because a new tensor has the same storage but different values for size, offset and stride

In [None]:
a = torch.tensor([[1,2,3], [4,5,6]])
print(f"a.size: {a.size()}")
print(f"a.storage_offset: {a.storage_offset()}")
print(f"a.stride: {a.stride()}")

In [None]:
b = a[1]
print(f"b.size: {b.size()}")
print(f"b.storage_offset: {b.storage_offset()}")
print(f"b.stride: {b.stride()}")

#### Transposing a tensor

* the transpose just swaps entries in size and stride

<div align="center">
    <img src="../img/transpose.svg" width="1200px" alt="Transpose explained">
</div>


#### Contiguous

* A tensor whose values are laid out in the storage starting from the right most dimension onward is **contiguous**
  * e.g. 2D tensor:
    * `t.size() # torch.Size([#rows, #columns])`
    * moving along rows (i.e. fix row, go from one column to the next) is equivalent to going through storage one by one
* this data locality improves performance

In [None]:
a = torch.tensor([[1,2,3], [4,5,6]])
assert a.is_contiguous()

In [None]:
b = a.t()
assert not b.is_contiguous()

In [None]:
c = b.contiguous()
assert c.is_contiguous()

### Numeric types

* `torch.floatXX`: 32: float, 64: double, 16: half
* `torch.intXX`: 8, 16, 32, 64
* `torch.uint8`: torch.ByteTensor
* `torch.Tensor`: equivalent to torch.FloatTensor


## Exercise 1:

Create a tensor `a` from `list(range(9))`. Predict then check what the size, offset, and strides are.

In [None]:
from helper import test_attributes

# TODO define tensor
a = None
test_attributes(a)

## Exercise 2:

Create a tensor `b = a.view(3, 3)`. What is the value of `b[1,1]`?

In [None]:
# TODO define tensor
b = None 

b[1,1]

## Exercise 3:

Create a tensor `c = b[1:,1:]`. Predict then check what the size, offset, and strides are.

In [None]:
# TODO define tensor
c = None

test_attributes(c)

# Indexing and Broadcasting

## Indexing

* similar to [numpy indexing](https://numpy.org/devdocs/user/basics.indexing.html), e.g. `points[1:, 0]`: all but first rows, first column

#### Tips and tricks

In [None]:
# Pairwise indexing works
t = torch.tensor(range(1, 10)).reshape(3, -1)
diagonal = t[range(3), range(3)]
diagonal

In [None]:
# Inject additional dimensions with indexing

t = torch.rand((3, 64, 64))

# Index with `None` at second dim to `unsqeeze`.
assert t[:, None].shape == torch.Size([3, 1, 64, 64])

# Do it multiple times
assert t[:, None, : , None].shape == torch.Size([3, 1, 64, 1, 64])

# Can also use ellipsis
assert t[..., None].shape == torch.Size([3, 64, 64, 1])

## Exercise 4: 

Get the diagonal elements of `t.rand(3, 3)` by reshaping into a 1d tensor and taking every fourth element, starting from the first.

In [None]:
t = torch.rand(3,3)
# TODO: Calculate actual tensor
diag_actual = None

In [None]:
from helper import test_indexing
test_indexing(t, diag_actual)

## Broadcasting

Look at the examples below and think about why we can multiply two tensors of different shapes and get the result that one would expect?

In [None]:
a = torch.tensor([
    3
])
b = torch.tensor([
    1, 2, 3
])
torch.allclose(a*b, torch.tensor([
    3, 6, 9
]))

In [None]:
a = torch.tensor([
    [1, 2],
    [3, 4]
])
b = torch.tensor([
    1, 2
])
torch.allclose(a*b, torch.tensor([
    [1, 4],
    [3, 8]
]))

The answer is that PyTorch magically *expands* the shape of the tensors in a smart way such that operations can be performed.
&rarr; This is called **broadcasting**.

### How is broadcasting done?

1. Compare the dimensions of all tensors, starting from the trailing one.
2. If dims are the same, do nothing
3. If one dim is 1 (or missing), expand it to match the other dim.
4. Else: abort

**Note:** When broadcasting, PyTorch does not actually need to expand the dimensions of a tensor in memory in order to perform efficient tensor operations.

```
Example 1
[a]:    3 x 64 x 64
[b]:              1
[a*b]:  3 x 64 x 64

Example 2
[a]:    3 x  1 x 64
[b]:    1 x 64 x  1
[a*b]:  3 x 64 x 64
```

## Exercise 5 - Broadcasting: 

Write down the shapes of the tensors in the examples and convince yourself that the output shape is as expected.

In [None]:
# TODO: define tensors from Example 1 and check output shape

In [None]:
# TODO: define tensors from Example 2 and check output shape