<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Creating-and-Copying-Tensors" data-toc-modified-id="Creating-and-Copying-Tensors-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Creating and Copying Tensors</a></span></li><li><span><a href="#Tensor-Shapes" data-toc-modified-id="Tensor-Shapes-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Tensor Shapes</a></span></li><li><span><a href="#Broadcasting" data-toc-modified-id="Broadcasting-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Broadcasting</a></span></li><li><span><a href="#Playground" data-toc-modified-id="Playground-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Playground</a></span></li></ul></div>

In [None]:
import torch
from torch import Tensor
from fastcore.test import *

**Helpers**

In [None]:
class T(Tensor): pass

In [None]:
def pnl(*args): print(*args, sep='\n')

# Creating and Copying Tensors

**torch.empty(shape)**

Create a tensor of `shape` using random memory; will be filled with garbage, but it's the fastest way to create a tensor. Note that this tensor is NOT a floattensor, but all other tensors created in this section are.

In [None]:
torch.empty((2,3,4))

tensor([[[3.5505e-09, 4.5553e-41, 3.5505e-09, 4.5553e-41],
         [0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00],
         [0.0000e+00, 1.0800e-05, 1.8788e+31, 1.7220e+22]],

        [[1.9152e+23, 1.0432e+21, 2.5639e-09, 1.6691e-07],
         [2.0314e+20, 1.3423e-05, 1.3075e+22, 5.3483e+22],
         [2.5100e-18, 1.9421e+31, 2.7491e+20, 6.1949e-04]]])

In [None]:
test_eq(torch.empty(0), torch.Tensor([]))

**torch.zeros(shape)**

Create float tensor of shape with zeros.

In [None]:
torch.zeros(1,2,3)

tensor([[[0., 0., 0.],
         [0., 0., 0.]]])

**torch.rand(shape)**

Create float tensor of shape with uniform random vals between 0 and 1.

In [None]:
torch.rand(2,2)

tensor([[0.7242, 0.1377],
        [0.1792, 0.3165]])

**Tensor[idx]**  *(aka: slicing on indeces)*

In [None]:
t = torch.rand(2,2)

pnl(t,
    t[0,0],
    t[:,0],
    t[1,:])

tensor([[0.2705, 0.3561],
        [0.4726, 0.6325]])
tensor(0.2705)
tensor([0.2705, 0.4726])
tensor([0.4726, 0.6325])


**Tensor.repeat(repeats,per,dim)**

In [None]:
t = torch.rand(2,2)
t.repeat(1,1)

tensor([[0.4580, 0.8005],
        [0.8964, 0.3002]])

In [None]:
t.repeat(2,3)

tensor([[0.4580, 0.8005, 0.4580, 0.8005, 0.4580, 0.8005],
        [0.8964, 0.3002, 0.8964, 0.3002, 0.8964, 0.3002],
        [0.4580, 0.8005, 0.4580, 0.8005, 0.4580, 0.8005],
        [0.8964, 0.3002, 0.8964, 0.3002, 0.8964, 0.3002]])

In [None]:
t.repeat(1,1,1,2,3)

tensor([[[[[0.4580, 0.8005, 0.4580, 0.8005, 0.4580, 0.8005],
           [0.8964, 0.3002, 0.8964, 0.3002, 0.8964, 0.3002],
           [0.4580, 0.8005, 0.4580, 0.8005, 0.4580, 0.8005],
           [0.8964, 0.3002, 0.8964, 0.3002, 0.8964, 0.3002]]]]])

**torch.arange(num_of_elements)**

In [None]:
torch.arange(4.)

tensor([0., 1., 2., 3.])

**Tensor.view(shape)**

Create a view of a tensor. View object shares data (and memory) with underlying tensor; avoids extra data copying for more efficient operations.

In [None]:
torch.arange(6).view(2,3)

tensor([[0, 1, 2],
        [3, 4, 5]])

In [None]:
torch.arange(6).view(1,1,2,3,1)

tensor([[[[[0],
           [1],
           [2]],

          [[3],
           [4],
           [5]]]]])

**torch.squeeze(tensor)**

Remove dims of size 1.

In [None]:
torch.squeeze(T(1,2,1,1,3,1,4,5,1)).shape

torch.Size([2, 3, 4, 5])

**Tensor.flatten()**

In [None]:
#todo

**Tensor.unflatten(dim, sizes)**

Expand dimension `dim` into new dimensions `sizes`.

IDK if this still works?

**Tensor.split()**

In [None]:
#todo

# Tensor Shapes

Tensors have shapes that look like tuples, lists, vectors, etc. Ex: (5,2,3). The tuple represents a list of dimension sizes. The length of the tuple is the total number of dimensions (aka the tensor's rank).

To visualize the shape of a tensor, I basically visualize a spreadsheet workbook. I start at the "trailing" dimension (the right-most one) – 3 in the example above. This is the number of "columns" on one page. The next number is the number of rows. The next number is the number of sheets. And, if there was a number above that one, I'd visualize that as the number of workbooks. (If there was one above that, I'd visualize it as the number of folders; from there it's nested folders all the way down).

**Ex: visualizing a tensor of shape (2,2,2,3,4).**

From right (trailing dim) to left:
- 4 cols
- 3 rows
- 2 sheets
- 2 wbs
- 2 folders

In [None]:
                                # end of cols (4 per row)
T([[[[[ 3.5506e-09,  4.5553e-41,  3.5506e-09,  4.5553e-41],   # end of rows (3 per sheet)
      [ 0.0000e+00,  0.0000e+00,  0.0000e+00,  0.0000e+00], 
      [ 0.0000e+00,  0.0000e+00,  0.0000e+00,  0.0000e+00]],  # end of sheet (2 per wb)
                                                              
     [[-3.5106e-13,  3.0617e-41,  0.0000e+00,  0.0000e+00],
      [-3.6195e-13,  3.0617e-41, -3.6195e-13,  3.0617e-41],
      [-3.6195e-13,  3.0617e-41,  0.0000e+00,  0.0000e+00]]], # end of wb (2 per folder)
                                                              
                                                              
    [[[ 4.5553e-41,  3.2320e-12,  4.5552e-41,  1.6158e-09],
      [ 4.5553e-41,  1.6158e-09,  4.5553e-41,  1.4013e-45],
      [-3.6195e-13,  3.0617e-41, -3.6195e-13,  3.0617e-41]], 
      
     [[-3.6195e-13,  3.0617e-41,  1.4013e-45,  0.0000e+00],
      [-3.5587e-13,  3.0617e-41,  1.4013e-45,  0.0000e+00],
      [ 1.4013e-45,  0.0000e+00,  1.4013e-45,  0.0000e+00]]]], # end of folder (2 per tensor)
                                                               
                                                               
                                                               
   [[[[ 0.0000e+00,  0.0000e+00,  1.4013e-45,  0.0000e+00],
      [-1.9100e-04,  4.5552e-41,  1.4013e-45,  9.1834e-41],
      [ 1.4013e-45,  0.0000e+00,  1.4013e-45,  2.3511e-38]],
   
     [[ 3.5873e-43,  0.0000e+00,  3.5873e-43,  0.0000e+00],
      [-3.6196e-13,  3.0617e-41,  2.7185e-43,  0.0000e+00],
      [ 0.0000e+00,  0.0000e+00,  0.0000e+00,  0.0000e+00]]],
      
   
    [[[ 0.0000e+00,  0.0000e+00,  1.8788e+31,  1.7220e+22],
      [ 1.8704e+20,  1.7000e+22,  1.0311e-11,  6.7940e+22],
      [ 8.3106e+20,  1.0431e-08,  6.5562e-10,  1.0991e-05]],
  
     [[ 2.1065e-07,  1.0490e-08,  3.1369e+27,  7.0800e+31],
      [ 3.1095e-18,  1.8590e+34,  7.7767e+31,  7.1536e+22],
      [ 3.3803e-18,  1.9421e+31,  2.7491e+20,  6.1949e-04]]]]]) # end of tensor

tensor([[[[[ 3.5506e-09,  4.5553e-41,  3.5506e-09,  4.5553e-41],
           [ 0.0000e+00,  0.0000e+00,  0.0000e+00,  0.0000e+00],
           [ 0.0000e+00,  0.0000e+00,  0.0000e+00,  0.0000e+00]],

          [[-3.5106e-13,  3.0617e-41,  0.0000e+00,  0.0000e+00],
           [-3.6195e-13,  3.0617e-41, -3.6195e-13,  3.0617e-41],
           [-3.6195e-13,  3.0617e-41,  0.0000e+00,  0.0000e+00]]],


         [[[ 4.5553e-41,  3.2320e-12,  4.5552e-41,  1.6158e-09],
           [ 4.5553e-41,  1.6158e-09,  4.5553e-41,  1.4013e-45],
           [-3.6195e-13,  3.0617e-41, -3.6195e-13,  3.0617e-41]],

          [[-3.6195e-13,  3.0617e-41,  1.4013e-45,  0.0000e+00],
           [-3.5587e-13,  3.0617e-41,  1.4013e-45,  0.0000e+00],
           [ 1.4013e-45,  0.0000e+00,  1.4013e-45,  0.0000e+00]]]],



        [[[[ 0.0000e+00,  0.0000e+00,  1.4013e-45,  0.0000e+00],
           [-1.9100e-04,  4.5552e-41,  1.4013e-45,  9.1834e-41],
           [ 1.4013e-45,  0.0000e+00,  1.4013e-45,  2.3511e-38]],

         

# Broadcasting

References:
- [NumPy broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html#module-numpy.doc.broadcasting)
- [PyTorch broadcasting](https://pytorch.org/docs/stable/notes/broadcasting.html) (mostly from this)

Broadcasting is the step in matrix algebra involving vectors/matricies/tensors of different sizes where you make copies of each shape in order to do element-wise operations. For example, recall that adding a vector of shape 3x1 to a vector of shape 1x3 results in a vector of size 3x3. Each element in the result vector was created by adding a elements of the input vectors.

`ex:
          |1|   |1+1 1+2 1+3|   |2 3 4|
|1 2 3| + |2| = |1+2 2+2 3+2| = |3 4 5|
          |3|   |1+3 2+3 3+3|   |4 5 6|`

Notice how each of the input vectors had to be copied three times to perform all of the element-wise operations. That copying is known as **broadcasting.**

From PyTorch docs: "If a PyTorch operation supports broadcasting, then its Tensor args can be automatically expanded to be of equal sizes *(without making copies of the data)*." (emphasis added).

Being able to do tensor maths with massive tensors and without having to make copies when broadcasting is a big deal.

Two tensors are "broadcastable" if:
- Neither has a shape of [0].
- When iterating over the dimension sizes, starting at the trailing dimensison, the dimension sizes must be equal, one of them is 1, or one of them does not exist.

**Case 1:**
- Tensors have the same shape.
- Addition works; no broadcasting takes place (no copies needed).

In [None]:
T(2,2) + T(2,2)

tensor([[6.7333e+22, 1.7591e+22],
        [1.7184e+25, 4.3222e+27]])

**Case 2:**
- One of them has a trailing dimension of size 1, like Tensor(4,9,...,1).
- Works because 1||3 and 3||1 (|| now means "is compatible with")

In [None]:
T(1,3) + T(3,1)

tensor([[ 3.5501e-09,  3.5505e-09,  3.5501e-09],
        [-3.5675e-13,  7.6170e-41, -3.5282e-13],
        [ 3.5501e-09,  3.5505e-09,  3.5501e-09]])

**Case 3:**
- Missing dimensions.
- Works because 2||2, and 3||na.

In [None]:
T(3,2) + T(2)

tensor([[3.5501e-09, 7.6170e-41],
        [3.5505e-09, 4.5623e-41],
        [3.5505e-09, 4.5553e-41]])

**Case 4:**
- Diff shapes.
- Does not work: 3||3, but 2|!|3

In [None]:
T(1,2,3) + T(1,2)

RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 2

**Case 5:**
- Diff shapes.
- Works: 1||2, 3||1, 2||na, 1||na

In [None]:
T(1,2,3,1) + T(1,2)

tensor([[[[ 3.5501e-09, -3.5538e-13],
          [ 3.5505e-09,  7.6170e-41],
          [ 7.1009e-09,  3.5505e-09]],

         [[ 3.5505e-09,  9.1107e-41],
          [ 3.5505e-09,  4.5598e-41],
          [ 3.5505e-09,  4.5553e-41]]]])

**Overall Rules**

Given two non-scalar tensors, compare their trailing (right-most) dimensions' shapes. The tensors are broadcastable if one dimension is shape 1, one is missing, or if the shapes are equal.

# Playground