# Deep Dive Into Torch Tensors

This notebook goes a bit more into the details of `torch.Tensor` objects.

## Tensors

Each tensor has a type and a shape. Tensors are associated with `Storage` objects. A tensor is not a `Storage` object, but it contains one.

In [1]:
import torch
import numpy as np

x = torch.randn(2, 2)
print(torch.is_tensor(x))  # True
print(torch.is_storage(x)) # False
print(torch.is_storage(x.storage())) # True

True
False
True


### Type

The type of a tensor is generally decided when the tensor is intantiated. One can add flexibility by using the `dtype` attribute as shown below.

In [2]:
dtype = torch.FloatTensor
# If CUDA is available, one can use instead:
# dtype = torch.cuda.FloatTensor

x = torch.Tensor(2, 2).type(dtype)

### Indexing, Slicing, Joining, Mutating Ops

#### torch.cat

`torch.cat` concatenates a sequence of tensors along a given `dim`. Optionally one can specify the output tensor `out`. Remember:
- `dim=0` is equivalent to R's `rowbind`.
- `dim=1` is equivalent to R's `colbind`.

In [82]:
x1, x2, x3 = torch.randn(2, 3), torch.randn(4, 3), torch.randn(2, 4)

print(torch.cat((x1, x2), dim=0).size()) # concatenate by row (same n. of columns)
print(torch.cat((x1, x3), dim=1).size()) # concatenate by column (same n. or rows)

torch.Size([6, 3])
torch.Size([2, 7])


#### torch.chunk

`torch.chunk` splits a tensor into a list of a given number of sub-tensors ("chunks") along a given axis.

In [83]:
x12 = torch.cat((x1, x2), dim=0)
print(torch.chunk(x12, chunks=3, dim=0))

(
 0.1816 -0.2602 -0.0569
 0.6215  0.4222  0.6585
[torch.FloatTensor of size 2x3]
, 
-1.3553 -0.2051 -0.2057
-0.2135  0.5906 -0.3909
[torch.FloatTensor of size 2x3]
, 
 1.2769  0.1044 -1.4429
 0.6632 -0.0868  0.5538
[torch.FloatTensor of size 2x3]
)


#### torch.gather

**TODO** how does it work??

#### torch.index_select

`torch.index_select` selects the element of a tensor based on a `torch.LongTensor` of indices and a `dim`. For exaple:

In [84]:
x = torch.randn(3, 4)
indices = torch.LongTensor([0, 2])
print(torch.index_select(x, dim=0, index=indices)) # Select *rows* 0 and 2
print(torch.index_select(x, dim=1, index=indices)) # Select *columns* 0 and 2


-0.5316 -0.5159  0.9304 -0.1629
 1.2964 -1.1541 -0.1048 -1.0506
[torch.FloatTensor of size 2x4]


-0.5316  0.9304
-1.3817  1.3097
 1.2964 -0.1048
[torch.FloatTensor of size 3x2]



#### torch.masked_select

Given a tensor, we can create a boolean mask of class `torch.ByteTensor` and use this mask to select some elements that will always be returned as a 1D tensor.

In [85]:
x = torch.randn(4, 3, 2)
mask = x.ge(0.5) # This returns a torch.ByteTensor
print(torch.masked_select(x, mask=mask).size())

torch.Size([7])


#### torch.nonzero

`torch.nonzero` returns a tensor with the indices of all the nonzero elements. In the example below, we apply `torch.nonzero` to a 2D tensor, and the resulting `indices` contain the row and column of the nonzero elements.

In [86]:
x = torch.Tensor([[0, 1, 0], [0, 0, 0], [0, 1, 1]])
print(x)
indices = torch.nonzero(x)
print(indices)


 0  1  0
 0  0  0
 0  1  1
[torch.FloatTensor of size 3x3]


 0  1
 2  1
 2  2
[torch.LongTensor of size 3x2]



#### torch.split

`torch.split` is similar to `torch.chunk`, but instead of returning a given number of chunks, it splits the tensors in chunks of a given size. It allows for non-exact chunking. The last chunk will be smaller if the tensor size along the given `dim` is not divisible by `split_size`

####  torch.squeeze and torch.unsqueeze

`torch.squeeze` takes a tensor and returns a tensor with all the dimensions of size 1 removed.

In [87]:
x = torch.randn(2, 1, 3, 1)
print(x)
x_squeezed = torch.squeeze(x)


(0 ,0 ,.,.) = 
  0.0941
  0.5801
 -0.0761

(1 ,0 ,.,.) = 
 -0.0790
 -0.9265
  0.5988
[torch.FloatTensor of size 2x1x3x1]



In [88]:
print(x_squeezed)


 0.0941  0.5801 -0.0761
-0.0790 -0.9265  0.5988
[torch.FloatTensor of size 2x3]



If a `dim` is given, the tensor is "squeezed" only along that direction.

In [89]:
print(x.squeeze(dim=1).size())

torch.Size([2, 3, 1])


`torch.unsqueeze` does the opposite, but always needs a `dim` to know where to add the dimension. PyTorch only works on mini-batches. This command is useful for turning a single example into a one-example mini-batch.

In [90]:
print(x_squeezed.unsqueeze(dim=1))


(0 ,.,.) = 
  0.0941  0.5801 -0.0761

(1 ,.,.) = 
 -0.0790 -0.9265  0.5988
[torch.FloatTensor of size 2x1x3]



#### torch.stack

`torch.stack` is similar to `torch.cat`, but less flexible, in that it expects all the tensors to have the same size. In the example below, `torch.cat` works, but `torch.stack` fails.

In [91]:
print(x1.size(), x2.size(), x3.size())

torch.cat((x1, x2), dim=0)

# torch.stack((x1, x2), dim=0) # RuntimeError

torch.Size([2, 3]) torch.Size([4, 3]) torch.Size([2, 4])



 0.1816 -0.2602 -0.0569
 0.6215  0.4222  0.6585
-1.3553 -0.2051 -0.2057
-0.2135  0.5906 -0.3909
 1.2769  0.1044 -1.4429
 0.6632 -0.0868  0.5538
[torch.FloatTensor of size 6x3]

In [92]:
x1, x2, x3 = torch.chunk(torch.randn(3, 2, 2), chunks=3, dim=0)
print(x1.size(), x2.size(), x3.size())
torch.stack((x1, x2, x3))

torch.Size([1, 2, 2]) torch.Size([1, 2, 2]) torch.Size([1, 2, 2])



(0 ,0 ,.,.) = 
  1.0579  1.1103
  1.3379 -0.7545

(1 ,0 ,.,.) = 
  0.3428  1.2521
  0.4236  0.1182

(2 ,0 ,.,.) = 
  0.3056 -0.2605
  0.5931 -2.6714
[torch.FloatTensor of size 3x1x2x2]

Note that we have retained the unit dimension in `dim=1`. We can easily get rid of it with `torch.squeeze`.

#### torch.t and torch.transpose

These two functions are similar, but the first one is more specialized, as it always expects a 2D tensor, and it transposes dimensions 0 and 1. `torch.transpose`, instead, takes a tensor of any size and transposes dimensions `dim1` and `dim2`.

In [93]:
x = torch.randn(3, 2, 4)
print(x.transpose(dim0=1, dim1=2).size())

torch.Size([3, 4, 2])


#### torch.unbind

`torch.unbind` removes a given dimension and returns a tuple of all slices along that dimension. The example below takes a (3, 2, 3) tensor and removes the last dimension returning a tuple of 3 (3, 2) slices.

In [94]:
x = torch.randn(3, 2, 3)
print(torch.unbind(x, dim=2))

(
 1.8726  0.6956
-2.0518 -0.3820
 0.3543  0.1863
[torch.FloatTensor of size 3x2]
, 
-0.2550 -0.3574
 1.0613  3.0645
 1.0749 -2.3235
[torch.FloatTensor of size 3x2]
, 
 1.9762  0.4940
-2.5179 -0.0927
 0.5357  0.0243
[torch.FloatTensor of size 3x2]
)


## Random Sampling

### Serialization

### Parallelism

## Math Operations

## Torch Sparse

## Torch Storage

## Broadcasting Semantics

Two tensor are broadcastable if they satisfy the following properties:
1. Each tensor has at least one dimension.
2. When iterating along the dimension, starting from the last one, the dimension sizes must either:
    - Be equal.
    - One of them is equal to 1.
    - One of them does not exist.

Let's see a few cases, from the simplest to the most complex.

In [95]:
# Two tensors, one of which has dimension 1
x1, x2 = torch.randn(1), torch.randn(2, 2)
x1 + x2


 0.2257 -1.4464
 0.2828 -0.9714
[torch.FloatTensor of size 2x2]

In [106]:
# Two tensors with different dimension, but one of which is 1.
x1 = torch.ones([4, 1])
x2 = torch.ones([4])
x1 + x2


 2  2  2  2
 2  2  2  2
 2  2  2  2
 2  2  2  2
[torch.FloatTensor of size 4x4]

In [96]:
# Two tensors with the same shape (no real broadcasting here)
x1, x2 = torch.randn(2, 2), torch.randn(2, 2)
x1 + x2


-0.0408 -0.2068
-0.6204  1.2971
[torch.FloatTensor of size 2x2]

In [97]:
# Two tensors, one of which has an extra dimension equal to one
# In this case you can line up the dimensions as
# (1, 2, 2)
# (   2, 2)
x3 = torch.randn(1, 2, 2)
x1 + x3


(0 ,.,.) = 
 -3.2766 -1.7948
  2.1794  1.4836
[torch.FloatTensor of size 1x2x2]

In [98]:
# But the position of the 1-dimension does not matter
x3 = torch.randn(2, 1, 2)
x1 + x3


(0 ,.,.) = 
 -2.5369 -1.7435
 -1.1534  1.0840

(1 ,.,.) = 
 -2.4881 -2.2367
 -1.1045  0.5908
[torch.FloatTensor of size 2x2x2]

In [99]:
# Two tensors with the same dimensions, one of which does not exist. 
# You can line up the dimensions as 
# (2, 2, 2)
# (   2, 2)                                  
x2 = torch.randn(2, 2, 2)
x1 + x2


(0 ,.,.) = 
 -1.1321 -1.2815
 -0.0913  3.1672

(1 ,.,.) = 
 -1.8309 -2.7214
  0.3292  1.2932
[torch.FloatTensor of size 2x2x2]

When two tensors are broadcastable, the dimension of the resulting tensor is computed as follows:
1. If the two tensors do not have the same dimensions, a 1 is prepended to the dimensions of the tensor with fewer dimensions.
2. After this step, or if the tensors have already the same dimensions, for each dimension the final dimension is obtained as the max of the two corresponding dimensions.

For example, in the case below we add one (3, 3) tensor with a (1, 3) tensor, and the result has dimension (3, 3) obtained by adding the row tensor to each row of the matrix.

In [100]:
x1 = torch.Tensor(torch.arange(9).view(3, 3))
print(x1)
x2 = torch.Tensor(torch.arange(3).view(1, 3))
print(x2)
print(x1 + x2)


 0  1  2
 3  4  5
 6  7  8
[torch.FloatTensor of size 3x3]


 0  1  2
[torch.FloatTensor of size 1x3]


  0   2   4
  3   5   7
  6   8  10
[torch.FloatTensor of size 3x3]



Similarly, if we consider a column tensor, we end up adding it to each column of the matrix.

In [102]:
x2 = x2.view(3, 1)
print(x2)
print(x1 + x2)


 0
 1
 2
[torch.FloatTensor of size 3x1]


  0   1   2
  4   5   6
  8   9  10
[torch.FloatTensor of size 3x3]

