Useful resources:
1. https://pytorch.org/docs/stable/tensors.html (retrieved 2022-12-24, [Github](https://github.com/pytorch/tutorials/blob/d5161086e7277f10c68dd44914f8925fda62f399/beginner_source/blitz/tensor_tutorial.py))
2. https://pytorch.org/tutorials/beginner/introyt/tensors_deeper_tutorial.html (retrieved 2022-12-24, [Github](https://github.com/pytorch/tutorials/blob/c2115df8003e6a3aeeb327441ff4d8389576d6f0/beginner_source/introyt/tensors_deeper_tutorial.py))
3. Chollete, p.26-47 (for definitions and a Tensorflow implementation)

In [1]:
import sys
import torch
import helpers as h

In [2]:
h.print_pytorch_version()
h.print_python_version()

pyTorch version: 1.13.1+cpu
Python version: 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]


### Initializing Tensors

#### Initiailize a rank-1 tensor (vector)

In [3]:
x = torch.tensor([1., 2.])
h.print_tensor_info(x)

Tensor       tensor([1., 2.])
Type         <class 'torch.Tensor'>
dtype        torch.float32
Dimension    1
Shape        (2,)


* Tip: Look into the function definition of `h.printing_tensor_info` to see the Tensor's methods and attribures used to display the values above. 

In PyTorch, each tensor has a `device` associated with it. By default (for constructors such as `torch.tensor`) this device is the CPU. It can also be a GPU. To check for the device a tensor is associated with:

In [10]:
x.device.type

'cpu'

Below we define a vector with a single element. Notice the tensor-dimensionality and shape of this tensor:

In [43]:
x = torch.tensor([1.,])
h.print_tensor_info(x)

Tensor:     tensor([1.])
Type:       <class 'torch.Tensor'>
dtype:      torch.float32
Dimensions: 1
Shape:      (1,)


#### Initialize a rank-0 tensor (scalar)

In [37]:
x = torch.tensor(2.5)
h.print_tensor_info(x)

Tensor:     2.5
Type:       <class 'torch.Tensor'>
dtype:      torch.float32
Dimensions: 0
Size:       ()


#### Initialize a rank-2 tensor (matrix)

In [45]:
x = torch.tensor([[1., 2.],[3., 4.]])
h.print_tensor_info(x)

Tensor:     tensor([[1., 2.],
        [3., 4.]])
Type:       <class 'torch.Tensor'>
dtype:      torch.float32
Dimensions: 2
Shape:      (2, 2)


#### Initialize a rank-3 tensor

In [48]:
x = torch.tensor([
    [[1.,],[2.,]],
    [[3.,],[4.,]],
    [[5.,],[6.,]],
    ])
h.print_tensor_info(x)

Tensor:     tensor([[[1.],
         [2.]],

        [[3.],
         [4.]],

        [[5.],
         [6.]]])
Type:       <class 'torch.Tensor'>
dtype:      torch.float32
Dimensions: 3
Shape:      (3, 2, 1)


Note that even though tensors can have different shapes and dimensions, they are all PyTorch tensors, which means they have the same data type, device, and other metadata. Additionally, they support the same operations and methods, such as elementwise arithmetic, reshaping, and indexing.

#### Explicitly defining the tensor's data type

In [6]:
x = torch.tensor([1.], dtype=torch.double)
h.print_tensor_info(x)

tensor([1.], dtype=torch.float64)
<class 'torch.Tensor'>
torch.float64
1
torch.Size([1])


#### Note

You might have seen (or might see) defining a constant tensor by `torch.Tensor` instead of `torch.tensor`. Let's take a quick look:

In [35]:
x = torch.Tensor([1.])
h.print_tensor_info(x)

Tensor:     tensor([1.])
Type:       <class 'torch.Tensor'>
dtype:      torch.float32
Dimensions: 1
Size:       (1,)


As you can expect, `torch.tensor` and `torch.Tensor` are not the same. While `torch.tensor` is a constructor, `torch.Tensor` is an alias for the default tensor type (`torch.FloatTensor`). As can be seen below, the first is a function, while the second is a class. You might find it useful to have a quick look at their source (ctrl-left-click on Windows and Linux, or cmd-left-click on Mac; This works on Google's colab as well)

In [36]:
print(type(torch.tensor))
print(type(torch.Tensor))

<class 'builtin_function_or_method'>
<class 'torch._C._TensorMeta'>


### Tesor Indexing, Slicing and Assignments

In [12]:
x = torch.tensor([
    [[1., 2.],[3., 4.]],
    [[5., 6.],[7., 8.]],
    [[9., 10.],[11., 12.]],
    ])
h.print_tensor_info(x)

Tensor       tensor([[[ 1.,  2.],
         [ 3.,  4.]],

        [[ 5.,  6.],
         [ 7.,  8.]],

        [[ 9., 10.],
         [11., 12.]]])
Type         <class 'torch.Tensor'>
dtype        torch.float32
Dimension    3
Shape        (3, 2, 2)


In [13]:
x[0]

tensor([[1., 2.],
        [3., 4.]])

In [14]:
x[0][0][0]

tensor(1.)

In [15]:
x[1,1,1] = 100

In [16]:
x

tensor([[[  1.,   2.],
         [  3.,   4.]],

        [[  5.,   6.],
         [  7., 100.]],

        [[  9.,  10.],
         [ 11.,  12.]]])

In [17]:
x[1,:,:]

tensor([[  5.,   6.],
        [  7., 100.]])

### Tensor Arithmetic

Arithmetic is the branch of mathematics that deals with the manipulation of numbers and quantities. It includes the study of basic operations such as addition, subtraction, multiplication, and division, as well as more advanced concepts such as square roots, exponentiation, logarithmic functions, and even trigonometric functions and logarithms.

Let's look at two rank-1 tensors:

In [74]:
x = torch.tensor([1., 2.])
y = torch.tensor([3., 4.])
print(f'x: {x}')
print(f'y: {y}')

x: tensor([1., 2.])
y: tensor([3., 4.])


The four basic operations of arithmetic are element-wise:

In [65]:
print(f'x + y : {x + y}')

x + y : tensor([4., 6.])


In [64]:
print(f'x - y : {x - y}')

x - y : tensor([-2., -2.])


In [66]:
print(f'x * y : {x * y}')

x * y : tensor([3., 8.])


In [67]:
print(f'x / y : {x / y}')

x / y : tensor([0.3333, 0.5000])


The dot-product:

In [69]:
x.dot(y)

tensor(11.)

Matrix Multiplication
Looking at the definitions can actually be useful here:
https://pytorch.org/docs/stable/generated/torch.matmul.html

>If both tensors are 1-dimensional, the dot product (scalar) is returned.

In [78]:
torch.matmul(x, y)

tensor(11.)

dot product is symmetric:

In [75]:
x.matmul(y)

tensor(11.)

In [77]:
y.matmul(x)

tensor(11.)

`@` is an alias of `matmul`:

In [70]:
x@y

tensor(11.)

>If both arguments are 2-dimensional, the matrix-matrix product is returned.

In [91]:
x = torch.tensor([[1., 2.],[3., 4.]])
y = torch.tensor([[5., 6.],[7., 8.]])
print(x)
print(y)

tensor([[1., 2.],
        [3., 4.]])
tensor([[5., 6.],
        [7., 8.]])


In [90]:
print(torch.matmul(x, y))
print(torch.matmul(x, y).shape)

tensor([[19., 22.],
        [43., 50.]])
torch.Size([2, 2])


>If the first argument is 1-dimensional and the second argument is 2-dimensional, a 1 is prepended to its dimension for the purpose of the matrix multiply. After the matrix multiply, the prepended dimension is removed.

In other words, the first tensor is boardcasted to the shape of the second tensor.

In [92]:
x = torch.tensor([1., 2.])
y = torch.tensor([[3., 4.],[5., 6.]])
print(x)
print(y)

tensor([1., 2.])
tensor([[3., 4.],
        [5., 6.]])


In [88]:
print(torch.matmul(x,y))
print(torch.matmul(x,y).shape)

tensor([13., 16.])
torch.Size([2])


>If the first argument is 2-dimensional and the second argument is 1-dimensional, the matrix-vector product is returned.

In [82]:
torch.matmul(y,x)

tensor([23., 34.])

>If both arguments are at least 1-dimensional and at least one argument is N-dimensional (where N > 2), then a batched matrix multiply is returned. 
>1. If the first argument is 1-dimensional, a 1 is prepended to its dimension for the purpose of the batched matrix multiply and removed after.
>2. If the second argument is 1-dimensional, a 1 is appended to its dimension for the purpose of the batched matrix multiple and removed after.
>3. The non-matrix (i.e. batch) dimensions are broadcasted (and thus must be broadcastable). For example, if `input` is a $(j \times 1 \times n \times n)$ tensor and `other` is a $(k \times n \times n)$ tensor, `out` will be a $(j \times k \times n \times n)$ tensor.

>Note that the broadcasting logic only looks at the batch dimensions when determining if the inputs are broadcastable, and not the matrix dimensions. For example, if `input` is a $(j \times 1 \times n \times m)$ tensor and `other` is a $(k \times m \times p)$ tensor, these inputs are valid for broadcasting even though the final two dimensions (i.e. the matrix dimensions) are different. `out` will be a $(j \times k \times n \times p)$ tensor.

This is a helpful background:
https://pytorch.org/docs/stable/notes/broadcasting.html#broadcasting-semantics

In [95]:
x = torch.tensor([1., 2.])
y = torch.tensor([
    [[1., 2.],[3., 4.]],
    [[5., 6.],[7., 8.]],
    [[9., 10.],[11., 12.]],
    ])
print(x)
print(x.dim())
print(x.shape)
print(y)
print(y.dim())
print(y.shape)

tensor([1., 2.])
1
torch.Size([2])
tensor([[[ 1.,  2.],
         [ 3.,  4.]],

        [[ 5.,  6.],
         [ 7.,  8.]],

        [[ 9., 10.],
         [11., 12.]]])
3
torch.Size([3, 2, 2])


In [97]:
print(torch.matmul(x,y))
print(torch.matmul(x,y).dim())
print(torch.matmul(x,y).shape)

tensor([[ 7., 10.],
        [19., 22.],
        [31., 34.]])
2
torch.Size([3, 2])


As can be seen above, this is a "dot product broadcast". The dimension is one less that of `other`

In [98]:
x = torch.tensor([
    [[1., 2.],[3., 4.]],
    [[5., 6.],[7., 8.]],
    [[9., 10.],[11., 12.]],
    ])
y = torch.tensor([1., 2.])
print(x)
print(x.dim())
print(x.shape)
print(y)
print(y.dim())
print(y.shape)

tensor([[[ 1.,  2.],
         [ 3.,  4.]],

        [[ 5.,  6.],
         [ 7.,  8.]],

        [[ 9., 10.],
         [11., 12.]]])
3
torch.Size([3, 2, 2])
tensor([1., 2.])
1
torch.Size([2])


In [99]:
print(torch.matmul(x,y))
print(torch.matmul(x,y).dim())
print(torch.matmul(x,y).shape)

tensor([[ 5., 11.],
        [17., 23.],
        [29., 35.]])
2
torch.Size([3, 2])


Notice that here the "dot product" multiplication is done on the first dimension. Unlike the previous case (and as usually defined in math), where dot product is a 'row vector' multiplied by a 'column vector'. 

In [100]:
x = torch.tensor([
    [[1., 2.],[3., 4.]],
    [[5., 6.],[7., 8.]],
    [[9., 10.],[11., 12.]],
    ])
y = torch.tensor([[1., 2.],[3., 4.]])
print(x)
print(x.dim())
print(x.shape)
print(y)
print(y.dim())
print(y.shape)

tensor([[[ 1.,  2.],
         [ 3.,  4.]],

        [[ 5.,  6.],
         [ 7.,  8.]],

        [[ 9., 10.],
         [11., 12.]]])
3
torch.Size([3, 2, 2])
tensor([[1., 2.],
        [3., 4.]])
2
torch.Size([2, 2])


In [101]:
print(torch.matmul(x,y))
print(torch.matmul(x,y).dim())
print(torch.matmul(x,y).shape)

tensor([[[ 7., 10.],
         [15., 22.]],

        [[23., 34.],
         [31., 46.]],

        [[39., 58.],
         [47., 70.]]])
3
torch.Size([3, 2, 2])


In [18]:
# the section below is based on https://github.com/pytorch/tutorials/blob/d5161086e7277f10c68dd44914f8925fda62f399/beginner_source/blitz/tensor_tutorial.py
# See: https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html
# Presented here in accordance with BSD-3 license

#### Tensor Operations

Over 100 tensor operations, including transposing, indexing, slicing,
mathematical operations, linear algebra, random sampling, and more are
comprehensively described [here](https://pytorch.org/docs/stable/torch.html).

Each of them can be run on the GPU (at typically higher speeds than on a
CPU).

In [20]:
tensor = torch.ones(4, 4)
# We move our tensor to the GPU if available
if torch.cuda.is_available():
  tensor = tensor.to('cuda')
print(f"Device tensor is stored on: {tensor.device}")

Device tensor is stored on: cpu


#### In-place operations
Operations that have a ``_`` suffix are in-place. For example: ``x.copy_(y)``, ``x.t_()``, will change ``x``.

Note: In-place operations save some memory, but can be problematic when computing derivatives because of an immediate loss of history. Hence, their use is discouraged.

In [21]:
print(tensor, "\n")
tensor.add_(5)
print(tensor)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]) 

tensor([[6., 6., 6., 6.],
        [6., 6., 6., 6.],
        [6., 6., 6., 6.],
        [6., 6., 6., 6.]])


Compare to:

In [22]:
tensor.add(5)

tensor([[11., 11., 11., 11.],
        [11., 11., 11., 11.],
        [11., 11., 11., 11.],
        [11., 11., 11., 11.]])

In [23]:
print(tensor)

tensor([[6., 6., 6., 6.],
        [6., 6., 6., 6.],
        [6., 6., 6., 6.],
        [6., 6., 6., 6.]])
