# Thinking in tensors in PyTorch

Deep learning for neuroscientists - hands-on training  by [Piotr Migdał](https://p.migdal.pl) (2019). Version 0.2.


## Notebook 1: Tensors

In [2]:
import torch
from torch import tensor

Linear algebra is the language of deep learning... and quantum mechanics.

Note: in physics and engineering, tensor is not any array. There is a one-two-many rule: 

* 0: scalar
* 1: vector
* 2: matrix
* 3 and above: n-dimensional tensor

In theory, tensors can be of an arbitrarily high dimension. In deep learning, they rare exceed 5.

## Scalar

Scalar is "just a number". Real-world examples of a scalar are: temperature, pressure, price of an apple in a given shop, etc.

In [3]:
x = tensor(42.)
x

tensor(42.)

In [4]:
x.dim()

0

In [5]:
2 * x

tensor(84.)

In [6]:
x.item()

42.0

### Food for thought

> The scalar fallacy is the false but pervasive assumption that real-world things (hotels, sandwiches, people, mutual funds, chemo drugs, whatever) have some single-dimension ordering of "goodness".

> When you project a multi-dimensional space down to one dimension, you are involving a lot of context and preferences in the act of projecting. - [rlucas on HN](https://news.ycombinator.com/item?id=8132525)

See also: [Scalar fallacy](http://observationalepidemiology.blogspot.com/2011/01/scalar-fallacy.html).


## Vector

Vector is an ordered list of numbers, such as `[-5., 2., 0.]`.

In physics and mechanical engineering, not everything is a vector:

> it is not generally true that any three numbers form a vector. It is true only if, when we rotate the coordinate system, the components of the vector transform among themselves in the correct way. - [II 02: Differential Calculus of Vector Fields](http://www.feynmanlectures.caltech.edu/II_02.html) from [The Feynman Lectures on Physics](http://www.feynmanlectures.caltech.edu/)

* position
* velocity
* electric field
* spatial gradient of a scalar field ($\nabla T$)


In deep learning we are more... relaxed. Usually vectors are abstract, 


* feature vector after a ImageNet-trained vector
* a word representation in (see: [king - man + woman is queen; but why?](https://p.migdal.pl/2017/01/06/king-man-woman-queen-why.html))
* user and product vectors in [Factorization Machines](https://www.reddit.com/r/MachineLearning/comments/65d3lt/r_factorization_machines_2010_a_classic_paper_in/) and related recommendation systems


$$\vec{v} = \left[ v_1, v_2, \ldots, v_n \right]$$

In [7]:
v = tensor([1.5, -0.5, 3.0])
v

tensor([ 1.5000, -0.5000,  3.0000])

In [8]:
v.dim()

1

In [9]:
v.size()

torch.Size([3])

### Vector arithmetics

$$\vec{v} + \vec{u}$$

In [10]:
2 * v

tensor([ 3., -1.,  6.])

In [15]:
1 + 1.

2.0

In [13]:
#u = tensor([1., 0., 1.])
u = tensor([1, 0, 1])

In [16]:
v + u.float()

tensor([ 2.5000, -0.5000,  4.0000])

### Vector length


$$|\vec{v}| = \sqrt{v_1^2 + v_2^2 + \ldots + v_n^2} = \sqrt{\sum_{i=1}^n v_i^2}$$

In [17]:
v.pow(2).sum().sqrt()

tensor(3.3912)

In [20]:
v**2

tensor([2.2500, 0.2500, 9.0000])

In [21]:
torch.pow(v, 2)

tensor([2.2500, 0.2500, 9.0000])

In [19]:
v.pow(2).sum()

tensor(11.5000)

In [24]:
v / v.norm()

tensor([ 0.4423, -0.1474,  0.8847])

## Matrix

Typical operations:

* rotation
* next step in a stochastic process
* scalar products


Give example with colors to $RGB$ to $black/R-G$

https://xkcd.com/184/

* [Hessian matrix](https://en.wikipedia.org/wiki/Hessian_matrix) of a scalar

In [25]:
M = tensor([[1., 2.], [3., 4.]])
M

tensor([[1., 2.],
        [3., 4.]])

In [26]:
M.matmul(M)

tensor([[ 7., 10.],
        [15., 22.]])

In [27]:
tensor([1., 0.]).matmul(M)

tensor([1., 2.])

In [None]:
# for Python 3.5+
M @ M

In [28]:
M * M

tensor([[ 1.,  4.],
        [ 9., 16.]])

In [29]:
tensor([1., 2.]).matmul(M)

tensor([ 7., 10.])

In [30]:
M.svd()

(tensor([[-0.4046, -0.9145],
         [-0.9145,  0.4046]]),
 tensor([5.4650, 0.3660]),
 tensor([[-0.5760,  0.8174],
         [-0.8174, -0.5760]]))

In [31]:
M.det()

tensor(-2.0000)

## Tensor


Tensor is a generalization of vectors and matrices for more dimensions.

In physics and engineering they have more properties, as in:


![](https://upload.wikimedia.org/wikipedia/commons/thumb/f/fe/StressEnergyTensor_contravariant.svg/250px-StressEnergyTensor_contravariant.svg.png)

[Electromagnetic tensor](https://en.wikipedia.org/wiki/Electromagnetic_tensor) from [Introduction to the mathematics of general relativity - Wikipedia](https://en.wikipedia.org/wiki/Introduction_to_the_mathematics_of_general_relativity), see also: [Tensor](https://en.wikipedia.org/wiki/Tensor).

In deep learning, there are any arrays.
You can look at See also:

* [Einsum is All you Need - Einstein Summation in Deep Learning - Tim Rocktäschel](https://rockt.github.io/2018/04/30/einsum)

## To dos

(Internal notes...)

Technicalities:

* avoid Python loops 
* data type
* (un)squeeze
* stride
* from/to GPU

Advanced:

* Einstein summation
* Tensor diagrams

To do:

* links
* LaTeX formulas to compare 
* More practical examples

Extras:

* SVG diagrams for tensors