<a href="https://colab.research.google.com/github/SzymonNowakowski/Workshops/blob/2023_2/Day_1/0_tensors.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import torch


# Before we begin

## Departure from neuron-like terminology. Arrival of mathematically oriented terminology

There is a departure from the neuron-like biological terminology in ANN community. You will see even in this very simple example, that it is more convenient to think of layers not as of data points (neurons) but as of mathematical transformations (so a layer would be a matrix multiplication by weights or a layer would be aplication of nonlinear transform, or both, and in this latter case a layer would decompose further).

In the case of more complex networks like Transformers it would be even hard to find a neuron analogy. Transformers explicitly work with matrices and generally with mathematical abstractions.

The departure from the neural terminology is also justified by biology, itself. It turns out, that a single neuron in a brain behaves much more like a full artificial neural network than like an artificial neuron.

The authors of the paper cited below show that, at the very least, a 5-layer 128-unit TCN — temporal convolutional network — is needed to simulate the I/O patterns of a pyramidal neuron at the millisecond resolution (single spike precision). To make a gross comparison: This means a single biological neuron needs between 640 and 2048 artificial neurons to be simulated adequately.

[Beniaguev D, Segev I, London M. Single cortical neurons as deep artificial neural networks. Neuron. 2021 Sep 1;109(17):2727-2739.e3. doi: 10.1016/j.neuron.2021.07.002](https://pubmed.ncbi.nlm.nih.gov/34380016/)

**For the interested reader: [here you can read about Transformers](https://jalammar.github.io/illustrated-transformer/).**

# Let's talk about Tensors

A PyTorch or TensorFlow Tensor is just a **multidimensional array**. But it is much more transformation-centered, and when I say *transformation* I mean mathematical transformation.

Many times in the past it proved for me hard to speak clearly of Tensor transformations and dimensions. Below is a screenshot from a random tutorial, which shows a clear confusion when the author talks about Tensor dimensions.

![Tensor dimensions confusion - screenshot from https://beerensahu.wordpress.com/2018/03/21/pytorch-tutorial-lesson-1-tensor/](https://i.imgur.com/jikCq6K.png)

In [None]:
torch.Tensor(2,3)

tensor([[ 6.9204e-12,  4.4884e-41, -9.1724e+01],
        [ 3.2007e-41,  2.4085e+09,  4.4882e-41]])

Considering that a Tensor is a much more mathematically inclined object than a multidimensional array, let me introduce a naming convention which will be compatible with mathematical objects a Tensor represents. For instance, a vector from $\mathbb{R}^n$ is considered $n$-dimensional in mathematics but it can be stored in a one-dimensional array, a matrix in $\mathbb{R}^{r \times c}$ is considered to be $r \times c$-dimensional  object, but can be stored in a two-dimensional array. This duality creates a confusion. In my experience the following naming convention is coherent considering that Tensors **are** mathematical objects.

Tensor Terms  | Meaning | Multidimensional Array Terms
---|---|---
**order** | number of indices (levels) in a Tensor | dimension
**dimension** | number of components a Tensor can store in a given order | size
Tensor of order zero | a constant | - |
Tensor of order one | a vector | one-dimensional array |
Tensor of order two | a matrix | two-dimensional array |
`torch.tensor([1.1, 4.12, 8.9, 14.85])` | an order-one 4-dimensional tensor representing a vector in $\mathbb{R}^4$ | `[1.1, 4.12, 8.9, 14.85]`


**Now, together, let's go through [this presentation on tensors and basic layers](https://drive.google.com/file/d/1JyTDpcDhe3Ep3bnzEtK5eP0KhaDru8Nc/view?usp=sharing).**