# Scalars, vectors, matrices, and tensors
>Correspodence between mathematics and PyTorch.

- toc: true 
- badges: true
- comments: true
- categories: [mathematics]

---

tags: scalars vectors matrices tensors mathematics linear algebra pytorch  

---

# tl;dr
We establish the following correspondence between entities commonly encountered in linear algebra and their counterparts in `PyTorch`.

|mathematical name|mathematical notation|`tensor` shape|`tensor` dimension|
|---|---|---|---|
|scalar|$x$|`()`|`0`|
|vector|$(x_1, \dots, x_n)$|`(n,)`|`1`|
|linear transform| $w_{i,j}$ <br> $i=1, \dots, m$<br>$j=1, \dots, n$|N/A|N/A|
|matrix|$\begin{bmatrix}w_{1,1}&\dots&w_{1,n}\\\vdots&\ddots&\vdots\\w_{m,1}&\dots&w_{n,m}\end{bmatrix}$|`(m,n)`| `2`|
|column vector|$\begin{bmatrix}x_1\\\vdots\\x_n\end{bmatrix}$|`(n,1)`|`2`|
|row vector|$\begin{bmatrix}x_1&\dots&x_n\end{bmatrix}$|`(1,n)`|`2`|

>Note:
When we write `tensor`, we mean a `PyTorch` `tensor`.  Otherwise, "tensor" refers to the mathematical notion.

>Important:
The dimension of a _vector_ is the number of its entries.  The dimension of a _tensor_ is the number of indices needed to label its entries, in other words the length of its shape.

>Note: As a consequence, it is enough to refer to a `(m,n)`-`tensor` rather than to a `2`-dimensional `tensor` with shape `(m,n)`, but at times we will want to emphasize the dimension and make the information explicit.

# Scalars
A scalar is a real number.  It corresponds to a `0`-dimensional `tensor` with empty shape:

In [1]:
from torch import tensor
x = tensor(3.14)
print("Example of a 0-dimensional tensor in PyTorch:")
print("-"*45)
print(x)
print(f"\nDimension:   {x.ndim}")
print(f"Shape:       {tuple(x.shape)}")

Example of a 0-dimensional tensor in PyTorch:
---------------------------------------------
tensor(3.1400)

Dimension:   0
Shape:       ()


# Vectors
An $n$-dimensional vector is a **list** of $n$ scalars:
\begin{equation}(x_1, \dots, x_n)\,.\end{equation}
It corresponds to a `0`-dimensional `tensor` of shape `(n,)`.

In [2]:
# collapse-show
from torch import tensor
x = tensor([-1., 5., 7.])
print("Example of a 1-dimensional tensor in PyTorch:")
print("-"*45)
print(x)
print(f"\nDimension:   {x.ndim}")
print(f"Shape:       {tuple(x.shape)}")

Example of a 1-dimensional tensor in PyTorch:
---------------------------------------------
tensor([-1.,  5.,  7.])

Dimension:   1
Shape:       (3,)


# Linear transformations
A linear transformation which maps $n$-dimensional vectors to $m$-dimensional vectors can be identified with a doubly-indexed list of scalars $w_{i,j}$ where $i$ runs from $1$ to $m$ and $j$ from $1$ to $n$, in that they act on input vectors according to the familiar formula:
$$y_i = \sum_{j=1}^nw_{i,j}x_j\quad \textrm{for}\quad i=1, \dots, m\,.$$

>Tip:
The indices in $w_{i,j}x_j$ appear in the order $(i, j, j)$.  The dummy index $j$ is repeated next to itself and "disappears" upon summation, while the index $i$ remains.

# Column vectors
For computational purposes, instead of a vector $(x_1, \dots, x_n)$, it is convenient to work with its **column vector** counterpart, which is the array
$$\begin{bmatrix}x_1\\\vdots\\x_n\end{bmatrix}\,.
$$
>Important:
It is important to distinguish a vector and its corresponding column vector:
$$(x_1, \dots, x_n) \quad \ne\quad \begin{bmatrix}x_1\\\vdots\\x_n\end{bmatrix}\,.$$  

Indeed, while the vector corresponds to a `1`-dimensional `tensor`, its column vector counterpart corresponds to a `2`-dimensional `tensor` with shape `(n,1)`:

In [3]:
# collapse-show
from torch import tensor
x = tensor([-1., 5., 7.]).reshape(-1, 1)
print("Example of a 1-dimensional tensor in PyTorch:")
print("-"*45)
print(x)
print(f"\nDimension:   {x.ndim}")
print(f"Shape:       {tuple(x.shape)}")

Example of a 1-dimensional tensor in PyTorch:
---------------------------------------------
tensor([[-1.],
        [ 5.],
        [ 7.]])

Dimension:   2
Shape:       (3, 1)


# Matrices
An **$m$-by-$n$ matrix** is a doubly-indexed list $w_{i,j}$ where $i$ ranges from $1$ to $m$ and $j$ from $1$ to $n$.  It is usually written as an array with $m$ rows and $n$ columns:
$$\begin{bmatrix}
w_{1,1}&\dots&w_{1,j}&\dots&w_{1,n}\\
\vdots&&\vdots&&\vdots\\
w_{i,1}&\dots&w_{i,j}&\dots&w_{i,n}\\
\vdots&&\vdots&&\vdots\\
w_{m,1}&\dots&w_{m,j}&\dots&w_{m,n}
\end{bmatrix}
$$

>Important:
The first index in $w_{i,j}$ labels rows and the second index labels columns.

An $m$-by-$n$ matrix corresponds to a `2`-dimensional`tensor` with shape `(m,n)`:

In [4]:
# collapse-show
import torch
n = 3
m = 4
x = torch.randn(n, m)
print("Example of a 2-dimensional tensor in PyTorch:")
print("-"*45)
print(x)
print(f"\nDimension:   {x.ndim}")
print(f"Shape:       {tuple(x.shape)}")

Example of a 2-dimensional tensor in PyTorch:
---------------------------------------------
tensor([[-0.3536,  1.9439, -0.0042, -0.8287],
        [-0.4937,  0.4641, -1.0579,  0.2006],
        [-0.4996, -1.2149, -0.9282,  0.1443]])

Dimension:   2
Shape:       (3, 4)


# Computation and compact representation of linear transformations
The _collection_ of $m$ formulas $y_i = \sum_{j=1}^nw_{i,j}x_j$, where $i$ runs from $1$ to $m$, is compactly represented as
$$
\begin{bmatrix}y_1\\\vdots\\y_m\end{bmatrix}
\,=\,\begin{bmatrix}w_{1,1}&\dots&w_{1,n}\\\vdots&\ddots&\vdots\\w_{m,1}&\dots&w_{m,n}\end{bmatrix}
\begin{bmatrix}x_1\\\vdots\\x_n\end{bmatrix}
$$
where the matrix is written to the left of the input column vector.

>Note:
The number of columns in the matrix matches the number of rows in the column vector.

# Row vectors
Identifying a vector with its column vector is the more common convention.  Alternatively, the vector $(x_1, \dots, x_n)$ can be identified with its **row vector** counterpart, which is the array
$$\begin{bmatrix}x_1&\dots&x_n\end{bmatrix}\,.$$

>Important:
Again, a vector, its column and row vector counterparts are all different objects:
$$(x_1, \dots, x_n) \quad \ne \quad \begin{bmatrix}x_1\\\vdots\\x_n\end{bmatrix}
\quad\ne\quad
\begin{bmatrix}x_1&\dots&x_n\end{bmatrix}
\quad\ne\quad
(x_1, \dots, x_n)\,.$$  

Indeed, a row vector of length $n$ corresponds to a `2`-dimensional `tensor` with shape `(1,n)`:

In [5]:
# collapse-show
from torch import tensor
x = tensor([-1., 5., 7.]).reshape(1, -1)
print("Example of a 1-dimensional tensor in PyTorch:")
print("-"*45)
print(x)
print(f"\nDimension:   {x.ndim}")
print(f"Shape:       {tuple(x.shape)}")

Example of a 1-dimensional tensor in PyTorch:
---------------------------------------------
tensor([[-1.,  5.,  7.]])

Dimension:   2
Shape:       (1, 3)


> Note:
Vectors and row vectors are all represented horizontally.  However, vectors are written with parentheses while row vectors are written with square brackets.

# Computing with row vectors
In terms of row vectors, the output of a linear transformation is more conveniently rewritten with the similar but different formulas
$$y_j = \sum_{i=1}^nx_i\tilde w_{i,j}\qquad \textrm{for}\quad j=1, \dots, m$$
where now the matrix with coefficients $\tilde w_{i,j}$ where $i$ runs from $1$ to $n$ and $j$ from $1$ to $m$, has $n$ rows and $m$ columns:
$$
\begin{bmatrix}
\tilde w_{1,1}&\dots&\tilde w_{1,j}&\dots&\tilde w_{1,m}\\
\vdots&&\vdots&&\vdots\\
\tilde w_{i,1}&\dots&\tilde w_{i,j}&\dots&\tilde w_{i,m}\\
\vdots&&\vdots&&\vdots\\
\tilde w_{n,1}&\dots&\tilde w_{n,i}&\dots&\tilde w_{n,m}
\end{bmatrix}\,.$$

>Tip:
The indices in $x_i\tilde w_{i,j}$ appear in the order $(i, i, j)$ (note the difference from earlier).  The dummy index, which is now $i$, is repeated next to itself.  The remaining index, which is now $j$, is the same as that in the quantity $y_j$ on the left of the equation.

The collection of $m$ formulas $y_j = \sum_{i=1}^nx_i\tilde w_{i,j}$, for $j$ from $1$ to $m$, is compactly represented as
$$
\begin{bmatrix}y_1&\dots&y_m\end{bmatrix}
\quad=\quad
\begin{bmatrix}x_1&\dots&x_n\end{bmatrix}
\begin{bmatrix}\tilde w_{1,1}&\dots&\tilde w_{1,m}\\\vdots&\ddots&\vdots\\\tilde w_{n,1}&\dots&\tilde w_{n,m}
\end{bmatrix}
$$
where the matrix is written to the right of the input row vector.

>Note:
The number of columns in the input row vector matches the number of rows in the matrix.

>Important:
The coefficients $\tilde w_{i,j}$ share the same values as the coefficients $w_{i,j}$ in the representation with column vectors, but they are coefficients of distinct matrices: the first is $n$-by-$m$ while the second is $m$-by-$n$.  In fact, these matrices are transposes of each other:
$$
\begin{bmatrix}\tilde w_{1,1}&\dots&\tilde w_{1,m}\\\vdots&\ddots&\vdots\\\tilde w_{n,1}&\dots&\tilde w_{n,m}
\end{bmatrix}
\quad=\quad
\begin{bmatrix}w_{1,1}&\dots&w_{1,n}\\\vdots&\ddots&\vdots\\w_{m,1}&\dots&w_{m,n}\end{bmatrix}^\top
\quad=\quad
\begin{bmatrix}w_{1,1}&\dots&w_{m,1}\\\vdots&\ddots&\vdots\\w_{1,n}&\dots&w_{m,n}\end{bmatrix}
\,.$$
Equivalently,
$$\tilde w_{i,j} = w_{j,i}\qquad \textrm{for}\quad i=1, \dots, n\,,\quad j=1, \dots, m\,.$$