# Tensor Operations

## Tensor Transposition

- The transpose of a scalar is itself, e.g.: $x^T = x$
- The transpose of a vector converts column to row, and vice versa
- Scalar and vector transposition are special cases of **matrix transposition**:
    - Flip of axes over the *main diagonal*: $(\boldsymbol{X}^T)_{i,j} = \boldsymbol{X}_{j,i}$

   $$\begin{bmatrix} x_{1,1} & x_{1,2} \\ x_{2,1} & x_{2,2} \\ x_{3,1} & x_{3,2} \end{bmatrix}^T = \begin{bmatrix} x_{1,1} & x_{2,1} & x_{3,1} \\ x_{1,2} & x_{2,2} & x_{3,2} \end{bmatrix}$$

### Tensor Transposition with Python

In [1]:
import numpy as np
import torch
import tensorflow as tf

In [5]:
X = np.array([[25, 2], [5, 26], [3, 7]])
X_tf = tf.Variable([[25, 2], [5, 26], [3, 7]])
X_pt = torch.tensor([[25, 2], [5, 26], [3, 7]])

##### Transposition with NumPy

In [6]:
X.T

array([[25,  5,  3],
       [ 2, 26,  7]])

##### Transposition with TensorFlow

In [7]:
tf.transpose(X_tf)

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[25,  5,  3],
       [ 2, 26,  7]])>

### Transposition with PyTorch

In [8]:
X_pt.T

tensor([[25,  5,  3],
        [ 2, 26,  7]])

## Basic Tensor Arithmetic

Adding or multiplying with scalar applies the operation to each element of the tensor and the tensor shape is retained:

In [9]:
X * 2

array([[50,  4],
       [10, 52],
       [ 6, 14]])

In [10]:
X + 2

array([[27,  4],
       [ 7, 28],
       [ 5,  9]])

In [11]:
X*2+2

array([[52,  6],
       [12, 54],
       [ 8, 16]])

With TensorFlow and PyTorch, there's the risk of operator overloading: operator overloading is when the same operator (e.g. `*`, `+`) has different meanings depending on the context. For example, in TensorFlow, `*` is element-wise multiplication, while in PyTorch, it can be matrix multiplication. To avoid confusion, it's better to use the explicit functions provided by the libraries:

In [13]:
torch.add(torch.mul(X_pt, 2), 2)

tensor([[52,  6],
        [12, 54],
        [ 8, 16]])

In [14]:
tf.add(tf.multiply(X_tf, 2), 2)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[52,  6],
       [12, 54],
       [ 8, 16]])>

#### Hadamard Product (or Element-wise Product)

If two tensors have the same size, operations are often applied element-wise. This is called the **Hadamard product**. The Hadamard product of two tensors $\boldsymbol{A}$ and $\boldsymbol{X}$ is denoted as $\boldsymbol{A} \odot \boldsymbol{X}$.

In [15]:
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [16]:
A = X + 2
A

array([[27,  4],
       [ 7, 28],
       [ 5,  9]])

In [17]:
A + X

array([[52,  6],
       [12, 54],
       [ 8, 16]])

In [19]:
# the Hadamard product in NumPy
A * X

array([[675,   8],
       [ 35, 728],
       [ 15,  63]])

In [23]:
A_pt = X_pt + 2

In [21]:
A_pt + X_pt

tensor([[52,  6],
        [12, 54],
        [ 8, 16]])

In [22]:
# the Hadamard product in PyTorch
A_pt * X_pt

tensor([[675,   8],
        [ 35, 728],
        [ 15,  63]])

In [24]:
A_tf = X_tf + 2

In [25]:
A_tf + X_tf

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[52,  6],
       [12, 54],
       [ 8, 16]])>

In [26]:
# the Hadamard product in TensorFlow
A_tf * X_tf

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[675,   8],
       [ 35, 728],
       [ 15,  63]])>

### Tensor Reduction

Calculating the sum across all elements of a tensor is a common operation. For example:

- For vector $\boldsymbol{x}$ of length $n$, we calculate $\sum_{i=1}^{n} x_i$
- For matrix $\boldsymbol{X}$ with $m$ by $n$ dimensions, we calculate $\sum_{i=1}^{m} \sum_{j=1}^{n} X_{i,j}$


In [27]:
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [28]:
X.sum()

68

In [29]:
torch.sum(X_pt)

tensor(68)

In [30]:
tf.reduce_sum(X_tf)

<tf.Tensor: shape=(), dtype=int32, numpy=68>

It can also be done along a specific axis.

In [39]:
X.sum(0)

array([33, 35])

In [40]:
X.sum(1)

array([27, 31, 10])

In [33]:
torch.sum(X_pt, 0)

tensor([33, 35])

In [34]:
tf.reduce_sum(X_tf, 1)

<tf.Tensor: shape=(3,), dtype=int32, numpy=array([27, 31, 10])>

Other operations that can be applied with reduction along all or a selection of axses include:

- maximum
- minimum
- mean
- product

## The Dot Product

If we have two vectors $\boldsymbol{x}$ and $\boldsymbol{y}$ of the same length $n$, we can calculate the dot product between them.

The dot product is annotated in different ways:

- $\boldsymbol{x} \cdot \boldsymbol{y}$
- $\boldsymbol{x}^T \boldsymbol{y}$
- $\langle \boldsymbol{x} , \boldsymbol{y}\rangle$

Regardless of the notation, the dot product is calculated as follows: first, the products are calculated in an element-wise fashion, and then the results are summed reductively across the products to a scalar value/

$$\boldsymbol{x} \cdot \boldsymbol{y} = \sum_{i=1}^{s} x_i y_i$$

In [88]:
x = np.array([25, 2, 5])
x_pt = torch.tensor([25, 2, 5])
x_tf = tf.Variable([25, 2, 5])
y = np.array([0, 1, 2])
y_pt = torch.tensor([0, 1, 2])
y_tf = tf.Variable([0, 1, 2])

In [89]:
np.dot(x, y)

12

In [90]:
np.dot(x_pt, y_pt)

12

In [91]:
torch.dot(torch.tensor([25, 2, 5.]), torch.tensor([0, 1, 2.]))

tensor(12.)

In [92]:
tf.reduce_sum(tf.multiply(x_tf, y_tf))

<tf.Tensor: shape=(), dtype=int32, numpy=12>

In [93]:
A = np.array([[1,2,3],[4,5,6]])
B = np.array([[7,8],[9,10],[11,12]])
np.dot(A, B)

array([[ 58,  64],
       [139, 154]])

## Exercises

1. What is $\boldsymbol{Y}^T$ for the following matrix:

$$\boldsymbol{Y} = \begin{bmatrix} 42 & 4 & 7 & 99 \\ -99 & -3 & 17 & 22 \end{bmatrix}$$

In [111]:
y = np.array([[42, 4, 7, 99],[-99, -3, 17, 22]])
y

array([[ 42,   4,   7,  99],
       [-99,  -3,  17,  22]])

In [112]:
y.T

array([[ 42, -99],
       [  4,  -3],
       [  7,  17],
       [ 99,  22]])

2. What is the Hadamard product of the following matrices:

$$\begin{bmatrix} 25 & 10 \\ -2 & 1 \end{bmatrix} \odot \begin{bmatrix} -1 & 7 \\ 10 & 8 \end{bmatrix}$$

In [113]:
A = np.array([[25,10],[-2,1]])
B = np.array([[-1,7],[10,8]])
A * B

array([[-25,  70],
       [-20,   8]])

3. What is the dot product of the following vectors:

$$\boldsymbol{w} = [-1 \quad 2 \quad -2]$$
$$\boldsymbol{z} = [5 \quad 10 \quad 0]$$

In [114]:
w = np.array([-1, 2, -2])
z = np.array([5, 10, 0])
np.dot(w, z)

15