## Common Tensor Operations

### Tensor Transposition

### In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix A by producing another matrix, often denoted by Aᵀ.

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import torch
import tensorflow as tf

In [4]:
X = np.array([[25, 2], [5, 26], [3, 7]])
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [5]:
X.T

array([[25,  5,  3],
       [ 2, 26,  7]])

In [6]:
X_pt = torch.tensor([[25, 2], [5, 26], [3, 7]])
X_pt

tensor([[25,  2],
        [ 5, 26],
        [ 3,  7]])

In [7]:
X_pt.T

tensor([[25,  5,  3],
        [ 2, 26,  7]])

In [8]:
X_tf = tf.Variable([[25, 2], [5, 26], [3, 7]])
X_tf

<tf.Variable 'Variable:0' shape=(3, 2) dtype=int32, numpy=
array([[25,  2],
       [ 5, 26],
       [ 3,  7]])>

In [9]:
tf.transpose(X_tf) # less Pythonic

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[25,  5,  3],
       [ 2, 26,  7]])>

### Basic Arithmetical Properties

Adding or multiplying with scalar applies operation to all elements and tensor shape is retained: 

In [10]:
X*2

array([[50,  4],
       [10, 52],
       [ 6, 14]])

In [11]:
X+2

array([[27,  4],
       [ 7, 28],
       [ 5,  9]])

In [12]:
X*2+2

array([[52,  6],
       [12, 54],
       [ 8, 16]])

In [13]:
X_pt*2+2 # Python operators are overloaded; could alternatively use torch.mul() or torch.add()

tensor([[52,  6],
        [12, 54],
        [ 8, 16]])

In [14]:
torch.add(torch.mul(X_pt, 2), 2)

tensor([[52,  6],
        [12, 54],
        [ 8, 16]])

In [15]:
X_tf*2+2 # Operators likewise overloaded; could equally use tf.multiply() tf.add()

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[52,  6],
       [12, 54],
       [ 8, 16]])>

In [16]:
tf.add(tf.multiply(X_tf, 2), 2)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[52,  6],
       [12, 54],
       [ 8, 16]])>

If two tensors have the same size, operations are often by default applied element-wise. This is **not matrix multiplication**,  but is rather called the **Hadamard product** or simply the **element-wise product**. 

The mathematical notation is $A \odot X$

In [17]:
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [18]:
A = X+2
A

array([[27,  4],
       [ 7, 28],
       [ 5,  9]])

In [19]:
A + X

array([[52,  6],
       [12, 54],
       [ 8, 16]])

In [20]:
A * X

array([[675,   8],
       [ 35, 728],
       [ 15,  63]])

In [21]:
A_pt = X_pt + 2

In [22]:
A_pt + X_pt

tensor([[52,  6],
        [12, 54],
        [ 8, 16]])

In [23]:
A_pt * X_pt

tensor([[675,   8],
        [ 35, 728],
        [ 15,  63]])

In [24]:
A_tf = X_tf + 2

In [25]:
A_tf + X_tf

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[52,  6],
       [12, 54],
       [ 8, 16]])>

In [26]:
A_tf * X_tf

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[675,   8],
       [ 35, 728],
       [ 15,  63]])>

### Reduction

Calculating the sum across all elements of a tensor is a common operation. For example: 

* For vector ***x*** of length *n*, we calculate $\sum_{i=1}^{n} x_i$
* For matrix ***X*** with *m* by *n* dimensions, we calculate $\sum_{i=1}^{m} \sum_{j=1}^{n} X_{i,j}$

In [28]:
X

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [29]:
X.sum()

68

In [30]:
torch.sum(X_pt)

tensor(68)

In [31]:
tf.reduce_sum(X_tf)

<tf.Tensor: shape=(), dtype=int32, numpy=68>

In [32]:
# Can also be done along one specific axis alone, e.g.:
X.sum(axis=0) # summing all rows

array([33, 35])

In [33]:
X.sum(axis=1) # summing all columns

array([27, 31, 10])

In [34]:
torch.sum(X_pt, 0)

tensor([33, 35])

In [35]:
tf.reduce_sum(X_tf, 1)

<tf.Tensor: shape=(3,), dtype=int32, numpy=array([27, 31, 10])>

Many other operations can be applied with reduction along all or a selection of axes, e.g.:

* maximum
* minimum
* mean
* product

They're fairly straightforward and used less often than summation, so you're welcome to look them up in library docs if you ever need them.

In [42]:
X.max(), X.max(axis = 1), X.max( axis = 0)

(26, array([25, 26,  7]), array([25, 26]))

In [43]:
X.min(), X.min(axis = 1), X.min( axis = 0)

(2, array([2, 5, 3]), array([3, 2]))

In [45]:
X.mean(), X.mean(axis = 1), X.mean( axis = 0)

(11.333333333333334,
 array([13.5, 15.5,  5. ]),
 array([11.        , 11.66666667]))

In [46]:
X.prod(), X.prod(axis = 1), X.prod(axis = 0)

(136500, array([ 50, 130,  21]), array([375, 364]))

### The Dot Product

If we have two vectors (say, ***x*** and ***y***) with the same length *n*, we can calculate the dot product between them. This is annotated several different ways, including the following: 

* $x \cdot y$
* $x^Ty$
* $\langle x,y \rangle$

Regardless which notation you use (I prefer the first), the calculation is the same; we calculate products in an element-wise fashion and then sum reductively across the products to a scalar value. That is, $x \cdot y = \sum_{i=1}^{n} x_i y_i$

The dot product is ubiquitous in deep learning: It is performed at every artificial neuron in a deep neural network, which may be made up of millions (or orders of magnitude more) of these neurons.

In [49]:
x = np.array([25, 2, 5])
x

array([25,  2,  5])

In [50]:
y = np.array([0, 1, 2])
y

array([0, 1, 2])

In [51]:
25*0 + 2*1 + 5*2

12

In [52]:
np.dot(x, y)

12

In [56]:
x_pt = torch.tensor([[25, 2, 5]])
x_pt

tensor([[25,  2,  5]])

In [57]:
y_pt = torch.tensor([0, 1, 2])
y_pt

tensor([0, 1, 2])

In [58]:
np.dot(x_pt, y_pt)

array([12], dtype=int64)

In [59]:
torch.dot(torch.tensor([25, 2, 5.]), torch.tensor([0, 1, 2.]))

tensor(12.)

In [64]:
x_tf = tf.Variable([25, 2, 5])
x_tf

<tf.Variable 'Variable:0' shape=(3,) dtype=int32, numpy=array([25,  2,  5])>

In [65]:
y_tf = tf.Variable([0, 1, 2])
y_tf

<tf.Variable 'Variable:0' shape=(3,) dtype=int32, numpy=array([0, 1, 2])>

In [66]:
tf.reduce_sum(tf.multiply(x_tf, y_tf))

<tf.Tensor: shape=(), dtype=int32, numpy=12>

#### Exercise

In [70]:
y = np.array([[42, 4, 7 , 99],[-99, -3, 17, 22]])
y

array([[ 42,   4,   7,  99],
       [-99,  -3,  17,  22]])

In [71]:
y.T

array([[ 42, -99],
       [  4,  -3],
       [  7,  17],
       [ 99,  22]])

In [74]:
x = np.array([[25, 10], [-2, 1]])
y = np.array([[-1, 7], [10, 8]])
x * y      

array([[-25,  70],
       [-20,   8]])

In [75]:
w = np.array([-1, 2, -2])
x = np.array([5, 10, 0])
np.dot(w, x)

15