# Purpose
This notebook aims to verify what Aurelien Geron writes about higher-dimensional tensor multiplication in his hands-on machine learning book 2nd ed.

What he said in the book is essentially that when tensors of shape length more than 2, i.e. not matrices anymore, then the multiplication is viewed as a bunch of multiplications of corresponding matrices from corresponding positions of the involved tensors.

If nothing has changed, for ref. purposes, the statements could be found on p.560 Ch16 (Natural Language Processing with RNNs and Attention).

In [1]:
import numpy as np
import tensorflow as tf

2021-11-21 18:30:20.943318: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-11-21 18:30:20.943353: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


In [2]:
np.random.seed(42)

In [3]:
A = np.random.randint(1, 50, size=(2, 3, 4))
A

array([[[39, 29, 15, 43],
        [ 8, 21, 39, 19],
        [23, 11, 11, 24]],

       [[36, 40, 24,  3],
        [22,  2, 24, 44],
        [30, 38,  2, 21]]])

In [4]:
B = np.random.randint(1, 50, size=(2, 4, 5))
B

array([[[33, 12, 22, 44, 25],
        [49, 27, 42, 28, 16],
        [15, 47, 44,  3, 37],
        [ 7, 21,  9, 39, 18]],

       [[ 4, 25, 14,  9, 26],
        [ 2, 20, 28, 47,  7],
        [44,  8, 47, 35, 14],
        [17, 36, 40,  4,  2]]])

When the above `A` and `B` are multiplied, we view that

- `A` consists of two matrices of shape `(3, 4)`
- `B` consists of two matrices of shape `(4, 5)`
- Their product consists of two matrices of shape `(3, 5)`
    - Each matrix is the product of the corresponding matrices of `A` and of `B`

In [5]:
AB = A @ B
AB.shape

(2, 3, 5)

In [6]:
AB

array([[[3234, 2859, 3123, 4250, 2768],
        [2011, 2895, 2945, 1798, 2321],
        [1631, 1594, 1668, 2289, 1590]],

       [[1331, 2000, 2872, 3056, 1558],
        [1896, 2366, 3252, 1308, 1010],
        [ 641, 2282, 2418, 2210, 1116]]])

In [7]:
np.array_equal(A[0] @ B[0], AB[0])

True

In [8]:
np.array_equal(A[1] @ B[1], AB[1])

True

Let's verify that the same holds in TensorFlow.

In [9]:
tA = tf.constant(A)
tB = tf.constant(B)
tAB = tA @ tB

2021-11-21 18:30:23.073705: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-11-21 18:30:23.073756: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2021-11-21 18:30:23.073800: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (fedora): /proc/driver/nvidia/version does not exist


In [12]:
np.array_equal(tAB.numpy(), AB)

True

The same is true when the `ndim` gets bigger

In [16]:
A = np.random.randint(1, 50, size=(2, 3, 4, 5))
B = np.random.randint(1, 50, size=(2, 3, 5, 6))
AB = A @ B

In [18]:
n_rows, n_cols = A.shape[:2]
for i in range(n_rows):
    for j in range(n_cols):
        equal = np.array_equal(AB[i, j], A[i, j] @ B[i, j])
        print(f"AB[{i}, {j}] equals A[{i}, {j}] @ B[{i}, {j}]: {equal}")

AB[0, 0] equals A[0, 0] @ B[0, 0]: True
AB[0, 1] equals A[0, 1] @ B[0, 1]: True
AB[0, 2] equals A[0, 2] @ B[0, 2]: True
AB[1, 0] equals A[1, 0] @ B[1, 0]: True
AB[1, 1] equals A[1, 1] @ B[1, 1]: True
AB[1, 2] equals A[1, 2] @ B[1, 2]: True


In [19]:
tA = tf.constant(A)
tB = tf.constant(B)
tAB = tA @ tB

In [20]:
np.array_equal(tAB.numpy(), AB)

True