# 6 - Matrix Multiplication
- 6.1 Standard multiplication
- 6.2 Multiplication and equations
- 6.3 Multiplication with diagonals
- 6.4 LIVE EVIL
- 6.5 Matrix-vector multiplication
- 6.6 Creating symmetric matrices
- 6.7 Multiply symmetric matrices
- 6.8 Hadamard multiplication
- 6.9 Froebenius dot product
- 6.10 Matrix norms
- 6.11 Matrix asymmetry index
- 6.12 What about matrix division?
- 6.13 Exercises
- 6.14 Answers
- 6.15 Code challenges
- 6.16 Code solutions


In [1]:
import numpy as np

## 6.1 Standard Multiplication
Matrix $A \in \mathbb{R}^{M \times N}$ is compatible for multiplication with matrix $B \in \mathbb{R}^{N \times P}$ only when the innermost dimension of $A$ and $B$ are equal.

### 1/ Element Perspective
Interpretation of matrix multiplication as dot products between rows of A and columns of B.

$$
\begin{bmatrix}
a_{1,1} & a_{1,2} & a_{1,3} \\
a_{2,1} & a_{2,2} & a_{2,3} \\
a_{3,1} & a_{3,2} & a_{3,3} \\
\end{bmatrix}
\begin{bmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2} \\
b_{3,1} & b_{3,2} \\
\end{bmatrix}
=
\begin{bmatrix}
a_{1,1} b_{1,1} + a_{1,2} b_{2,1} + a_{1,3} b_{3,1} & a_{1,1} b_{1,2} + a_{1,2} b_{2,2} + a_{1,3} b_{3,2} \\
a_{2,1} b_{1,1} + a_{2,2} b_{2,1} + a_{2,3} b_{3,1} & a_{2,1} b_{1,2} + a_{2,2} b_{2,2} + a_{2,3} b_{3,2} \\
a_{3,1} b_{1,1} + a_{3,2} b_{2,1} + a_{3,3} b_{3,1} & a_{3,1} b_{1,2} + a_{3,2} b_{2,2} + a_{3,3} b_{3,2} \\
\end{bmatrix}
$$

Notes
- Lower triangle of product $AB$ contains dot products of rows of A whose index is larger than columns of B eg $i \gt j$.
- Upper triangle of product $AB$ contains dot products of columns of B whose index is larger than rows of A eg $j \gt i$.
- Lower and upper triangle interpretations are important for QR decomposition.

### 2/ Layer Perspective
Interpretation of matrix multiplication as sum of outer product of columns of A and rows of B.

$$
\begin{bmatrix}
a_{1,1} & a_{1,2} & a_{1,3} \\
a_{2,1} & a_{2,2} & a_{2,3} \\
a_{3,1} & a_{3,2} & a_{3,3} \\
\end{bmatrix}
\begin{bmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2} \\
b_{3,1} & b_{3,2} \\
\end{bmatrix}
=
\begin{bmatrix}
a_{1,1} \\
a_{2,1} \\
a_{3,1} \\
\end{bmatrix}
\begin{bmatrix}
b_{1,1} & b_{1,2} \\
\end{bmatrix}
+
\begin{bmatrix}
a_{1,2} \\
a_{2,2} \\
a_{3,2} \\
\end{bmatrix}
\begin{bmatrix}
b_{2,1} & b_{2,2} \\
\end{bmatrix}
+
\begin{bmatrix}
a_{1,3} \\
a_{2,3} \\
a_{3,3} \\
\end{bmatrix}
\begin{bmatrix}
b_{3,1} & b_{3,2} \\
\end{bmatrix}
$$

Notes
- Layer perspective is closely related to spectral theory of matrices which says that any matrix can be represented as  sum of rank-1 matrices.
- Spectral theory of matrices is important for singular value decomposition SVD.

### 3/ Column Perspective
Interpretation of matrix multiplication as linear weighted combination of the columns of A and the columns of B.

$$
\begin{bmatrix}
a_{1,1} & a_{1,2} & a_{1,3} \\
a_{2,1} & a_{2,2} & a_{2,3} \\
a_{3,1} & a_{3,2} & a_{3,3} \\
\end{bmatrix}
\begin{bmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2} \\
b_{3,1} & b_{3,2} \\
\end{bmatrix}
=
\begin{bmatrix}
 b_{1,1}
 \begin{bmatrix}
 a_{1,1} \\
 a_{2,1} \\
 a_{3,1} \\
 \end{bmatrix}
 +
 b_{2,1}
 \begin{bmatrix}
 a_{1,2} \\
 a_{2,2} \\
 a_{3,2} \\
 \end{bmatrix}
 +
 b_{3,1}
 \begin{bmatrix}
 a_{1,3} \\
 a_{2,3} \\
 a_{3,3} \\
 \end{bmatrix}
 &
 b_{1,2}
 \begin{bmatrix}
 a_{1,1} \\
 a_{2,1} \\
 a_{3,1} \\
 \end{bmatrix}
 +
 b_{2,2}
 \begin{bmatrix}
 a_{1,2} \\
 a_{2,2} \\
 a_{3,2} \\
 \end{bmatrix}
 +
 b_{3,2}
 \begin{bmatrix}
 a_{1,3} \\
 a_{2,3} \\
 a_{3,3} \\
 \end{bmatrix}
\end{bmatrix}
$$

Notes
- The column perspective is useful in statistical infererence where the columns of the left matrix A contain the predictors and the columns of the right matrix B contain the model coefficients or weights. The product of AB is the prediction from the model.

### 4/ Row Perspective
Interpretation of matrix multiplication as linear weighted combination of the rows of A and the rows of B.

$$
\begin{bmatrix}
a_{1,1} & a_{1,2} & a_{1,3} \\
a_{2,1} & a_{2,2} & a_{2,3} \\
a_{3,1} & a_{3,2} & a_{3,3} \\
\end{bmatrix}
\begin{bmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2} \\
b_{3,1} & b_{3,2} \\
\end{bmatrix}
=
\begin{bmatrix}
 a_{1,1}
 \begin{bmatrix}
 b_{1,1} & b_{1,2}
 \end{bmatrix}
 +
 a_{1,2}
 \begin{bmatrix}
 b_{2,1} & b_{2,2}
 \end{bmatrix}
 +
 a_{1,3}
 \begin{bmatrix}
 b_{3,1} & b_{3,2}
 \end{bmatrix} \\
 a_{2,1}
 \begin{bmatrix}
 b_{1,1} & b_{1,2}
 \end{bmatrix}
 +
 a_{2,2}
 \begin{bmatrix}
 b_{2,1} & b_{2,2}
 \end{bmatrix}
 +
 a_{2,3}
 \begin{bmatrix}
 b_{3,1} & b_{3,2}
 \end{bmatrix} \\
 a_{3,1}
 \begin{bmatrix}
 b_{1,1} & b_{1,2}
 \end{bmatrix}
 +
 a_{3,2}
 \begin{bmatrix}
 b_{2,1} & b_{2,2}
 \end{bmatrix}
 +
 a_{3,3}
 \begin{bmatrix}
 b_{3,1} & b_{3,2}
 \end{bmatrix} \\
\end{bmatrix}
$$

Notes
- The row perspective is useful in principal component analysis (PCA) where the rows of the right matrix B contain data (observations in rows and features in columns) and the rows of the left matrix A contain weights for combining the features.  The product of AB is the principal component scores.

In [2]:
def mmult_as_dot(A,B):
    """
    mmult_as_dot returns the product AB as dot products of rows of A and columns of B

    :param A: numpy.ndarray  Matrix A
    :param B: numpy.ndarray  Matrix B
    :return: numpy.ndarray   Matrix product AB
    """
    assert A.shape[1] == B.shape[0]
    
    m, n, p = A.shape[0], A.shape[1], B.shape[1]
    AB = np.zeros((m,p))
    for i in range(m):
        for j in range(p):
            AB[i,j] = np.dot(A[i,:], B[:,j])
    return AB


m, n, p = 4, 3, 2
A = np.random.random((m,n))
B = np.random.random((n,p))
AB = mmult_as_dot(A,B)
expected = A @ B
np.testing.assert_almost_equal(AB, expected, err_msg="mmult_as_dot")

In [3]:
def mmult_as_outer(A,B):
    """
    mmult_as_outer returns the product AB as sum of outer products of columns of A and rows of B

    :param A: numpy.ndarray  Matrix A
    :param B: numpy.ndarray  Matrix B
    :return: numpy.ndarray   Matrix product AB
    """
    assert A.shape[1] == B.shape[0]
    
    m, n, p = A.shape[0], A.shape[1], B.shape[1]
    AB = np.zeros((m,p))
    for k in range(n):
        AB += np.outer(A[:,k], B[k,:])
    return AB


m, n, p = 4, 3, 2
A = np.random.random((m,n))
B = np.random.random((n,p))
AB = mmult_as_outer(A,B)
expected = A @ B
np.testing.assert_almost_equal(AB, expected, err_msg="mmult_as_outer")

In [4]:
def mmult_as_col(A,B):
    """
    mmult_as_col returns the product AB as linear weighted combination of columns of A and B

    :param A: numpy.ndarray  Matrix A
    :param B: numpy.ndarray  Matrix B
    :return: numpy.ndarray   Matrix product AB
    """
    assert A.shape[1] == B.shape[0]
    
    m, n, p = A.shape[0], A.shape[1], B.shape[1]
    AB = np.empty((m,p))
    for k in range(p):
        AB[:,k] = A @ B[:,k]
    return AB


m, n, p = 4, 3, 2
A = np.random.random((m,n))
B = np.random.random((n,p))
AB = mmult_as_col(A,B)
expected = A @ B
np.testing.assert_almost_equal(AB, expected, err_msg="mmult_as_col")

In [5]:
def mmult_as_row(A,B):
    """
    mmult_as_row returns the product AB as linear weighted combination of rows of A and B

    :param A: numpy.ndarray  Matrix A
    :param B: numpy.ndarray  Matrix B
    :return: numpy.ndarray   Matrix product AB
    """
    assert A.shape[1] == B.shape[0]
    
    m, n, p = A.shape[0], A.shape[1], B.shape[1]
    AB = np.empty((m,p))
    for i in range(m):
        AB[i,:] = A[i,:] @ B
    return AB


m, n, p = 4, 3, 2
A = np.random.random((m,n))
B = np.random.random((n,p))
AB = mmult_as_row(A,B)
expected = A @ B
np.testing.assert_almost_equal(AB, expected, err_msg="mmult_as_row")

## 6.2 Multiplication and equations
When performing matrix algebra to both sides of an equation, then matrix order eg pre-multiplication or post-multiplication must be preserved.

$$
\begin{align}
A &= A \\
AB &= AB \quad \text{post-multiply} \\
BA &= BA \quad \text{pre-multiply} \\
\end{align}
$$

## 6.3 Multiplication with diagonals
Pre-multiplication with a diagonal matrix scales the rows of the right matrix by diagonal elements.

$$
\begin{bmatrix}
k_1 & 0 & 0 \\
0 & k_2 & 0 \\
0 & 0 & k_3 \\
\end{bmatrix}
\begin{bmatrix}
a & b & c \\
d & e & f \\
g & h & i \\
\end{bmatrix} =
\begin{bmatrix}
k_1 a & k_1 b & k_1 c \\
k_2 d & k_2 e & k_2 f \\
k_3 g & k_3 h & k_3 i \\
\end{bmatrix}
$$

Post-multiplication with a diagonal matrix scales the columns of the left matrix by diagonal elements.

$$
\begin{bmatrix}
a & b & c \\
d & e & f \\
g & h & i \\
\end{bmatrix}
\begin{bmatrix}
k_1 & 0 & 0 \\
0 & k_2 & 0 \\
0 & 0 & k_3 \\
\end{bmatrix} =
\begin{bmatrix}
k_1 a & k_2 b & k_3 c \\
k_1 d & k_2 e & k_3 f \\
k_1 g & k_2 h & k_3 i \\
\end{bmatrix}
$$

Multiplication of two diagonal matrices is the product of the diagonals.
- Multiplication of two diagonal matrices becomes relevant for eigendecomposition. 

## 6.4 LIVE EVIL
An operation applied to the product of matrices is equal to the product of the operation applied to each matrix but in reverse order.

$$
\begin{align}
(A_1 A_2 \cdots A_N)^T = A_N^T \cdots A_2^T A_1^T \\
(A_1 A_2 \cdots A_N)^{-1} = A_N^{-1} \cdots A_2^{-1} A_1^{-1} \\
\end{align}
$$

## 6.5 Matrix-vector multiplication
Matrix-vector multiplication always produces a vector.
- This is the connection between linear transformations that underpin computer graphics.