$$\newcommand{\F}{\mathbb{F}}
\newcommand{\R}{\mathbb{R}}
\newcommand{\v}{\mathbf{v}}
\newcommand{\a}{\mathbf{a}}
\newcommand{\b}{\mathbf{b}}
\newcommand{\c}{\mathbf{c}}
\newcommand{\w}{\mathbf{w}}
\newcommand{\u}{\mathbf{u}}
\newcommand{\x}{\mathbf{x}}
\newcommand{\y}{\mathbf{y}}
\newcommand{\z}{\mathbf{z}}
\newcommand{\0}{\mathbf{0}}
\newcommand{\1}{\mathbf{1}}
\newcommand{\A}{\mathbf{A}}
\newcommand{\B}{\mathbf{B}}
\newcommand{\C}{\mathbf{C}}$$

## Table of Contents

- [Matrix Multiplication](#matrix-multiplication)
  - [Validity of Matrix Multiplication](#validity-of-matrix-multiplication)
  - [Basic Formula for Matrix Multiplication](#basic-formula-for-matrix-multiplication)
  - [Matrix-Vector Multiplication](#matrix-vector-multiplication)
    - [Right Multiplication (Linear Combination of Columns)](#right-multiplication-linear-combination-of-columns)
      - [Example (Right Multiplication)](#example-right-multiplication)
      - [Example (Linear Transformation)](#example-linear-transformation)
      - [Definition (Right Multiplication)](#definition-right-multiplication)
      - [Linear Equations (Right Multiplication)](#linear-equations-right-multiplication)
      - [Column Space (Right Multiplication)](#column-space-right-multiplication)
    - [Left Multiplication (Linear Combination of Rows)](#left-multiplication-linear-combination-of-rows)
      - [Example (Left Multiplication)](#example-left-multiplication)
  - [The Four Interpretations of Matrix Multiplication](#the-four-interpretations-of-matrix-multiplication)
    - [Element Wise Matrix Multiplication](#element-wise-matrix-multiplication)
      - [Significance (Element Wise Matrix Multiplication)](#significance-element-wise-matrix-multiplication)
    - [Outer Product Wise Matrix Multiplication](#outer-product-wise-matrix-multiplication)
    - [Matrix Multiplication using Right Multiplication (Columns)](#matrix-multiplication-using-right-multiplication-columns)
      - [Significance](#significance)
    - [Matrix Multiplication using Right Multiplication (Rows)](#matrix-multiplication-using-right-multiplication-rows)
      - [Significance](#significance-1)
  - [Matrix Multiplication Properties](#matrix-multiplication-properties)
  - [Symmetric Matrices](#symmetric-matrices)
- [Matrix Multiplication (Naive) in Python](#matrix-multiplication-naive-in-python)
- [References](#references)

!!! summary "Learning Objectives"
    - Matrix Multiplication

## Matrix Multiplication

### Validity of Matrix Multiplication


If you have dabbled in deep learning before, then the dreaded error `shape mismatch` is **omnipresent**. This is because matrix multiplication is only defined for matrices of a certain shape.

If $\A$ is an $m \times n$ matrix and $\B$ is an $n \times p$ matrix, then $\A\B$ is well defined because the **columns** of $\A$ is equals to the **rows** of $\B$. If this is not true, then the "shape is mismatched". Consequently, if the matrix multiplication is well defined, then $\C = \A\B$ has shape (size) of $m \times p$.

### Basic Formula for Matrix Multiplication

Let us go through the most basic formula for matrix multiplication. Quoting from Wikipedia[^matrix_multiplication]:

---

Let 

$$\A=\begin{bmatrix}
 a_{11} & a_{12} & \cdots & a_{1n} \\
 a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
 a_{m1} & a_{m2} & \cdots & a_{mn} \\
\end{bmatrix},\quad\mathbf{B}=\begin{bmatrix}
 b_{11} & b_{12} & \cdots & b_{1p} \\
 b_{21} & b_{22} & \cdots & b_{2p} \\
\vdots & \vdots & \ddots & \vdots \\
 b_{n1} & b_{n2} & \cdots & b_{np} \\
\end{bmatrix}$$

then the **matrix product** $\C = \A\B$ is defined to be the $m \times p$ matrix:

$$\mathbf{C}=\A\B=\begin{bmatrix}
 c_{11} & c_{12} & \cdots & c_{1p} \\
 c_{21} & c_{22} & \cdots & c_{2p} \\
\vdots & \vdots & \ddots & \vdots \\
 c_{m1} & c_{m2} & \cdots & c_{mp} \\
\end{bmatrix}$$

where $c_{ij} = a_{i1}b_{1j} + a_{i2}b_{2j} +\cdots + a_{in}b_{nj}= \sum_{k=1}^n a_{ik}b_{kj}$ for $i = 1, \cdots , m$ and $j = 1, \cdots , p$.

That is, the entry $c_{i,j}$ of the product is obtained by multiplying term-by-term the entries of the $i$-th row of $\A$ and the $j$-th column of $\B$, and summing these $n$ products. In other words, $c_{i,j}$ is the **dot product** of the $i$-th row of $\A$ and the $j$-th column of $\B$.

Therefore, $\C = \A\B$ can also be written as

$$\mathbf{C}=\A\B=\begin{bmatrix}
 a_{11}b_{11} +\cdots + a_{1n}b_{n1} & a_{11}b_{12} +\cdots + a_{1n}b_{n2} & \cdots & a_{11}b_{1p} +\cdots + a_{1n}b_{np} \\
 a_{21}b_{11} +\cdots + a_{2n}b_{n1} & a_{21}b_{12} +\cdots + a_{2n}b_{n2} & \cdots & a_{21}b_{1p} +\cdots + a_{2n}b_{np} \\
\vdots & \vdots & \ddots & \vdots \\
 a_{m1}b_{11} +\cdots + a_{mn}b_{n1} & a_{m1}b_{12} +\cdots + a_{mn}b_{n2} & \cdots & a_{m1}b_{1p} +\cdots + a_{mn}b_{np} \\
\end{bmatrix} 
$$

Thus the product $\A\B$ is defined if and only if the number of columns in $\A$ equals the number of rows in $\B$.


[^matrix_multiplication]: [https://en.wikipedia.org/wiki/Matrix_multiplication](https://en.wikipedia.org/wiki/Matrix_multiplication)

### Matrix-Vector Multiplication

Before we go into **Matrix-Matrix** Multiplication, it is important to understand how **Matrix-Vector** Multiplication. Take a mental note on the usage of **linear combination** here.

We will be referencing heavily from the article written by Eli Bendersky[^visualizing-matrix-multiplication-as-a-linear-combination].

[^visualizing-matrix-multiplication-as-a-linear-combination]: [https://eli.thegreenplace.net/2015/visualizing-matrix-multiplication-as-a-linear-combination/](https://eli.thegreenplace.net/2015/visualizing-matrix-multiplication-as-a-linear-combination/)

#### Right Multiplication (Linear Combination of Columns)

##### Example (Right Multiplication)

We motivate this with an example.

Given a 3 by 3 matrix $A = \begin{bmatrix} x_1 & y_1 & z_1 \\ x_2 & y_2 & z_2 \\ x_3 & y_3 & z_3 \\ \end{bmatrix}$ and $\x = \begin{bmatrix} a \\ b \\ c \\ \end{bmatrix}$ then 

$$\A\x = \begin{bmatrix} x_1 & y_1 & z_1 \\ x_2 & y_2 & z_2 \\ x_3 & y_3 & z_3 \\ \end{bmatrix} \begin{bmatrix} a \\ b \\ c \\ \end{bmatrix} = \begin{bmatrix} ax_1+by_1+cz_1 \\ ax_2+by_2+cz_2 \\ ax_3+by_3+cz_3 \\ \end{bmatrix}$$

But notice that the above can also be written as:

$$\A\x = \begin{bmatrix} x_1 & y_1 & z_1 \\ x_2 & y_2 & z_2 \\ x_3 & y_3 & z_3 \\ \end{bmatrix} \begin{bmatrix} a \\ b \\ c \\ \end{bmatrix} = \begin{bmatrix} ax_1+by_1+cz_1 \\ ax_2+by_2+cz_2 \\ ax_3+by_3+cz_3 \\ \end{bmatrix} 
= \color{red}{a}\begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ \end{bmatrix} + \color{green}{b}\begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ \end{bmatrix} + \color{blue}{c}\begin{bmatrix} z_1 \\ z_2 \\ z_3 \\ \end{bmatrix}$$

**This above expression is called: The matrix $\A$ acts on the vector $\x$ and the output is a linear combination of the columns of the matrix $\A$**.

##### Example (Linear Transformation)

We won't be going through the formal definition yet, but one could directly see as a consequence of the previous example.

---

Given a 3 by 3 matrix $A = \begin{bmatrix} 1 & 2 & 3 \\ 2 & 4 & 6 \\ 3 & 6 & 9 \\ \end{bmatrix}$ and $\x = \begin{bmatrix} 1 \\ 2 \\ 4 \\ \end{bmatrix}$ then the geometric meaning of $\A\x$ can be defined by a series of "linear transformations" categorized by scaling the first column of $\A$ by 1, then add the result to 2 times of the second column of $\A$, and add the result to 4 times of the third column of $\A$.

##### Definition (Right Multiplication)

Given a $m \times n$ matrix 
$\A=\begin{bmatrix}
 a_{11} & a_{12} & \cdots & a_{1n} \\
 a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
 a_{m1} & a_{m2} & \cdots & a_{mn} \\
\end{bmatrix}$ 
and a column vector 
$\x=\begin{bmatrix}
 x_{1} \\ x_{2} \\ \vdots \\ x_{n}\end{bmatrix}$ then 
 
$$\A\x = \begin{bmatrix}
 a_{11} & a_{12} & \cdots & a_{1n} \\
 a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
 a_{m1} & a_{m2} & \cdots & a_{mn} \\
\end{bmatrix}\begin{bmatrix}
 x_{1} \\ x_{2} \\ \vdots \\ x_{n}\end{bmatrix} 
= \color{red}{x_1}\begin{bmatrix} a_{11} \\ a_{12} \\ \vdots \\ a_{m1} \\ \end{bmatrix} + \color{green}{x_2}\begin{bmatrix} a_{12} \\ a_{22} \\ \vdots \\ a_{m2}  \\ \end{bmatrix} + \cdots +  \color{blue}{x_n}\begin{bmatrix} a_{1n} \\ a_{2n} \\ \vdots \\ a_{mn} \\ \end{bmatrix}$$ 

---

If the formula looks daunting, just remember that $\A\x$ gives nothing but the **linear combination** of the columns of $\A$ with values of $\x$ as coefficients.

##### Linear Equations (Right Multiplication)

This is a very important realization that one must have, I will mention it right here first and will repeat it throughout.

Given a 3 by 3 matrix $A = \begin{bmatrix} x_1 & y_1 & z_1 \\ x_2 & y_2 & z_2 \\ x_3 & y_3 & z_3 \\ \end{bmatrix}$ and $\x = \begin{bmatrix} a \\ b \\ c \\ \end{bmatrix}$ then 

$$\A\x = \begin{bmatrix} x_1 & y_1 & z_1 \\ x_2 & y_2 & z_2 \\ x_3 & y_3 & z_3 \\ \end{bmatrix} \begin{bmatrix} a \\ b \\ c \\ \end{bmatrix} = \begin{bmatrix} ax_1+by_1+cz_1 \\ ax_2+by_2+cz_2 \\ ax_3+by_3+cz_3 \\ \end{bmatrix} 
= \color{red}{a}\begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ \end{bmatrix} + \color{green}{b}\begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ \end{bmatrix} + \color{blue}{c}\begin{bmatrix} z_1 \\ z_2 \\ z_3 \\ \end{bmatrix}$$

---

In the examples above, $\A$ and $\x$ are known and we want to find the unknown $\b$ which is the product of $\A\x$.

> Now, if we know $\A$ and $\b$ and wish to find $\x$ that solves the **linear equation/system** $\A\x = \b$ instead, what can we understand from the above?

Since we know $\b = \A\x$ is a linear combination of the columns of $\A$ with values of $\x$ as coefficients. Then if we want to solve for $\x$, we ask ourselves what **combination** of the columns of $\A = \begin{bmatrix} \x & \y & \z \end{bmatrix}$ gives rise to the vector $\b$? If we can find the **combination** $a, b, c$, we can recover  $\x = \begin{bmatrix} a \\ b \\ c \\ \end{bmatrix}$

##### Column Space (Right Multiplication)

A **column space** of a matrix $\A$ is just the set of linear combination of the columns of $\A$, and is a subspace. We will go through it in more details, but for now, I want to introduce this idea first.

As a consequence of the example prior, one should realize two things:

- $\A\x = \b$ may not always have a solution $\x$.
- If $\A\x = \b$ has a solution $\x$, then the product $\b$ must be a **linear combination** of the columns of $\A$. This has **important consequences** later on. For now, just know that if $\A\x = \b$ has a solution, then $\b$ resides in the **column space** of $\A$.

#### Left Multiplication (Linear Combination of Rows)

This part is also necessary to better understand matrix multiplication later.

##### Example (Left Multiplication)

We motivate this with an example.

Given a 3 by 3 matrix $A = \begin{bmatrix} x_1 & y_1 & z_1 \\ x_2 & y_2 & z_2 \\ x_3 & y_3 & z_3 \\ \end{bmatrix}$ and $\x = \begin{bmatrix} a & b & c \end{bmatrix}$ then 

$$\x\A = \begin{bmatrix} a & b & c \end{bmatrix} \begin{bmatrix} x_1 & y_1 & z_1 \\ x_2 & y_2 & z_2 \\ x_3 & y_3 & z_3 \\ \end{bmatrix} = \begin{bmatrix} ax_1+bx_2+cx_3 & ay_1+by_2+cy_3 & az_1+bz_2+cz_3 \end{bmatrix}$$

But notice that the above can also be written as:

$$\x\A = \begin{bmatrix} x_1 & y_1 & z_1 \\ x_2 & y_2 & z_2 \\ x_3 & y_3 & z_3 \\ \end{bmatrix} = \begin{bmatrix} ax_1+bx_2+cx_3 & ay_1+by_2+cy_3 & az_1+bz_2+cz_3 \end{bmatrix}
= \color{red}{a}\begin{bmatrix} x_1 & y_1 & z_1 \end{bmatrix} + \color{green}{b}\begin{bmatrix} x_2 & y_2 & z_2 \end{bmatrix} + \color{blue}{c}\begin{bmatrix} x_3 & y_3 & z_3 \end{bmatrix}$$

**Notice that now $\x\A$ is just a linear combination of the rows of $\A$.**

### The Four Interpretations of Matrix Multiplication

#### Element Wise Matrix Multiplication

We have mentioned in the previous section. Here we repeat again for the sake of modularity.

---

Let 

$$\A=\begin{bmatrix}
 a_{11} & a_{12} & \cdots & a_{1n} \\
 a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
 a_{m1} & a_{m2} & \cdots & a_{mn} \\
\end{bmatrix},\quad\mathbf{B}=\begin{bmatrix}
 b_{11} & b_{12} & \cdots & b_{1p} \\
 b_{21} & b_{22} & \cdots & b_{2p} \\
\vdots & \vdots & \ddots & \vdots \\
 b_{n1} & b_{n2} & \cdots & b_{np} \\
\end{bmatrix}$$

then the **matrix product** $\C = \A\B$ is defined to be the $m \times p$ matrix:

$$\mathbf{C}=\A\B=\begin{bmatrix}
 c_{11} & c_{12} & \cdots & c_{1p} \\
 c_{21} & c_{22} & \cdots & c_{2p} \\
\vdots & \vdots & \ddots & \vdots \\
 c_{m1} & c_{m2} & \cdots & c_{mp} \\
\end{bmatrix}$$

where $c_{ij} = a_{i1}b_{1j} + a_{i2}b_{2j} +\cdots + a_{in}b_{nj}= \sum_{k=1}^n a_{ik}b_{kj}$ for $i = 1, \cdots , m$ and $j = 1, \cdots , p$.

That is, the entry $c_{i,j}$ of the product is obtained by multiplying term-by-term the entries of the $i$-th row of $\A$ and the $j$-th column of $\B$, and summing these $n$ products. In other words, $c_{i,j}$ is the **dot product** of the $i$-th row of $\A$ and the $j$-th column of $\B$.

Therefore, $\C = \A\B$ can also be written as

$$\mathbf{C}=\A\B=\begin{bmatrix}
 a_{11}b_{11} +\cdots + a_{1n}b_{n1} & a_{11}b_{12} +\cdots + a_{1n}b_{n2} & \cdots & a_{11}b_{1p} +\cdots + a_{1n}b_{np} \\
 a_{21}b_{11} +\cdots + a_{2n}b_{n1} & a_{21}b_{12} +\cdots + a_{2n}b_{n2} & \cdots & a_{21}b_{1p} +\cdots + a_{2n}b_{np} \\
\vdots & \vdots & \ddots & \vdots \\
 a_{m1}b_{11} +\cdots + a_{mn}b_{n1} & a_{m1}b_{12} +\cdots + a_{mn}b_{n2} & \cdots & a_{m1}b_{1p} +\cdots + a_{mn}b_{np} \\
\end{bmatrix} 
$$

Thus the product $\A\B$ is defined if and only if the number of columns in $\A$ equals the number of rows in $\B$.

---

##### Significance (Element Wise Matrix Multiplication)

From **Mike X Cohen: Linear Algebra: Theory, Intuition, Code, 2021. (pp. 144)**, he mentioned the following:

- **The diagonal of $\C$ contains dot products between row $i$ and column $j$ of $\A$ and $\B$ respectively, this is relevant in data covariance matrices.**
- **The lower triangle of $\C$ contains dot products between row $i$ in $\A$ and column $j$ in $\B$ where $i > j$. The upper triangle of $\C$ contains dot products between row $i$ in $\A$ and column $j$ in $\B$ where $i < j$. Both are important in matrix decompositions, such as QR decomposition and generalized eigendecomposition**.

#### Outer Product Wise Matrix Multiplication

To simplify notation, we denote:

$$\A = \begin{bmatrix} \a_1 & \a_2 \cdots & \a_n \end{bmatrix}, \quad \B = \begin{bmatrix} \b_1^\top \\ \b_2^\top \\ \cdots \\ \b_n^\top \end{bmatrix}$$ as shorthand where $\a_i$ is the $i$-th column of $\A$ and $\b_j^\top$ is the $j$-th row of $\B$. Then 

$$\A\B = \a_1\b_1^\top + \a_2\b_2^\top + \cdots + \a_n\b_n^\top$$

> Notice that in each of the $\a_i\b_i^\top$, the columns form a
dependent set (the same can be said of the rows). However, the
sum of these singular matrices—the product matrix—has columns
that form a linearly independent set. Each of these matrices is rank-1 (to be defined later). This forms the basis for the Singular Value Decomposition. - **Mike X Cohen: Linear Algebra: Theory, Intuition, Code, 2021. (pp. 145)**

#### Matrix Multiplication using Right Multiplication (Columns)

Using back the notation in the **Element Wise Matrix Multiplication**, we can define 

$$\A\B = \A \begin{bmatrix} \b_1 & \b_2 & \cdots & \b_p \end{bmatrix} = \begin{bmatrix} \A\b_1 & \A\b_2 & \cdots & \A\b_p \end{bmatrix}$$

where $\b_i$ is the column $i$ of the matrix $\B$.

This means that each column of the matrix $\C = \A\B$ is defined by $\A\b_i$, and recall in the section "Matrix-Vector Right Multiplication", $\A\b_i$ means a linear combination of the columns of $\A$ with weight coefficients in $\b_i$. 

---

They say a picture is worth a thousand words. The below images are taken from Eli Bendersky's website [here](https://eli.thegreenplace.net/2015/visualizing-matrix-multiplication-as-a-linear-combination/).


<img src="https://storage.googleapis.com/reighns/reighns_ml_projects/docs/linear_algebra/visualizing-matrix-multiplication-as-a-linear-combination-column-perspective.PNG" style="margin-left:auto; margin-right:auto"/>
<p style="text-align: center">
    <b>Matrix Multiplication, Column Perspective; Courtesy of Eli Bendersky</b>
</p>


##### Significance

> The column perspective of matrix multiplication is useful in statistics, when the columns of the left matrix contain a set of regressors (a simplified model of the data), and the right matrix contains coefficients. The coefficients encode the importance of each regressor, and the goal of statistical model-fitting is to find the best coefficients such that the weighted combination of regressors matches the data.  - **Mike X Cohen: Linear Algebra: Theory, Intuition, Code, 2021. (pp. 146)**

#### Matrix Multiplication using Left Multiplication (Rows)

Using back the notation in the **Element Wise Matrix Multiplication**, we can define 

$$\A\B = \begin{bmatrix}\a_1 \\ \a_2 \\  \vdots \\ \a_m \end{bmatrix}\B = \begin{bmatrix}\a_1\B \\ \a_2\B \\  \vdots \\ \a_m\B \end{bmatrix}$$

where $\a_i$ is the row $i$ of the matrix $\A$. This means that each row of the matrix $\C = \A\B$ is defined by $\a_i\B$, and recall in the section "Matrix-Vector Left Multiplication", $\a_i\B$ means a linear combination of the row of $\B$ with weight coefficients in $\a_i$. This becomes apparent when we come to the chapter of **Row-Echolon Form**. 

---

They say a picture is worth a thousand words. The below images are taken from Eli Bendersky's website [here](https://eli.thegreenplace.net/2015/visualizing-matrix-multiplication-as-a-linear-combination/).


<img src="https://storage.googleapis.com/reighns/reighns_ml_projects/docs/linear_algebra/visualizing-matrix-multiplication-as-a-linear-combination-row-perspective.PNG" style="margin-left:auto; margin-right:auto"/>
<p style="text-align: center">
    <b>Matrix Multiplication, Row Perspective; Courtesy of Eli Bendersky</b>
</p>

##### Significance

The row perspective is useful, for example in principal components analysis, where the rows of the right matrix contain data (observations in rows and features in columns) and the rows of the left matrix contain weights for combining the features. Then the weighted sum of data creates the principal component scores.  - **Mike X Cohen: Linear Algebra: Theory, Intuition, Code, 2021. (pp. 147)**

### Matrix Multiplication Properties

Read [Wikipedia](https://en.wikipedia.org/wiki/Matrix_multiplication#General_properties). To fill in when free as it is relatively light.

### Symmetric Matrices

## Matrix Multiplication (Naive) in Python

For a **Divide-and-conquer** method, see here[^solvay_strassen_algorithm].

[^solvay_strassen_algorithm]: [solvay_strassen_algorithm](https://www.baeldung.com/cs/matrix-multiplication-algorithms)

In [133]:
import os
import random
import numpy as np
import torch

def seed_all(seed: int = 19921930) -> None:
    """Seed all random number generators."""
    print(f"Using Seed Number {seed}")

    os.environ["PYTHONHASHSEED"] = str(
        seed
    )  # set PYTHONHASHSEED env var at fixed value
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.cuda.manual_seed(seed)  # pytorch (both CPU and CUDA)
    np.random.seed(seed)  # for numpy pseudo-random generator
    # set fixed value for python built-in pseudo-random generator
    random.seed(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    torch.backends.cudnn.enabled = False
    
seed_all()

Using Seed Number 19921930


In [134]:
import torch
import numpy as np
from typing import Tuple, List

A = np.random.randint(0, 10, size=(3, 2))
B = np.random.randint(0, 10, size=(2, 2))

### Pseudo Code

Let us run through the matrix multiplication between two simple matrices $\A$ and $\B$. We define their product $\C = \A\B$ to be a matrix of shape $(3, 2)$.

- Take first row of $\A$ and first column of $\B$, compute the dot product of both and assign the value to the first entry of $\C$ (i.e. $\C_{1, 1}$).
- Take first row of $\A$ and second column of $\B$, compute the dot product of both and assign the value to the first entry of $\C$ (i.e. $\C_{1, 2}$).
- Take second row of $\A$ and first column of $\B$, compute the dot product of both and assign the value to the first entry of $\C$ (i.e. $\C_{2, 1}$).
- Take second row of $\A$ and second column of $\B$, compute the dot product of both and assign the value to the first entry of $\C$ (i.e. $\C_{2, 2}$).
- Take third row of $\A$ and first column of $\B$, compute the dot product of both and assign the value to the first entry of $\C$ (i.e. $\C_{3, 1}$).
- Take third row of $\A$ and second column of $\B$, compute the dot product of both and assign the value to the first entry of $\C$ (i.e. $\C_{3, 2}$).

---

We have seen the above in laymen english. We can easily convert the above to pseudo-code, but first we need to recognize that the above method is the "Element-wise" method, where the outer loop is rows of $\A$, and the inner loop is columns of $\B$.

- Input: matrices $\A$ and $\B$ of size $(m, n)$ and $(n, p)$ respectively.
- Initialize matrix $\C$ to be the product of $\A\B$ with the correct shape.
- For row i from 1 to m:
    * For col j from 1 to p:
        * compute dot product of row i of A and col j of B: `dot`
        * assign the dot product to entry $C_{i, j}$
- return $\C$

We have removed a layer of abstraction above, and that is the computation of the dot product, in which we have coded up previously. If there is no abstraction, then:

- Input: matrices $\A$ and $\B$ of size $(m, n)$ and $(n, p)$ respectively.
- Initialize matrix $\C$ to be the product of $\A\B$ with the correct shape.
- For row i from 1 to m:
    * Initialize summation = 0
    * For col j from 1 to p:
        * for k from 1 to n:
            * summation += dot product of row i of A and col j of B
            * assign the dot product summation to entry $C_{i, j}$
- return $\C$

### Python Code

In [135]:
def linear_combination_vectors(
    weights: List[float], *args: np.ndarray
) -> np.ndarray:
    """Computes the linear combination of vectors.

    Args:
        weights (List[float]): The set of weights corresponding to each vector.

    Returns:
        linear_weighted_sum (np.ndarray): The linear combination of vectors.

    Examples:
        >>> v1 = np.asarray([1, 2, 3, 4, 5]).reshape(-1, 1)
        >>> v2 = np.asarray([2, 4, 6, 8, 10]).reshape(-1, 1)
        >>> v3 = np.asarray([3, 6, 9, 12, 15]).reshape(-1, 1)
        >>> weights = [10, 20, 30]
        >>> linear_combination_vectors([10, 20, 30], v1, v2, v3)
    """
    
    linear_weighted_sum = np.zeros(shape=args[0].shape)
    for weight, vec in zip(weights, args):

        linear_weighted_sum += weight * vec
    return linear_weighted_sum



def dot_product(v1: np.ndarray, v2: np.ndarray) -> float:
    """Computes the dot product of two vectors.

    We assume both vectors are flattened, i.e. they are 1D arrays.

    Args:
        v1 (np.ndarray): The first vector.
        v2 (np.ndarray): The second vector.

    Returns:
        dot_product_v1_v2 (float): The dot product of two vectors.

    Examples:
        >>> v1 = np.asarray([1, 2, 3, 4, 5])
        >>> v2 = np.asarray([2, 4, 6, 8, 10])
        >>> dot_product(v1, v2)
    """

    v1, v2 = np.asarray(v1).flatten(), np.asarray(v2).flatten()

    dot_product_v1_v2 = 0
    for element_1, element_2 in zip(v1, v2):
        dot_product_v1_v2 += element_1 * element_2

    # same as np.dot but does not take into the orientation of vectors
    assert dot_product_v1_v2 == np.dot(v1.T, v2)

    return dot_product_v1_v2

In [136]:
def get_matmul_shape(A: np.ndarray, B: np.ndarray) -> Tuple[int, int, int]:
    """Check if the shape of the matrices A and B are compatible for matrix multiplication.

    If A and B are of size (m, n) and (n, p), respectively, then the shape of the resulting matrix is (m, p).

    Args:
        A (np.ndarray): The first matrix.
        B (np.ndarray): The second matrix.

    Raises:
        ValueError: Raises a ValueError if the shape of the matrices A and B are not compatible for matrix multiplication.

    Returns:
        (Tuple[int, int, int]): (m, n, p) where (m, n) is the shape of A and (n, p) is the shape of B.
    """

    if A.shape[1] != B.shape[0]:
        raise ValueError(
            f"The number of columns of A must be equal to the number of rows of B, but got {A.shape[1]} and {B.shape[0]} respectively."
        )

    return (A.shape[0], A.shape[1], B.shape[1])


def np_matmul_naive(A: np.ndarray, B: np.ndarray) -> np.ndarray:
    """Computes the matrix multiplication of two matrices.

    Args:
        A (np.ndarray): The first matrix.
        B (np.ndarray): The second matrix.

    Returns:
        matmul (np.ndarray): The matrix multiplication of two matrices.
    """

    num_rows_A, common_index, num_cols_B = check_matmul_shape(A, B)

    matmul = np.zeros(shape=(num_rows_A, num_cols_B))

    # 1st loop: loops through first matrix A
    for i in range(num_rows_A):
        summation = 0
        # 2nd loop: loops through second matrix B
        for j in range(num_cols_B):
            # 3rd loop: computes dot prod
            for k in range(common_index):
                summation += A[i, k] * B[k, j]
                matmul[i, j] = summation

    return matmul


def np_matmul_element_wise(A: np.ndarray, B: np.ndarray) -> np.ndarray:
    """Computes the matrix multiplication of two matrices using element wise method.

    Args:
        A (np.ndarray): The first matrix.
        B (np.ndarray): The second matrix.

    Returns:
        matmul (np.ndarray): The matrix multiplication of two matrices.
    """

    num_rows_A, _, num_cols_B = check_matmul_shape(A, B)

    matmul = np.zeros(shape=(num_rows_A, num_cols_B))

    # 1st loop: loops through first matrix A
    for row_i in range(num_rows_A):
        # 2nd loop: loops through second matrix B
        for col_j in range(num_cols_B):
            # computes dot product of row i with column j of B and
            # assign the result to the element of the matrix matmul at row i and column j.
            matmul[row_i, col_j] = dot_product(A[row_i, :], B[:, col_j])

    return matmul


def np_matmul_column_wise(A: np.ndarray, B: np.ndarray) -> np.ndarray:
    """Computes the matrix multiplication of two matrices using column wise method.

    Recall the section on Matrix Multiplication using Right Multiplication.

    Column i of C is represented by: Ab_i

    Args:
        A (np.ndarray): The first matrix.
        B (np.ndarray): The second matrix.

    Returns:
        matmul (np.ndarray): The matrix multiplication of two matrices.
    """
    num_rows_A, _, num_cols_B = check_matmul_shape(A, B)

    matmul = np.zeros(shape=(num_rows_A, num_cols_B))

    # we just need to populate the columns of C

    for col_i in range(matmul.shape[1]):
        # b_i
        col_i_B = B[:, col_i]
        # Ab_i
        linear_comb_A_on_col_i_B = linear_combination_vectors(col_i_B, *A.T)
        # C_i = Ab_i
        matmul[:, col_i] = linear_comb_A_on_col_i_B

    return matmul

In [137]:
# assert the function output is similar to numpy's A @ B
np.allclose(np_matmul_element_wise(A, B), np_matmul_column_wise(A, B), A @ B)

True

### Time Complexity

The time complexity can be easily found in the pseudo-code.

- Input: matrices $\A$ and $\B$ of size $(m, n)$ and $(n, p)$ respectively.
- Initialize matrix $\C$ to be the product of $\A\B$ with the correct shape.
- For row i from 1 to m:
    * Initialize summation = 0
    * For col j from 1 to p:
        * for k from 1 to n:
            * summation += dot product of row i of A and col j of B
            * assign the dot product summation to entry $C_{i, j}$
- return $\C$

In rough terms, the first outer loop has $m$ operations, the second loop has $p$ operations and last loop has $n$ operations; we further assume each operation is $\mathcal{O}(1)$ and thus we have a total of $m \times n \times p$ loops (operations), so the time complexity is $$\mathcal{O}(mnp) \approx \mathcal{O}(n^3)$$ 
if $m \approx n \approx p$.

## References

- https://eli.thegreenplace.net/2015/visualizing-matrix-multiplication-as-a-linear-combination/
- https://math.stackexchange.com/questions/192835/fastest-and-intuitive-ways-to-look-at-matrix-multiplication
- https://math.stackexchange.com/questions/24456/matrix-multiplication-interpreting-and-understanding-the-process