# Four ways to look at matrix multiplication

If $A$ is $m\times n$ and $B$ is $n\times p$, then $C=AB$ is an $m\times p$ matrix. We may regard as a definition the formula
\begin{equation}
C_{ij} = \sum_{k=1}^n A_{ik}B_{kj}.
\end{equation}

In [1]:
using LinearAlgebra

In [2]:
m,n,p = 4,5,3;
A = round.(5*rand(m,n));
B = round.(3*rand(n,p).-1);
display(A), display(B);

4×5 Array{Float64,2}:
 1.0  0.0  4.0  1.0  0.0
 3.0  3.0  4.0  3.0  2.0
 1.0  1.0  1.0  3.0  4.0
 4.0  3.0  4.0  1.0  0.0

5×3 Array{Float64,2}:
 1.0  -1.0  -1.0
 0.0  -0.0   1.0
 0.0  -1.0   2.0
 1.0   1.0   1.0
 1.0   2.0   2.0

In [3]:
C = A*B

4×3 Array{Float64,2}:
 2.0  -4.0   8.0
 8.0   0.0  15.0
 8.0   9.0  13.0
 5.0  -7.0   8.0

Here is a literal interpretation of the summation definition for the matrix product. Notice how in Julia, there are "implicit for" loops (aka generators) that can be enclosed in parentheses to generate a list for a command, or in brackets to generate a matrix or vector. 

In [4]:
C_0 = [ sum(A[i,k]*B[k,j] for k=1:n) for i=1:m, j=1:p ]

4×3 Array{Float64,2}:
 2.0  -4.0   8.0
 8.0   0.0  15.0
 8.0   9.0  13.0
 5.0  -7.0   8.0

## Inner products

If the matrices are real, then we can interpret each sum as the inner product between vectors that are of length $n$. In Julia we can use `dot` for the inner product, or the LaTeX symbol $\cdot$, which is entered as `\cdot` followed by the Tab key.

In [5]:
C_1 = [ A[i,:]⋅B[:,j] for i=1:m, j=1:p ]

4×3 Array{Float64,2}:
 2.0  -4.0   8.0
 8.0   0.0  15.0
 8.0   9.0  13.0
 5.0  -7.0   8.0

Note that `A[i,:]` and `B[:,j]` extract one row and one column, respectively. In Julia, each result will be of type `Vector`, and the shape distinction is not preserved, as every vector is simply one-dimensional. (This is unlike MATLAB, where even vectors are regarded as having two dimensions, one with size 1.) 

In [6]:
A[2,:]

5-element Array{Float64,1}:
 3.0
 3.0
 4.0
 3.0
 2.0

In [7]:
size(ans)

(5,)

## Linear combinations of columns

If we express $B$ columnwise, then the matrix product $AB$ can also be expressed columnwise, as
\begin{equation}
AB = \begin{bmatrix} A b_1 & A b_2 & \cdots & A b_p \end{bmatrix}.
\end{equation}

In [8]:
b1 = B[:,1]

5-element Array{Float64,1}:
 1.0
 0.0
 0.0
 1.0
 1.0

In [9]:
display(C[:,1]), display(A*b1);

4-element Array{Float64,1}:
 2.0
 8.0
 8.0
 5.0

4-element Array{Float64,1}:
 2.0
 8.0
 8.0
 5.0

Furthermore, $A$ times a compatible vector is a linear combination of the columns of $A$:
$$
Av = v_1 a_1 + \cdots + v_n a_n.
$$

In [10]:
display(A*b1), display( sum(b1[k]*A[:,k] for k=1:n) );

4-element Array{Float64,1}:
 2.0
 8.0
 8.0
 5.0

4-element Array{Float64,1}:
 2.0
 8.0
 8.0
 5.0

Putting this all together, the full interpretation of $C=AB$ is

In [11]:
C_2 = hcat( ( sum(B[:,j][k]*A[:,k] for k=1:n) for j=1:p )... )

4×3 Array{Float64,2}:
 2.0  -4.0   8.0
 8.0   0.0  15.0
 8.0   9.0  13.0
 5.0  -7.0   8.0

Of course, there is no reason to go through all that in practice. 

## Linear combinations of rows

This is the dual of the previous version:

$$
AB = \begin{bmatrix} a_1^T B \\ \vdots \\ a_m^T B \end{bmatrix}
$$

We put the transposes in because we want to have all named vectors be column vectors. Thus, each row of $A$ has to have a transpose on it. Note also that transposing a Julia vector will create a "row vector": 

In [12]:
a3T = A[3,:]'

1×5 Adjoint{Float64,Array{Float64,1}}:
 1.0  1.0  1.0  3.0  4.0

Thus, in the third row of the product:

In [13]:
display(C[3,:]), display(a3T*B);

3-element Array{Float64,1}:
  8.0
  9.0
 13.0

1×3 Adjoint{Float64,Array{Float64,1}}:
 8.0  9.0  13.0

These are the same vector, although the second version thinks of it as having row shape. Furthermore, each such vector-matrix product is a linear combination of the rows of $B$,

In [14]:
display(a3T*B), display( sum(a3T[k]*B[k,:] for k=1:n) );

1×3 Adjoint{Float64,Array{Float64,1}}:
 8.0  9.0  13.0

3-element Array{Float64,1}:
  8.0
  9.0
 13.0

Finally, doing this for all the rows of $A$, we have another identity for the product $AB$.

In [15]:
C_3 = vcat( ( sum(A[i,:][k]*B[k,:]' for k=1:n) for i=1:m )... )

4×3 Array{Float64,2}:
 2.0  -4.0   8.0
 8.0   0.0  15.0
 8.0   9.0  13.0
 5.0  -7.0   8.0

## Outer products

This form might be the most surprising, and it leads to some interesting perspectives. The outer product between two vectors is the matrix formed by all possible products of pairs of elements from them. 

In [16]:
outer(u,v) = [ u[i]v[j] for i=1:length(u), j=1:length(v) ];

In [17]:
outer(A[:,2],B[4,:])

4×3 Array{Float64,2}:
 0.0  0.0  0.0
 3.0  3.0  3.0
 1.0  1.0  1.0
 3.0  3.0  3.0

The outer product is consistent with the definition of the matrix product $uv^T$. Note this is a *column* times a *row*, which is the reverse of the inner product, nor do the vectors even need to have the same length.

In [18]:
A[:,2]*B[4,:]'

4×3 Array{Float64,2}:
 0.0  0.0  0.0
 3.0  3.0  3.0
 1.0  1.0  1.0
 3.0  3.0  3.0

Because each column of $uv^T$ is a multiple of $u$ (or equivalently, each row is a multiple of $v^T$), any nonzero outer product has rank 1. 

An interesting identity is that a general matrix product can be written as a sum of outer products:
$$
AB = \sum_{k=1}^n a_k b_k^T,
$$
where we are writing $A$ by its columns and $B$ by its rows. 

In [19]:
C_4 = sum( A[:,k]*B[k,:]' for k=1:n )

4×3 Array{Float64,2}:
 2.0  -4.0   8.0
 8.0   0.0  15.0
 8.0   9.0  13.0
 5.0  -7.0   8.0