# Deconstructing Matrix Multiplication 

Here, we will "deconstruct" Matrix Multiplication, meaning viewing it in terms of individual columns, or rows, or other ways.

## Computing columns of a matrix product

Suppose we had two large matrices $A\in \mathbb{R}^{n\times m}$ and $B\in\mathbb{R}^{m\times p}$ that contain a bunch of information, but we're only interested in computing the $i$th column of the product $AB$. 

A naive way to find this column is to first compute the product $AB$ and then select the $i$th column using slicing in Python. Let's try this approach.

We first define two random matrices $A$ and $B$.

In [None]:
import numpy as np

n, m, p = 1000, 100, 1000

A = np.random.rand(n, m)
B = np.random.randn(m, p)

Let's time how long it takes to compute $AB$ and then select the $i$th column of the product.

In [None]:
import time

i = 20

tic = time.time()
AB = np.dot(A,B)
ith_column = AB[:,i]
print('time taken to compute AB and select the ith column: ', time.time()- tic)

This works, but as we'll see it is not the most effecient way to find the desired column.

Let's write $B$ in block form, representing it in terms of its columns.
$$
B = \begin{pmatrix}|& | && |\\B_{:,1}&  B_{:,2}& \cdots & B_{:,p}\\ |&|&&|\end{pmatrix}
$$
Then the product $AB$ can be written as
$$
AB = A\begin{pmatrix}|& | && |\\B_{:,1}&  B_{:,2}& \cdots & B_{:,p}\\ |&|&&|\end{pmatrix}
$$
From this representation, we see that the $i$th column of $AB$ is really just $AB_{:,i}$ -- or the matrix-vector product of $A$ with the $i$th column of $B$. Therefore, we see that we can compute the $i$th column of $AB$ without having to compute the whole matrix $AB$ first: we can simply select the $i$th column $B_{:,i}$ of $B$, and then apply $A$ to it. Let's try this method, and compare the time with the above method.

In [None]:
tic = time.time()
ith_column_fast = np.dot(A,B[:,i])
print('time taken to compute A*B[:,i]: ', time.time()- tic)

As we can see, this method is much faster -- and as the matrices get larger, this speedup will only become greater. Let's also verify that the two approaches give the same result.

In [None]:
np.allclose(ith_column, ith_column_fast)

This method is easily generalized to selecting a subset of the columns of $AB$. For example, suppose we wanted to select the $1$st $5$th and $11$th columns of $AB$. Then we could multiply $A$ by only the columns $1,5$ and $11$ of $B$. In Python, we can do this with the following code.

In [None]:
cols = [0,4,10]

tic = time.time()
AB = np.dot(A,B)
subset_of_columns_slow = AB[:,cols]
print('time taken to compute AB and select subset of columns: ', time.time()- tic)

tic = time.time()
subset_of_columns_fast = np.dot(A,B[:,cols])
print('time taken to compute A*B[:,cols]: ', time.time()- tic)

And again we can verify that the two approaches give the same result.

In [None]:
np.allclose(subset_of_columns_slow, subset_of_columns_fast)

## Computing rows of a matrix product

Like in the above section with columns, we can also take advantage of the structure of matrix multiplication in computing a single row of a matrix product $AB$. To see this, let's write
$$
A = \begin{pmatrix}- &A_{1,:}^\top & -\\ - &A_{2,:}^\top & -\\ &\vdots&\\ - &A_{n,:}^\top& -\end{pmatrix}
$$
Where $A_{i,:}^\top$ is the $i$th row of $A$. Then if we write out the matrix product $AB$ as
$$
AB = \begin{pmatrix}- &A_{1,:}^\top & -\\ - &A_{2,:}^\top & -\\ &\vdots&\\ - &A_{n,:}^\top& -\end{pmatrix}B
$$
we observe that the $i$th row of $AB$ is given by $A_{i,:}^\top B$. Let's compare this method to the naive approach of computing the full product $AB$ and then selecting the $i$th row.

In [None]:
i = 20

tic = time.time()
AB = np.dot(A,B)
ith_row = AB[i,:]
print('time taken to compute AB and select the ith row: ', time.time()- tic)

tic = time.time()
ith_row_fast = np.dot(A[i,:],B)
print('time taken to compute A[i,:]*B: ', time.time()- tic)

As expected, the method of computing $A_{i,:}^\top B$ is substantially faster than computing $AB$ and then extracting the $i$th row. Let's very that they do indeed give the same results.

In [None]:
np.allclose(ith_row, ith_row_fast)

Likewise, we can follow the same approach as above to select a subset of rows of the product $AB$. For example, if we wanted the 4th, 12th and 20th rows of $AB$, we can do so with the following.

In [None]:
rows = [3, 11, 19]

tic = time.time()
AB = np.dot(A,B)
subset_of_rows_slow = AB[rows,:]
print('time taken to compute AB and select subset of rows: ', time.time()- tic)

tic = time.time()
subset_of_rows_fast = np.dot(A[rows,:],B)
print('time taken to compute A[rows,:]*B: ', time.time()- tic)

And we can verify that the two methods give the same result.

In [None]:
np.allclose(subset_of_rows_slow, subset_of_rows_fast)

For both of the above example (finding columns and finding rows of $AB$), the speedup becomes even more dramatic and we make the matrices larger. This is because we are computing more unnecessary products to find $AB$ as the dimensions get large. You can see this yourself by changing the values of $n,m$ and $p$ in the cells above and re-running the same code given here. In data science, we often encounter very large matrices when working with big datasets, and keeping the structure of operations like matrix multiplication in mind when working with these datasets can save you a great deal of computation time in practice.