# Matrix Algebra
## Basic Matrix Operations

Like vectors, matrices have their own set of algebraic operations. In this mission, we'll learn the core matrix operations and build up to using some of them to solve the matrix equation. Let's first start with matrix addition and subtraction.

If you recall from the previous mission, a matrix consists of one or more column vectors.

![](https://s3.amazonaws.com/dq-content/162/matrix_vector_decomposition.svg)

Because of that, the operations from vectors also carry over to matrices. We could perform vector addition and subtraction between vectors with the same number of rows. We can perform matrix addition and subtraction between matrices containing the same number of rows and columns.

![](https://s3.amazonaws.com/dq-content/162/valid_matrix_sums.svg)

As with vectors, matrix addition and subtraction works by distributing the operations across the specific elements and combining them.

![](https://s3.amazonaws.com/dq-content/162/matrix_addition.svg)

Lastly, we can also multiply a matrix by a scalar value, just like we can with a vector.

![](https://s3.amazonaws.com/dq-content/162/matrix_scalar_multiplication.svg)

Let's practice applying these operators using NumPy.

## Matrix Vector Multiplication

The matrix equation we discussed briefly in the last mission is an example of matrix-vector multiplication. When we multiply a **matrix by a vector**, we are essentially combining each row in the matrix with the column vector.

![](https://s3.amazonaws.com/dq-content/162/matrix_vector_multiplication.svg)

To multiply a matrix by a vector, the number of columns in the matrix needs to match the number of rows in the vector.

![](https://s3.amazonaws.com/dq-content/162/valid_matrix_products.svg)

To multiply a matrix with a vector in NumPy, we need to use the [numpy.dot()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html) function.

In [1]:
import numpy as np

In [2]:
matrix_a = np.asarray([
    [0.7, 3, 9],
    [1.7, 2, 9],
    [0.7, 9, 2]
])

vector_b = np.asarray([
    [1],
    [2],
    [1]
])

ab_product = np.dot(matrix_a, vector_b)
print(ab_product)

[[ 15.7]
 [ 14.7]
 [ 20.7]]


## Matrix Multiplication

Because a matrix consists of column vectors, we can extend what we learned about matrix vector multiplication to multiply matrices together. In matrix vector multiplication, we performed a dot product between each row in the matrix and the column vector. In matrix multiplication, we extend this to perform a dot product between each row in the first matrix and each row in the second matrix.

![](https://s3.amazonaws.com/dq-content/162/matrix_multiplication.svg)

As with matrix vector multiplication, the columns in the first matrix need to match the number of rows in the second matrix.

![](https://s3.amazonaws.com/dq-content/162/valid_matrix_multiplication.svg)

Note that the order of multiplication also matters.

![](https://s3.amazonaws.com/dq-content/162/matrix_multiplication_2.svg)

To multiply vectors in NumPy, we use the same `numpy.dot()` function we used in the last screen.

In [6]:
matrix_a = np.asarray([
    [0.7, 3],
    [1.7, 2],
    [0.7, 9]
], dtype=np.float32)

matrix_b = np.asarray([
    [113, 3, 10],
    [1, 0, 1],
], dtype=np.float32)

product_ab = np.dot(matrix_a, matrix_b)
product_ba = np.dot(matrix_b, matrix_a)

try:
    print(np.equal(product_ab, product_ba))
except:
    print('Array sizes are different.')
    print('product_ab shape:',product_ab.shape)
    print('product_ba shape:',product_ba.shape)
    
print(product_ab)
print(product_ba)

Array sizes are different.
product_ab shape: (3, 3)
product_ba shape: (2, 2)
[[  82.09999847    2.0999999    10.        ]
 [ 194.1000061     5.10000038   19.        ]
 [  88.09999847    2.0999999    16.        ]]
[[  91.19999695  435.        ]
 [   1.39999998   12.        ]]


## Matrix Transpose

The transpose of a matrix switches the rows and columns of a matrix. You can think of the transpose operation as a rotation. In data science, we're often working with data tables of different dimensions. Because of the requirements for matrix multiplication, we sometimes want to take the transpose of a matrix to allow us to multiply matrices together that, by default, don't overlap in number of rows and columns.<br>

Here's what the transpose of a matrix looks like visually:

![](https://upload.wikimedia.org/wikipedia/commons/e/e4/Matrix_transpose.gif)

Mathematically, we use the notation $A^T$ to specify the transpose operation.

$$A^T+B^T=C$$

The transpose has a few different interesting rules that are a bit intuitive. For example, when taking the transpose of the sum of two matrices, we can distribute the transpose operation to each matrix:

$$(A+B)^T=A^T+B^T$$

One counterintuitive rule is when we take the transpose of the product of 2 matrices:

$$(AB)^T=B^TA^T$$

Let's explore these properties using NumPy. To compute the transpose of a NumPy ndarray, we need to use the [numpy.transpose()](https://docs.scipy.org/doc/numpy-1.12.0/reference/generated/numpy.transpose.html) function.

In [8]:
transpose_a = matrix_a.T
print(transpose_a)
print(matrix_a)

[[ 0.69999999  1.70000005  0.69999999]
 [ 3.          2.          9.        ]]
[[ 0.69999999  3.        ]
 [ 1.70000005  2.        ]
 [ 0.69999999  9.        ]]


In [13]:
trans_ba = np.dot(matrix_b.T, matrix_a.T)
trans_ab = np.dot(matrix_a.T, matrix_b.T)

product_ab = np.dot(matrix_a, matrix_b)
print(product_ab.T)

# confirm that the transpose of product_ab the same as trans_ba
print(np.equal(product_ab.T, trans_ba))

[[  82.09999847  194.1000061    88.09999847]
 [   2.0999999     5.10000038    2.0999999 ]
 [  10.           19.           16.        ]]
[[ True  True  True]
 [ True  True  True]
 [ True  True  True]]


## Identity Matrix

In the matrix equation that we discussed in the last mission, we're trying to solve for the vector $\vec{x}$.

$$A\vec{x}=\vec{b}$$

Right now, the matrix $A$ multiplies the vector $\vec{x}$ and we need a way to cancel $A$.<br>

Let's look at the identity matrix, which we touched on briefly at the end of the first mission in this course. If you recall, the identity matrix contains $1$ along the diagonals and $0$ elsewhere. Here's what the $2x2$ identity matrix looks like, often represented symbolically using $I_2$:

$$
I_2 = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}
$$

When we multiply $I2$ with any vector containing 2 elements, the resulting vector matches the original vector exactly:

$$I_2 \vec{x} = \vec{x}$$

This is because each element in the vector is multiplied exactly once by the diagonal `1` value in the identity matrix:

![](https://s3.amazonaws.com/dq-content/162/identity_matrix.svg)

**If we can transform matrix A and convert it into the identity matrix, then only the solution vector will remain** $\vec{x}$. Let's practice working with the identity matrix before exploring how to transform $A$ into $I$.<br>

We can create any $I_n$ identity matrix using the [numpy.identity()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.identity.html) function. This function only has 1 required parameter, `n`, which specifies the `n x n` identity matrix we want.


In [15]:
i_2 = np.asarray([
    [1, 0],
    [0, 1]
])

i_3 = np.asarray([
    [1,0,0],
    [0,1,0],
    [0,0,1]
])

matrix_33 = np.asarray([
    [1,2,3],
    [4,5,6],
    [7,8,9]
])

matrix_23 = np.asarray([
    [1,2,3],
    [4,5,6]
])

identity_33 = np.dot(i_3, matrix_33)
identity_23 = np.dot(i_2, matrix_23)

print(np.equal(identity_33, matrix_33))
print(np.equal(identity_23, matrix_23))

[[ True  True  True]
 [ True  True  True]
 [ True  True  True]]
[[ True  True  True]
 [ True  True  True]]


## Matrix Inverse

Now that we're more familiar with the identity matrix, let's discuss how to cancel the coefficient matrix $A$. Said another way, we want to transform $A$ into the identity matrix $I$. Multiplying the **inverse** of a matrix by the matrix accomplishes this task.<br>

The matrix inverse is similar to the idea of the multiplicative inverse. For example, let's say we want to solve for x in the equation $5x=10$. To do so, we need to multiply both sides by the multiplicative inverse of $5$, which is $5^{−1}$ (or $1/5$):

$$5^{-1}*5x = 5^{-1}*10$$

The inverse of $5$ transforms it to $1$ and leaves us with the solution: $x=2$. To solve for the vector $\vec{x}$ in the matrix equation, we need to multiply both sides by the inverse of $A$:

$$A^{−1}A\vec{x}=A^{−1}\vec{b}$$

This simplifies to $I\vec_{x}=A^{−1}\vec{b}$ and we're then left with the formula for calculating the solution vector:

$$\vec{x} = A^{-1}\vec{b}$$

While we use the matrix inverse to cancel out specific terms in the same fashion as the multiplicative inverse, the calculation is completely different. Let's understand the calculation for the inverse of a $2x2$ matrix.<br>

If $A =  \begin{bmatrix} a & b \\ c & d \end{bmatrix}$ then<br>
$A^{-1} = \frac{1}{ad - bc} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix}$<br>

The term $ad−bc$ is known as the **determinant ($det$)** and is often written as $det(A)=ad−bc$ or as $|A|=ad−bc$. Because we're dividing by the determinant when calculating the matrix inverse, **a 2 x 2 matrix is only invertible if the determinant ($ad-bc$) is not equal to 0**. In this step and the next step, we'll focus on finding the matrix inverse when A is a $2 x 2$ matrix. Later in this mission, we'll walkthrough how to compute the matrix inverse for a higher dimensional matrix ($3 x 3$ and greater).<br>

Let's implement the matrix inverse in Python before moving on to solving the matrix equation.

In [16]:
matrix_a = np.asarray([
    [1.5, 3],
    [1, 4]
])

In [18]:
matrix_a[0,1]

3.0

In [19]:
def matrix_inverse_two(arr22):
    
    det22 = arr22[0,0]*arr22[1,1]-arr22[0,1]*arr22[1,0]
    assert det22 != 0
    
    arr_rearranged = np.asarray([
                    [arr22[1,1], -arr22[0,1]],
                    [-arr22[1,0], arr22[0,0]]
    ])
    
    return 1/det22*arr_rearranged

In [20]:
inverse_a = matrix_inverse_two(matrix_a)

i_2 = np.dot(matrix_a, inverse_a)
print(i_2)

[[ 1.  0.]
 [ 0.  1.]]


## Solving The Matrix Equation

Now that we know how to compute the matrix inverse, we can solve our system using the matrix equation $A\vec{x}=\vec{b}$:

$$\left[\begin{array}{rr|r}
30 & -1 \\ 
50 & -1 
\end{array}\right] \begin{bmatrix} x_1\\ x_2 \end{bmatrix} =  \begin{bmatrix} -1000\\ -100 \end{bmatrix}$$

We start by left multiplying $A^{−1}$ on both sides:

$$\left[\begin{array}{rr|r}
30 & -1 \\ 
50 & -1 
\end{array}\right]^{-1} \left[\begin{array}{rr|r}
30 & -1 \\ 
50 & -1 
\end{array}\right] \begin{bmatrix} x_1\\ x_2 \end{bmatrix} =  \left[\begin{array}{rr|r}
30 & -1 \\ 
50 & -1 
\end{array}\right]^{-1} \begin{bmatrix} -1000\\ -100 \end{bmatrix}$$

This simplifies to:

$$\begin{bmatrix} x_1\\ x_2 \end{bmatrix} =  \left[\begin{array}{rr|r}
30 & -1 \\ 
50 & -1 
\end{array}\right]^{-1} \begin{bmatrix} -1000\\ -100 \end{bmatrix}$$

Let's finish this last step in Python. To compute the inverse of a NumPy ndarray, we need to use the [numpy.linalg.inv()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.inv.html) function.

In [22]:
coef_matrix = np.asarray([
                [30, -1],
                [50, -1]
])

solution_x = np.dot(np.linalg.inv(coef_matrix),
                   np.asarray([[-1000],[-100]])
                   )
print(solution_x)

[[   45.]
 [ 2350.]]


## Determinant For Higher Dimensions

Before we discuss how to compute the matrix inverse for higher dimensional matrices, let's dive deeper into the determinant and introduce some more terminology. So far, we've mostly worked with matrices that contain the same number orws and columns. These matrices are known as **square matrices** and we can only compute the determinant and matrix inverse for square matrices. In addition, we can only compute the matrix inverse of a square matrix when the determinant is not equal to $0$.<br>

To find the determinant of a higher dimensional square matrix, we need use the more general form of the determinant. Here's what that looks like:

![](https://s3.amazonaws.com/dq-content/178/3d_determinant_one.svg)

The determinant of a higher-dimensional system involves breaking down the full matrix into **minor matrices**. First, we select a row or column (most teaching materials select the first row). For the first value in that row, we "hide" the other values in that row (2nd and 3rd value in the row) and in that column (2nd and 3rd value in the column), select the rest of the elements as the minor matrix, and multiply the scalar value with the determinant of the minor matrix. We repeat this for the remaining values in the first row. This diagram helps illustrate this much clearer:

![](https://s3.amazonaws.com/dq-content/178/3d_determinant_two.svg)

Here's a concrete example:

![](https://s3.amazonaws.com/dq-content/178/3d_determinant_three.svg)

To compute the determinant in NumPy, we use the `numpy.linalg.det()` function. We'll leave it to you to read the [documentation](https://docs.scipy.org/doc/numpy-1.12.0/reference/generated/numpy.linalg.det.html) and learn how to use this function.


In [23]:
matrix_22 = np.asarray([
    [8, 4],
    [4, 2]
])

matrix_33 = np.asarray([
    [1, 1, 1],
    [1, 1, 6],
    [7, 8, 9]
])

In [24]:
matrix_33[1,1:]

array([1, 6])

In [26]:
matrix_33[2, :2]

array([7, 8])

In [27]:
def get_determinant22(arr22):
    return arr22[0,0]*arr22[1,1]-arr22[1,0]*arr22[0,1]

def get_determinant33(arr33):
    
    minor1 = np.asarray([
                arr33[1,1:],
                arr33[2,1:]
    ])
    
    minor2 = np.asarray([
                [arr33[1,0], arr33[1,2]],
                [arr33[2,0], arr33[2,2]]
    ])
    
    minor3 = np.asarray([
                arr33[1, :2],
                arr33[2, :2]
    ])
    
    return (get_determinant22(minor1)-\
            get_determinant22(minor2)+\
            get_determinant22(minor3))
    

det_22 = get_determinant22(matrix_22)
det_33 = get_determinant33(matrix_33)

## Matrix Inverse For Higher Dimensions

To calculate the matrix inverse for a 3 by 3, or larger, matrix, we need to also work with the more general form of the matrix inverse equation. Similar to the determinant for higher-dimensional matrices, the matrix inverse works by generating minor matrices that are dependent on the position in the matrix. Here's a diagram describing the matrix inverse for a 3 by 3 matrix:

![](https://s3.amazonaws.com/dq-content/162/3d_matrix_inverse.svg)

While it's helpful to know how to compute the inverse this way for higher dimensional matrices, the amount of careful arithmetic you have to by hand is large. Thankfully, the `numpy.linalg.inv()` function can work with any `n`-dimensional square matrix.
