# 3. Arithmetic
Matrices are a foundational element of linear algebra. Matrices are used throughout the field of machine learning in the description of algorithms and processes such as the input data variable (X) when training an algorithm. In this tutorial, you will discover matrices in linear algebra and how to manipulate them in Python. After completing this tutorial, you will know:
- What a matrix is and how to define one in Python with NumPy.
- How to perform element-wise operations such as addition, subtraction, and the Hadamard product.
- How to multiply matrices together and the intuition behind the operation.

## 3.1 What is a Matrix
A matrix is a two-dimensional array of scalars with one or more columns and one or more rows.

> A matrix is a two-dimensional array (a table) of numbers.
>
> — Page 115, No Bullshit Guide To Linear Algebra, 2017.

The notation for a matrix is often an uppercase letter, such as A, and entries are referred to by their two-dimensional subscript of row (*i*) and column (*j*), such as $a_{i,j}$. For example, we can define a 3-row, 2-column matrix:

$A = ((a_{1,1},a_{1,2}),(a_{2,1},a_{2,2}),(a_{3,1},a_{3,2}))$

It is more common to see matrices defined using a horizontal notation.

$A = \begin{bmatrix}a_{1,1} & a_{1,2} & a_{1,3} \\ a_{2,1} & a_{2,2} & a_{2,3} \\ a_{3,1} & a_{3,2} & a_{3,3} \end{bmatrix}$

A likely first place you may encounter a matrix in machine learning is in model training data comprised of many rows and columns and often represented using the capital letter X. The geometric analogy used to help understand vectors and some of their operations does not hold with matrices. Further, a vector itself may be considered a matrix with one column and multiple rows. Often the dimensions of the matrix are denoted as $m$ and $n$ or $m×n$ for the number of rows and the number of columns respectively. Now that we know what a matrix is, let’s look at defining one in Python.

## 3.2 Defining a Matrix
We can represent a matrix in Python using a two-dimensional NumPy array. A NumPy array
can be constructed given a list of lists. For example, below is a 2 row, 3 column matrix.

In [1]:
# create matrix
from numpy import array
A = array([[1, 2, 3], [4, 5, 6]])
print(A)

[[1 2 3]
 [4 5 6]]


## 3.3 Matrix Arithmetic
In this section will demonstrate simple matrix-matrix arithmetic, where all operations are
performed element-wise between two matrices of equal size to result in a new matrix with the
same size.

### 3.3.1 Matrix Addition
Two matrices with the same dimensions can be added together to create a new third matrix.

$C = A + B$

The scalar elements in the resulting matrix are calculated as the addition of the elements in each of the matrices being added.

$C = \begin{bmatrix}a_{1,1} + b_{1,1} & a_{1,2} + b_{1,2} & a_{1,3} + b_{1,3} \\ a_{2,1} + b_{2,1} & a_{2,2} + b_{2,2} & a_{2,3} + b_{2,3} \\ a_{3,1} + b_{3,1} & a_{3,2} + b_{3,2} & a_{3,3} + b_{3,3} \end{bmatrix}$

We can implement this in Python using the plus operator directly on the two NumPy arrays.

In [2]:
# matrix addition
from numpy import array
# define first matrix
A = array([
[1, 2, 3],
[4, 5, 6]])
print(A)
# define second matrix
B = array([
[1, 2, 3],
[4, 5, 6]])
print(B)
# add matrices
C = A + B
print(C)

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[[ 2  4  6]
 [ 8 10 12]]


### 3.3.2 Matrix Subtraction
Similarly, one matrix can be subtracted from another matrix with the same dimensions.

$C = A - B$

The scalar elements in the resulting matrix are calculated as the subtraction of the elements
in each of the matrices.

$C = \begin{bmatrix}a_{1,1} - b_{1,1} & a_{1,2} - b_{1,2} & a_{1,3} - b_{1,3} \\ a_{2,1} - b_{2,1} & a_{2,2} - b_{2,2} & a_{2,3} - b_{2,3} \\ a_{3,1} - b_{3,1} & a_{3,2} - b_{3,2} & a_{3,3} - b_{3,3} \end{bmatrix}$

We can implement this in Python using the minus operator directly on the two NumPy
arrays.

In [3]:
# matrix subtraction
from numpy import array
# define first matrix
A = array([
[1, 2, 3],
[4, 5, 6]])
print(A)
# define second matrix
B = array([
[0.5, 0.5, 0.5],
[0.5, 0.5, 0.5]])
print(B)
# subtract matrices
C = A - B
print(C)

[[1 2 3]
 [4 5 6]]
[[0.5 0.5 0.5]
 [0.5 0.5 0.5]]
[[0.5 1.5 2.5]
 [3.5 4.5 5.5]]


### 3.3.3 Matrix Multiplication (Hamard Product)
Two matrices with the same size can be multiplied together, and this is often called element-wise
matrix multiplication or the Hadamard product. It is not the typical operation meant when
referring to matrix multiplication, therefore a diﬀerent operator is often used, such as a circle ◦.

$C = A ◦ B$

As with element-wise subtraction and addition, element-wise multiplication involves the
multiplication of elements from each parent matrix to calculate the values in the new matrix.

$C = \begin{bmatrix}a_{1,1} \times b_{1,1} & a_{1,2} \times b_{1,2} & a_{1,3} \times b_{1,3} \\ a_{2,1} \times b_{2,1} & a_{2,2} \times b_{2,2} & a_{2,3} \times b_{2,3} \\ a_{3,1} \times b_{3,1} & a_{3,2} \times b_{3,2} & a_{3,3} \times b_{3,3} \end{bmatrix}$

We can implement this in Python using the star operator directly on the two NumPy arrays.

In [4]:
# matrix Hadamard product
from numpy import array
# define first matrix
A = array([
[1, 2, 3],
[4, 5, 6]])
print(A)
# define second matrix
B = array([
[1, 2, 3],
[4, 5, 6]])
print(B)
# multiply matrices
C = A * B
print(C)

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[[ 1  4  9]
 [16 25 36]]


### 3.3.4 Matrix Division
One matrix can be divided by another matrix with the same dimensions.

$C = A / B$

The scalar elements in the resulting matrix are calculated as the division of the elements in
each of the matrices.

$C = \begin{bmatrix}\frac{a_{1,1}}{b_{1,1}} & \frac{a_{1,2}}{b_{1,2}} & \frac{a_{1,3}}{b_{1,3}} \\ \frac{a_{2,1}}{b_{2,1}} & \frac{a_{2,2}}{b_{2,2}} & \frac{a_{2,3}}{b_{2,3}} \\ \frac{a_{3,1}}{b_{3,1}} & \frac{a_{3,2}}{b_{3,2}} & \frac{a_{3,3}}{b_{3,3}} \end{bmatrix}$

We can implement this in Python using the division operator directly on the two NumPy
arrays.

In [5]:
# matrix division
from numpy import array
# define first matrix
A = array([
[1, 2, 3],
[4, 5, 6]])
print(A)
# define second matrix
B = array([
[1, 2, 3],
[4, 5, 6]])
print(B)
# divide matrices
C = A / B
print(C)

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[[1. 1. 1.]
 [1. 1. 1.]]


## 3.4 Matrix-Matrix Multiplication
Matrix multiplication, also called the matrix dot product is more complicated than the previous
operations and involves a rule as not all matrices can be multiplied together.

$C = A · B$

or

$C= AB$

The rule for matrix multiplication is as follows:

- The number of columns (n) in the first matrix (A) must equal the number of rows (m) in the second matrix (B).

For example, matrix $A$ has the dimensions $m$ rows and $n$ columns and matrix $B$ has the
dimensions $n$ and $k$. The $n$ columns in $A$ and $n$ rows in $B$ are equal. The result is a new matrix
with $m$ rows and $k$ columns.

$C(m,k) = A(m,n)·B(n,k)$

This rule applies for a chain of matrix multiplications where the number of columns in one matrix in the chain must match the number of rows in the following matrix in the chain.

> One of the most important operations involving matrices is multiplication of two matrices. The matrix product of matrices $A$ and $B$ is a third matrix $C$. In order for this product to be defined, $A$ must have the same number of columns as $B$ has rows. If $A$ is of shape $m×n$ and $B$ is of shape $n×p$, then $C$ is of shape $m×p$.
>
> — Page 34, *Deep Learning*, 2016.

The intuition for the matrix multiplication is that we are calculating the dot product between each row in matrix $A$ with each column in matrix $B$. For example, we can step down rows of column $A$ and multiply each with column 1 in $B$ to give the scalar values in column 1 of $C$. Below describes the matrix multiplication using matrix notation.

$A = \begin{bmatrix}a_{1,1} & a_{1,2} \\ a_{2,1} & a_{2,2} \\ a_{3,1} & a_{3,2} \end{bmatrix}$

$B = \begin{bmatrix}b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \end{bmatrix}$

$C = \begin{bmatrix}a_{1,1} \times b_{1,1} + a_{1,2} \times b_{2,1} & a_{1,1} \times b_{1,2} + a_{1,2} \times b_{2,2} \\ a_{2,1} \times b_{1,1} + a_{2,2} \times b_{2,1} & a_{2,1} \times b_{1,2} + a_{2,2} \times b_{2,2} \\ a_{3,1} \times b_{1,1} + a_{3,2} \times b_{2,1} & a_{3,1} \times b_{1,2} + a_{3,2} \times b_{2,2} \end{bmatrix}$

The matrix multiplication operation can be implemented in NumPy using the `dot()` function.
It can also be calculated using the newer $@$ operator, since Python version 3.5. The example
below demonstrates both methods.

In [8]:
# matrix dot product
from numpy import array
# define first matrix
A = array([
[1, 2],
[3, 4],
[5, 6]])
print(A)
# define second matrix
B = array([
[1, 2],
[3, 4]])
print(B)
# multiply matrices
C = A.dot(B)
print(C)
# multiply matrices with @ operator
D = A @ B
print(D)

[[1 2]
 [3 4]
 [5 6]]
[[1 2]
 [3 4]]
[[ 7 10]
 [15 22]
 [23 34]]
[[ 7 10]
 [15 22]
 [23 34]]


## 3.5 Matrix-Vector Multiplication
A matrix and a vector can be multiplied together as long as the rule of matrix multiplication is observed. Specifically, that the number of columns in the matrix must equal the number of items in the vector. As with matrix multiplication, the operation can be written using the dot notation. Because the vector only has one column, the result is always a vector.

$c= A·v $

Or without the dot in a compact form.

$c= Av $

The result is a vector with the same number of rows as the parent matrix.

$A = \begin{bmatrix}a_{1,1} & a_{1,2} \\ a_{2,1} & a_{2,2} \\ a_{3,1} & a_{3,2} \end{bmatrix}$

$v = \begin{pmatrix}v_1 \cr v_2 \end{pmatrix}$

$c = \begin{bmatrix} a_{1,1} \times v_1 + a_{1,2} \times v_2 \\ a_{2,1} \times v_1 + a_{2,2} \times v_2 \\ a_{3,1} \times v_1 + a_{3,2} \times v_2 \end{bmatrix}$

Or, more compactly.

$c = \begin{bmatrix} a_{1,1}v_1 + a_{1,2}v_2 \\ a_{2,1}v_1 + a_{2,2}v_2 \\ a_{3,1}v_1 + a_{3,2}v_2 \end{bmatrix}$

The matrix-vector multiplication can be implemented in NumPy using the `dot()` function.

In [9]:
# matrix-vector multiplication
from numpy import array
# define matrix
A = array([
[1, 2],
[3, 4],
[5, 6]])
print(A)
# define vector
B = array([0.5, 0.5])
print(B)
# multiply
C = A.dot(B)
print(C)

[[1 2]
 [3 4]
 [5 6]]
[0.5 0.5]
[1.5 3.5 5.5]


## 3.5 Matrix-Scalar Multiplication
A matrix can be multiplied by a scalar. This can be represented using the dot notation between
the matrix and the scalar.

$C= A·b$

Or without the dot notation.

$C= Ab$

The result is a matrix with the same size as the parent matrix where each element of the
matrix is multiplied by the scalar value.

$A = \begin{bmatrix}a_{1,1} & a_{1,2} \\ a_{2,1} & a_{2,2} \\ a_{3,1} & a_{3,2} \end{bmatrix}$


$C = \begin{bmatrix} a_{1,1} \times b & a_{1,2} \times b \\ a_{2,1} \times b & a_{2,2} \times b \\ a_{3,1} \times b & a_{3,2} \times b \end{bmatrix}$

or

$C = \begin{bmatrix} a_{1,1}b & a_{1,2}b \\ a_{2,1}b & a_{2,2}b \\ a_{3,1}b & a_{3,2}b \end{bmatrix}$

This can be implemented directly in NumPy with the multiplication operator.

In [10]:
# matrix-scalar multiplication
from numpy import array
# define matrix
A = array([[1, 2], [3, 4], [5, 6]])
print(A)
# define scalar
b = 0.5
print(b)
# multiply
C = A * b
print(C)

[[1 2]
 [3 4]
 [5 6]]
0.5
[[0.5 1. ]
 [1.5 2. ]
 [2.5 3. ]]
