# Matrices and Matrix Arithmetic

Matrices are a foundational element of linear algebra. Matrices are used throughout the field of machine learning in the description of algorithms and processes such as the input data variable (X) when training an algorithm.

## Defining a Matrix

We can represent a matrix in Python using a two-dimensional NumPy array. A NumPy array can be constructed given a list of lists. For example, below is a 2 row, 3 column matrix.

In [1]:
# create matrix
from numpy import array
A = array([[1, 2, 3], [4, 5, 6]])
print(A)

[[1 2 3]
 [4 5 6]]


## Matrix Arithmetic

In simple matrix-matrix arithmetic, all operations are performed element-wise between two matrices of equal size to result in a new matrix with the same size.

Two matrices with the same dimensions can be added together to create a new third matrix.

$C = A + B$

The scalar elements in the resulting matrix are calculated as the addition of the elements in each of the matrices being added.


$
\begin{aligned} 
C =\begin{pmatrix}a_{11}+b_{11}\; a_{12}+b_{12}\\a_{21}+b_{21}\; a_{22}+b_{22}\\a_{31}+b_{31}\; a_{32}+b_{32}\end{pmatrix}  
\end{aligned}
$


In [2]:
# matrix addition
from numpy import array

# define first matrix
A = array([
[1, 2, 3],
[4, 5, 6]])
print(A)

# define second matrix
B = array([
[1, 2, 3],
[4, 5, 6]])
print(B)

# add matrices
C = A + B
print(C)

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[[ 2  4  6]
 [ 8 10 12]]


## Matrix Subtraction

Similarly, one matrix can be subtracted from another matrix with the same dimensions.

$C = A - B$

The scalar elements in the resulting matrix are calculated as the subtraction of the elements in each of the matrices.

$
\begin{aligned} 
C =\begin{pmatrix}a_{1,1}-b_{1,1}\; a_{1,2}-b_{1,2}\\a_{2,1}-b_{2,1}\; a_{2,2}-b_{2,2}\\a_{3,1}-b_{3,1}\; a_{3,2}-b_{3,2}\end{pmatrix}  
\end{aligned}
$

In [3]:
# matrix subtraction
from numpy import array

# define first matrix
A = array([
[1, 2, 3],
[4, 5, 6]])
print(A)

# define second matrix
B = array([
[0.5, 0.5, 0.5],
[0.5, 0.5, 0.5]])
print(B)

# subtract matrices
C = A - B
print(C)

[[1 2 3]
 [4 5 6]]
[[0.5 0.5 0.5]
 [0.5 0.5 0.5]]
[[0.5 1.5 2.5]
 [3.5 4.5 5.5]]


## Matrix Multiplication (Hadamard Product)

Two matrices with the same size can be multiplied together, and this is often called element-wise matrix multiplication or the Hadamard product. It is not the typical operation meant when referring to matrix multiplication, therefore a different operator is often used, such as a circle o.

$C = A o B$

As with element-wise subtraction and addition, element-wise multiplication involves the multiplication of elements from each parent matrix to calculate the values in the new matrix.

$
\begin{aligned} 
C =\begin{pmatrix}a_{1,1}\,x\,b_{1,1}\; a_{1,2}\,x\,b_{1,2}\\a_{2,1}\,x\,b_{2,1}\; a_{2,2}\,x\,b_{2,2}\\a_{3,1}\,x\,b_{3,1}\; a_{3,2}\,x\,b_{3,2}\end{pmatrix}  
\end{aligned}
$

We can implement this in Python using the star operator directly on the two NumPy arrays.

In [4]:
# matrix Hadamard product
from numpy import array

# define first matrix
A = array([
[1, 2, 3],
[4, 5, 6]])
print(A)

# define second matrix
B = array([
[1, 2, 3],
[4, 5, 6]])
print(B)

# multiply matrices
C = A * B
print(C)

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[[ 1  4  9]
 [16 25 36]]


## Matrix Division

One matrix can be divided by another matrix with the same dimensions.

$ C=\dfrac{A}{B}$

The scalar elements in the resulting matrix are calculated as the division of the elements in each of the matrices.

$
\begin{aligned} 
C =\begin{pmatrix}\dfrac{a_{1,1}}{b_{1,1}}\; \dfrac{a_{1,2}}{b_{1,2}}\\ \dfrac{a_{2,1}}{b_{2,1}}\; \dfrac{a_{2,2}}{b_{2,2}}\\ \dfrac{a_{3,1}}{b_{3,1}}\; \dfrac{a_{3,2}}{b_{3,2}}\end{pmatrix} 
\end{aligned}
$

We can implement this in Python using the division operator directly on the two NumPy arrays.

In [5]:
# matrix division
from numpy import array

# define first matrix
A = array([
[1, 2, 3],
[4, 5, 6]])
print(A)

# define second matrix
B = array([
[1, 2, 3],
[4, 5, 6]])
print(B)

# divide matrices
C = A / B
print(C)

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[[1. 1. 1.]
 [1. 1. 1.]]


## Matrix-Matrix Multiplication

Matrix multiplication, also called the matrix dot product is more complicated than the previous operations and involves a rule as not all matrices can be multiplied together.

$C = A\, . B$

The rule for matrix multiplication is as follows:

- The number of columns (n) in the first matrix (A) must equal the number of rows (m) in the second matrix (B).

$C(m, k) = A(m, n)\,. B(n, k)$

This rule applies for a chain of matrix multiplications where the number of columns in one matrix in the chain must match the number of rows in the following matrix in the chain.

$ 
\begin{aligned} 
A = \begin{pmatrix}  a_{1,1}\; a_{1,2}\\ a_{2,1}\; a_{2,2}\\ a_{3,1}\; a_{3,2} \end{pmatrix}  
\end{aligned}
$


$ 
\begin{aligned} 
B = \begin{pmatrix}  b_{1,1}\; b_{1,2}\\ b_{2,1}\; b_{2,2}\end{pmatrix}  
\end{aligned}
$

$ 
\begin{aligned} 
C = \begin{pmatrix} 
a_{1,1}\, x \, b_{1,1}+ a_{1,2}\, x \, b_{2,1}\; , \;a_{1,1}\, x \, b_{1,2} + a_{1,2}\, x \, b_{2,2}\\
a_{2,1}\, x \, b_{1,1}+ a_{2,2}\, x \, b_{2,1}\; , \;a_{2,1}\, x \, b_{1,2} + a_{2,2}\, x \, b_{2,2}\\
a_{3,1}\, x \, b_{1,1}+ a_{3,2}\, x \, b_{2,1}\; , \;a_{3,1}\, x \, b_{1,2} + a_{3,2}\, x \, b_{2,2}
\end{pmatrix}  
\end{aligned}
$ 

The matrix multiplication operation can be implemented in NumPy using the dot() function. It can also be calculated using the newer @ operator, since Python version 3.5. The example below demonstrates both methods.

In [6]:
# matrix dot product
from numpy import array

# define first matrix
A = array([
[1, 2],
[3, 4],
[5, 6]])
print(A)

# define second matrix
B = array([
[1, 2],
[3, 4]])
print(B)

# multiply matrices
C = A.dot(B)
print(C)

# multiply matrices with @ operator
D = A @ B
print(D)

[[1 2]
 [3 4]
 [5 6]]
[[1 2]
 [3 4]]
[[ 7 10]
 [15 22]
 [23 34]]
[[ 7 10]
 [15 22]
 [23 34]]


It's recommend using the dot() function for matrix multiplication for now given the newness of the @ operator.

## Matrix-Vector Multiplication

A matrix and a vector can be multiplied together as long as the rule of matrix multiplication is observed. Specifically, that the number of columns in the matrix must equal the number of items in the vector. As with matrix multiplication, the operation can be written using the dot notation. Because the vector only has one column, the result is always a vector.

$C = A\,.v$

The result is a vector with the same number of rows as the parent matrix.

$ 
\begin{aligned} 
A = \begin{pmatrix}  a_{1,1}\; a_{1,2}\\ a_{2,1}\; a_{2,2}\\ a_{3,1}\; a_{3,2} \end{pmatrix}  
\end{aligned}
$

$
\begin{aligned}
v = \begin{pmatrix} v_{1}\\
v_{2}
\end{pmatrix}
\end{aligned}
$

$ 
\begin{aligned} 
C = \begin{pmatrix} 
a_{1,1}\, x \, v_{1}+ a_{1,2}\, x \, v_{2}\\
a_{2,1}\, x \, v_{1}+ a_{2,2}\, x \, v_{2}\\
a_{3,1}\, x \, v_{1}+ a_{3,2}\, x \, v_{2}
\end{pmatrix}  
\end{aligned}
$ 

The matrix-vector multiplication can be implemented in NumPy using the dot() function.

In [7]:
# matrix-vector multiplication
from numpy import array

# define matrix
A = array([
[1, 2],
[3, 4],
[5, 6]])
print(A)

# define vector
B = array([0.5, 0.5])
print(B)

# multiply
C = A.dot(B)
print(C)

[[1 2]
 [3 4]
 [5 6]]
[0.5 0.5]
[1.5 3.5 5.5]


## Matrix-Scalar Multiplication

A matrix can be multiplied by a scalar. This can be represented using the dot notation between the matrix and the scalar.

$C = A\,. b$

The result is a matrix with the same size as the parent matrix where each element of the matrix is multiplied by the scalar value.

$ 
\begin{aligned} 
A = \begin{pmatrix}  a_{1,1}\; a_{1,2}\\ a_{2,1}\; a_{2,2}\\ a_{3,1}\; a_{3,2} \end{pmatrix}  
\end{aligned}
$

$ 
\begin{aligned} 
C = \begin{pmatrix} 
a_{1,1}\, x \, b+ a_{1,2}\, x \, b\\
a_{2,1}\, x \, b+ a_{2,2}\, x \, b\\
a_{3,1}\, x \, b+ a_{3,2}\, x \, b
\end{pmatrix}  
\end{aligned}
$ 

This can be implemented directly in NumPy with the multiplication operator.

In [8]:
# matrix-scalar multiplication
from numpy import array

# define matrix
A = array([[1, 2], [3, 4], [5, 6]])
print(A)

# define scalar
b = 0.5
print(b)

# multiply
C = A * b
print(C)

[[1 2]
 [3 4]
 [5 6]]
0.5
[[0.5 1. ]
 [1.5 2. ]
 [2.5 3. ]]
