# Matrices

## Define a matrix

In [1]:
import numpy as np

In [4]:
# create matrix
A = np.array([[1, 2, 3], [4, 5, 6]])
print(A)

[[1 2 3]
 [4 5 6]]


## Matrix Arithmetic

Operations are performed element-wise between two matrices of equal size to result in a new matrix with the
same size.

### Matrix Addition

Two matrices with the same dimensions can be added together to create a new third matrix.

C = A + B

The scalar elements in the resulting matrix are calculated as the addition of the elements in
each of the matrices being added.

In [5]:
# define first matrix
A = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
print(A)

# define second matrix
B = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
print(B)

# add matrices
C = A + B
print(C)

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[[ 2  4  6]
 [ 8 10 12]]


### Matrix Subtraction

Similarly, one matrix can be subtracted from another matrix with the same dimensions.

C = A - B

The scalar elements in the resulting matrix are calculated as the subtraction of the elements
in each of the matrices.

In [6]:
# define first matrix
A = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
print(A)

# define second matrix
B = np.array([
    [0.5, 0.5, 0.5],
    [0.5, 0.5, 0.5]
])
print(B)

# subtract matrices
C = A - B
print(C)

[[1 2 3]
 [4 5 6]]
[[0.5 0.5 0.5]
 [0.5 0.5 0.5]]
[[0.5 1.5 2.5]
 [3.5 4.5 5.5]]


### Matrix Multiplication(Hadamard Product)

Two matrices with the same size can be multiplied together, and this is often called element-wise
matrix multiplication or the Hadamard product. It is not the typical operation meant when
referring to matrix multiplication, therefore a di erent operator is often used, such as a circle o.

C = A o B

As with element-wise subtraction and addition, element-wise multiplication involves the
multiplication of elements from each parent matrix to calculate the values in the new matrix.

In [7]:
# define first matrix
A = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
print(A)

# define second matrix
B = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
print(B)

# multiply matrices
C = A * B
print(C)

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[[ 1  4  9]
 [16 25 36]]


### Matrix Division

One matrix can be divided by another matrix with the same dimensions.

C = A / B

The scalar elements in the resulting matrix are calculated as the division of the elements in
each of the matrices.

In [8]:
# define first matrix
A = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
print(A)

# define second matrix
B = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
print(B)

# divide matrices
C = A / B
print(C)

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[[1. 1. 1.]
 [1. 1. 1.]]


### Matrix-Matrix Multiplication

Matrix multiplication, also called the matrix dot product is more complicated than the previous
operations and involves a rule as not all matrices can be multiplied together.

C = A.B

or

C = AB

The rule for matrix multiplication is as follows:

 -- The number of columns (n) in the first matrix (A) must equal the number of rows (m) in
the second matrix (B).

For example, matrix A has the dimensions m rows and n columns and matrix B has the dimensions n and k.
The n columns in A and n rows in B are equal. The result is a new matrix with m rows and k columns.

C(m, k) = A(m, n) . B(n, k)

This rule applies for a chain of matrix multiplications where the number of columns in one
matrix in the chain must match the number of rows in the following matrix in the chain.

One of the most important operations involving matrices is multiplication of two
matrices. The matrix product of matrices A and B is a third matrix C. In order for
this product to be defined, A must have the same number of columns as B has rows.
If A is of shape m x n and B is of shape n x p, then C is of shape m x p.

The intuition for the matrix multiplication is that we are calculating the dot product between
each row in matrix A with each column in matrix B. For example, we can step down rows of
column A and multiply each with column 1 in B to give the scalar values in column 1 of C.


In [9]:
#  define first matrix
A = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])
print(A)

#  define second matrix
B = np.array([
    [1, 2],
    [3, 4]
])
print(B)

# multiple matrices
C = A.dot(B)
print(C)

# multiply matrices with @ operator. The '@' operator is newer and can be used as a replacement for .dot()
D = A @ B
print(D)

[[1 2]
 [3 4]
 [5 6]]
[[1 2]
 [3 4]]
[[ 7 10]
 [15 22]
 [23 34]]
[[ 7 10]
 [15 22]
 [23 34]]


### Matrix-Vector Multiplication

A matrix and a vector can be multiplied together as long as the rule of matrix multiplication
is observed. Specifically, that the number of columns in the matrix must equal the number of
items in the vector. As with matrix multiplication, the operation can be written using the dot
notation. Because the vector only has one column, the result is always a vector.

c = A . v

or

c = Av

In [11]:
# define matrix
A = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])
print(A)

# define vector
B = np.array([0.5, 0.5])
print(B)

# multiply
C = A.dot(B)
print(C)

[[1 2]
 [3 4]
 [5 6]]
[0.5 0.5]
[1.5 3.5 5.5]


### Matrix-Scalar Multiplication

A matrix can be multiplied by a scalar. This can be represented using the dot notation between
the matrix and the scalar.

C = A . b

or

C = Ab

The result is a matrix with the same size as the parent matrix where each element of the
matrix is multiplied by the scalar value.

In [12]:
# define matrix
A = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])
print(A)

# define scalar
b = 0.5
print(b)

# multiply
C = A * b
print(C)

[[1 2]
 [3 4]
 [5 6]]
0.5
[[0.5 1. ]
 [1.5 2. ]
 [2.5 3. ]]


## Transpose

A defined matrix can be transposed, which creates a new matrix with the number of columns
and rows flipped. An invisible diagonal line can be drawn through the matrix from top left to bottom right on
which the matrix can be flipped to give the transpose.

### C = A^T

The operation has no effect if the matrix is symmetrical, e.g. has the same number of
columns and rows and the same values at the same locations on both sides of the invisible
diagonal line.

In [2]:
# transpose matrix
A = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])
print(A)

# calculate transpose
C = A.T
print(C)

[[1 2]
 [3 4]
 [5 6]]
[[1 3 5]
 [2 4 6]]


## Inverse

Matrix inversion is a process that  nds another matrix that when multiplied with the matrix,
results in an identity matrix. Given a matrix A,  and matrix B, such that AB = I^n or BA = I^n.

### AB = BA = I^n

In [3]:
# invert matrix
# define matrix
A = np.array([
    [1.0, 2.0],
    [3.0, 4.0]
])
print(A)

# invert matrix
B = np.linalg.inv(A)
print(B)

# multiply A and B
I = A.dot(B)
print(I)

[[1. 2.]
 [3. 4.]]
[[-2.   1. ]
 [ 1.5 -0.5]]
[[1.00000000e+00 1.11022302e-16]
 [0.00000000e+00 1.00000000e+00]]


## Trace

A trace of a square matrix is the sum of the values on the main diagonal of the matrix (top-left
to bottom-right). The trace operator gives the sum of all of the diagonal entries of a matrix.

The operation of calculating a trace on a square matrix is described using the notation tr(A)
where A is the square matrix on which the operation is being performed.

### tr(A)

In [4]:
# matrix trace
# define matrix
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])
print(A)

# calculate trace
B = np.trace(A)
print(B)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
15


## Determinant

The determinant of a square matrix is a scalar representation of the volume of the matrix.

The determinant describes the relative geometry of the vectors that make up the
rows of the matrix. More speci cally, the determinant of a matrix A tells you the
volume of a box with sides given by rows of A.

It is denoted by the det(A) notation or jAj, where A is the matrix on which we are calculating
the determinant.

### det(A)

The determinant of a square matrix is calculated from the elements of the matrix. More
technically, the determinant is the product of all the eigenvalues of the matrix. The intuition for the determinant is
that it describes the way a matrix will scale another matrix when they are multiplied together.
For example, a determinant of 1 preserves the space of the other matrix. A determinant of 0
indicates that the matrix cannot be inverted.

In [6]:
# matrix determinant
# define matrix
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])
print(A)

# calculate determinant
B = np.linalg.det(A)
print(B)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
6.66133814775094e-16


## Rank

The rank of a matrix is the estimate of the number of linearly independent rows or columns in
a matrix. The rank of a matrix M is often denoted as the function rank().

### rank(A)

An intuition for rank is to consider it the number of dimensions spanned by all of the vectors
within a matrix. For example, a rank of 0 suggest all vectors span a point, a rank of 1 suggests
all vectors span a line, a rank of 2 suggests all vectors span a two-dimensional plane. The rank
is estimated numerically, often using a matrix decomposition method. A common approach is to
use the Singular-Value Decomposition or SVD for short.

In [8]:
# matrix rank
# rank 0
M0 = np.array([
    [0, 0],
    [0, 0]
])
print(M0)
mr0 = np.linalg.matrix_rank(M0)
print(mr0)

# rank1
M1 = np.array([
    [1, 2],
    [1, 2]
])
print(M1)
mr1 = np.linalg.matrix_rank(M1)
print(mr1)

# rank2
M2 = np.array([
    [1, 2],
    [3, 4]
])
print(M2)
mr2 = np.linalg.matrix_rank(M2)
print(mr2)

[[0 0]
 [0 0]]
0
[[1 2]
 [1 2]]
1
[[1 2]
 [3 4]]
2


## Sparse Matrix

A sparse matrix is a matrix that is comprised of mostly zero values. Sparse matrices are distinct
from matrices with mostly non-zero values, which are referred to as dense matrices.

A matrix is sparse if many of its coe cients are zero. The interest in sparsity arises
because its exploitation can lead to enormous computational savings and because
many large matrix problems that occur in practice are sparse.

The sparsity of a matrix can be quantified with a score, which is the number of zero values
in the matrix divided by the total number of elements in the matrix.

### sparsity = count of zero elements/total elements

For example a matrix with 13 zero values out of a total of 18 values in the matrix will have a sparsity score of 0.722 or about 72%

### Space complexity 
Very large matrices require a lot of memory, and some very large matrices that we wish to work
with are sparse. In practice, most large matrices are sparse | almost all entries are zeros.

An example of a very large matrix that is too large to be stored in memory is a link matrix
that shows the links from one website to another. An example of a smaller sparse matrix might
be a word or term occurrence matrix for words in one book against all known words in English.
In both cases, the matrix contained is sparse with many more zero values than data values. The
problem with representing these sparse matrices as dense matrices is that memory is required
and must be allocated for each 32-bit or even 64-bit zero value in the matrix. This is clearly a
waste of memory resources as those zero values do not contain any information.

### Time complexity
Assuming a very large sparse matrix can be  t into memory, we will want to perform operations
on this matrix. Simply, if the matrix contains mostly zero-values, i.e. no data, then performing
operations across this matrix may take a long time where the bulk of the computation performed
will involve adding or multiplying zero values together.

This is a problem of increased time complexity of matrix operations that increases with the
size of the matrix. This problem is compounded when we consider that even trivial machine
learning methods may require many operations on each row, column, or even across the entire
matrix, resulting in vastly longer execution times.


In [10]:
# calculate sparsity
# create dense matrix
A = np.array([
    [1, 0, 0, 1, 0, 0],
    [0, 0, 2, 0, 0, 1],
    [0, 0, 0, 2, 0, 0]
])
print(A)

# calculate sparsity
sparsity = 1.0 - np.count_nonzero(A)/A.size
print(sparsity)

[[1 0 0 1 0 0]
 [0 0 2 0 0 1]
 [0 0 0 2 0 0]]
0.7222222222222222


In [11]:
from scipy.sparse import csr_matrix

# convert dense matrix in to a sparse matrix
# create dense matrix
A = np.array([
    [1, 0, 0, 1, 0, 0],
    [0, 0, 2, 0, 0, 1],
    [0, 0, 0, 2, 0, 0]
])
print(A)

# convert to sparse matrix(CSR method)
S = csr_matrix(A)
print(S)

# reconstruct dense matrix
B = S.todense()
print(B)

[[1 0 0 1 0 0]
 [0 0 2 0 0 1]
 [0 0 0 2 0 0]]
  (0, 0)	1
  (0, 3)	1
  (1, 2)	2
  (1, 5)	1
  (2, 3)	2
[[1 0 0 1 0 0]
 [0 0 2 0 0 1]
 [0 0 0 2 0 0]]
