# Lesson 3

## Matrices & matrix operations. Part 1

A __matrix__ of size $m\times n$ is a rectangular table consisting of $m$ rows and $n$ columns. Elements of a matrix denoted by a capital letter (e.g. $A$) are usually denoted by $a_{ij}$, where the indices $i$ and $j$ are the row and column numbers, respectively, in which the element is located. 

$$\begin{pmatrix}
a_{11} & a_{12} & \cdots & a_{1n}\\ 
a_{21} & a_{22} & \cdots & a_{2n}\\  
\cdots & \cdots & \ddots & \cdots \\ 
a_{m1} & a_{m2} & \cdots & a_{mn}\\ 
\end{pmatrix}$$

The numbers $m$ and $n$ are called the _orders_ of a matrix. If $m=n$, then the matrix is called _square_, and the number $m=n$ is its _order_.

The symbol is sometimes used to briefly identify a matrix $\left\|a_{ij}\right\|$ or an expression $A = \left\|a_{ij}\right\| = (a_{ij}) ~ (i = 1,2,...,m; ~ j = 1,2,...,n).$

In machine learning, matrices are typically used to store information about objects and their attributes. Typically, the rows contain the objects and the columns contain the features (some properties of the objects whose values are used to train the model and which the model must then predict for other objects).

The vectors studied in the previous lessons are special cases of matrices: in particular, the _vector-string_ (matrix of size $1\times m$) and _vector-column_ (matrix of size $n\times1$) mentioned in the previous lesson:

$$\begin{pmatrix}
1 & 2 & 3
\end{pmatrix} ~ and ~ \begin{pmatrix}
1\\ 
2\\ 
3
\end{pmatrix}.$$

In Python, matrices are usually handled using the NumPy library.

In [1]:
import numpy as np

The easiest way to create a matrix is with the function `numpy.array(list, dtype=None, ...)`. Here `list` is the list of iterated objects that will make up the matrix rows. We have already used this function in the first lesson to create vectors (a particular one-dimensional case of matrices).

In [2]:
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(f'Matrix:\n{a}')

Matrix:
[[1 2 3]
 [4 5 6]
 [7 8 9]]


## Matrix operations

### 1. Matrix addition

Consider matrices $A$ of size $m\times n$, consisting of elements $a_{ij}$, and $B$ of the same size, consisting of elements $b_{ij}$.

Only matrices of the same order can be added, and they are added element by element, i.e., the _sum of two matrices $A+B$ is a matrix $C$ of the same size consisting of elements $c_{ij}=a_{ij}+b_{ij}$:

$$A + B = 
\begin{pmatrix}
a_{11} & a_{12} & \cdots & a_{1n}\\ 
a_{21} & a_{22} & \cdots & a_{2n}\\  
\cdots & \cdots & \ddots & \cdots \\ 
a_{m1} & a_{m2} & \cdots & a_{mn}\\ 
\end{pmatrix} + \begin{pmatrix}
b_{11} & b_{12} & \cdots & b_{1n}\\ 
b_{21} & b_{22} & \cdots & b_{2n}\\  
\cdots & \cdots & \ddots & \cdots \\ 
b_{m1} & b_{m2} & \cdots & b_{mn}\\ 
\end{pmatrix} = \begin{pmatrix}
a_{11} + b_{11} & a_{12} + b_{12} & \cdots & a_{1n} + b_{1n}\\ 
a_{21} + b_{21}& a_{22} + b_{22}& \cdots & a_{2n} + b_{2n}\\  
\cdots & \cdots & \ddots & \cdots \\ 
a_{m1} + b_{m1} & a_{m2} + b_{m2} & \cdots & a_{mn} + b_{mn}\\ 
\end{pmatrix}
= C.$$

__Example__

Add up two matrices filled with natural numbers:

$$\begin{pmatrix}
1 & 2 & 3\\ 
4 & 5 & 6\\ 
7 & 8 & 9
\end{pmatrix} + \begin{pmatrix}
1 & 1 & 1\\ 
1 & 1 & 1\\ 
1 & 1 & 1
\end{pmatrix} = \begin{pmatrix}
2 & 3 & 4\\ 
5 & 6 & 7\\ 
8 & 9 & 10
\end{pmatrix}.$$


And perform the same operation using Python:

In [3]:
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
b = np.ones((3,3))  # function to create a matrix of size (x,y) filled with units

print(f'Matrix A\n{a}\n')
print(f'Matrix B\n{b}\n')
print(f'Matrix С = A + B\n{a + b}')

Matrix A
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Matrix B
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]

Matrix С = A + B
[[ 2.  3.  4.]
 [ 5.  6.  7.]
 [ 8.  9. 10.]]


The matrix addition operation has the same properties as the real number addition operation, namely<br>
1) the permutative property: $A+B=B+A;$<br>
2) combinative property: $(A+B)+C=A+(B+C).$

### 2. Multiplication of matrices by a number

Multiplication of a matrix by a number is as simple as addition: the _product of matrix_ $A=\left\|a_{ij}\right\|$ by number $\lambda$ is matrix $C = \left\|c_{ij}\right\|$ whose elements $c_{ij}=\lambda a_{ij}$, that is, each matrix element is multiplied by number $\lambda$:

$$\lambda\cdot\begin{pmatrix}
a_{11} & a_{12} & \cdots & a_{1n}\\ 
a_{21} & a_{22} & \cdots & a_{2n}\\  
\cdots & \cdots & \ddots & \cdots \\ 
a_{m1} & a_{m2} & \cdots & a_{mn}\\ 
\end{pmatrix}=
\begin{pmatrix}
\lambda a_{11} & \lambda a_{12} & \cdots & \lambda a_{1n}\\ 
\lambda a_{21} & \lambda a_{22} & \cdots & \lambda a_{2n}\\  
\cdots & \cdots & \ddots & \cdots \\ 
\lambda a_{m1} & \lambda a_{m2} & \cdots & \lambda a_{mn}\\ 
\end{pmatrix}.$$

__Example__

$$3\cdot\begin{pmatrix}
1 & 2 & 3\\ 
4 & 5 & 6\\ 
7 & 8 & 9
\end{pmatrix} = \begin{pmatrix}
3 & 6 & 9\\ 
12 & 15 & 18\\ 
21 & 24 & 27
\end{pmatrix}.$$


This is also the case in Python:

In [4]:
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
k = 3

print(f'Matrix А\n{a}\n')
print(f'Matrix 3*А\n{k*a}')

Matrix А
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Matrix 3*А
[[ 3  6  9]
 [12 15 18]
 [21 24 27]]


Matrix multiplication by a number has the following properties:<br>
1) $(\lambda \mu)A=\lambda(\mu A);$<br>
2) $\lambda (A+B)=\lambda A + \lambda B;$<br>
3) $(\lambda + \mu) A = \lambda A + \mu A.$

### 3. Matrix multiplication

The _Product_ of the matrix $A = \left\|a_{ij}\right|$ having orders $m$ and $n$, and the matrix $B = \left\|b_{ij}\right|$, having orders $n$ and $k$ is called matrix $C = \left\||c_{ij}\right||$ having orders $m$ and $k$:

$$C = \begin{pmatrix}
c_{11} & c_{12} & \cdots & c_{1k}\\ 
c_{21} & c_{22} & \cdots & c_{2k}\\ 
\cdots & \cdots & \ddots & \cdots\\ 
c_{m1} & c_{m2} & \cdots & c_{mk}
\end{pmatrix},$$

filled with the elements defined by the formula

$$ c_{ij}=\sum_{p=1}^{n}a_{ip}b_{pj}.$$

The rule of matrix multiplication can be stated verbally as follows: The _element $c_{ij}$ at the intersection of the $i$th row and $j$th column of matrix $C=A\cdot B$ is equal to the sum of the pairwise products of the corresponding elements of the $i$th row of matrix $A$ and $j$th column of matrix $B$.

Note that matrix $A$ cannot be multiplied by any matrix $B$: It is necessary that the number of columns of matrix $A$ be equal to the number of rows of matrix $B$.

In particular, it is possible to determine both products $A\cdot B$ and $B\cdot A$ only if the number of columns $A$ is the same as the number of rows $B$, and the number of rows $A$ is the same as the number of columns $B$. In this case, the matrices $A\cdot B$ and $B\cdot A$ will be squared. Their order will be different in general and will be the same only in the case of square matrices $A$ and $B$.

__Example 1__

The scalar product of vectors we discussed in the previous lesson can be understood as the multiplication of a row vector by a column vector.

__Example 2__

For a better understanding, let's look at the multiplication of second-order square matrices as an example:

$$\begin{pmatrix}
a_{11} & a_{12}\\ 
a_{21} & a_{22}
\end{pmatrix} \begin{pmatrix}
b_{11} & b_{12}\\ 
b_{21} & b_{22}
\end{pmatrix}=\begin{pmatrix}
(a_{11}b_{11} + a_{12}b_{21}) & (a_{11}b_{12} + a_{12}b_{22})\\ 
(a_{21}b_{11} + a_{22}b_{21}) & (a_{21}b_{12} + a_{22}b_{22})
\end{pmatrix}.$$

__Example 3__

Multiply the matrices

$$A=\begin{pmatrix}
1 & 0\\ 
2 & 1\\ 
10 & 5
\end{pmatrix} \; и \; B=\begin{pmatrix}
2 & 0 & 0\\ 
0 & 0 & 1
\end{pmatrix}.$$

$$A\cdot B=\begin{pmatrix}
1\cdot2+0\cdot0 & 1\cdot0+2\cdot0 & 1\cdot0+0\cdot1\\ 
2\cdot2+1\cdot0 & 2\cdot0+1\cdot0 & 2\cdot0+1\cdot1\\ 
10\cdot2+5\cdot0 & 10\cdot0+5\cdot0 & 10\cdot0+5\cdot1
\end{pmatrix}=\begin{pmatrix}
2 & 0 & 0\\ 
4 & 0 & 1\\ 
20 & 0 & 5
\end{pmatrix}.$$

Let's do the same with Python.

In the NumPy library, we multiply matrices using the same tools we used to find the scalar product of vectors - the function `numpy.dot(a, b)` or the method `a.dot(b)`, only in this case `a` and `b` are matrices.

In [5]:
A = np.array([[1, 0], [2, 1], [10, 5]])
B = np.array([[2, 0, 0], [0, 0, 1]])

print(f'Matrix A\n{A}')
print(f'Matrix B\n{B}')
print(f'Matrix AB\n{np.dot(A, B)}')

Matrix A
[[ 1  0]
 [ 2  1]
 [10  5]]
Matrix B
[[2 0 0]
 [0 0 1]]
Matrix AB
[[ 2  0  0]
 [ 4  0  1]
 [20  0  5]]


Let's try to multiply matrices that don't meet the rule of matching the number of rows and columns:

In [6]:
A = np.array([[1, 0], [2, 1], [10, 5]])
B = np.array([[2, 0, 0], [0, 0, 1], [0, 0, 1]])

print(f'Matrix A\n{A}')
print(f'Matrix B\n{B}')
print(f'Matrix AB\n{np.dot(A, B)}')

Matrix A
[[ 1  0]
 [ 2  1]
 [10  5]]
Matrix B
[[2 0 0]
 [0 0 1]
 [0 0 1]]


ValueError: shapes (3,2) and (3,3) not aligned: 2 (dim 1) != 3 (dim 0)

Running the code in the cell above causes an error because the number of columns of the first matrix does not equal the number of rows of the second matrix when multiplied.

__Example 4__

Multiply the matrices

$$A = \begin{pmatrix}
1 & 3\\ 
2 & 6
\end{pmatrix} ~ 
и ~
B = \begin{pmatrix}
9 & 6\\ 
-3 & -2
\end{pmatrix}.$$

$$\begin{pmatrix}
1 & 3\\ 
2 & 6
\end{pmatrix} \cdot
\begin{pmatrix}
9 & 6\\ 
-3 & -2
\end{pmatrix} = 
\begin{pmatrix}
1\cdot 9 + 3\cdot (-3) & 1\cdot6 + 3\cdot (-2)\\ 
2\cdot 9 + 6\cdot (-3) & 2\cdot6 + 6\cdot (-2)
\end{pmatrix} = 
\begin{pmatrix}
0 & 0\\ 
0 & 0
\end{pmatrix}.
$$

This is an important example, showing that matrices have divisors of zero, i.e. non-zero matrices which, when multiplied, give a _zero matrix_.

It follows from the specificity of matrix product that it has no commutativity property, i.e. $AB\neq BA$. In general, this is true even for square matrices, e.g:

$$\begin{pmatrix}
0 & 0\\ 
0 & 1
\end{pmatrix} \begin{pmatrix}
0 & 0\\ 
1 & 0
\end{pmatrix}=\begin{pmatrix}
0 & 0\\ 
1 & 0
\end{pmatrix},$$
$$\begin{pmatrix}
0 & 0\\ 
1 & 0
\end{pmatrix}\begin{pmatrix}
0 & 0\\ 
0 & 1
\end{pmatrix}=\begin{pmatrix}
0 & 0\\ 
0 & 0
\end{pmatrix}.$$


However, the product of matrices of __appropriate dimensions__ has other characteristics:

1. Associativity: $(AB)C = A(BC).$

   __Proof__

    Take matrices $A$ of size $m\times n$, $B$ of size $n\times k$, $C$ of size $k\times l$. Then, by definition, the $i,j$th element of the product of matrix $AB$ by matrix $C$ is equal to:

    $$\left\{(AB)C\right\}_{ij}=\sum_{p=1}^{k}\left\{AB\right\}_{ip}c_{pj}=\sum_{p=1}^{k}\left\  
    (\sum_{q=1}^{n}a_{iq}b_{qp}\right)c_{pj}=\sum_{q=1}^{n}a_{iq}\left(\sum_{p=1}^{k}b_{qp}c_{pj}\right)=\left\{A(BC)\right\}_{ij},$$

    which is exactly what I needed to prove.


2. Distributivity: $(A+B)C = AC + BC$ и $A(B+C) = AB + AC$.

    This property is derived from the definitions of sum and product of matrices.

The term "multiplication of matrices" is a special case of multiplication:

$$A\cdot A = A^{2}.$$

__Example__

$$\begin{pmatrix}
1 & 1\\ 
0 & 1
\end{pmatrix}^{2} = 
\begin{pmatrix}
1 & 1\\ 
0 & 1
\end{pmatrix} \cdot
\begin{pmatrix}
1 & 1\\ 
0 & 1
\end{pmatrix} = 
\begin{pmatrix}
1\cdot1 + 1\cdot0 & 1\cdot1 + 1\cdot1\\ 
0\cdot1 + 1\cdot0 & 0\cdot1 + 1\cdot1
\end{pmatrix} = 
\begin{pmatrix}
1 & 2\\ 
0 & 1
\end{pmatrix}
$$

### 4. Transpose matrices

Consider another common procedure when working with matrices, called _transpose._

Suppose there is a matrix $A$ of size $m\times n$. In this case, the matrix $B$ obtained by transposing matrix $A$ and denoted by $B=A^{T}$ will be a matrix of size $n\times m$ whose elements $b_{ij}=a_{ji}$. In other words, a transposed matrix is a matrix mirrored with respect to the main diagonal (the diagonal on which the elements with $i=j$ are placed). By transposing, the rows of the original matrix become columns, and the columns become rows.

__Example__

$$A=\begin{pmatrix}
8 & 5 & 3\\ 
4 & 6 & 1\\ 
0 & 11 & 9\\ 
2 & 7 & 10
\end{pmatrix}, \;A^{T}=\begin{pmatrix}
8 & 4 & 0 & 2\\ 
5 & 6 & 11 & 7\\ 
3 & 1 & 9 & 10
\end{pmatrix}.$$

In NumPy, the transposed matrix is computed using the `numpy.transpose(array)` function or the `array.T` method, where `array` is the matrix.

In [7]:
a = np.array([[8, 5, 3], [4, 6, 1], [0, 11, 9], [2, 7, 10]])

print(f'Matrix:\n{a}')
print(f'Transposed matrix:\n{a.T}')

Matrix:
[[ 8  5  3]
 [ 4  6  1]
 [ 0 11  9]
 [ 2  7 10]]
Transposed matrix:
[[ 8  4  0  2]
 [ 5  6 11  7]
 [ 3  1  9 10]]


Some important properties of transposition:<br>
1) $(A+B)^{T}=A^{T}+B^{T};$<br>
2) $(A\cdot B)^{T}=B^{T}\cdot A^{T}.$

## Matrix types

Consider some special types of matrices that we will need in the future, in particular when dealing with the topic of linear transformations and matrix decompositions.

### Diagonal matrix

A _diagonal matrix_ is a matrix in which all elements that do not lie on the main diagonal are zero:

$$D = \begin{pmatrix}
d_{1} & 0 & \cdots & 0\\ 
0 & d_{2} & \cdots  & 0\\ 
\cdots & \cdots & \ddots & \cdots\\ 
0 & 0 & \cdots & d_{n}
\end{pmatrix}.$$

A special case of a diagonal matrix is a _unique matrix_, usually denoted by $E$ or $I$, which is a matrix in which the main diagonal is occupied by ones and the other elements are zero:

$$E=\begin{pmatrix}
1 & 0 & \cdots & 0\\ 
0 & 1 & \cdots  & 0\\ 
\cdots & \cdots & \ddots & \cdots\\ 
0 & 0 & \cdots & 1
\end{pmatrix}.$$


In NumPy, the unit matrix is specified using the function `numpy.eye(n)`, where `n` is the order of the matrix:

In [8]:
e = np.eye(5)
print(f'Single matrix:\n{e}')

Single matrix:
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


### Triangular matrix

A _triangular matrix_ is a square matrix in which all elements below (or above) the main diagonal are zero.

A triangular matrix, in which all the elements below the main diagonal are zero, is called an _upper triangular (upper-triangular)_ matrix:

$$\begin{pmatrix}
a_{11} & a_{12} & \cdots & a_{1n}\\ 
0 & a_{22} & \cdots & a_{2n}\\  
\cdots & \cdots & \ddots & \cdots \\ 
0 & 0 & \cdots & a_{nn}\\ 
\end{pmatrix}.$$

A triangular matrix in which all the elements above the main diagonal are zero is called a _lower triangular (lower-triangular)_ matrix:

$$\begin{pmatrix}
a_{11} & 0 & \cdots & 0\\ 
a_{21} & a_{22} & \cdots & 0\\  
\cdots & \cdots & \ddots & \cdots \\ 
a_{n1} & a_{n2} & \cdots & a_{nn}\\ 
\end{pmatrix}.$$

An important property of upper triangular matrices: _ when upper triangular matrices are multiplied, the property of upper triangularity is retained.

__Example__

In [9]:
import numpy as np
a = np.array([[4, 5, 2], [1, -2, 3]])
b = np.array([[4, 1], [5, -2], [2, 3]])

print(f'Matrix A\n{a}\n')
print(f'Matrix B\n{b}\n')
print(f'Matrix AB\n{np.dot(a, b)}')

Matrix A
[[ 4  5  2]
 [ 1 -2  3]]

Matrix B
[[ 4  1]
 [ 5 -2]
 [ 2  3]]

Matrix AB
[[45  0]
 [ 0 14]]


### Orthogonal matrix

A matrix $A$ is called _orthogonal_ if 

$$AA^{T}=A^{T}A=E.$$

__Example__

An orthogonal matrix is

$$\begin{pmatrix}
cos\varphi & -sin\varphi\\ 
sin \varphi & cos\varphi
\end{pmatrix}.$$

Let's make sure that this is indeed the case:

$$\begin{pmatrix}
cos\varphi & -sin\varphi\\ 
sin \varphi & cos\varphi
\end{pmatrix} \cdot 
\begin{pmatrix}
cos\varphi & sin\varphi\\ 
-sin \varphi & cos\varphi
\end{pmatrix} = 
\begin{pmatrix}
cos\varphi \cdot cos\varphi + (-sin\varphi) \cdot (-sin\varphi) & cos\varphi \cdot sin\varphi + (-sin\varphi) \cdot cos\varphi\\ 
sin\varphi \cdot cos\varphi + cos\varphi \cdot (-sin\varphi) & sin\varphi \cdot sin\varphi + cos\varphi \cdot cos\varphi
\end{pmatrix} = 
\begin{pmatrix}
1 & 0\\ 
0 & 1
\end{pmatrix}.$$

### Symmetric matrix

A matrix $A$ is called _symmetric_ if

$$A=A^{T},$$

that is, the symmetric matrix is symmetric about the main diagonal.

## Additional

1. [Ways to set a matrix in NumPy](https://docs.scipy.org/doc/numpy-1.10.1/user/basics.creation.html).
2. [numpy.transpose](https://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.transpose.html).
3. [array.T](https://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.ndarray.T.html).
4. [Перемножение матриц в NumPy](https://docs.scipy.org/doc/numpy-1.10.0/reference/routines.linalg.html#matrix-and-vector-products).