# Linear Algebra
## Ch05 - Intro to Matrices

## Matrix terminology and dimensionality

Notation:

$$\large
A=\begin{bmatrix}
1 & 6 & 0\\
7 & 2 & 4\\
4 & 1 & 1
\end{bmatrix} \ \ \ \ \ \ \ \ \ \ a_{1,2} =6
$$

This is a $3\times 3$ Matrix. Where:
- $\mathbb{R}^{M\times N}$
- M = Rows
- N = Columns

Note: $\mathbb{R}^{M\times N}$ is **different** than $\mathbb{R}^{MN}$.

Dimensionality refers to the **number of elements** in the matrix.

In [1]:
import numpy as np

In [2]:
# Square vs. Rectangular
S = np.round(np.random.randn(5, 5), 1)
R = np.round(np.random.randn(5, 2), 1) # 5 rows, 2 columns
print(f"Square:\n{S}"), print('')
print(f"Rectangular:\n{R}"), print('')

# Identity
I = np.eye(3)
print(f"Identity:\n{I}"), print('')

# Zeros
Z = np.zeros((4, 4))
print(f"Zeros:\n{Z}"), print('')

# Diagonal
D = np.diag([1, 2, 3, 5, 2])
print(f"Diagonal:\n{D}"), print('')

# Create triangular matrix from full matrices
S = np.random.randn(5, 5)
U = np.triu(S)
L = np.tril(S)
print(f"Lower Triangular:\n{np.round(L,1)}"), print('')

# Concatenate matrices (sizes must match!)
A = np.random.randn(3, 2)
B = np.random.randn(3, 4)
C = np.concatenate((A, B), axis=1)
print(f"Concatenate A & B:\n{np.round(C)}")

Square:
[[ 0.7 -0.8  0.8  0.2  0.6]
 [-0.1  0.4  1.3  2.3  0.2]
 [-1.4 -0.5 -0.9 -0.1 -0.9]
 [-0.8  0.9 -1.3 -0.7 -0.3]
 [-0.4  0.4  1.5  1.6 -2.2]]

Rectangular:
[[ 0.9  0.1]
 [-1.  -2.5]
 [-0.3 -1.3]
 [-0.   0. ]
 [-0.2  0.2]]

Identity:
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

Zeros:
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

Diagonal:
[[1 0 0 0 0]
 [0 2 0 0 0]
 [0 0 3 0 0]
 [0 0 0 5 0]
 [0 0 0 0 2]]

Lower Triangular:
[[ 0.4  0.   0.   0.   0. ]
 [-0.4 -1.1  0.   0.   0. ]
 [ 1.  -1.2  0.7  0.   0. ]
 [-1.6 -0.2  1.  -1.1  0. ]
 [ 0.2  0.4  0.9  0.7  0.5]]

Concatenate A & B:
[[ 1.  0. -1. -0. -0. -0.]
 [ 0. -1.  1.  1. -1.  0.]
 [ 0.  1. -1.  2.  0.  0.]]


## Matrix addition and subtraction
Matrix addition is cummutative and associative.

$$\large
 \begin{array}{l}
A+B\ =\ B+A\\
\\
A+( B+C) =( A+B) +C
\end{array}
$$


In [3]:
A = np.array([[1,2], [3,0]])
B = np.array([[0,4], [4,2]])

# Addition
A+B

array([[1, 6],
       [7, 2]])

In [4]:
A = np.array([[1,2,-3], [3,0,-2]])
B = np.array([[0,4,-3], [4,2,2]])

# Subtraction
A-B

array([[ 1, -2,  0],
       [-1, -2, -4]])

## Matrix-scalar multiplication

Matrix multiplication works element-wise.

$$\large
 \begin{array}{l}
\delta \begin{bmatrix}
a & b\\
c & d
\end{bmatrix} =\begin{bmatrix}
\delta a & \delta b\\
\delta c & \delta d
\end{bmatrix} =\begin{bmatrix}
a\delta  & b\delta \\
c\delta  & d\delta 
\end{bmatrix} =\begin{bmatrix}
a & b\\
c & d
\end{bmatrix} \delta \\
\\
\delta MA=M\delta A=MA\delta 
\end{array}
$$

In [5]:
# Define matrix and scalar
M = np.array([[1, 2], [2, 5]])
s = 2

# Pre and post-multiplication is the same:
print(M*s, '\n')
print(s*M)

[[ 2  4]
 [ 4 10]] 

[[ 2  4]
 [ 4 10]]


## Code challenge: is matrix multiplication a linear operation?
Test for some random $M\times N$ matrices whether s(A+B) = sA + sB 

In [6]:
M = 4
N = 3
A = np.round(np.random.randn(M, N), 1)
B = np.round(np.random.randn(M, N), 1)
s = np.round(np.random.randn(1), 1)

# Check s(A+B) & sA + sB
resL = s*(A+B)
resR = s*A + s*B

print(resL), print()
print(resR)

[[-1.12  0.28 -0.42]
 [-3.92  2.24  3.22]
 [-0.    0.42  0.56]
 [-1.26 -0.14 -0.14]]

[[-1.12  0.28 -0.42]
 [-3.92  2.24  3.22]
 [ 0.    0.42  0.56]
 [-1.26 -0.14 -0.14]]


## Transpose
The transpose operation is a way of flipping a matrix by converting rows into columns and columns into rows.

$$\large
 \begin{array}{l}
\begin{bmatrix}
1 & 5\\
0 & 6\\
2 & 8\\
5 & 3\\
-2 & 0
\end{bmatrix}^{T} =\begin{bmatrix}
1 & 0 & 2 & 5 & -2\\
5 & 6 & 8 & 3 & 0
\end{bmatrix}^{T} =\begin{bmatrix}
1 & 5\\
0 & 6\\
2 & 8\\
5 & 3\\
-2 & 0
\end{bmatrix}\\
\\
A^{TT} =A
\end{array}
$$

In [7]:
# Define matrix
M = np.array([ [1,2,3],
               [2,3,4] ])

print(M), print('')
print(M.T), print('') # one transpose
print(M.T.T), print('') # double-transpose returns the original matrix

# Can also use the function transpose
print(np.transpose(M))

[[1 2 3]
 [2 3 4]]

[[1 2]
 [2 3]
 [3 4]]

[[1 2 3]
 [2 3 4]]

[[1 2]
 [2 3]
 [3 4]]


In [8]:
# Warning! be careful when using complex matrices
C = np.array([[4+1j , 3 , 2-4j]])

print(C), print('')
print(C.T), print('')
print(np.transpose(C)), print('')

# Note: In MATLAB, the transpose is the Hermitian transpose; 
#       in Python, you need to call the Hermitian explicitly by first converting from an array into a matrix
print(np.matrix(C).H) # note the sign flips!

[[4.+1.j 3.+0.j 2.-4.j]]

[[4.+1.j]
 [3.+0.j]
 [2.-4.j]]

[[4.+1.j]
 [3.+0.j]
 [2.-4.j]]

[[4.-1.j]
 [3.-0.j]
 [2.+4.j]]


## Symmetric matrices

In [9]:
B = np.array([[2,3,6],[3,4,5],[6,5,9]])
A = B.T

print(B), print()
print(A), print()
print(A == B)

[[2 3 6]
 [3 4 5]
 [6 5 9]]

[[2 3 6]
 [3 4 5]
 [6 5 9]]

[[ True  True  True]
 [ True  True  True]
 [ True  True  True]]


### Quiz
If $A=B^T$ and B is symmetric, then $A^T+B = 2\times A$

In [10]:
print(A.T+B), print()
print(2*A)

[[ 4  6 12]
 [ 6  8 10]
 [12 10 18]]

[[ 4  6 12]
 [ 6  8 10]
 [12 10 18]]


## Diagonal
The diagonal elements of a matrix can be extracted into a vector. As a function, diagonalization takes a matrix as input and returns a vector as output.

$$\large
 \begin{array}{l}
\text{diag}\begin{pmatrix}
\begin{bmatrix}
1 & -1 & 8\\
-1 & -2 & 4\\
0 & 3 & 5
\end{bmatrix}
\end{pmatrix} =\begin{bmatrix}
1\\
-2\\
5
\end{bmatrix}\\
\\
\text{diag}\begin{pmatrix}
\begin{bmatrix}
1 & -1\\
-1 & -2\\
0 & 3
\end{bmatrix}
\end{pmatrix} =\begin{bmatrix}
1\\
-2
\end{bmatrix}
\end{array}
$$

## Trace
The trace is the sum of all the diagonal elements of a matrix. Note, the trace is only defined for square matrices ($M=N$).

$$\large
\text{trace}\begin{pmatrix}
\begin{bmatrix}
1 & -1 & 8\\
-1 & -2 & 4\\
0 & 3 & 5
\end{bmatrix}
\end{pmatrix} =1+( -2) +5=4
$$

### Diagonal and trace: formal definitions
**Diagonal:**

$$\large
v_{i} =A_{i,i} ,\ \ \ \ \ \ \ i=\{1,2,...,min( M,N)\}
$$

**Trace:**

$$\large
tr( A) =\sum _{i=1}^{m} A_{i,i}
$$

In [11]:
# Define matrix
M = np.round(6*np.random.randn(4,4))
print("Matrix:")
print(M), print()

# Extract the diagonals
d = np.diag(M)

# Notice the two ways of using the diag function
d = np.diag(M) # input is matrix, output is vector
D = np.diag(d) # input is vector, output is matrix
print('Diagonal; input is matrix, output is vector:')
print(d), print()

print('Diagonal; input is vector, output is matrix:')
print(D), print()

# Trace as sum of diagonal elements
tr = np.trace(M)
tr2 = sum(np.diag(M))
print('Trace:')
print(tr, tr2)

Matrix:
[[ -9.  -0. -13.   6.]
 [ -9.  16.  11.  -5.]
 [ -9.  -0.   5.  -2.]
 [ -2.   5.   3.   5.]]

Diagonal; input is matrix, output is vector:
[-9. 16.  5.  5.]

Diagonal; input is vector, output is matrix:
[[-9.  0.  0.  0.]
 [ 0. 16.  0.  0.]
 [ 0.  0.  5.  0.]
 [ 0.  0.  0.  5.]]

Trace:
17.0 17.0


## Code challenge: linearity of trace
1. Determine the relationship between $\text{trace}(A) + \text{trace}(B)$ and $\text{trace}(A+B)$
2. Determine the relationship between $\text{trace}(l\times A)$ and $l\times \text{trace}(A)$

In [12]:
# Sizes
M = 4
N = 4

# Define matrices A & B
A = np.round(np.random.randn(M,N),1)
B = np.round(np.random.randn(M,N),1)
l = np.round(20*np.random.randn(1),1)
print('Matrix A:'), print(A), print()
print('Matrix B:'), print(B), print()
print('scalar:'), print(l), print()

# 1. Determine the relationship between trace(𝐴) + trace(𝐵) and trace(𝐴+𝐵)
tr1 = np.trace(A) + np.trace(B)
tr2 = np.trace(A+B)
print('tr(A) + tr(B):', np.round(tr1, 1))
print('tr(A+B):', np.round(tr2, 1)), print()

# 2. Determine the relationship between trace(𝑙×𝐴) and 𝑙×trace(𝐴)
tr3 = np.trace(l*A)
tr4 = float(l*np.trace(A))
print('trace(l*A):', np.round(tr3, 1))
print('l*trace(A):', np.round(tr4, 1))

Matrix A:
[[ 0.2 -0.1  0.5  0.6]
 [ 0.6  0.5  0.4  0.6]
 [ 0.5 -0.2 -0.9  0.1]
 [ 1.2  1.5 -1.1 -0.9]]

Matrix B:
[[ 0.2 -0.   0.2  1.6]
 [ 0.7 -2.2  1.6  0.2]
 [-0.1 -0.6 -0.5 -1.2]
 [ 0.1 -0.5  1.3  0.5]]

scalar:
[-41.8]

tr(A) + tr(B): -3.1
tr(A+B): -3.1

trace(l*A): 46.0
l*trace(A): 46.0


## Broadcasting matrix arithmetic
Broadcasting solves the problem of arithmetic between arrays of differing shapes by in effect replicating the smaller array along the last mismatched dimension. 

The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations.

In [13]:
# Create a matrix
A = np.reshape(np.arange(1,13), (3,4), 'F') # F=column, C=row

# And two vectors
r = np.array([10, 20, 30, 40])
c = np.array([100, 200, 300])

print('Matrix A:')
print(A), print()
print('Vector r:')
print(r), print()
print('Vector c:')
print(c)

Matrix A:
[[ 1  4  7 10]
 [ 2  5  8 11]
 [ 3  6  9 12]]

Vector r:
[10 20 30 40]

Vector c:
[100 200 300]


In [14]:
# Broadcast on the rows
print('A + r:')
print(A + r), print()

# Broadcast on the columns
# print(A+c)
print('Broadcast vector c columns:')
print(A + np.reshape(c, (len(c), 1))) # only works for explicit column vectors

A + r:
[[11 24 37 50]
 [12 25 38 51]
 [13 26 39 52]]

Broadcast vector c columns:
[[101 104 107 110]
 [202 205 208 211]
 [303 306 309 312]]
