## Matrix properties

Frobenius Norm

In [1]:
import numpy as np

X = np.array([[1, 2], [3, 4]])

In [2]:
(1**2 + 2**2 + 3**2 + 4**2)**0.5

5.477225575051661

In [3]:
np.linalg.norm(X) #same as vector L2 norm also known as Euclidean norm 

np.float64(5.477225575051661)

In [4]:
# Frobenius norm using pytorch
import torch
X_pt = torch.tensor([[1, 2], [3, 4.]])
torch.norm(X_pt)

tensor(5.4772)

In [5]:
# Frobenius norm using tensorflow
import tensorflow as tf
X_tf = tf.Variable([[1, 2], [3, 4.]])
tf.norm(X_tf)

<tf.Tensor: shape=(), dtype=float32, numpy=5.4772257804870605>

### Matrix Multiplication (with a Vector)

In [6]:
A = np.array([[3,4], [5,6], [7,8]])

b = np.array([1,2])

print(A)
print(A.shape)
print(b)
print(b.shape)

[[3 4]
 [5 6]
 [7 8]]
(3, 2)
[1 2]
(2,)


In [7]:
np.dot(A, b) # technically dot product are between vectors only

array([11, 17, 23])

In [8]:
A_pt = torch.tensor([[3,4], [5,6], [7,8]])
b_pt = torch.tensor([1,2])

from_np = torch.from_numpy(b)

# using pytorch matmul - like dot np.dot() - automatically infers dims in order to perform dot product, matvec, or matrix multiplication
torch.matmul(A_pt, b_pt)

tensor([11, 17, 23])

In [9]:
A_tf = tf.Variable([[3,4], [5,6], [7,8]])
b_tf = tf.Variable([1,2])

from_np_tf = tf.convert_to_tensor(b, dtype=tf.int32)

# matvec - matrix vector multiplication
tf.linalg.matvec(A_tf, b_tf)

<tf.Tensor: shape=(3,), dtype=int32, numpy=array([11, 17, 23], dtype=int32)>

### (Matrix-by-)Matrix Multiplication

In [10]:
M1 = np.array([[3,4], [5,6], [7,8]])
M2 = np.array([[1,9], [2,0]])

print(M1)
print(M1.shape)
print(M2)
print(M2.shape)


[[3 4]
 [5 6]
 [7 8]]
(3, 2)
[[1 9]
 [2 0]]
(2, 2)


In [11]:
np.dot(M1, M2)

# Matix Multiplication is not Commutative M1*M2 != M2*M1

array([[11, 27],
       [17, 45],
       [23, 63]])

In [12]:
M1_pt = torch.tensor([[3,4], [5,6], [7,8]])
M2_pt = torch.tensor([[1,9], [2,0]])

print(M1_pt)
print(M2_pt)

tensor([[3, 4],
        [5, 6],
        [7, 8]])
tensor([[1, 9],
        [2, 0]])


In [13]:
torch.matmul(M1_pt, M2_pt)

tensor([[11, 27],
        [17, 45],
        [23, 63]])

In [14]:
M1_tf = tf.Variable([[3,4], [5,6], [7,8]])
M2_tf = tf.Variable([[1,9], [2,0]])

tf.linalg.matmul(M1_tf, M2_tf)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[11, 27],
       [17, 45],
       [23, 63]], dtype=int32)>

### Symmetric Matrices

Properties are:
- Square
- $X^T$ = $X$

In [15]:
X_sym = np.array([[1, 2, 3], [2, 3 ,4], [3, 4, 5]])
X_sym

array([[1, 2, 3],
       [2, 3, 4],
       [3, 4, 5]])

In [16]:
X_sym.T

array([[1, 2, 3],
       [2, 3, 4],
       [3, 4, 5]])

In [17]:
X_sym.T == X_sym

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

### Identity Matrices

- Special case of Symmetric Matrix
- Every element along diagonal is 1
- All other elements are 0
- Notation: $I_n$ were n = height or width
- n-length vector unchanged if multiplied by $I_n$

In [18]:
X_indentity = torch.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
X_indentity

tensor([[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1]])

In [19]:
x_pt = torch.tensor([25, 3, 5])
x_pt

tensor([25,  3,  5])

In [20]:
torch.matmul(X_indentity, x_pt)

tensor([25,  3,  5])

### Matrix Inversion

Properties:
- Clever, convenient approach for solving linear equations
- An alternative to manually solving with substitution or elemination

Matrix inverse of $X$ is denoted as $X^{-1}$
- Satisfies = $X^{-1} X$ = $I_n$

In [21]:
X = np.array([[4, 2], [-5, -3]])
X

array([[ 4,  2],
       [-5, -3]])

In [22]:
X_inv = np.linalg.inv(X)
X_inv

array([[ 1.5,  1. ],
       [-2.5, -2. ]])

In [23]:
y = np.array([4, -7])
y

array([ 4, -7])

The regression formula is $y = Xw$ - where w is the weights through m

- we can also write the same regression formula like this $Xw = y$
- assuming $X^{-1}$ exists
- then, $X^{-1} Xw$ = $X^{-1}y$
- We know $X^{-1} X$ = $I_n$
- So, $I_n w$ = $X^{-1}y$
- any vector multiplied by Identity matrix $I_n$ is itself
- Finally, $w$ = $X^{-1}y$

In [25]:
w = np.dot(X_inv, y)
w

array([-1.,  4.])

In [27]:
# now that we know the weights, we can use it on the equation
y1 = 4 * -1 + 2 * 4
y1

4

In [28]:
y2 = -5 * -1 + -3 * 4
y2

-7

In [29]:
# let us equate the same to the original equation y = Xw
y = np.dot(X, w)
y

array([ 4., -7.])

In [33]:
# inverse of a matrix in PyTorch and TensorFlow
torch.inverse(torch.tensor([[4, 2.], [-5, -3]]))

tensor([[ 1.5000,  1.0000],
        [-2.5000, -2.0000]])

In [32]:
tf.linalg.inv(tf.Variable([[4, 2.], [-5, -3]]))


<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[ 1.4999998,  0.9999998],
       [-2.4999995, -1.9999996]], dtype=float32)>

Matrix inversion is a nifty trick, but can only be applied to matrix
- Matrix is not sigular
- That is all columns of matrix must be linearly independent
    - Eg: if a column is [1,2] and another column cannot be [2,4] or also [1,2] 

Where can it be applied:
- we appply matrix inversion on square matrix $n_row$ = $n_col$ (vector span = matrix)
    - Avoid overdetermination  $n_{row}$ > $n_{col}$ or  $n_{equations}$ > $n_{dimentions}$ 
    - Avoid underdetermination  $n_{row}$ < $n_{col}$ or  $n_{equations}$ < $n_{dimentions}$ 

*There are other ways to solve for overdetermined and underdetermined matrices*

In [38]:
# Materix inversion with no solution

# X = np.array([[-4, 1], [-8, 2]])
X = np.array([[-4, 1], [-4, 1]])
# X = np.array([[-4, 1], [-3, 6], [6, 8]])

X

array([[-4,  1],
       [-4,  1]])

In [39]:
Xinv = np.linalg.inv(X)

LinAlgError: Singular matrix

### Diagonal Matrix 

- Non zero elements along main  diagonal; zeros everywhere else
- Identity matrix is an example
- If this matrix is a square, can be denoted as diag(**x**) where x is vector of main diagonal elements
- Computationally efficient
    - Multiplication: $diag(x)y$ = $x \odot y$
    - Inversion: $diag(x)^{-1}$ = $diag[1/x_1,....., 1/x_n]^T$
        - Cant' divided by zero so $x$ cannot be zero, here $x$ is diagonal
- Can be non-squared matroix and computation is still efficient:
    - if $h > w$, simply add zeros to product
    - if $h < w$, remove elements from product    


In [41]:
## Diagonal matrix

X = np.array([[1, 0, 0], [0, 2, 0], [0, 0, 3]])
X

array([[1, 0, 0],
       [0, 2, 0],
       [0, 0, 3]])

### Orthogonal Matrix

- In Orthodonal matrices, like orthonormal vectors:
    - make up all rows
    - make up all columns
- This means $A^T A$ = $A A^T$ = $I$
- So, let us consider $A A^T$ = $I$
    - Here if add inverse on both sides: $A^{-1} A A^T$ = $A^{-1} I$
    - A and A inverse cancel out on the left side: $A^T$ = $A^{-1} I$
    - When $A^{-1}$ gets multiplied by $I$, it would become just $A^{-1}$: 
        - $A^T$ = $A^{-1}$ 
        - Calcualting $A^T$ is cheap, so for orthogonal matrix calculating $A^{-1}$ is also cheap

In [42]:
# check if I3's columns are orthogonal to each other

I_3 = np.eye(3)
I_3

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [47]:
c1 = I_3[:, 0]
c2 = I_3[:, 1]
c3 = I_3[:, 2]
c1, c2, c3

(array([1., 0., 0.]), array([0., 1., 0.]), array([0., 0., 1.]))

In [48]:
v1 = np.dot(c1, c2)
v2 = np.dot(c1, c3)
v3 = np.dot(c2, c3)
v1, v2, v3

(np.float64(0.0), np.float64(0.0), np.float64(0.0))

In [None]:
# check if I3's columns has unit norm

uc_1 = np.linalg.norm(I_3[:, 0])
uc_2 = np.linalg.norm(I_3[:, 1])
uc_3 = np.linalg.norm(I_3[:, 2])
uc_1, uc_2, uc_3


(np.float64(1.0), np.float64(1.0), np.float64(1.0))

Since the matrix has mutually orthogonal columns and each column has unit norm, the column vectors of $I_3$ are *orthonormal*. Since
$I_3^T$ = $I_3$ this means the rows of $I_3$ must also be orthonormal.

Since the columns and rows of $I_3$ are orthonormal, $I_3$ is an orthogonal matrix.

In [49]:
K = torch.tensor([[2/3, 1/3, 2/3], [-2/3, 2/3, 1/3], [1/3, 2/3, -2/3]])
K

tensor([[ 0.6667,  0.3333,  0.6667],
        [-0.6667,  0.6667,  0.3333],
        [ 0.3333,  0.6667, -0.6667]])

In [50]:
k_c1 = K[:, 0]
k_c2 = K[:, 1]
k_c3 = K[:, 2]
k_c1, k_c2, k_c3

(tensor([ 0.6667, -0.6667,  0.3333]),
 tensor([0.3333, 0.6667, 0.6667]),
 tensor([ 0.6667,  0.3333, -0.6667]))

In [None]:
torch.dot(k_c1, k_c2)

tensor(0.)

In [None]:
torch.dot(k_c1, k_c3)

tensor(0.)

In [67]:
torch.dot(k_c2, k_c3)

tensor(0.)

In [54]:
torch.norm(k_c1)


tensor(1.)

In [55]:
torch.norm(k_c2)

tensor(1.)

In [56]:
torch.norm(k_c3)

tensor(1.)

For this matrix as well we see all the columns are orthogonal to each other and have unit norm. 

In [60]:
# check if K transpose 

K_t = K.T
K_t

tensor([[ 0.6667, -0.6667,  0.3333],
        [ 0.3333,  0.6667,  0.6667],
        [ 0.6667,  0.3333, -0.6667]])

In [None]:
K_t == K


tensor([[ True, False, False],
        [False,  True, False],
        [False, False,  True]])

K transpose is not equal to K so we can perform to check if K transpose columns are orthogonal to each other and check if K transpose columns has unit norm

In [62]:
K_t_c1 = K_t[:, 0]
K_t_c2 = K_t[:, 1]
K_t_c3 = K_t[:, 2]
K_t_c1, K_t_c2, K_t_c3

(tensor([0.6667, 0.3333, 0.6667]),
 tensor([-0.6667,  0.6667,  0.3333]),
 tensor([ 0.3333,  0.6667, -0.6667]))

In [None]:
torch.dot(K_t_c1, K_t_c2)

tensor(0.)

In [None]:
torch.dot(K_t_c1, K_t_c3)

tensor(0.)

In [None]:
torch.dot(K_t_c2, K_t_c3)

tensor(0.)

In [66]:
#  Check norm of K transpose
cn1 = torch.norm(K_t_c1)
cn2 = torch.norm(K_t_c2)
cn3 = torch.norm(K_t_c3)
cn1, cn2, cn3

(tensor(1.), tensor(1.), tensor(1.))

So for $K^T$ as well we see that the all the columns are orthogonal to each other and have unit norm.

Hence we can conclude that the K is orthogonal

Alternatively, we can check if $K$ is orthogonal, we can check if $A^T A$ = $I$ for $K$ using a single line of code

In [68]:
torch.matmul(K_t, K)

tensor([[ 1.0000e+00, -3.3114e-09,  3.3114e-09],
        [-3.3114e-09,  1.0000e+00,  6.6227e-09],
        [ 3.3114e-09,  6.6227e-09,  1.0000e+00]])

Here, we can confirm that the output of the $K^T K$ is in fact $I$ (identity matrix - diagonals are 1s, other elements close to 0) and Therefore $K$ is orthogonal matrix