<h2 style="text-align: center;"><strong>Segment 3: Matrix Properties</strong></h2>

* The Frobenius Norm
* Matrix Multiplication
* Symmetric Matrices
* Identity Matrices
* Matrix Inversion
* Diagonal Matrices
* Orthogonal Matrices

---

In [65]:
import numpy as np
import matplotlib.pyplot as plt
import torch
import tensorflow as tf

## **Frobenius Norm**
*The **Frobenius norm** of a matrix is a measure of its overall magnitude. It is defined as the **square root of the sum of the squares of all the elements** in the matrix.*

> $\|A\|_F = \sqrt{\sum_{i=1}^{m} \sum_{j=1}^{n} a_{ij}^2}$

##### **Frobenius Norm in NumPy**

In [66]:
X = np.array([[1, 2], [3, 4]])
X

array([[1, 2],
       [3, 4]])

In [67]:
(1**2 + 2**2 + 3**2 + 4**2)**(1/2)

5.477225575051661

In [68]:
np.linalg.norm(X)

np.float64(5.477225575051661)

##### **Frobenius Norm in PyTorch**

*torch.norm() supports floats only*

In [69]:
X_pt = torch.tensor([[1, 2], [3, 4.]])

In [70]:
torch.norm(X_pt)

tensor(5.4772)

##### **Frobenius Norm in TensorFlow**

*tf.norm() also supports floats only*

In [71]:
X_tf = tf.Variable([[1, 2], [3, 4.]])

In [72]:
tf.norm(X_tf)

<tf.Tensor: shape=(), dtype=float32, numpy=5.4772257804870605>

---

## **Matrix Multiplication**
*Matrix multiplication is an operation where two matrices are combined to produce a third matrix. For the multiplication $A \cdot B$ to be valid, the number of **columns in $A$ must equal the number of rows in $B$**. Each element of the resulting matrix is the **dot product** of a row from the first matrix and a column from the second matrix.*

**Rules for matrix multiplication:**
1. Only defined if $A$ is of size $m \times n$ and $B$ is of size $n \times p$.  
2. The resulting matrix $C = A \cdot B$ will have size $m \times p$.  
3. Matrix multiplication is generally **not commutative**, i.e., $AB \neq BA$.  
4. Multiplication **is associative**: $(AB)C = A(BC)$.  
5. Multiplication **is distributive** over addition: $A(B + C) = AB + AC$.  

> $C_{ij} = \sum_{k=1}^{n} A_{ik} B_{kj}$

### **Matrix-Vector Multiplication**

##### **NumPy**

In [73]:
A = np.array([[3, 4], [5, 6], [7, 8]])
A

array([[3, 4],
       [5, 6],
       [7, 8]])

In [74]:
b = np.array([1, 2])
b

array([1, 2])

*Even though technically dot products are between vectors only*

In [75]:
np.dot(A, b)

array([11, 17, 23])

##### **PyTorch**

In [76]:
A_pt = torch.tensor([[3, 4], [5, 6], [7, 8]])
A_pt

tensor([[3, 4],
        [5, 6],
        [7, 8]])

In [77]:
b_pt = torch.tensor([1, 2])
b_pt

tensor([1, 2])

*Like np.dot(), automatically infers dims in order to perform dot product, matvec, or matrix multiplication*

In [78]:
torch.matmul(A_pt, b_pt)

tensor([11, 17, 23])

##### **TensorFlow**

In [79]:
A_tf = tf.Variable([[3, 4], [5, 6], [7, 8]])
A_tf

<tf.Variable 'Variable:0' shape=(3, 2) dtype=int32, numpy=
array([[3, 4],
       [5, 6],
       [7, 8]], dtype=int32)>

In [80]:
b_tf = tf.Variable([1, 2])
b_tf

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([1, 2], dtype=int32)>

In [81]:
tf.linalg.matvec(A_tf, b_tf)

<tf.Tensor: shape=(3,), dtype=int32, numpy=array([11, 17, 23], dtype=int32)>

### **Matrix-Matrix Multiplication**

##### **NumPy**

In [82]:
A

array([[3, 4],
       [5, 6],
       [7, 8]])

In [83]:
B = np.array([[1, 9], [2, 0]])
B

array([[1, 9],
       [2, 0]])

In [84]:
np.dot(A, B)

array([[11, 27],
       [17, 45],
       [23, 63]])

##### **PyTorch**

In [85]:
B_pt = torch.from_numpy(B)
B_pt

tensor([[1, 9],
        [2, 0]])

*Another neat way to create the same tensor with transposition*

In [86]:
B_pt = torch.tensor([[1, 2], [9, 0]]).T
B_pt

tensor([[1, 9],
        [2, 0]])

*No need to change functions, unlike in TF*

In [87]:
torch.matmul(A_pt, B_pt)

tensor([[11, 27],
        [17, 45],
        [23, 63]])

##### **TensorFlow**

In [88]:
B_tf = tf.convert_to_tensor(B, dtype=tf.int32)
B_tf

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[1, 9],
       [2, 0]], dtype=int32)>

In [89]:
tf.matmul(A_tf, B_tf)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[11, 27],
       [17, 45],
       [23, 63]], dtype=int32)>

**Note:** In general, **matrix multiplication is not commutative**, meaning:

> $AB \neq BA$

Assuming commutativity can result in **dimension mismatch errors** or incorrect computations.  

**Special cases where $AB = BA$:**
1. **Diagonal matrices** of the same size.  
2. **Scalar multiples of the identity matrix**, e.g., $A \cdot I = I \cdot A$.  
3. **Pairs of matrices that commute by construction** (rare in practice).  

---

## **Symmetric Matrices**
*A symmetric matrix is a square matrix that is equal to its transpose, meaning its elements satisfy $A = A^T$. The entries are mirrored across the main diagonal.*

> $A = A^T \quad \Rightarrow \quad a_{ij} = a_{ji}$

In [90]:
X_sym = np.array([[0, 1, 2], [1, 7, 8], [2, 8, 9]])
X_sym

array([[0, 1, 2],
       [1, 7, 8],
       [2, 8, 9]])

In [91]:
X_sym.T

array([[0, 1, 2],
       [1, 7, 8],
       [2, 8, 9]])

In [92]:
X_sym.T == X_sym

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

---

## **Identity Matrices**
*An identity matrix is a square matrix in which all the diagonal elements are 1 and all off-diagonal elements are 0. It acts as the multiplicative identity in matrix multiplication.*

> $I_n = 
\begin{bmatrix}
1 & 0 & \dots & 0 \\
0 & 1 & \dots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \dots & 1
\end{bmatrix}$

In [93]:
I = torch.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
I

tensor([[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1]])

In [94]:
x_pt = torch.tensor([25, 2, 5])
x_pt

tensor([25,  2,  5])

In [95]:
torch.matmul(I, x_pt)

tensor([25,  2,  5])

---

## **Matrix Inversion**
*A square matrix $A$ is invertible if there exists a matrix $A^{-1}$ such that multiplying them yields the identity matrix. In other words, $A \cdot A^{-1} = A^{-1} \cdot A = I$.*

> $A \cdot A^{-1} = A^{-1} \cdot A = I$

##### **Solving a Linear System Using Matrix Inverse — Worked Example**

**Question**  
*Solve the system of linear equations*

$$
\begin{cases}
4w_1 + 2w_2 = 4 \\
-5w_1 - 3w_2 = -7
\end{cases}
$$

*Or in matrix form*

$$
y = X w, \quad
X = 
\begin{bmatrix}
4 & 2 \\
-5 & -3
\end{bmatrix}, \quad
y = 
\begin{bmatrix}
4 \\ -7
\end{bmatrix}, \quad
w = 
\begin{bmatrix}
w_1 \\ w_2
\end{bmatrix}.
$$

**Goal:** Find \(w\) such that \(y = X w\).

> **Step 1 — Compute the inverse of \(X\)**

$$
X^{-1}
$$

> **Step 2 — Solve for \(w\)**

$$
w = X^{-1} y
$$

> **Step 3 — Verify solution**

$$
X w = y
$$

> **Step 4 — Manual verification**

*From the first equation*

$$
w_2 = 2 - 2 w_1
$$

*Substitute into the second equation*

$$
-5 w_1 - 3(2 - 2 w_1) = -7
$$

$$
w_1 - 6 = -7 \quad \Rightarrow \quad w_1 = -1
$$

*Then*

$$
w_2 = 2 - 2(-1) = 4
$$

**Final Answer**

$$
w =
\begin{bmatrix}
-1 \\ 4
\end{bmatrix}
$$

> Verification:  
> 
> $$
> X w =
> \begin{bmatrix}
> 4 & 2 \\
> -5 & -3
> \end{bmatrix} 
> \begin{bmatrix}
> -1 \\ 4
> \end{bmatrix} =
> \begin{bmatrix}
> 4 \\ -7
> \end{bmatrix} = y
> $$

##### **Matrix Inversion in NumPy**

In [96]:
X = np.array([[4, 2], [-5, -3]])
X

array([[ 4,  2],
       [-5, -3]])

In [97]:
Xinv = np.linalg.inv(X)
Xinv

array([[ 1.5,  1. ],
       [-2.5, -2. ]])

In [98]:
np.dot(Xinv, X)

array([[1.00000000e+00, 3.33066907e-16],
       [0.00000000e+00, 1.00000000e+00]])

**Validation of $w = X^{-1} y$ as the Solution to $X w = y$**

In [99]:
y = np.array([4, -7])
y

array([ 4, -7])

In [100]:
w = np.dot(Xinv, y)
w

array([-1.,  4.])

In [101]:
np.dot(X, w)

array([ 4., -7.])

##### **Matrix Inversion in PyTorch**

*Float type*

In [102]:
X_pt = torch.tensor([[4, 2], [-5, -3.]])
X_pt

tensor([[ 4.,  2.],
        [-5., -3.]])

In [103]:
Xpt_inv = torch.inverse(X_pt)
Xpt_inv

tensor([[ 1.5000,  1.0000],
        [-2.5000, -2.0000]])

In [104]:
torch.matmul(X_pt,Xpt_inv)

tensor([[1., 0.],
        [0., 1.]])

##### **Matrix Inversion in TensorFlow**

*Float type*

In [105]:
X_tf = tf.Variable([[4, 2], [-5, -3.]])
X_tf

<tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[ 4.,  2.],
       [-5., -3.]], dtype=float32)>

In [106]:
Xtf_inv = tf.linalg.inv(tf.Variable(X_tf))
Xtf_inv

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[ 1.4999998,  0.9999998],
       [-2.4999995, -1.9999996]], dtype=float32)>

In [107]:
tf.matmul(X_tf,Xtf_inv)

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[ 1.0000000e+00,  0.0000000e+00],
       [-4.7683716e-07,  9.9999988e-01]], dtype=float32)>

*TensorFlow and NumPy show tiny floating-point errors after inversion, while PyTorch rounds them; all are effectively the identity matrix.*

##### **Limitations of Matrix Inversion**
*Not all matrices are invertible. A matrix can be inverted only if it is **square** and **non-singular** (i.e., its determinant is non-zero). Inverting large matrices is also **computationally expensive** and can lead to **numerical instability**.*

**Key points:**
1. Only **square matrices** (same number of rows and columns) can have an inverse.  
2. **Singular matrices** with determinant zero do not have an inverse.  
3. Inverting very large matrices can be **slow** and prone to **round-off errors**.  
4. For many practical ML applications, it is often better to use **solving linear systems** instead of explicit inversion.  

---

## **Diagonal Matrices**
*A **diagonal matrix** is a square matrix in which all the **off-diagonal elements are zero**. Only the elements on the main diagonal (from top-left to bottom-right) can be non-zero.* 

$$
D = 
\begin{bmatrix}
d_1 & 0 & 0 & \cdots & 0 \\
0 & d_2 & 0 & \cdots & 0 \\
0 & 0 & d_3 & \cdots & 0 \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
0 & 0 & 0 & \cdots & d_n
\end{bmatrix}
$$

##### **Properties of Diagonal Matrices**

1. **Multiplication**  
   Multiplying a diagonal matrix with a vector or another diagonal matrix is straightforward:

   $$
   D \cdot x = 
   \begin{bmatrix}
   d_1 x_1 \\
   d_2 x_2 \\
   \vdots \\
   d_n x_n
   \end{bmatrix}
   $$

2. **Inverse**  
   If all diagonal entries are non-zero, the inverse exists and is simply:

   $$
   D^{-1} = 
   \begin{bmatrix}
   1/d_1 & 0 & \cdots & 0 \\
   0 & 1/d_2 & \cdots & 0 \\
   \vdots & \vdots & \ddots & \vdots \\
   0 & 0 & \cdots & 1/d_n
   \end{bmatrix}
   $$

3. **Transpose**  
   The transpose of a diagonal matrix is itself:

   $$
   D^T = D
   $$

4. **Determinant**  
   The determinant of a diagonal matrix is the product of its diagonal elements:

   $$
   \text{det}(D) = d_1 \cdot d_2 \cdot \ldots \cdot d_n
   $$

##### **Example**

$$
D =
\begin{bmatrix}
3 & 0 & 0 \\
0 & 5 & 0 \\
0 & 0 & 2
\end{bmatrix}, \quad
x =
\begin{bmatrix}
1 \\
2 \\
3
\end{bmatrix}
$$

> Multiplying a Diagonal Matrix with a Vector

$$
D \cdot x =
\begin{bmatrix}
3 \cdot 1 \\
5 \cdot 2 \\
2 \cdot 3
\end{bmatrix} =
\begin{bmatrix}
3 \\
10 \\
6
\end{bmatrix}
$$

> Inverse

$$
D^{-1} =
\begin{bmatrix}
1/3 & 0 & 0 \\
0 & 1/5 & 0 \\
0 & 0 & 1/2
\end{bmatrix}
$$

**Takeaway:**  
Diagonal matrices are simple to work with because most operations reduce to element-wise operations on the diagonal elements. They frequently appear in **linear algebra, eigenvalue problems, and machine learning**.

---

## **Orthogonal Matrices**
*An orthogonal matrix is a square matrix whose rows and columns are orthonormal vectors. A matrix $Q$ is orthogonal if its transpose is equal to its inverse.*

> $Q^\top Q = QQ^\top = I$

**Key properties:**
- Preserves lengths and angles during transformations.  
- Determinant is either **+1** or **−1**.  
- Transpose is the inverse: $Q^{-1} = Q^\top$.  

$ I = \begin{bmatrix} 
1 & 0 & 0 \\ 
0 & 1 & 0 \\ 
0 & 0 & 1 
\end{bmatrix} $

##### **Orthogonality Verification of Identity Matrix $I_3$ Using PyTorch**

In [108]:
I = torch.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]], dtype=torch.float )
I

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

In [109]:
column_1 = I[:,0]
column_1

tensor([1., 0., 0.])

In [110]:
column_2 = I[:,1]
column_2

tensor([0., 1., 0.])

In [111]:
column_3 = I[:,2]
column_3

tensor([0., 0., 1.])

**Compute Dot Product Between Columns of $K$**

In [112]:
torch.matmul(column_1, column_2)

tensor(0.)

In [113]:
torch.matmul(column_1, column_3)

tensor(0.)

In [114]:
torch.matmul(column_2, column_3)

tensor(0.)

> The columns of $I_3$ have been verified to be orthogonal.

**Compute the Norm of Columns of $K$**

In [115]:
torch.norm(column_1)

tensor(1.)

In [116]:
torch.norm(column_2)

tensor(1.)

In [117]:
torch.norm(column_3)

tensor(1.)

> The columns of $I_3$ are orthogonal and have unit norm; hence, they are orthonormal.

*The columns of $I_3$ are orthonormal, and since $I_3^T = I_3$, the rows are also orthonormal. Therefore, $I_3$ is an **orthogonal matrix**.*

$K = \begin{bmatrix} 
\frac{2}{3} & \frac{1}{3} & \frac{2}{3} \\ 
-\frac{2}{3} & \frac{2}{3} & \frac{1}{3} \\ 
\frac{1}{3} & \frac{2}{3} & -\frac{2}{3} 
\end{bmatrix}$

##### **Orthogonality Verification of Matrix $K$ Using TensorFlow**

In [118]:
K = tf.Variable([[2/3, 1/3, 2/3], [-2/3, 2/3, 1/3], [1/3, 2/3, -2/3]])
K

<tf.Variable 'Variable:0' shape=(3, 3) dtype=float32, numpy=
array([[ 0.6666667 ,  0.33333334,  0.6666667 ],
       [-0.6666667 ,  0.6666667 ,  0.33333334],
       [ 0.33333334,  0.6666667 , -0.6666667 ]], dtype=float32)>

In [119]:
Kcol_1 = K[:,0]
Kcol_1

<tf.Tensor: shape=(3,), dtype=float32, numpy=array([ 0.6666667 , -0.6666667 ,  0.33333334], dtype=float32)>

In [120]:
Kcol_2 = K[:,1]
Kcol_2

<tf.Tensor: shape=(3,), dtype=float32, numpy=array([0.33333334, 0.6666667 , 0.6666667 ], dtype=float32)>

In [121]:
Kcol_3 = K[:,2]
Kcol_3

<tf.Tensor: shape=(3,), dtype=float32, numpy=array([ 0.6666667 ,  0.33333334, -0.6666667 ], dtype=float32)>

**Compute Dot Product Between Columns of $K$**

In [122]:
tf.tensordot(Kcol_1, Kcol_2, axes=1)

<tf.Tensor: shape=(), dtype=float32, numpy=-3.311368956815386e-09>

In [123]:
tf.tensordot(Kcol_1, Kcol_3, axes=1)

<tf.Tensor: shape=(), dtype=float32, numpy=3.311368956815386e-09>

In [124]:
tf.tensordot(Kcol_2, Kcol_3, axes=1)

<tf.Tensor: shape=(), dtype=float32, numpy=6.622737913630772e-09>

> The columns of $K$ have been verified to be orthogonal.

**Compute the Norm of Columns of $K$**

In [125]:
tf.norm(Kcol_1)

<tf.Tensor: shape=(), dtype=float32, numpy=1.0>

In [126]:
tf.norm(Kcol_2)

<tf.Tensor: shape=(), dtype=float32, numpy=1.0>

In [127]:
tf.norm(Kcol_3)

<tf.Tensor: shape=(), dtype=float32, numpy=1.0>

> The columns of $K$ are orthogonal and have unit norm; hence, they are orthonormal.

*To verify that $K$ is orthogonal, we must show that its rows are also orthonormal. Since $K^T \neq K$, this is less straightforward than with $I_3$. We can either check the rows individually or use the orthogonal matrix property $K^T K = I$ to verify orthogonality in a single line of code.*

**Verify orthogonality using $K^T K = I$**

In [128]:
tf.matmul(tf.transpose(K), K)

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[ 1.0000001e+00, -3.3113690e-09,  3.3113690e-09],
       [-3.3113690e-09,  1.0000000e+00,  6.6227379e-09],
       [ 3.3113690e-09,  6.6227379e-09,  1.0000000e+00]], dtype=float32)>

*Orthogonality of $K$ is thus verified.*

---