In [None]:
'''
 * Copyright (c) 2018 Radhamadhab Dalai
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 * THE SOFTWARE.
'''

# 📘 2.2 Algorithms for Matrix Factorization

Matrix factorizations are essential tools in numerical linear algebra, especially for computing estimates and diagnostics in linear models. These decompositions help in evaluating rank, determinant, and inverse of matrices, and are widely used in statistical software for numerical stability.

---

## 🔹 Result 2.2.1: Full-Rank Factorization

Let $A \in \mathbb{R}^{m \times n}$ be a matrix of rank $r$. Then there exist matrices $B \in \mathbb{R}^{m \times r}$ and $C \in \mathbb{R}^{r \times n}$ such that:

$$
A = BC
\tag{2.2.1}
$$

### Construction:

Let $\{x_1, \dots, x_r\}$ be a basis for $\mathcal{C}(A)$ (the column space of $A$). Define:

$$
B = (x_1, \dots, x_r)
$$

Each column $a_j$ of $A$ can be written as:

$$
a_j = \sum_{i=1}^{r} c_{ij} x_i = B c_j
$$

Let:

$$
C = (c_1, \dots, c_n)
$$

Then:

$$
A = BC
$$

Alternatively, if $P$ and $Q$ are nonsingular matrices such that:

$$
A = P \begin{pmatrix} I_r & 0 \\ 0 & 0 \end{pmatrix} Q
$$

Then $B$ is the first $r$ columns of $P$ and $C$ is the first $r$ rows of $Q$, so again:

$$
A = BC
$$

---

## 🔹 Result 2.2.2: QR Decomposition

Let $A$ be an $m \times n$ matrix of full column rank. Then there exists:

- $Q \in \mathbb{R}^{m \times n}$ with orthogonal columns
- $R \in \mathbb{R}^{n \times n}$ upper triangular and nonsingular

such that:

$$
A = QR
\tag{2.2.2}
$$

---

## 🔹 Gram–Schmidt Orthogonalization

Let $a_1, \dots, a_n$ be the columns of $A$. Define:

$$
b_1 = a_1
$$

For $i = 2, \dots, n$:

$$
b_i = a_i - \sum_{j=1}^{i-1} c_{ji} b_j
$$

where:

$$
c_{ji} = \frac{a_j^\top b_i}{b_i^\top b_i}
$$

Then:

$$
Q = (b_1, \dots, b_n), \quad R_{ij} = c_{ji}
$$

and:

$$
A = QR
$$

---

The QR decomposition is widely used in linear regression for computing stable estimates of coefficients, and is implemented in most statistical software packages.


# 📘 2.2 Matrix Diagonalization and Eigenstructure

Matrix diagonalization plays a central role in linear model theory, especially in understanding projections, diagnostics, and the behavior of symmetric and positive definite matrices. These results are foundational in numerical linear algebra and statistical computing (see Golub and Van Loan (1989), Stewart (1973)).

---

## 🔹 Definition 2.2.1: Diagonalizability

An $n \times n$ matrix $A$ is said to be **diagonalizable** if there exists a nonsingular matrix $Q$ such that:

$$
Q^{-1} A Q = D
\tag{2.2.3}
$$

where $D$ is a diagonal matrix. Equivalently:

$$
A = Q D Q^{-1}
$$

This process is called **diagonalization**, and is closely related to the eigensystem of $A$.

---

## 🔹 Definition 2.2.2: Orthogonal Diagonalizability

An $n \times n$ matrix $A$ is **orthogonally diagonalizable** if there exists an orthogonal matrix $P$ such that:

$$
P^\top A P = D
$$

where $D$ is diagonal.

---

## 🔹 Result 2.2.3: Properties of Diagonalization

Let $A$ be an $n \times n$ matrix. Suppose there exists a nonsingular matrix $Q$ such that:

$$
Q^{-1} A Q = D = \text{diag}(\lambda_1, \dots, \lambda_n)
$$

Let $Q = (q_1, \dots, q_n)$. Then:

### 1. Rank:
$$
r(A) = \text{number of nonzero } \lambda_i
$$

### 2. Determinant:
$$
|A| = \prod_{i=1}^{n} \lambda_i = |D|
$$

### 3. Trace:
$$
\text{tr}(A) = \sum_{i=1}^{n} \lambda_i = \text{tr}(D)
$$

### 4. Characteristic Polynomial:
$$
P(\lambda) = (-1)^n \prod_{i=1}^{n} (\lambda - \lambda_i)
$$

### 5. Eigenvalues:
The eigenvalues of $A$ are $\lambda_1, \dots, \lambda_n$, which may be zero or repeated.

### 6. Eigenvectors:
The columns of $Q$ are linearly independent eigenvectors of $A$, with:

$$
A q_i = \lambda_i q_i, \quad i = 1, \dots, n
$$

---

These results form the backbone of spectral analysis and are essential for understanding matrix behavior in regression, PCA, and multivariate statistics.

# 📘 Result 2.2.7: Singular Value Decomposition (SVD)

Let $A \in \mathbb{R}^{m \times n}$ be a matrix of rank $r$. Then $A$ admits the decomposition:

$$
A = P 
\begin{pmatrix}
D_1 & 0 \\
0 & 0
\end{pmatrix}
Q^\top = P_1 D_1 Q_1^\top
$$

or equivalently:

$$
A = \sum_{i=1}^{r} d_i p_i q_i^\top
$$

---

## 🔹 Components of the SVD

- $P \in \mathbb{R}^{m \times m}$ is an orthogonal matrix  
- $Q \in \mathbb{R}^{n \times n}$ is an orthogonal matrix  
- $D_1 = \text{diag}(d_1, \dots, d_r)$ with $d_1 \ge d_2 \ge \dots \ge d_r > 0$  
- $P_1 = (p_1, \dots, p_r)$ are the first $r$ columns of $P$  
- $Q_1 = (q_1, \dots, q_r)$ are the first $r$ columns of $Q$  

---

## 🔹 Singular Values

The scalars $d_1, \dots, d_r$ are called the **singular values** of $A$. They are the positive square roots of the nonzero eigenvalues of $A^\top A$ and are invariant under the choice of $P$ and $Q$.

---

## 🔹 Eigenstructure

- The columns of $P$ are eigenvectors of $A A^\top$  
  - The first $r$ columns correspond to the nonzero eigenvalues $d_1^2, \dots, d_r^2$  
  - The remaining $m - r$ columns correspond to zero eigenvalues  

- The columns of $Q$ are eigenvectors of $A^\top A$  
  - The first $r$ columns correspond to the nonzero eigenvalues $d_1^2, \dots, d_r^2$  
  - The remaining $n - r$ columns correspond to zero eigenvalues  

---

## 🔹 Uniqueness

Once the first $r$ columns of $P$ are specified, the first $r$ columns of $Q$ are uniquely determined, and vice versa.

For further details, see Harville (1997), Section 21.12.






In [1]:
import math

# -----------------------------
# Basic Matrix Operations
# -----------------------------
def matmul(A, B):
    return [[sum(a * b for a, b in zip(A_row, B_col)) for B_col in zip(*B)] for A_row in A]

def transpose(A):
    return [list(row) for row in zip(*A)]

def scalar_mul(scalar, M):
    return [[scalar * val for val in row] for row in M]

def normalize(v):
    norm = math.sqrt(sum(x**2 for x in v))
    return [x / norm for x in v]

def dot(u, v):
    return sum(x * y for x, y in zip(u, v))

def outer(u, v):
    return [[ui * vj for vj in v] for ui in u]

def print_matrix(M, label="Matrix"):
    print(f"\n🔹 {label}:")
    for row in M:
        print("  ".join(f"{val:8.4f}" for val in row))
    print()

# -----------------------------
# Eigen-decomposition of AᵗA (2x2 only)
# -----------------------------
def eigen_decompose_2x2(M):
    a, b = M[0][0], M[0][1]
    c, d = M[1][0], M[1][1]
    trace = a + d
    det = a * d - b * c
    disc = math.sqrt(trace**2 - 4 * det)
    lambda1 = (trace + disc) / 2
    lambda2 = (trace - disc) / 2

    def eigenvector(M, lam):
        a, b = M[0][0] - lam, M[0][1]
        c, d = M[1][0], M[1][1] - lam
        if abs(b) > abs(c):
            v = [1, -a / b] if b != 0 else [1, 0]
        else:
            v = [-d / c, 1] if c != 0 else [0, 1]
        return normalize(v)

    v1 = eigenvector(M, lambda1)
    v2 = eigenvector(M, lambda2)
    return [lambda1, lambda2], [v1, v2]

# -----------------------------
# SVD via Eigen-decomposition
# -----------------------------
def svd(A):
    At = transpose(A)
    AtA = matmul(At, A)
    lambdas, Q_cols = eigen_decompose_2x2(AtA)
    singular_values = [math.sqrt(l) for l in lambdas]

    D_inv = [[1/singular_values[0], 0], [0, 1/singular_values[1]]]
    Q = transpose(Q_cols)
    AQ = matmul(A, Q)
    P = matmul(AQ, D_inv)
    D = [[singular_values[0], 0], [0, singular_values[1]]]
    return P, D, Q

# -----------------------------
# Example Matrix
# -----------------------------
A = [
    [3, 2],
    [2, 3],
    [1, 0]
]

P, D, Q = svd(A)

print_matrix(P, "Left Singular Vectors (P)")
print_matrix(D, "Singular Values (D)")
print_matrix(Q, "Right Singular Vectors (Q)")



🔹 Left Singular Vectors (P):
  0.7028   -0.5189
  0.6969    0.6396
  0.1429   -0.5672


🔹 Singular Values (D):
  5.0508    0.0000
  0.0000    1.2205


🔹 Right Singular Vectors (Q):
  0.7217   -0.6922
  0.6922    0.7217



# 📘 2.3 Symmetric and Idempotent Matrices

---

## 🔹 Definition: Symmetric Matrix

An $n \times n$ matrix $A$ is **symmetric** if:

$$
A^\top = A
$$

---

## 🔹 Result 2.3.1: Real Eigenvalues of Symmetric Matrices

Let $A$ be a real symmetric matrix. Then all eigenvalues of $A$ are real.

**Proof Sketch:**

Let $\lambda$ be an eigenvalue of $A$ with eigenvector $x$ (possibly complex). Then:

$$
x^* A x = x^* (\lambda x) = \lambda x^* x
\tag{2.3.1}
$$

Also:

$$
x^* A x = (\lambda^* x^*)^\top x = \lambda^* x^* x
\tag{2.3.2}
$$

Equating (2.3.1) and (2.3.2):

$$
\lambda x^* x = \lambda^* x^* x \Rightarrow \lambda = \lambda^*
$$

So $\lambda$ is real.

---

## 🔹 Result 2.3.2: Orthogonality of Eigenvectors

Let $x_1$ and $x_2$ be eigenvectors of $A$ corresponding to distinct eigenvalues $\lambda_1$ and $\lambda_2$. Then:

$$
x_1^\top x_2 = 0
$$

So $x_1$ and $x_2$ are orthogonal.

---

## 🔹 Example 2.3.1

Let:

$$
A = \begin{pmatrix}
2 & 2 \\
2 & -1
\end{pmatrix}
$$

Compute the characteristic polynomial:

$$
|A - \lambda I| = (2 - \lambda)(-1 - \lambda) - 4 = 0
$$

Solving:

$$
\lambda = 3, \quad \lambda = -2
$$

Corresponding eigenvectors:

$$
p_1 = \begin{pmatrix} 2 \\ 1 \end{pmatrix}, \quad
p_2 = \begin{pmatrix} 1 \\ -2 \end{pmatrix}
$$

Clearly:

$$
p_1^\top p_2 = 0
$$

So they are orthogonal.

---

## 🔹 Result 2.3.3: Orthonormal Eigenvectors in Invariant Subspace

Let $A$ be symmetric and $V \subset \mathbb{R}^n$ be an invariant subspace under $A$ with $\dim(V) = k$. Then $V$ contains $k$ orthonormal eigenvectors of $A$.

**Proof Sketch:**

Use induction on $k$.  
- Base case: $k = 0$ is trivial.  
- Inductive step: find an eigenvector $q$ in $V$, construct $V_1 = \{v \in V : v \perp q\}$, show $V_1$ is invariant under $A$, and apply the hypothesis.

---

## 🔹 Result 2.3.4: Spectral Decomposition

Let $A$ be symmetric with eigenvalues $\lambda_1, \dots, \lambda_n$ and orthonormal eigenvectors $p_1, \dots, p_n$. Then:

### Spectral Form:
$$
A = \sum_{k=1}^{n} \lambda_k p_k p_k^\top
\tag{2.3.4}
$$

### Diagonalization:
$$
A = P D P^\top, \quad \text{where } P = (p_1, \dots, p_n), \quad D = \text{diag}(\lambda_1, \dots, \lambda_n)
\tag{2.3.3}
$$

So every symmetric matrix is orthogonally diagonalizable.

---


# 📘 Section 2.3: Symmetric and Idempotent Matrices

---

## 🔹 Result 2.3.5: Trace Properties of Symmetric Matrices

Let $ A \in \mathbb{R}^{n \times n} $ be symmetric with eigenvalues $ \lambda_1, \dots, \lambda_n $. Then:

1. **Trace**:
$$
\text{tr}(A) = \sum_{i=1}^{n} \lambda_i
$$

2. **Power Trace** (for any nonnegative integer $ s $):
$$
\text{tr}(A^s) = \sum_{i=1}^{n} \lambda_i^s
$$

3. **Inverse Trace** (if $ A $ is nonsingular):
$$
\text{tr}(A^{-1}) = \sum_{i=1}^{n} \frac{1}{\lambda_i}
$$

---

## 🔹 Definition 2.3.1: Idempotent Matrix

An $ n \times n $ matrix $ A $ is **idempotent** if:
$$
A^2 = A
$$

If $ A $ is also symmetric:
$$
A^\top = A \quad \text{and} \quad A^2 = A
$$

**Examples**:
- Identity matrix: $ I_n $
- Averaging matrix: $ J_n = \frac{1}{n} \mathbf{1}_n \mathbf{1}_n^\top $
- Centering matrix: $ C_n = I_n - J_n $

---

## 🔹 Result 2.3.6: Properties of Idempotent Matrices

1. $ A^\top $ is idempotent if and only if $ A $ is idempotent  
2. $ I - A $ is idempotent if and only if $ A $ is idempotent  
3. If $ A $ is idempotent, then it can be diagonalized:
$$
Q^{-1} A Q = D, \quad \text{where } D = \text{diag}(1, \dots, 1, 0, \dots, 0)
$$

Then:
$$
r(A) = \text{tr}(A), \quad r(I_n - A) = n - \text{tr}(A)
$$

4. If \( r(A) = n \), then:
$$
A = I_n
$$

---

## 🔹 Result 2.3.7: Eigenvalues of Symmetric Idempotent Matrices

Let $ A \in \mathbb{R}^{n \times n} $ be symmetric and idempotent of rank $ m $. Then:

- $ m $ eigenvalues of $ A $ are equal to 1  
- $ n - m $ eigenvalues of $ A $ are equal to 0

That is:
$$
\text{Spec}(A) = \{ \underbrace{1, \dots, 1}_{m}, \underbrace{0, \dots, 0}_{n - m} \}
$$

---

## 🔹 Result 2.3.8: Cauchy’s Interlacing Theorem

Let $ A \in \mathbb{R}^{n \times n} $ be symmetric with eigenvalues $ \lambda_1(A) \le \dots \le \lambda_n(A) $, and let $ A_1 $ be an $ (n-1) \times (n-1) $ principal submatrix with eigenvalues $ \lambda_1(A_1) \le \dots \le \lambda_{n-1}(A_1) $. Then:

$$
\lambda_i(A) \le \lambda_i(A_1) \le \lambda_{i+1}(A), \quad \text{for } i = 1, \dots, n-1
$$

---

These results are foundational for understanding projections, regression diagnostics, and the spectral behavior of symmetric matrices in linear models.


In [2]:
import math

# -----------------------------
# Basic Matrix Operations
# -----------------------------
def transpose(A):
    return [list(row) for row in zip(*A)]

def matmul(A, B):
    return [[sum(a * b for a, b in zip(A_row, B_col)) for B_col in zip(*B)] for A_row in A]

def is_symmetric(A):
    return A == transpose(A)

def is_idempotent(A):
    A2 = matmul(A, A)
    return all(abs(A2[i][j] - A[i][j]) < 1e-8 for i in range(len(A)) for j in range(len(A)))

def trace(A):
    return sum(A[i][i] for i in range(len(A)))

def power_trace(A, s):
    result = A
    for _ in range(s - 1):
        result = matmul(result, A)
    return trace(result)

def inverse_trace(eigenvalues):
    return sum(1 / lam for lam in eigenvalues if lam != 0)

# -----------------------------
# Eigen-decomposition (2x2 only)
# -----------------------------
def eigen_decompose_2x2(A):
    a, b = A[0][0], A[0][1]
    c, d = A[1][0], A[1][1]
    trace_val = a + d
    det = a * d - b * c
    disc = math.sqrt(trace_val**2 - 4 * det)
    lam1 = (trace_val + disc) / 2
    lam2 = (trace_val - disc) / 2
    return [lam1, lam2]

# -----------------------------
# Example Matrix
# -----------------------------
A = [
    [2, 2],
    [2, -1]
]

print("\n🔹 Matrix A:")
for row in A:
    print(row)

# Check symmetry and idempotency
print("\nIs symmetric:", is_symmetric(A))
print("Is idempotent:", is_idempotent(A))

# Eigenvalues
eigenvals = eigen_decompose_2x2(A)
print("\nEigenvalues of A:", [round(l, 4) for l in eigenvals])

# Trace properties
print("Trace from matrix:", trace(A))
print("Trace from eigenvalues:", sum(eigenvals))
print("Trace of A²:", power_trace(A, 2))
print("Trace of A⁻¹ (if nonsingular):", inverse_trace(eigenvals))

# Idempotent matrix example
B = [
    [1, 0],
    [0, 0]
]

print("\n🔹 Matrix B (Idempotent):")
for row in B:
    print(row)

print("Is symmetric:", is_symmetric(B))
print("Is idempotent:", is_idempotent(B))
print("Trace of B:", trace(B))
print("Eigenvalues of B:", eigen_decompose_2x2(B))



🔹 Matrix A:
[2, 2]
[2, -1]

Is symmetric: True
Is idempotent: False

Eigenvalues of A: [3.0, -2.0]
Trace from matrix: 1
Trace from eigenvalues: 1.0
Trace of A²: 13
Trace of A⁻¹ (if nonsingular): -0.16666666666666669

🔹 Matrix B (Idempotent):
[1, 0]
[0, 0]
Is symmetric: True
Is idempotent: True
Trace of B: 1
Eigenvalues of B: [1.0, 0.0]
