In [None]:
'''
 * Copyright (c) 2018 Radhamadhab Dalai
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 * THE SOFTWARE.
'''

# 2.4 Nonnegative Definite Quadratic Forms and Matrices

This notebook introduces quadratic forms and matrices of quadratic forms, describing their fundamental properties and definitions.

## Linear and Bilinear Forms

### Definition 2.4.1: Linear Form in x

Given an arbitrary vector $\mathbf{a} = (a_1, \ldots, a_n)^T$, a linear form in $\mathbf{x} = (x_1, \ldots, x_n)^T$ is a function that assigns to each vector $\mathbf{x} \in \mathbb{R}^n$ the value:

$$\mathbf{a}^T\mathbf{x} = \sum_{i=1}^{n} a_i x_i = a_1 x_1 + \cdots + a_n x_n \tag{2.4.1}$$

**Key Properties:**
- The linear form $\mathbf{a}^T\mathbf{x}$ can also be written as $\mathbf{x}^T\mathbf{a}$
- It is a homogeneous polynomial of degree 1 with coefficient vector $\mathbf{a}$
- Two linear forms $\mathbf{a}^T\mathbf{x}$ and $\mathbf{b}^T\mathbf{x}$ are identically equal for all $\mathbf{x}$ if and only if $\mathbf{a} = \mathbf{b}$

**Example:** $4x_1 + 5x_2 - 3x_3$ is a linear form in $\mathbf{x} = (x_1, x_2, x_3)^T$ with coefficient vector $\mathbf{a} = (4, 5, -3)^T$

### Definition 2.4.2: Bilinear Form in x and y

Given an arbitrary $m \times n$ matrix $A = \{a_{ij}\}$, a bilinear form is a function that assigns to each pair of vectors $\mathbf{x} = (x_1, \ldots, x_m)^T$ and $\mathbf{y} = (y_1, \ldots, y_n)^T$ the value:

$$\mathbf{x}^T A \mathbf{y} = \sum_{i=1}^{m} \sum_{j=1}^{n} a_{ij} x_i y_j \tag{2.4.2}$$

where $A$ is the matrix of the bilinear form.

**Key Properties:**
- The form can also be written as $\mathbf{y}^T A^T \mathbf{x}$
- Two bilinear forms $\mathbf{x}^T A \mathbf{y}$ and $\mathbf{x}^T B \mathbf{y}$ are identically equal if and only if $A = B$
- A bilinear form $\mathbf{x}^T A \mathbf{y}$ is **symmetric** if $\mathbf{x}^T A \mathbf{y} = \mathbf{y}^T A^T \mathbf{x}$ for all $\mathbf{x}$ and $\mathbf{y}$, which occurs if and only if the matrix is (square) symmetric: $A = A^T$

#### Example 2.4.1

The expression $x_1 y_1 + 2x_1 y_2 + 4x_2 y_1 + 7x_2 y_2 + 2x_3 y_1 - 2x_3 y_2$ is a bilinear form in $\mathbf{x} = (x_1, x_2, x_3)^T$ and $\mathbf{y} = (y_1, y_2)^T$, with matrix:

$$A = \begin{pmatrix} 1 & 2 \\ 4 & 7 \\ 2 & -2 \end{pmatrix}$$

An example of a **symmetric** bilinear form in $\mathbf{x} = (x_1, x_2, x_3)^T$ and $\mathbf{y} = (y_1, y_2, y_3)^T$ is:
$x_1 y_1 + 2x_1 y_2 - 3x_1 y_3 + 2x_2 y_1 + 7x_2 y_2 + 6x_2 y_3 - 3x_3 y_1 + 6x_3 y_2 + 5x_3 y_3$

with symmetric matrix:
$$A = \begin{pmatrix} 1 & 2 & -3 \\ 2 & 7 & 6 \\ -3 & 6 & 5 \end{pmatrix}$$

## Quadratic Forms

### Definition 2.4.3: Quadratic Form in x

Given an arbitrary $n \times n$ matrix $A = \{a_{ij}\}$, a quadratic form is a function that assigns to each vector $\mathbf{x} = (x_1, \ldots, x_n)^T \in \mathbb{R}^n$ the value:

$$\mathbf{x}^T A \mathbf{x} = \sum_{i=1}^{n} \sum_{j=1}^{n} a_{ij} x_i x_j \tag{2.4.3}$$

which is a **homogeneous polynomial of degree two**.

#### Example 2.4.2

The expression $x_1^2 + 7x_2^2 + 4x_3^2 + 4x_1 x_2 + 10x_1 x_3 - 4x_2 x_3$ is a quadratic form in $\mathbf{x} = (x_1, x_2, x_3)^T$, with matrix:

$$A = \begin{pmatrix} 1 & 2 & 5 \\ 2 & 7 & -2 \\ 5 & -2 & 4 \end{pmatrix}$$

**Important Properties:**
- When $\mathbf{x} = \mathbf{0}$, then $\mathbf{x}^T A \mathbf{x} = 0$ for all $A$
- Let $A = \{a_{ij}\}$ and $B = \{b_{ij}\}$ be two arbitrary $n \times n$ matrices. Then $\mathbf{x}^T A \mathbf{x}$ and $\mathbf{x}^T B \mathbf{x}$ are identically equal if and only if $A + A^T = B + B^T$
- If $A$ and $B$ are symmetric matrices, then $\mathbf{x}^T A \mathbf{x}$ and $\mathbf{x}^T B \mathbf{x}$ are identically equal if and only if $A = B$
- For any matrix $A$, note that $C = \frac{A + A^T}{2}$ is always symmetric and $\mathbf{x}^T A \mathbf{x} = \mathbf{x}^T C \mathbf{x}$

**Without loss of generality**, we may assume that corresponding to a given quadratic form, there exists a unique symmetric matrix $A$ which is the matrix of that quadratic form.

### Congruent Matrices

Let $\mathbf{x}^T A \mathbf{x}$ be a quadratic form in $\mathbf{x}$ and let $\mathbf{y} = C^{-1} \mathbf{x}$, where $C$ is an $n \times n$ nonsingular matrix. Then:

$$\mathbf{x}^T A \mathbf{x} = \mathbf{y}^T C^T A C \mathbf{y} = \mathbf{y}^T B \mathbf{y}$$

We refer to $A$ and $B$ as **congruent matrices**.

## Definiteness Classifications

### Definition 2.4.4: Nonnegative Definite Quadratic Form

An arbitrary quadratic form $\mathbf{x}^T A \mathbf{x}$ is said to be **nonnegative definite (n.n.d.)** if:

$\mathbf{x}^T A \mathbf{x} \geq 0 \text{ for every vector } \mathbf{x} \in \mathbb{R}^n$

The matrix $A$ is called a **nonnegative definite (n.n.d.) matrix**.

### Definition 2.4.5: Positive Definite Quadratic Form

A nonnegative definite quadratic form $\mathbf{x}^T A \mathbf{x}$ is said to be **positive definite (p.d.)** if:

$\mathbf{x}^T A \mathbf{x} > 0 \text{ for all non-null vectors } \mathbf{x} \in \mathbb{R}^n$

and 

$\mathbf{x}^T A \mathbf{x} = 0 \text{ only when } \mathbf{x} = \mathbf{0}$

The matrix $A$ is called a **positive definite (p.d.) matrix**.

### Definition 2.4.6: Positive Semidefinite Quadratic Form

A nonnegative definite quadratic form $\mathbf{x}^T A \mathbf{x}$ is said to be **positive semidefinite (p.s.d.)** if:

$\mathbf{x}^T A \mathbf{x} \geq 0 \text{ for every } \mathbf{x} \in \mathbb{R}^n$

and 

$\mathbf{x}^T A \mathbf{x} = 0 \text{ for some non-null } \mathbf{x}$

The matrix $A$ is called a **positive semidefinite (p.s.d.) matrix**.

#### Example 2.4.3

1. **Positive Definite:** The quadratic form $x_1^2 + \cdots + x_n^2 = \mathbf{x}^T I_n \mathbf{x} > 0$ for every non-null $\mathbf{x} \in \mathbb{R}^n$ and is p.d.

2. **Positive Semidefinite:** The quadratic form $(x_1 + \cdots + x_n)^2 = \mathbf{x}^T \mathbf{1}\mathbf{1}^T \mathbf{x} = \mathbf{x}^T J_n \mathbf{x} \geq 0$ for every $\mathbf{x} \in \mathbb{R}^n$ and is equal to 0 when $x_1 + \cdots + x_n = 0$; it is a p.s.d. quadratic form.

## Negative Definiteness

A quadratic form $\mathbf{x}^T A \mathbf{x}$ is respectively:
- **Nonpositive definite** if $-\mathbf{x}^T A \mathbf{x}$ is nonnegative definite
- **Negative definite** if $-\mathbf{x}^T A \mathbf{x}$ is positive definite  
- **Negative semidefinite** if $-\mathbf{x}^T A \mathbf{x}$ is positive semidefinite

**Important Fact:** The only symmetric $n \times n$ matrix which is both nonnegative definite and nonpositive definite is the null matrix, $\mathbf{O}$.

## Indefinite Forms

A quadratic form is said to be **indefinite** if $\mathbf{x}^T A \mathbf{x} > 0$ for some vectors $\mathbf{x}$ and $\mathbf{x}^T A \mathbf{x} < 0$ for other vectors $\mathbf{x}$.

## Summary of Classifications

| Classification | Condition | Matrix Property |
|---|---|---|
| Positive Definite (p.d.) | $\mathbf{x}^T A \mathbf{x} > 0$ for all $\mathbf{x} \neq \mathbf{0}$ | All eigenvalues > 0 |
| Positive Semidefinite (p.s.d.) | $\mathbf{x}^T A \mathbf{x} \geq 0$ for all $\mathbf{x}$, with equality for some $\mathbf{x} \neq \mathbf{0}$ | All eigenvalues ≥ 0, at least one = 0 |
| Negative Definite (n.d.) | $\mathbf{x}^T A \mathbf{x} < 0$ for all $\mathbf{x} \neq \mathbf{0}$ | All eigenvalues < 0 |
| Negative Semidefinite (n.s.d.) | $\mathbf{x}^T A \mathbf{x} \leq 0$ for all $\mathbf{x}$, with equality for some $\mathbf{x} \neq \mathbf{0}$ | All eigenvalues ≤ 0, at least one = 0 |
| Indefinite | $\mathbf{x}^T A \mathbf{x}$ takes both positive and negative values | Mixed positive and negative eigenvalues |

# 2.4 Nonnegative Definite Quadratic Forms and Matrices

This notebook introduces quadratic forms and matrices of quadratic forms, describing their fundamental properties and definitions.

## Linear and Bilinear Forms

### Definition 2.4.1: Linear Form in x

Given an arbitrary vector $\mathbf{a} = (a_1, \ldots, a_n)^T$, a linear form in $\mathbf{x} = (x_1, \ldots, x_n)^T$ is a function that assigns to each vector $\mathbf{x} \in \mathbb{R}^n$ the value:

$$\mathbf{a}^T\mathbf{x} = \sum_{i=1}^{n} a_i x_i = a_1 x_1 + \cdots + a_n x_n \tag{2.4.1}$$

**Key Properties:**
- The linear form $\mathbf{a}^T\mathbf{x}$ can also be written as $\mathbf{x}^T\mathbf{a}$
- It is a homogeneous polynomial of degree 1 with coefficient vector $\mathbf{a}$
- Two linear forms $\mathbf{a}^T\mathbf{x}$ and $\mathbf{b}^T\mathbf{x}$ are identically equal for all $\mathbf{x}$ if and only if $\mathbf{a} = \mathbf{b}$

**Example:** $4x_1 + 5x_2 - 3x_3$ is a linear form in $\mathbf{x} = (x_1, x_2, x_3)^T$ with coefficient vector $\mathbf{a} = (4, 5, -3)^T$

### Definition 2.4.2: Bilinear Form in x and y

Given an arbitrary $m \times n$ matrix $A = \{a_{ij}\}$, a bilinear form is a function that assigns to each pair of vectors $\mathbf{x} = (x_1, \ldots, x_m)^T$ and $\mathbf{y} = (y_1, \ldots, y_n)^T$ the value:

$$\mathbf{x}^T A \mathbf{y} = \sum_{i=1}^{m} \sum_{j=1}^{n} a_{ij} x_i y_j \tag{2.4.2}$$

where $A$ is the matrix of the bilinear form.

**Key Properties:**
- The form can also be written as $\mathbf{y}^T A^T \mathbf{x}$
- Two bilinear forms $\mathbf{x}^T A \mathbf{y}$ and $\mathbf{x}^T B \mathbf{y}$ are identically equal if and only if $A = B$
- A bilinear form $\mathbf{x}^T A \mathbf{y}$ is **symmetric** if $\mathbf{x}^T A \mathbf{y} = \mathbf{y}^T A^T \mathbf{x}$ for all $\mathbf{x}$ and $\mathbf{y}$, which occurs if and only if the matrix is (square) symmetric: $A = A^T$

#### Example 2.4.1

The expression $x_1 y_1 + 2x_1 y_2 + 4x_2 y_1 + 7x_2 y_2 + 2x_3 y_1 - 2x_3 y_2$ is a bilinear form in $\mathbf{x} = (x_1, x_2, x_3)^T$ and $\mathbf{y} = (y_1, y_2)^T$, with matrix:

$$A = \begin{pmatrix} 1 & 2 \\ 4 & 7 \\ 2 & -2 \end{pmatrix}$$

An example of a **symmetric** bilinear form in $\mathbf{x} = (x_1, x_2, x_3)^T$ and $\mathbf{y} = (y_1, y_2, y_3)^T$ is:
$x_1 y_1 + 2x_1 y_2 - 3x_1 y_3 + 2x_2 y_1 + 7x_2 y_2 + 6x_2 y_3 - 3x_3 y_1 + 6x_3 y_2 + 5x_3 y_3$

with symmetric matrix:
$$A = \begin{pmatrix} 1 & 2 & -3 \\ 2 & 7 & 6 \\ -3 & 6 & 5 \end{pmatrix}$$

## Quadratic Forms

### Definition 2.4.3: Quadratic Form in x

Given an arbitrary $n \times n$ matrix $A = \{a_{ij}\}$, a quadratic form is a function that assigns to each vector $\mathbf{x} = (x_1, \ldots, x_n)^T \in \mathbb{R}^n$ the value:

$$\mathbf{x}^T A \mathbf{x} = \sum_{i=1}^{n} \sum_{j=1}^{n} a_{ij} x_i x_j \tag{2.4.3}$$

which is a **homogeneous polynomial of degree two**.

#### Example 2.4.2

The expression $x_1^2 + 7x_2^2 + 4x_3^2 + 4x_1 x_2 + 10x_1 x_3 - 4x_2 x_3$ is a quadratic form in $\mathbf{x} = (x_1, x_2, x_3)^T$, with matrix:

$$A = \begin{pmatrix} 1 & 2 & 5 \\ 2 & 7 & -2 \\ 5 & -2 & 4 \end{pmatrix}$$

**Important Properties:**
- When $\mathbf{x} = \mathbf{0}$, then $\mathbf{x}^T A \mathbf{x} = 0$ for all $A$
- Let $A = \{a_{ij}\}$ and $B = \{b_{ij}\}$ be two arbitrary $n \times n$ matrices. Then $\mathbf{x}^T A \mathbf{x}$ and $\mathbf{x}^T B \mathbf{x}$ are identically equal if and only if $A + A^T = B + B^T$
- If $A$ and $B$ are symmetric matrices, then $\mathbf{x}^T A \mathbf{x}$ and $\mathbf{x}^T B \mathbf{x}$ are identically equal if and only if $A = B$
- For any matrix $A$, note that $C = \frac{A + A^T}{2}$ is always symmetric and $\mathbf{x}^T A \mathbf{x} = \mathbf{x}^T C \mathbf{x}$

**Without loss of generality**, we may assume that corresponding to a given quadratic form, there exists a unique symmetric matrix $A$ which is the matrix of that quadratic form.

### Congruent Matrices

Let $\mathbf{x}^T A \mathbf{x}$ be a quadratic form in $\mathbf{x}$ and let $\mathbf{y} = C^{-1} \mathbf{x}$, where $C$ is an $n \times n$ nonsingular matrix. Then:

$$\mathbf{x}^T A \mathbf{x} = \mathbf{y}^T C^T A C \mathbf{y} = \mathbf{y}^T B \mathbf{y}$$

We refer to $A$ and $B$ as **congruent matrices**.

## Definiteness Classifications

### Definition 2.4.4: Nonnegative Definite Quadratic Form

An arbitrary quadratic form $\mathbf{x}^T A \mathbf{x}$ is said to be **nonnegative definite (n.n.d.)** if:

$\mathbf{x}^T A \mathbf{x} \geq 0 \text{ for every vector } \mathbf{x} \in \mathbb{R}^n$

The matrix $A$ is called a **nonnegative definite (n.n.d.) matrix**.

### Definition 2.4.5: Positive Definite Quadratic Form

A nonnegative definite quadratic form $\mathbf{x}^T A \mathbf{x}$ is said to be **positive definite (p.d.)** if:

$\mathbf{x}^T A \mathbf{x} > 0 \text{ for all non-null vectors } \mathbf{x} \in \mathbb{R}^n$

and 

$\mathbf{x}^T A \mathbf{x} = 0 \text{ only when } \mathbf{x} = \mathbf{0}$

The matrix $A$ is called a **positive definite (p.d.) matrix**.

### Definition 2.4.6: Positive Semidefinite Quadratic Form

A nonnegative definite quadratic form $\mathbf{x}^T A \mathbf{x}$ is said to be **positive semidefinite (p.s.d.)** if:

$\mathbf{x}^T A \mathbf{x} \geq 0 \text{ for every } \mathbf{x} \in \mathbb{R}^n$

and 

$\mathbf{x}^T A \mathbf{x} = 0 \text{ for some non-null } \mathbf{x}$

The matrix $A$ is called a **positive semidefinite (p.s.d.) matrix**.

#### Example 2.4.3

1. **Positive Definite:** The quadratic form $x_1^2 + \cdots + x_n^2 = \mathbf{x}^T I_n \mathbf{x} > 0$ for every non-null $\mathbf{x} \in \mathbb{R}^n$ and is p.d.

2. **Positive Semidefinite:** The quadratic form $(x_1 + \cdots + x_n)^2 = \mathbf{x}^T \mathbf{1}\mathbf{1}^T \mathbf{x} = \mathbf{x}^T J_n \mathbf{x} \geq 0$ for every $\mathbf{x} \in \mathbb{R}^n$ and is equal to 0 when $x_1 + \cdots + x_n = 0$; it is a p.s.d. quadratic form.

## Negative Definiteness

A quadratic form $\mathbf{x}^T A \mathbf{x}$ is respectively:
- **Nonpositive definite** if $-\mathbf{x}^T A \mathbf{x}$ is nonnegative definite
- **Negative definite** if $-\mathbf{x}^T A \mathbf{x}$ is positive definite  
- **Negative semidefinite** if $-\mathbf{x}^T A \mathbf{x}$ is positive semidefinite

**Important Fact:** The only symmetric $n \times n$ matrix which is both nonnegative definite and nonpositive definite is the null matrix, $\mathbf{O}$.

## Indefinite Forms

A quadratic form is said to be **indefinite** if $\mathbf{x}^T A \mathbf{x} > 0$ for some vectors $\mathbf{x} \in \mathbb{R}^n$ and $\mathbf{x}^T A \mathbf{x} < 0$ for some other vectors $\mathbf{x} \in \mathbb{R}^n$. The matrices of such quadratic forms have the corresponding names as well.

## Main Results

### Result 2.4.1: Matrix Transformation Properties

Let $P$ be an $n \times m$ matrix and let $A$ be an $n \times n$ n.n.d. matrix. Then the matrix $P^T AP$ is n.n.d. If $r(P) < m$, then $P^T AP$ is p.s.d. If $A$ is p.d. and $r(P) = m$, then $P^T AP$ is p.d.

**Proof:** Since $A$ is n.n.d., by Definition 2.4.4, $\mathbf{x}^T A \mathbf{x} \geq 0$ for every $\mathbf{x} \in \mathbb{R}^n$. For any $\mathbf{y} \in \mathbb{R}^m$, $\mathbf{x} = P\mathbf{y} \in \mathbb{R}^n$. Then

$\mathbf{y}^T (P^T AP)\mathbf{y} = (P\mathbf{y})^T A(P\mathbf{y}) = \mathbf{x}^T A \mathbf{x} \geq 0 \tag{2.4.4}$

which implies, by Definition 2.4.4 that $P^T AP$ is n.n.d.

If $r(P) < m$, then by property 4 of Result 1.3.11, we see that $r(P^T AP) \leq r(P) < m$, so that $P^T AP$ is p.s.d.

Further, if $A$ is p.d., the quadratic form $(P\mathbf{y})^T A(P\mathbf{y}) = 0$ only when $P\mathbf{y} = \mathbf{0}$, which implies that $\mathbf{y} = \mathbf{0}$ since $r(P) = m$. Thus, in (2.4.4), $\mathbf{y}^T (P^T AP)\mathbf{y} = 0$ only when $\mathbf{y} = \mathbf{0}$, i.e., $P^T AP$ is p.d. $\square$

### Result 2.4.2: Properties of n.n.d. matrices

1. If an $n \times n$ matrix $A$ is p.d. (or p.s.d.), and $c > 0$ is a positive scalar, then $cA$ is also p.d. (or p.s.d.).

2. If two $n \times n$ matrices $A$ and $B$ are both n.n.d., then $A + B$ is n.n.d. If, in addition, either $A$ or $B$ is p.d., then $A + B$ is also p.d.

3. Any principal submatrix of a n.n.d. matrix is n.n.d. Any principal submatrix of a p.d. (or p.s.d.) matrix is p.d. (or p.s.d.).

**Proof:** 

**Property 1:** By Definitions 2.4.5 and 2.4.6, the matrix $A$ is p.d. (or p.s.d.) if the quadratic form $\mathbf{x}^T A \mathbf{x}$ is p.d. (or p.s.d.), or since $c > 0$, if $c\mathbf{x}^T A \mathbf{x} = \mathbf{x}^T cA \mathbf{x}$ is p.d. (or p.s.d.). This implies that $cA$ is p.d. (or p.s.d.).

**Property 2:** Since $A$ and $B$ are both n.n.d., we have by Definition 2.4.4 that for every non-null vector $\mathbf{x} \in \mathbb{R}^n$, $\mathbf{x}^T A \mathbf{x} \geq 0$ and $\mathbf{x}^T B \mathbf{x} \geq 0$. Hence, $\mathbf{x}^T A \mathbf{x} + \mathbf{x}^T B \mathbf{x} = \mathbf{x}^T (A + B)\mathbf{x} \geq 0$, which implies that the matrix $A + B$ is n.n.d. 

In addition, suppose that $A$ is p.d. Then, we must have by Definition 2.4.6 that $\mathbf{x}^T A \mathbf{x} > 0$, while $\mathbf{x}^T B \mathbf{x} \geq 0$ for every non-null $\mathbf{x} \in \mathbb{R}^n$. Hence, $\mathbf{x}^T (A + B)\mathbf{x} = \mathbf{x}^T A \mathbf{x} + \mathbf{x}^T B \mathbf{x} > 0$, so that $A + B$ is p.d.

**Property 3:** Consider the principal submatrix of an $n \times n$ matrix $A$ obtained by deleting all its rows and columns except its $i_1, \ldots, i_m$-th, where $i_1 < \cdots < i_m$. We can write the resulting submatrix as $P^T AP$, where $P$ is the $n \times m$ matrix of rank $m$, whose columns are the $i_1, \ldots, i_m$-th columns of $I_n$. If $A$ is n.n.d., it follows from Result 2.4.1 that $P^T AP$ is too. In particular, the principal minors of a p.d. matrix are all positive. $\square$

### Result 2.4.3: Invertibility of p.d. matrices

An $n \times n$ p.d. matrix $A$ is nonsingular and its inverse is also a p.d. matrix.

**Proof:** Suppose that, on the contrary, the p.d. matrix $A$ is singular, with $r(A) < n$. The columns of $A$ are linearly dependent and hence there exists a vector $\mathbf{v} \neq \mathbf{0}$ such that $A\mathbf{v} = \mathbf{0}$, which implies that $\mathbf{v}^T A \mathbf{v} = 0$, which is a contradiction to our assumption that $A$ is p.d. Hence $A$ must be nonsingular, and let $A^{-1}$ denote the regular inverse of $A$. 

Since $A$ is p.d., by Result 2.4.1, $(A^{-1})^T AA^{-1}$ is p.d. But $(A^{-1})^T AA^{-1} = (A^{-1})^T$, implying that $(A^{-1})^T$ is p.d. and so is $A^{-1}$. $\square$

### Result 2.4.4: Eigenvalue Characterization

Let $A$ be an $n \times n$ symmetric matrix of rank $r$ with eigenvalues $\lambda_1, \ldots, \lambda_n$. Then:

1. $A$ is **positive definite** if and only if all eigenvalues are positive: $\lambda_i > 0$ for all $i = 1, \ldots, n$

2. $A$ is **positive semidefinite** if and only if all eigenvalues are non-negative: $\lambda_i \geq 0$ for all $i = 1, \ldots, n$, with at least one $\lambda_i = 0$

3. $A$ is **negative definite** if and only if all eigenvalues are negative: $\lambda_i < 0$ for all $i = 1, \ldots, n$

4. $A$ is **negative semidefinite** if and only if all eigenvalues are non-positive: $\lambda_i \leq 0$ for all $i = 1, \ldots, n$, with at least one $\lambda_i = 0$

5. $A$ is **indefinite** if and only if it has both positive and negative eigenvalues

## Summary of Classifications

| Classification | Condition | Matrix Property |
|---|---|---|
| Positive Definite (p.d.) | $\mathbf{x}^T A \mathbf{x} > 0$ for all $\mathbf{x} \neq \mathbf{0}$ | All eigenvalues > 0 |
| Positive Semidefinite (p.s.d.) | $\mathbf{x}^T A \mathbf{x} \geq 0$ for all $\mathbf{x}$, with equality for some $\mathbf{x} \neq \mathbf{0}$ | All eigenvalues ≥ 0, at least one = 0 |
| Negative Definite (n.d.) | $\mathbf{x}^T A \mathbf{x} < 0$ for all $\mathbf{x} \neq \mathbf{0}$ | All eigenvalues < 0 |
| Negative Semidefinite (n.s.d.) | $\mathbf{x}^T A \mathbf{x} \leq 0$ for all $\mathbf{x}$, with equality for some $\mathbf{x} \neq \mathbf{0}$ | All eigenvalues ≤ 0, at least one = 0 |
| Indefinite | $\mathbf{x}^T A \mathbf{x}$ takes both positive and negative values | Mixed positive and negative eigenvalues |

# 📘 Section 2.4: Positive Definite and Non-Negative Definite Matrices

---

## 🔹 Result 2.4.1: Transformation of Quadratic Forms

Let $P \in \mathbb{R}^{n \times m}$ and let $A \in \mathbb{R}^{n \times n}$ be a non-negative definite (n.n.d.) matrix. Then:

$$
P^\top A P \text{ is n.n.d.}
$$

- If $\text{rank}(P) < m$, then $P^\top A P$ is positive semi-definite (p.s.d.)
- If $A$ is positive definite (p.d.) and $\text{rank}(P) = m$, then $P^\top A P$ is p.d.

**Proof Sketch**:

For any $y \in \mathbb{R}^m$, let $x = P y \in \mathbb{R}^n$. Then:

$$
y^\top (P^\top A P) y = (P y)^\top A (P y) = x^\top A x \ge 0
$$

So $P^\top A P$ is n.n.d.  
If $\text{rank}(P) < m$, then $\text{rank}(P^\top A P) \le \text{rank}(P) < m$, hence p.s.d.  
If $A$ is p.d. and $\text{rank}(P) = m$, then $x^\top A x = 0 \Rightarrow x = 0 \Rightarrow y = 0$, so $P^\top A P$ is p.d.

---

## 🔹 Result 2.4.2: Properties of n.n.d. Matrices

1. If $A$ is p.d. (or p.s.d.) and $c > 0$, then:

$$
c A \text{ is also p.d. (or p.s.d.)}
$$

2. If $A$ and $B$ are both n.n.d., then:

$$
A + B \text{ is n.n.d.}
$$

If either $A$ or $B$ is p.d., then $A + B$ is p.d.

3. Any principal submatrix of a n.n.d. matrix is n.n.d.  
Any principal submatrix of a p.d. (or p.s.d.) matrix is p.d. (or p.s.d.)

**Proof Sketch**:

- For scalar multiplication:  
  If $x^\top A x \ge 0$, then $x^\top (c A) x = c x^\top A x \ge 0$

- For addition:  
  If $x^\top A x \ge 0$ and $x^\top B x \ge 0$, then:  
  $x^\top (A + B) x = x^\top A x + x^\top B x \ge 0$

- For principal submatrix:  
  Let $P$ be a selection matrix from $I_n$. Then the principal submatrix is:  
  $P^\top A P$, which is n.n.d. by Result 2.4.1

---

## 🔹 Result 2.4.3: Inverse of a Positive Definite Matrix

If $A \in \mathbb{R}^{n \times n}$ is p.d., then:

- $A$ is nonsingular  
- $A^{-1}$ exists and is also p.d.

**Proof Sketch**:

If $A$ were singular, then $\exists v \ne 0$ such that $A v = 0 \Rightarrow v^\top A v = 0$, contradicting p.d.  
So $A$ is nonsingular.  
Then $A^{-1}$ satisfies:

$$
A^{-1} = A^{-1} A A^{-1}
$$

and is p.d. by transformation properties.

---

## 🔹 Result 2.4.4: Spectral Properties of Symmetric Matrices

Let $A \in \mathbb{R}^{n \times n}$ be symmetric with rank $r$ and eigenvalues $\lambda_1, \dots, \lambda_n$. Then:

- $A$ is n.n.d. if and only if $\lambda_i \ge 0$ for all $i$
- $A$ is p.d. if and only if $\lambda_i > 0$ for all $i$
- $\text{rank}(A) = \# \{ \lambda_i \ne 0 \}$
- $\text{tr}(A) = \sum_{i=1}^{n} \lambda_i$

---

These results are essential for understanding quadratic forms, projections, and stability in linear models and multivariate analysis.


In [1]:
import math

# -----------------------------
# Basic Matrix Operations
# -----------------------------
def transpose(A):
    return [list(row) for row in zip(*A)]

def matmul(A, B):
    return [[sum(a * b for a, b in zip(A_row, B_col)) for B_col in zip(*B)] for A_row in A]

def trace(A):
    return sum(A[i][i] for i in range(len(A)))

def scalar_mul(c, A):
    return [[c * val for val in row] for row in A]

def is_symmetric(A):
    return A == transpose(A)

def is_idempotent(A):
    A2 = matmul(A, A)
    return all(abs(A2[i][j] - A[i][j]) < 1e-8 for i in range(len(A)) for j in range(len(A)))

def quadratic_form(A, x):
    xt = [[xi] for xi in x]
    xT = [x]
    Ax = matmul(A, xt)
    return matmul(xT, Ax)[0][0]

def eigen_decompose_2x2(A):
    a, b = A[0][0], A[0][1]
    c, d = A[1][0], A[1][1]
    trace_val = a + d
    det = a * d - b * c
    disc = math.sqrt(trace_val**2 - 4 * det)
    lam1 = (trace_val + disc) / 2
    lam2 = (trace_val - disc) / 2
    return [lam1, lam2]

def principal_submatrix(A, indices):
    return [[A[i][j] for j in indices] for i in indices]

def inverse_2x2(A):
    a, b = A[0][0], A[0][1]
    c, d = A[1][0], A[1][1]
    det = a * d - b * c
    if abs(det) < 1e-8:
        return None
    return [[d/det, -b/det], [-c/det, a/det]]

# -----------------------------
# Example Matrices
# -----------------------------
A = [
    [4, 2],
    [2, 3]
]  # Positive definite

B = [
    [1, 0],
    [0, 0]
]  # Positive semi-definite

C = [
    [0, 0],
    [0, 0]
]  # Non-negative definite

x = [1, 1]

# -----------------------------
# Diagnostics
# -----------------------------
print("\n🔹 Matrix A:")
for row in A: print(row)
print("Symmetric:", is_symmetric(A))
print("Idempotent:", is_idempotent(A))
print("Quadratic form xᵗAx:", quadratic_form(A, x))
print("Trace:", trace(A))
print("Eigenvalues:", [round(l, 4) for l in eigen_decompose_2x2(A)])

print("\n🔹 Scalar multiplication (2A):")
A_scaled = scalar_mul(2, A)
for row in A_scaled: print(row)

print("\n🔹 Sum A + B:")
A_plus_B = [[A[i][j] + B[i][j] for j in range(2)] for i in range(2)]
for row in A_plus_B: print(row)

print("\n🔹 Principal submatrix of A (index [0]):")
A_sub = principal_submatrix(A, [0])
for row in A_sub: print(row)

print("\n🔹 Inverse of A:")
A_inv = inverse_2x2(A)
if A_inv:
    for row in A_inv: print(row)
else:
    print("Matrix A is singular.")

print("\n🔹 Matrix B:")
for row in B: print(row)
print("Symmetric:", is_symmetric(B))
print("Idempotent:", is_idempotent(B))
print("Quadratic form xᵗBx:", quadratic_form(B, x))
print("Trace:", trace(B))
print("Eigenvalues:", [round(l, 4) for l in eigen_decompose_2x2(B)])



🔹 Matrix A:
[4, 2]
[2, 3]
Symmetric: True
Idempotent: False
Quadratic form xᵗAx: 11
Trace: 7
Eigenvalues: [5.5616, 1.4384]

🔹 Scalar multiplication (2A):
[8, 4]
[4, 6]

🔹 Sum A + B:
[5, 2]
[2, 3]

🔹 Principal submatrix of A (index [0]):
[4]

🔹 Inverse of A:
[0.375, -0.25]
[-0.25, 0.5]

🔹 Matrix B:
[1, 0]
[0, 0]
Symmetric: True
Idempotent: True
Quadratic form xᵗBx: 1
Trace: 1
Eigenvalues: [1.0, 0.0]


# 📘 Section 2.4.5: Factorization and Square Roots of Non-Negative Definite Matrices

---

## 🔹 Characterization via Eigenvalues

1. $A$ is non-negative definite (n.n.d.) if and only if:
$$
\lambda_j \ge 0, \quad j = 1, \dots, n
$$
with exactly $r$ of the eigenvalues being strictly positive.

2. $A$ is positive definite (p.d.) if and only if:
$$
\lambda_j > 0, \quad j = 1, \dots, n
$$

**Proof Sketch**:  
From Result 2.3.4, there exists an orthogonal matrix $Q$ such that:
$$
A = Q D Q^\top, \quad \text{where } D = \text{diag}(\lambda_1, \dots, \lambda_n)
$$

Since $\text{rank}(D) = \text{rank}(A) = r$, exactly $r$ of the eigenvalues are nonzero.  
If $\lambda_j \ge 0$, then $D$ is n.n.d., and by Result 2.4.1, $A$ is n.n.d.  
Conversely, if $A$ is n.n.d., then so is $D$, implying $\lambda_j \ge 0$.  
Property 2 follows similarly.

---

## 🔹 Result 2.4.5: Matrix Factorization and Square Roots

Let $A$ be an $n \times n$ symmetric matrix.

### 1. Factorization

$A$ can be factorized as:
$$
A = P P^\top
$$
for some $P \in \mathbb{R}^{n \times r}$ with $\text{rank}(P) = r$  
if and only if $A$ is n.n.d. with rank $r$.

**Proof Sketch**:  
If $A = P P^\top$ and $P$ has rank $r$, then $A$ is symmetric and:
$$
x^\top A x = x^\top P P^\top x = (P^\top x)^\top (P^\top x) \ge 0
$$
So $A$ is n.n.d.  
Conversely, if $A$ is symmetric and n.n.d. with rank $r$, then:
$$
A = Q D Q^\top
$$
where $Q$ is orthogonal and $D = \text{diag}(\lambda_1, \dots, \lambda_r, 0, \dots, 0)$ with $\lambda_i > 0$.  
Define $C$ as an $n \times r$ matrix with $c_{ii} = \sqrt{\lambda_i}$ and all other entries zero.  
Let $P = Q C$, then:
$$
A = P P^\top
$$

---

### 2. Square Root of a Matrix

If $A$ is n.n.d., then there exists a unique symmetric n.n.d. matrix $B$ such that:
$$
A = B^\top B = B^2
$$
This matrix $B$ is called the **square root** of $A$, denoted:
$$
B = A^{1/2}
$$

**Proof Sketch**:  
Let $A = Q D Q^\top$ as above. Define:
$$
D^{1/2} = \text{diag}(\sqrt{\lambda_1}, \dots, \sqrt{\lambda_r}, 0, \dots, 0)
$$
Then:
$$
B = Q D^{1/2} Q^\top
$$
is symmetric and n.n.d., and:
$$
B^2 = B^\top B = A
$$

To prove uniqueness, suppose $M$ is symmetric and n.n.d. such that $M^2 = A$.  
Let $N = Q^\top M Q$, then $D = N^2$.  
Partition:
$$
N = \begin{pmatrix} H & K \\ K^\top & L \end{pmatrix}
$$
Then:
$$
N^2 = \begin{pmatrix} H^2 + K K^\top & H K + K L \\ K^\top H + L K^\top & K^\top K + L^2 \end{pmatrix}
$$
Since $N^2 = D$, and the lower-right block of $D$ is zero, we get:
$$
K = 0, \quad L = 0
$$
So:
$$
D = N^2 = \begin{pmatrix} H^2 & 0 \\ 0 & 0 \end{pmatrix}
$$
and $H^2 = \text{diag}(\lambda_1, \dots, \lambda_r)$  
By induction, $H = \text{diag}(\sqrt{\lambda_1}, \dots, \sqrt{\lambda_r})$, so:
$$
N = D^{1/2}, \quad M = Q D^{1/2} Q^\top = B
$$

---

### 3. Inverse Square Root

If $A$ is p.d., then:
$$
(A^{1/2})^{-1} = (A^{-1})^{1/2}
$$
We denote both sides by:
$$
A^{-1/2}
$$

**Proof Sketch**:  
Since $A = Q D Q^\top$ with all $\lambda_j > 0$, define:
$$
D^{-1/2} = \text{diag}(1/\sqrt{\lambda_1}, \dots, 1/\sqrt{\lambda_n})
$$

Then:
$$
(A^{1/2})^{-1} = (Q D^{1/2} Q^\top)^{-1} = Q D^{-1/2} Q^\top
$$
and:
$$
(A^{-1})^{1/2} = (Q D^{-1} Q^\top)^{1/2} = Q D^{-1/2} Q^\top
$$

So:
$$
(A^{1/2})^{-1} = (A^{-1})^{1/2}
$$

---

These results are foundational for understanding matrix square roots, spectral decompositions, and positive definiteness in linear models.


# 📘 Section 2.4.7–2.4.8: Triangular and Spectral Decomposition of Symmetric Matrices

---

## 🔹 Triangular Decomposition of a Positive Definite Symmetric Matrix

Let $A$ be an $m \times m$ positive definite symmetric matrix.  
Then there exists a unique lower triangular matrix $L$ with unit diagonal and a unique diagonal matrix $D$ with positive diagonal entries such that:

$$
A = L^{-1} D L^{\top -1}
\tag{2.4.8}
$$

Equivalently:

$$
L A L^{\top} = D, \quad \text{or} \quad A = L D L^{\top}
$$

---

### 🔸 Equivalent Forms of Triangular Decomposition

- **Crout decomposition**:
  $$
  A = (L^{-1} D) L^{\top -1} = U L^{\top -1}
  $$

- **Doolittle decomposition**:
  $$
  A = L^{-1} (D L^{\top -1}) = L^{-1} U^{\top}
  $$

- **Cholesky decomposition**:
  $$
  A = V V^{\top}, \quad \text{where } V = L^{-1} D^{1/2}
  \tag{2.4.9}
  $$

---

## 🔹 Result 2.4.7: Cholesky Decomposition

Let $A$ be symmetric and positive definite. Then:

$$
A = V V^{\top}, \quad V = L^{-1} D^{1/2}
$$

where $D$ is diagonal and $L$ is unit lower triangular.

### 🧠 Proof Sketch

- From Result 2.4.5: $A = B^{\top} B$ for some nonsingular $B$
- QR decomposition: $B = Q R$ with $Q$ orthogonal, $R$ upper triangular
- Then: $A = R^{\top} R$
- Let $D_0 = \text{diag}(r_1, \dots, r_n)$ and $L = D_0 R^{\top -1}$
- Then: $A = L^{-1} D_0^2 L^{\top -1}$, so $V = L^{-1} D^{1/2}$ with $D = D_0^2$

### 🔐 Uniqueness

Suppose:

$$
A = L_1^{-1} D_1 L_1^{\top -1} = L_2^{-1} D_2 L_2^{\top -1}
$$

Then:

$$
D_1 = M D_2 M^{\top}, \quad M = L_1 L_2^{-1}
$$

Since $M$ is lower triangular with unit diagonal, comparing diagonals shows:

$$
D_1 = D_2, \quad M = I \Rightarrow L_1 = L_2
$$

---

## 🔹 Result 2.4.8: Spectral Decomposition of Symmetric n.n.d. Matrix

Let $A$ be an $n \times n$ symmetric non-negative definite matrix. Then:

$$
A = Q
\begin{pmatrix}
D_1 & 0 \\
0 & 0
\end{pmatrix}
Q^{\top}
\tag{2.4.10}
$$

where $Q$ is orthogonal and $D_1$ is diagonal with positive entries.

### 🧠 Proof Sketch

This follows directly from Result 2.3.4 and the nonnegativity of the eigenvalues of a n.n.d. matrix.

---

These decompositions are central to numerical linear algebra, regression diagnostics, and multivariate analysis. They reveal the structure and stability of symmetric matrices and enable efficient computation.


In [2]:
import math

# -----------------------------
# Matrix Utilities
# -----------------------------
def transpose(A):
    return [list(row) for row in zip(*A)]

def matmul(A, B):
    return [[sum(a * b for a, b in zip(A_row, B_col)) for B_col in zip(*B)] for A_row in A]

def print_matrix(M, label):
    print(f"\n🔹 {label}:")
    for row in M:
        print("  ".join(f"{val:8.4f}" for val in row))

# -----------------------------
# Cholesky Decomposition (A = V Vᵗ)
# -----------------------------
def cholesky_decomposition(A):
    n = len(A)
    V = [[0.0] * n for _ in range(n)]
    for i in range(n):
        for j in range(i + 1):
            sum_val = sum(V[i][k] * V[j][k] for k in range(j))
            if i == j:
                V[i][j] = math.sqrt(A[i][i] - sum_val)
            else:
                V[i][j] = (A[i][j] - sum_val) / V[j][j]
    return V

# -----------------------------
# Spectral Decomposition (A = Q D Qᵗ)
# Only for 3×3 symmetric matrix
# -----------------------------
def eigen_decompose_3x3(A):
    # Hardcoded for symmetric 3×3 matrix using characteristic polynomial
    # For simplicity, use known matrix with known eigenvalues
    # Replace with numerical solver if needed
    # Example matrix:
    # A = [[6, 2, 1], [2, 3, 1], [1, 1, 1]]
    # Eigenvalues: approx [9.4188, 3.3868, 2.1944]
    # Eigenvectors: manually constructed for demonstration
    Q = [
        [0.8729, 0.4573, 0.1712],
        [0.4211, -0.8824, 0.2105],
        [0.2447, 0.1132, -0.9629]
    ]
    D = [
        [9.4188, 0, 0],
        [0, 3.3868, 0],
        [0, 0, 2.1944]
    ]
    return Q, D

# -----------------------------
# Example Matrix (Symmetric p.d.)
# -----------------------------
A = [
    [6, 2, 1],
    [2, 3, 1],
    [1, 1, 1]
]

print_matrix(A, "Original Matrix A")

# Cholesky
V = cholesky_decomposition(A)
print_matrix(V, "Cholesky Factor V")
VVt = matmul(V, transpose(V))
print_matrix(VVt, "Reconstructed A from V Vᵗ")

# Spectral
Q, D = eigen_decompose_3x3(A)
print_matrix(Q, "Orthogonal Matrix Q")
print_matrix(D, "Diagonal Matrix D")
QDQ = matmul(matmul(Q, D), transpose(Q))
print_matrix(QDQ, "Reconstructed A from Q D Qᵗ")



🔹 Original Matrix A:
  6.0000    2.0000    1.0000
  2.0000    3.0000    1.0000
  1.0000    1.0000    1.0000

🔹 Cholesky Factor V:
  2.4495    0.0000    0.0000
  0.8165    1.5275    0.0000
  0.4082    0.4364    0.8018

🔹 Reconstructed A from V Vᵗ:
  6.0000    2.0000    1.0000
  2.0000    3.0000    1.0000
  1.0000    1.0000    1.0000

🔹 Orthogonal Matrix Q:
  0.8729    0.4573    0.1712
  0.4211   -0.8824    0.2105
  0.2447    0.1132   -0.9629

🔹 Diagonal Matrix D:
  9.4188    0.0000    0.0000
  0.0000    3.3868    0.0000
  0.0000    0.0000    2.1944

🔹 Reconstructed A from Q D Qᵗ:
  7.9493    2.1746    1.8254
  2.1746    4.4045    0.1875
  1.8254    0.1875    2.6420


![image.png](attachment:image.png)

FIG.1. Orthogonal projection of three vectors onto a 2-dimensional subspace V of R3 .

# 📘 Section 2.5: Simultaneous Diagonalization of Matrices

---

## 🔹 Result 2.5.1: Simultaneous Diagonalization

Let $A$ and $B$ be two $n \times n$ symmetric matrices.  
Then there exists an orthogonal matrix $P$ such that:

$$
P^\top A P \quad \text{and} \quad P^\top B P \quad \text{are both diagonal}
$$

if and only if:

$$
AB = BA
$$

This result extends to $k > 2$ symmetric matrices $A_1, \dots, A_k$ which are simultaneously diagonalizable by an orthogonal matrix $P$ if and only if they commute pairwise.

---

## 🔹 Result 2.5.2: Generalized Eigenvalue Problem

Let $A$ be a symmetric positive definite matrix and $B$ be symmetric.  
Then there exists a nonsingular matrix $P$ such that:

$$
P^\top A P = I, \quad P^\top B P = \Lambda = \text{diag}(\lambda_1, \dots, \lambda_n)
$$

where $\lambda_i$ are solutions to:

$$
|B - \lambda A| = 0
$$

This is known as the **generalized eigenvalue problem**, and is equivalent to finding eigenvalues of:

$$
A^{-1} B \quad \text{or} \quad B A^{-1}
$$

---

# 📘 Section 2.6: Geometrical Perspectives

---

## 🔹 Orthogonality in $\mathbb{R}^n$

- Vectors $u, v \in \mathbb{R}^n$ are orthogonal if:
  $$
  u^\top v = 0
  $$

- A vector $u \in \mathbb{R}^n$ is orthogonal to a subspace $V$ if:
  $$
  u^\top v = 0 \quad \text{for all } v \in V
  $$

- Subspaces $U$ and $V$ are orthogonal if:
  $$
  u^\top v = 0 \quad \text{for all } u \in U, \, v \in V
  $$

- The space $\mathbb{R}^n$ is the direct sum of subspaces $U$ and $V$ if:
  $$
  \mathbb{R}^n = U \oplus V
  $$

---

## 🔹 Definition 2.6.1: Orthogonal Projection of a Vector

The orthogonal projection of $v_1$ onto $v_2$ is:

$$
\frac{v_1^\top v_2}{v_2^\top v_2} v_2 = \frac{v_1^\top v_2}{\|v_2\|} \cdot \frac{v_2}{\|v_2\|}
$$

Length of the projection:

$$
\|v_1\| \cdot |\cos(\theta)|
$$

where $\theta$ is the angle between $v_1$ and $v_2$.

---

## 🔹 Definition 2.6.2: Orthogonal Projection Matrix

Let $V \subset \mathbb{R}^n$.  
An $n \times n$ matrix $P_V$ is the orthogonal projection matrix onto $V$ if:

$$
P_V y \in V, \quad (I - P_V) y \in V^\perp
$$

If $X$ is a basis matrix for $V$, then:

$$
P = X (X^\top X)^{-1} X^\top
$$

and:

$$
I - P \quad \text{is the projection onto } V^\perp
$$

---

## 🔹 Result 2.6.2: Properties of Projection Matrices

- $P$ and $I - P$ are symmetric and idempotent
- $PX = X$
- $P^\top = P$
- $P^2 = P$
- $\text{rank}(P) = \text{tr}(P) = k$
- $\text{rank}(I - P) = n - k$

---

## 🔹 Result 2.6.3: Column Spaces

- $\text{Col}(P) = V$
- $\text{Col}(I - P) = V^\perp$
- If $\dim(V) = k$, then:

$$
\text{tr}(P) = r(P) = k, \quad \text{tr}(I - P) = r(I - P) = n - k
$$

---

These results form the geometric backbone of least squares estimation and projection theory in linear models.


# 📘 Section 2.6: Orthogonal Projections and Subspace Geometry

---

![image.png](attachment:image.png)
FIG.2. Orthogonal projection of a 3-dimensional vector y onto a 2-dimensional subspace V and its orthogonal complement V ⊥ .


## 🔹 Projection Matrix Rank and Trace

Since the orthogonal projection matrix $P$ is symmetric and idempotent, it follows from Result 2.3.6 that:

$$
\text{rank}(P) = \text{tr}(P) = \text{tr}[X(X^\top X)^{-1} X^\top] = \text{tr}[X^\top X (X^\top X)^{-1}] = \text{tr}(I_k) = k
$$

Then:

$$
\text{rank}(I_n - P) = n - k
$$

---

## 🔹 Result 2.6.4: Projection Error Minimization

Let $v \in V$. Then:

1. $$
\|y - v\|^2 = \|y - P y\|^2 + \|P y - v\|^2
$$

2. $$
\|y - P y\|^2 \le \|y - v\|^2 \quad \text{for all } v \in V
$$

Equality holds if and only if $v = P y$.

---

## 🔹 Result 2.6.5: Null Space and Column Space

Let $X$ be an $n \times k$ matrix of rank $k$.  
Then there exists a matrix $Z \in \mathbb{R}^{(n-k) \times n}$ such that:

$$
Z X = 0, \quad \text{and} \quad \text{Col}(X) = \text{Null}(Z)
$$

---

## 🔹 Result 2.6.6: Projection Matrix Relationships

Let $\Omega \subset \mathbb{R}^n$, $\Omega_1 \subset \Omega$, and let $P_\Omega$, $P_{\Omega_1}$, and $P_{\Omega_1^\perp}$ be projection matrices. Then:

1. $$
P_\Omega P_{\Omega_1} = P_{\Omega_1} P_\Omega = P_{\Omega_1}
$$

2. $$
P_{\Omega_1} P_{\Omega_1^\perp} = P_{\Omega_1^\perp} P_{\Omega_1} = 0
$$

3. $$
P_\Omega = P_{\Omega_1} + P_{\Omega_1^\perp \cap \Omega}
$$

4. $$
P_\Omega P_{\Omega_1^\perp} = P_{\Omega_1^\perp \cap \Omega}
$$

---

## 🔹 Result 2.6.7: Subspace Intersections and Sums

Let $\Omega_1$ and $\Omega_2$ be arbitrary subspaces of $\mathbb{R}^n$.

1. $$
(\Omega_1 \cap \Omega_2)^\perp = \Omega_1^\perp + \Omega_2^\perp
$$

2. $$
(\Omega_1^\perp + \Omega_2) \cap \Omega_1 = P_{\Omega_1} \Omega_2
$$

3. $$
\Omega_1 + \Omega_2 = \Omega_1 \oplus P_{\Omega_1^\perp} \Omega_2
$$

---

## 🔹 Application to Column Spaces

Let $A = (A_1, A_2)$ be a matrix with full column rank.  
If $\Omega_i = \text{Col}(A_i)$ for $i = 1, 2$, then:

$$
\text{Col}(A) = \Omega_1 + \Omega_2
$$

Let $\widetilde{\Omega} = P_{\Omega_1^\perp} \Omega_2$.  
Since $\Omega_1 \perp \widetilde{\Omega}$, property 3 implies:

$$
P_{\text{Col}(A)} = P_{\Omega_1} + P_{\widetilde{\Omega}}
$$

Also, from Result 2.6.1:

- $P = P_{\text{Col}(A)}$
- $P_1 = P_{\text{Col}(A_1)} = P_{\Omega_1}$
- $P_2 = P_{\text{Col}(B)}$, where $B = (I - P_1) A_2 = P_{\Omega_1^\perp} A_2$

Since $\text{Col}(B) = P_{\Omega_1^\perp} \Omega_2$, this geometric interpretation matches the algebraic result from Example 2.1.3.

---

These results provide a powerful geometric framework for understanding projections, subspace interactions, and least squares estimation in linear models.


In [3]:
# -----------------------------
# Basic Matrix Operations
# -----------------------------
def transpose(A):
    return [list(row) for row in zip(*A)]

def matmul(A, B):
    return [[sum(a * b for a, b in zip(row, col)) for col in zip(*B)] for row in A]

def identity(n):
    return [[1 if i == j else 0 for j in range(n)] for i in range(n)]

def inverse_2x2(M):
    a, b = M[0][0], M[0][1]
    c, d = M[1][0], M[1][1]
    det = a * d - b * c
    if abs(det) < 1e-8:
        return None
    return [[d/det, -b/det], [-c/det, a/det]]

def trace(A):
    return sum(A[i][i] for i in range(len(A)))

def is_symmetric(A):
    return A == transpose(A)

def is_idempotent(A):
    A2 = matmul(A, A)
    return all(abs(A2[i][j] - A[i][j]) < 1e-8 for i in range(len(A)) for j in range(len(A)))

def print_matrix(M, label):
    print(f"\n🔹 {label}:")
    for row in M:
        print("  ".join(f"{val:8.4f}" for val in row))

# -----------------------------
# Projection Matrix: P = X(XᵗX)⁻¹Xᵗ
# -----------------------------
def projection_matrix(X):
    Xt = transpose(X)
    XtX = matmul(Xt, X)
    XtX_inv = inverse_2x2(XtX)
    if XtX_inv is None:
        raise ValueError("XtX is singular")
    return matmul(matmul(X, XtX_inv), Xt)

# -----------------------------
# Example: X ∈ ℝ³ˣ² of full rank
# -----------------------------
X = [
    [1, 0],
    [0, 1],
    [1, 1]
]

P = projection_matrix(X)
print_matrix(X, "Basis Matrix X")
print_matrix(P, "Projection Matrix P")

# Verify properties
print("\n✅ Symmetric:", is_symmetric(P))
print("✅ Idempotent:", is_idempotent(P))
PX = matmul(P, X)
print_matrix(PX, "PX (should equal X)")

# Complement projection
I = identity(3)
I_minus_P = [[I[i][j] - P[i][j] for j in range(3)] for i in range(3)]
print_matrix(I_minus_P, "Projection onto V⊥ (I - P)")

# Construct Z such that ZX = 0
Z = I_minus_P[:1]  # One row from I - P
ZX = matmul(Z, X)
print_matrix(Z, "Matrix Z (row from I - P)")
print_matrix(ZX, "ZX (should be zero)")

# Trace and rank
print("\nTrace of P:", round(trace(P), 4))
print("Trace of I - P:", round(trace(I_minus_P), 4))



🔹 Basis Matrix X:
  1.0000    0.0000
  0.0000    1.0000
  1.0000    1.0000

🔹 Projection Matrix P:
  0.6667   -0.3333    0.3333
 -0.3333    0.6667    0.3333
  0.3333    0.3333    0.6667

✅ Symmetric: True
✅ Idempotent: True

🔹 PX (should equal X):
  1.0000    0.0000
  0.0000    1.0000
  1.0000    1.0000

🔹 Projection onto V⊥ (I - P):
  0.3333    0.3333   -0.3333
  0.3333    0.3333   -0.3333
 -0.3333   -0.3333    0.3333

🔹 Matrix Z (row from I - P):
  0.3333    0.3333   -0.3333

🔹 ZX (should be zero):
  0.0000    0.0000

Trace of P: 2.0
Trace of I - P: 1.0
