## 5A Invariant Subspaces

### Eigenvalues

**Definition 5.1: Operator**
- A linear map from a vector space onto itself is called an *operator*

**Definition 5.2: Invariant Subspace**
- Let $T\in\mathcal{L}(V)$, a subspace $U$ of $V$ is called *invariant* under $T$ if: $$Tu\in U, \ \forall u\in U$$

Consider this operator on $V$. If $T|_U$, which is the mapping of $T$ restricted to the domain of $U$, is an *operator on* $U$, then $U$ is an invariant subspace.

Ex. $T\in\mathcal{L}(\mathcal{P}(\reals))\coloneqq Tp=p'$
- Consider $\mathcal{P}_4(\reals)$. This is a subspace of $\mathcal{P}(\reals)$ that is invariant under $T$ becuase any $p\in\mathcal{P}_4(\reals)$ has degree of at most 4 and its first derivative $p'$ also has degree of at most 4. Thus $p'\in\mathcal{P}_4(\reals)$

Ex. Generically invariant subspaces for any operator $T\in\mathcal{L}(V)$:
- $\{0\}$ is an invariant subspace under $T$: $$Tu=0\in\{0\}, \ \forall u\in\{0\}$$
- $v$ is an invariant subspace under $T$: $$Tu\in V, \ \forall u \in V$$
- $\text{null } T$ is an invariant subspace under $T$: $$Tu=0\in\text{null } T, \ \forall u\in \text{null } T$$
- $\text{range } T$ is an invariant subspace under $T$: $$Tu\in\text{range T}, \ \forall u \in\text{range } T$$

**NOTE:** From these generic examples, it is clear that any operator on $V$ must have at least the invariant subspaces of $\{0\}$ and $V$. However, it turns out (as we will see later) that there must exist more invariant subspaces as well.

**NOTE:** When $T$ is an automorphism (i.e. it is bijective and invertible), then $\text{null }T = \{0\}$ and $\text{range } T = V$

#### Eigenvalues

Consider a one-dimensional subspace. Take any $v\in V$ with $v\ne 0$ and let $U$ be the set of *all scalar multiples* of $v$: $$U=\{\lambda v: \lambda\in\mathbb{F}\} = \text{span}(v)$$
$U$ is a one-dimensional subspace of $V$. Note that $U$ is uniquely defined w.r.t. the chosen $v\in V$. Any one-dimensional subspace of $V$ is similarly defined based on the choice of $v$.\
If $U$ is invariant under an operator $T\in\mathcal{L}(V)$, then $Tv\in U$, and thus there exists a scalar $\lambda\in\mathbb{F}$ such that: $$Tv=\lambda v$$
The converse is also true:\
If $Tv=\lambda v$ for some $\lambda\in\mathbb{F}$, then $\text{span}(v)$ is a one-dimensional subspace of $V$ that is invariant under $T$.

**Definition 5.5: Eigenvalue**
- For $T\in\mathcal{L}(V)$, a number $\lambda \in \mathbb{F}$ is called an *eigenvalue* of $T$ if there exists $v\in V$ such that $v\ne 0$ and $Tv = \lambda v$
- I.e.: $$\exists v\in V : Tv = \lambda v, v\ne 0 \implies \lambda \text{ is eigenvalue}$$

So, $V$ has a one-dimensional subspace that is invariant under $T$ if and only if $T$ has an eigenvalue

Ex. $T\in\mathcal{L}(\mathbb{F}^3)$:
$$ T(x, y, z) = \big(7x + 3z, \ 3x + 6y + 9z, \ -6y \big), \ \forall (x,y,z)\in\mathbb{F}^3$$
Then $T(3, 1, -1) = (18, 6, -6) = 6(3, 1, -1)$, thus $6$ is an *eigenvalue* of $T$

**Equivalent Conditions for an Eigenvalue**
- For finite-dimensional vector space $V$, linear map $T\in\mathcal{L}(V)$, and $\lambda\in\mathbb{F}$:
    - $\lambda \text{ is an eigenvalue of } T$
    - $T - \lambda I \text{ is not injective}$
    - $T - \lambda I \text{ is not surjective}$
    - $T - \lambda I \text{ is not invertible}$

From $Tv = \lambda v$ it follows that $(T - \lambda I)v = 0$

**Definition 5.8: Eigenvector**
- For $T\in\mathcal{L}(V)$ and eigenvalue $\lambda\in\mathbb{F}$ of $T$, the vector $v\in V$ is called an *eigenvector* of $T$ corresponding to $\lambda$ if $v\ne 0 $ and $Tv = \lambda v$

From the definition of an eigenvalue (5.5), the eigenvalue is specified for a specific vector $v\in V$. This vector is the eigenvector.\
A vector $v\in V$ is an eigenvector of an operator $T$ on $V$ if and only if $Tv$ is a scalar multiple of $v$. Correspondingly, a vector $v\in V$ is an eigenvector of $T$ corresponding to the eigenvalue $\lambda$ if and only if $v\in \text{null}(T - \lambda I)$

**Linear Independence of Eigenvectors**
- For an operator $T\in\mathcal{L}(V)$, *every* list of eigenvectors of $T$ corresponding to distinct eigenvalues of $T$ is linearly independent

*Proof:*

Proof by contradiction. Were this theorem false, then there would exist a linearly dependent list of eigenvectors **of minimal length** $m$, $v_1,...,v_m$ corresponding to *distinct* eigenvalues $\lambda_1,...,\lambda_m$ of $T$. Note that $m\ge 2$ because eigenvectors (and values) are non-zero by definition. Then by the definition of linear dependence, there must exist a set of coefficients $a_1,...,a_m\in\mathbb{F}$, not all zero, such that: $$a_1v_1 + \cdots + a_mv_m = 0$$
Apply $T-\lambda_m I$ to both sides:
$$a_1(\lambda_1 - \lambda_m) v_1 + \cdots + a_{m-1} (\lambda_{m-1} - \lambda_m) v_{m-1} = 0$$
Because the eigenvalues are distinct, none of the coefficients in this equation are 0. Thus the list of eigenvectors $v_1,...,v_{m-1}$ is a linearly dependent list of length $m-1$. This violiates the minimallity of $m$, and thus completes the proof.

**Operators Cannot Have More Eigenvectors than the Dimension of Vector Space**
- For finite-dimensional $V$, each operator on $V$ has *at most* $\dim V$ distinct eigenvalues

This follows clearly from the theorem that eigenvectors are independent. The maximum length of any list of independent vectors in a finite-dimensional vector space $V$ *is* $\dim V$, by the definition of dimension.

### Polynomials Applied to Operators

Operators may be raised to powers. This isn't that remarkable when you think about it. A linear map between subspaces (i.e. between spaces of different dimensions, a non-operator) cannot be composed with itself because the domain will be incompatible after the first mapping. But an operator (i.e. a linear map on the same vector space) may be composed with itself infinitely.

**Notation $T^m$**\
For operator $T\in\mathcal{L}(V)$:
- $T^m\in\mathcal{L}(V)$ is defined by $T^m = T\cdots T$, $m$ times
- $T^0$ is defined to be the identity operator $I$ on $V$
- If $T$ is invertible, then $T^{-m}$ is defined by: $$T^{-m} = (T^{-1})^m$$

**Notation: $p(T)$**\
For operator $T\in\mathcal{L}(V)$ and polynomial $p\in\mathcal{P}(\mathbb{F})$, $p(T)$ is the operator on $V$ defined by: $$p(T) = a_0 I + a_1T + a_2T^2 + \cdots + a_mT^m$$

**Definition 5.16: Product of Polynomials**
- If $p,q \in\mathcal{P}(\mathbb{F})$, then $pq\in\mathcal{P}(\mathbb{F})$ is the polynomial defined by: $$(pq)(z) = p(z)q(z), \ \forall z \in\mathbb{F}$$

**Multiplicative Properties:**
$$\begin{align*}
& a) \ \ (pq)(T) = p(T)q(T) \\
& b) \ \ p(T)q(T) = q(T)p(T)
\end{align*}$$
So, the order doesn't matter. This corresponds exactly with the product of any other polynomials. It is possible because an operator may be composed with itself (so, operators are commutative).

**Null Space and Range of $p(T)$ are invariant under $T$**
- For an operatore $T\in\mathcal{L}(V)$ and a polynomial $p\in\mathcal{P}(\mathbb{F})$, $\text{null } p(T)$ and $\text{range } p(T)$ are both *invariant* under $T$

We saw earlier that the null space and range of an operator are invariant subspaces under the operator. This just extends that logic to the polynomial of an operator, which is intuitive because the polynomial of an operator *is still an operator*.

### Exercises 5A

In [13]:
import numpy as np
x = np.array([1, 2, 3, 4, 5])
y = np.array([3, 1, 3, 4, 5])

v = np.column_stack((x, y))

T = np.array([[0, -3], [1, 0]])
print(v)
print(T @ v.T)

[[1 3]
 [2 1]
 [3 3]
 [4 4]
 [5 5]]
[[ -9  -3  -9 -12 -15]
 [  1   2   3   4   5]]


## 5B The Minimal Polynomial

### Existence of Eigenvalues on Complex Vector Spaces

***Major Theorem:***
**Existence of Eigenvalues**
- Every operator on a finite-dimensional nonzero complex vector space has an eigenvalue

*Proof:*\
Let $V$ be a finite-dimensional complex vector space  with dimension $n>0$ and $T\in\mathcal{L}(V)$ be an operator on $V$. Then for some $v\ne 0 \in V$ we may specify polynomials of $T$: $$v, Tv, T^2v, ..., T^n v$$
Note that this is a list of terms of a polynomial $p_n(T)$. Such a list of polynomial terms *cannot be* linearly independent because its length is $n+1$ and $\dim V = n$. Thus, there exists some non-trivial linear combination of the vectors in this list such that: $$p(T)v = 0$$
The first version of the fundamental theorem of algebra (ch 4) states: "Every non-constant polynomial with complex coefficients has a root in $\mathbb{C}$". Thus, there exists $\lambda \in \mathbb{C} : p(\lambda) = 0$, and some polynomial $q\in\mathcal{P}(\mathbb{C})$ such that: $$p(z) = (z-\lambda)q(z), \ \forall z\in \mathbb{C}$$
Where $p$ has some degree $m$ and $q$ has some degree $m-1$ (ch 4 again)\
This implies that:
$$p(T)v = (T - \lambda I ) \big(q(T)v\big) = 0$$
Because $q$ has degree less than $p$ (i.e. $m-1 < m$), it must be the case that $q(T)v \ne 0$.\
 Thus $\lambda$ is an eigenvalue with eigenvector $q(T)v$

*More Clean Proof:*\
Let $V$ be a complex finite-dimensional vector space with $\dim V = n > 0$ and operator $T\in\mathcal{L}(V)$.\
 Then consider the list of polynomial terms: 
 $$v, Tv, T^2v,...,T^nv, \ v\ne 0 \in V$$
This list *cannot be* linearly independent because it has length $n+1$ while $\dim V = n$.\
 Thus, there exist *complex numbers* $a_0,...,a_n$ not all zero such that: 
 $$a_0v + a_1Tv + a_2T^2v + \cdots + a_n T^n v = 0$$
This equation is a polynomial with complex coefficients.\
 By the fundamental theorem of algebra (ch 4), it has a factorization of the form:
$$a_0 + a_1z + \cdots + a_n z^n = c(z - \lambda_1) \cdots (z - \lambda_m)$$ 
For $c\ne 0 \in \mathbb{C}$ and all $\lambda_j \in \mathbb{C}$\
This factorization is as follows:
$$
\begin{align*}
& 0 = a_0v + a_1Tv + a_2T^2v + \cdots + a_n T^n v \\
& \ \ = \big(a_0 I + a_1 T + \cdots + a_n T^n \big) v \\
& \ \ = c(T - \lambda_1 I ) \cdots (T - \lambda_m I) v
\end{align*}
$$
Because $c \ne 0$ and $v \ne 0$, at least one term $T - \lambda_j I = 0$ in this expression. Equivalently, $T-\lambda_j I$ is *not injective* for some $0<j\le m \le n$ Thus, there exists at least one eigenvalue $\lambda_j$ for $T$.

### Eigenvalues and the Minimal Polynomial

**Definition 5.21: Monic Polynomial**
- A *monic polynomial* is a polynomial whose highest-degree *coefficient* equals $1$
- E.g. $2 + 9z^2 + z^7$ is a monic polynomial of degree $7$

**Existence, Uniqueness, and Degree of the Minimal Polynomial**
- For finite-dimensional $V$ and operator $T\in\mathcal{L}(V)$, there is a unique monic polynomial $p\in\mathcal{p}(\mathbb{F})$ of smallest degree such that:
    - $p(T)=0$
    - $\deg p \le \dim V$

**Definition 5.24: Minimal Polynomial**\
For finite-dimensional $V$ and operator $T\in\mathcal{L}(V)$:
- The minimal polynomial of $T$ is the unique monic polynomial $p\in\mathcal{P}(\mathbb{F})$ of smallest degree such that $p(T) =0$

For $\dim V = m$, this minimal polynomial is found as the solution to the equation:
$$c_0 I + c_1 T + \cdots + c_{m-1}T^{m-1} = -T^m$$

***IMPORTANT RESULT:* Eigenvalues are the Zeros of the Minimal Polynomial**
- For finite-dimensional $V$ and operator $T\in\mathcal{L}(V)$:
    - The zeros of the minimal polynomial of $T$ are the eigenvalues of $T$
    - If $V$ is a complex vector space, then the minimal polynomial of $T$ has the form:$$(z - \lambda_1)\cdots(z - \lambda_m)$$ where $\lambda_1,..., \lambda_m$ is a list of all eigenvalues of $T$

*Proof:*\
Let $p$ be the minimal polynomial of $T$ and $\lambda\in\mathbb{F}$ be a zero of $p$. Then $p$ may be expressed as:
$$p(z) = (z-\lambda)q(z)$$
where $q$ is a monic polynomial with coefficients in $\mathbb{F}$ and $\deg q = (\deg p) - 1$.\
Because the minimal polynomial $p(T) = 0$, we have:
$$0 = (T-\lambda I) \big(q(T)v\big), \ \forall v\in V$$
$p$ is the minimal polynomial, thus no other monic polynomial with degree less than $\deg p$ may exist such that it equals zero when composed with the operator $T$. Since $q$ is a monic polynomial with $\deg q = (\deg p) - 1$, $q(T)v \ne 0$ for *at least one* $v\in V$. This implies that $\lambda$ is an eigenvalue of $T$ with eigenvector $v$.

Now we muct prove that an eigenvalue of $T$ is also a zero of $p$. To do this, let $\lambda \in \mathbb{F}$ be an eigenvalue of $T$. Thus, there exists $v \in V$ with $v\ne 0$ such that $Tv = \lambda v$. Applying $T$ $k$ times yields $T^kv = \lambda^k v$. Thus:
$$p(T)v = p(\lambda)v$$
Because $p$ is the minimal polynomial of $T$, $p(T) = 0$. Thus, $p(\lambda) =0$, implying that $\lambda$ is a zero of $p$.

The second result simply follows from the fundamental theorem of algebra.\
The second result also directly implies that the operator $T$ has at most as many eigenvalues as the degree of its minimal polynomial. Since the minimal polynomial has degree of at most $\dim V$, this also implies that $T$ has at most $\dim V$ eigenvalues. This is an extension of the result from chapter 4 that a nonzero polynomial has at most as many distinct zeros as its degree.

**$q(T)=0$ If and Only If $q$ is a Polynomial Multiple of the Minimal Polynomial**
- For finite-dimensional $V$, operator $T$, and $q\in\mathcal{P}(\mathbb{F})$, $q(T)=0$ if and only if $q$ is a polynomial multiple of the minimal polynomial

*Proof:*\
Let $p$ be the minimal polynomial of $T$ and $q(T)=0$. Then, by the division algorithm for polynomials (ch 4), there exist polynomials $x,r\in\mathcal{P}(\mathbb{F})$ such that:
$$q = ps + r, \ \deg r < \deg p$$
Then,
$$0 = q(T) = p(T)s(T) + r(T) = r(T)$$
This implies that $r = 0$, thus:
$$q = ps$$
Therefore, $q$ is a polynomial multiple of the minimal polynomial $p$.

**Minimal Polynomial of a Restriction Operator**
- For finite-dimensional $V$, operator $T\in\mathcal{L}(V)$, and subspace $U\subseteq V$ invariant under $T$:
    - The minimal polynomial of $T$ is a polynomial multiple of the minimal polynomial of $T|_U$

*Proof:*\
Let $p$ be the minimal polynomial of $T$. Then $p(T)v=0, \ \forall v\in V$, and: $$p(T)u = 0, \ \forall u \in U$$
Thus $p\big(T|_u\big)=0$.\
By the result above, $p\big(T|_u\big)=0$ *if and only if* $p$ is a polynomial multple of the minimal polynomial of $T|_U$.

***Big Result:* An Operator $T$ is Not Invertible If and Only If the Constant Term of its Minimal Polynomial is $0$**
- For finite-dimensional $V$ and operator $T\in\mathcal{L}(V)$
    - $T$ is not invertible if and only if the constant term of the minimal polynomial of $T$ is $0$

*Proof:*\
Let $p$ be the minimal polynomial of $T$. Let $\lambda$ be an eigenvalue of $T$. Then: 
$$(T-\lambda I)v = 0, \ \forall v\in V$$
Thus, $T-\lambda I$ is not injective, thus $T-\lambda I$ is not invertible.

Now, every eigenvalue of $T$ is a zero of the minimal polynomial of $T$, as we have seen. And for any polynomial $p$, $p(0)$ is equal to the constant term of $p$. Thus the constant term of $p$ is a zero of $p$ if and only if it is equal to $0$, and $0$ is a root of $p$ if and only if it is the constant term of $p$. So, if $0$ is a root of $p$, then it is the constant term of $p$, and it is an eigenvalue of $T$ with:
$$(T - 0 I) v = 0, \ \forall v\in V$$
So, $T$ is not injective, and $T$ is not invertible. Summarizing:
$$
\begin{align*}
& T \ \text{is not invertible} \iff 0 \ \text{is an eigenvalue of } T \\
& \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \iff 0 \ \text{is a zero of } p \\
& \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \iff \text{the constant term of } p \ \text{is} \ 0
\end{align*}
$$

In [1]:
import numpy as np
M = np.array([
    [0, 0, 0, 0, -3],
    [1, 0, 0, 0, 6],
    [0, 1, 0, 0, 0],
    [0, 0, 1, 0, 0],
    [0, 0, 0, 1, 0]
])

e1 = np.array([1, 0, 0, 0, 0])

print(M @ e1.T)

[0 1 0 0 0]


### Eigenvalues on Odd-Dimensional Real Vector Spaces

**Even-Dimensional Null Space**
- Let $V$ be finite-dimensional on the reals with operator $T\in\mathcal{L}(V)$ and $b,c \in \reals$ with $b^2 < 4c$, then:
    - $\dim \text{null}\big(T^2 + b T + cI\big)$ *is an even number*

*Proof:*\
Recall that the null space of an operator $T$ is invariant under $T$. The null space of a polynomial of $T$ is also invariant under $T$. Thus $\text{null}\big(T^2 + b T + cI\big)$ is invariant under $T$. Let $U \coloneqq \text{null}\big(T^2 + b T + cI\big)$ and $S \coloneqq T|_U$, then:
$$0 = \big(S^2 + bS + cI\big)u, \ \forall u\in U$$
Now, let $\lambda\in\reals$ be an eigenvalue of $S$, then:
$$0 = \big(\lambda^2 + b\lambda + c\big)u = \bigg(\big(\lambda + \frac{b}{2}\big)^2 + c - \frac{b^2}{4}\bigg)u$$
Because $b^2<4c$, this term in large parentheses is positive. This implies that $u=0, \forall u\in U$. Thus, $S$ has *no eigenvectors* and *no eigenvalues*.

Let $W$ be a subspace of $U$ such that $W$ is the *largest even-dimensional invariant subspace* of $U$ under $S$.\
If $W \ne U$, then there exists $u \in U$ such that $u \notin W$.\
Let $Z = \text{span}(u, Su)$. Clearly, $Z$ is invariant under $S$. Also, $\dim Z = 2$, because $\dim Z$ cannot be greater than $2$, and if $\dim Z < 2$ then $u$ would be an eigenvector of $S$ which we've shown has no eigenvectors. Then:
$$\dim(W + Z) = \dim W + \dim Z - \dim(W\cap Z) = \dim W + 2$$
Note that $\dim (W\cap Z) = \{0\}$, because otherwise $U\cap Z$ would be a one-dimensional subspace invariant under $S$, which is impossible since $S$ has no eigenvectors.\
Because $W + Z$ is invariant under $S$, this equation shows that there exists some subspace of $U$ that is invariant under $S$ and has *even dimension* greater than $\dim W$. Thus, $W = U$ and $U$ **has even dimension**

***Big Result:* Operators on Odd-Dimensional Vector Spaces Have Eigenvalues**
- Every operator on an odd-dimensional vector space has an eigenvalue

### Exercises 3B

In [2]:
import numpy as np

M = np.ones((5, 5))
v = np.array([1, 2, 3, 4, 5])

v @ M

array([15., 15., 15., 15., 15.])

## 5C Upper-Triangular Matrices

**Definition 5.35: Matrix of an Operator**
- For operator $T$, the matrix of $T$ w.r.t. a *basis* $v_1,...,v_n$ of $V$ is the $[n\times n]$ matrix $\mathcal{M}(T)$ with entries defined by: $$Tv_k = A_{1,k}v_1 + \cdots + A_{n,k}v_n$$
- Note that the $k^{th}$ column of $\mathcal{M}(T)$ is given by $T$ applied to the $k^{th}$ basis vector


For a finite, complex vector space, $T$ is guaranteed to have an eigenvalue, and therefore there must exist a matrix representation of $T$ with a first column of only zeros except for an eigenvalue as its first element. I.e. 
$$\begin{bmatrix} \lambda \ \ \ \ \ \ \ \ \ \ \ \ \ \\ \ 0 \ \ \ \ \ \ * \ \ \ \ \ \ \\ \vdots \ \ \ \ \ \ \ \ \\ \ 0 \ \ \ \ \ \ \ \ \ \ \ \ \ \end{bmatrix}$$
This is because the expression for the column with the associated eigenvector $v$ is simply $\lambda v$. In other words, the coefficients for this column are $A_{1,1} = \lambda$ and $0$ for all other $A_{j,k}$ associated with all other basis vectors.

**Definition 5.37: Diagonal of a Matrix**
- The diagonal of a *square matrix* consists of the entries on the line from the upper left corner to the bottom right corner

**Definition 5.38: Upper-Triangular Matrix**
- A square matrix is called *upper triangular* if all entries below the diagonal are $0$

**Result 5.39: Conditions for Upper-Triangular Matrix**
- For $T\in\mathcal{L}(V)$ and basis $v_1,...,v_n$ of $V$, the following are equivalent:
    - The matrix of $T$ w.r.t. $v_1,...,v_n$ is upper triangular
    - $\text{span}(v_1,...,v_k)$ is invariant under $T$ for each $k = 1,...,n$
    - $Tv_k \in \text{span}(v_1,...,v_k)$ for each $k = 1,..., n$

**Result 5.40: equation Satisfied by Operator with Upper-Triangular Matrix**
- If $V$ has a basis with respect to which an operator $T$ has an upper-triangular matrix with diagonal entries $\lambda_1, ..., \lambda_n$, then: $$(T - \lambda_1I) \ \cdots \ (T - \lambda_n I) = 0 $$
This may be proven by induction beginning with the recognition that $Tv_1 = \lambda_1 v_1 \ \implies \ (T - \lambda_1 I)v_1 = 0$ and by noting the pattern $(T - \lambda_k I)v_k \in \text{span}(v_{k-1})$ (this should be pretty obvious by considering the expression of an upper-traingular matrix)

**Result 5.41: Determination of Eigenvalues from Upper-Triangular Matrix**
- If $T\in\mathcal{L}(V)$ has an upper-triangular matrix w.r.t. some basis of $V$, then the eigenvalues of $T$ are given by the diagonal of the upper-triangular matrix

**Result 5.44: Necessary and Sufficient Condition to have an Upper-Triangular Matrix**
- An operator $T$ has an upper-triangular matrix w.r.t. some basis of $V$ if and only if the minimal polynomial of $T$ equals $(z - \lambda_1) \cdots (z - \lambda_m)$ for some $\lambda-1, ..., \lambda_m \in \mathbb{F}$
- In other words, an operator has an upper-triangular matrix w.r.t. some basis if and only if its minimal polynomial may be factorized with first degree polynomials

**Result 5.47: If $\mathbb{F} = \mathbb{C}$ Then Every Operator on $V$ has an Upper-Triangular Matrix**
- All linear operators on finite-dimensional complex vector spaces have an upper-triangular matrix
- This result follows from result 5.44 and the second version of the fundamental theorem of algebra (4.13) (which states that every non-constant polynomial on the complex field has a unique factorization with first-degree polynomials)

**NOTE:** The basis vectors $v_2, ... v_n$ w.r.t. which an operator $T$ has an upper-triangular matrix representation do not necessarily need to be eigenvectors.
- **Only** columns with all zeros above *and* below the diagonal are associated with eigenvectors
    - This follows from the column factorization. Let $v_k$ be an eigenvector, then: $$Tv_k = A_{1,k}v_1 + \cdots + \lambda_k v_k + \cdots + A_{n, k}v_n = \lambda_k v_k \\ \therefore A_{1,k} = \cdots = A_{n,k} = 0$$

## 5D Diagonalizable Operators

### Diagonal Matrices

**Definition 5.48: Diagonal Matrix**
- A diagonal matrix is a square matrix that is 0 everywhere except possibly on the diagonal

If an operator has a diagonal matrix, then the entries on the diagonal of its diagonal matrix ***are its eigenvalues***

**Definition 5.50: Diagonalizable**
- An operator on $V$ is diagonalizable if the operator has a diagonal matrix w.r.t. some basis of $V$

Diagonalizable operators *may not be* diagonalizable w.r.t. *every* basis of $V$

**Definition 5.52: Eigenspace**
- For $T\in\mathcal{L}(V)$ and $\lambda \in \mathbb{F}$, the eigenspace of $T$ corresponding to $\lambda$ is the subspace $E(\lambda, T)$ of $V$ defined by:
$$E(\lambda, T) = \text{null}(T - \lambda I) = \{v \in V : Tv = \lambda v\}$$

**Result 5.54: Sum of Eigenspaces is a Direct Sum**
- For distinct eigenvalues of an operator $T$, $\lambda_1,...,\lambda_m$: $$E(\lambda_1, T) + \cdots + E(\lambda_m, T)$$ Is a direct sum, its dimension is given by: $$\dim E(\lambda_1, T) + \cdots + \dim E(\lambda_m, T) \le \dim V$$

The fact that eigenspaces are disjoint (and thus form a direct sum) follows directly from the fact that eigenvectors associated with different eigenvalues are linearly independent. As such, the sum of all vectors across all eigenspaces only equals zero when all of the vectors are zero (the trivial solution), which is the condition for a direct sum from 1.45.

### Conditions Equivalent to Diagonalizability

**Result 5.55: Equivalent Conditions to Diagonalizability**
- $T$ is diagonalizable
- $V$ has a basis consisting of eigenvectors of $T$
- $V = E(\lambda_1, T) \oplus \cdots \oplus E(\lambda_m, T)$
- $\dim V = \dim E(\lambda_1, T) + \cdots + \dim E(\lambda_m, T)$

**Result 5.58: $\dim V$ Eigenvalues Implies Diagonalizability**
- If an operator $T\in\mathcal{L}(V)$ has $\dim V$ *distinct* eigenvalues, then it is diagonalizable

Notably, this is not a bidirectional statement. This is because operators may be diagonalizable with *fewer* than $\dim V$ distinct eigenvalues. E.g.,
$$T\in\mathcal{L}(\mathbb{F}^3) : T(x,y,z) = (6x, 6y, 7z)$$
This operator has only two distinct eigenvalues: $6$ and $7$, but is diagonalizable by the standard basis. Thus, it has one two-dimensional eigenspace associated with the eigenvalue $6$ and one one-dimensional eigenspace associated with the eigenvalue $7$.

One particularly useful application of diagonal matrices is to calculate high powers of an operator, which simplifies because $T^k v = \lambda^k v$

**Result 5.62: Necessary and Sufficient Condition for Diagonalizability**
- For $T\in\mathcal{L}(V)$ and finite-dimensional $V$, $T$ is diagonalizable *if and only if* the minimal polynomial of $T$ equals $(z - \lambda_1) \cdots (z - \lambda_m)$ for some *distinct* set of $\lambda_1, ..., \lambda_m \in \mathbb{F}$

**Result 5.65: Restriction of Diagonalizable Operator to Invariat Subspace**
- Suppose $T\in\mathcal{L}(V)$ is diagonalizable and $U$ is a subspace of $V$ that is invariant under $T$. Then $T_{|U}$ is a diagonalizable operator on $U$

*Proof:*\
Fairly simple. Because $T$ is diagonalizable, the minimal polynomial of $T$ may be factorized as first-degree polynomials of distinct eigenvalues. The minimal polynomial of $T$ is a polynomial multiple of $T_{|U}$ by 5.31 (minimal polynomial of restriction operator). Thus, the minimal polynomial of $T_{|U}$ must also be factorizable with first-degree polynomials of distinct eigenvalues, and thus must also be diagonalizable.

### Gershgorin Disk Theorem

**Definition 5.66: Gershgorin Disks**
- Let $T\in\mathcal{L}(V)$, $v_1,...,v_n\in V$ be a basis, and $A$ be the matrix of $T$ w.r.t. this basis.\
 Then a *Gershgorin Disk* of $T$ w.r.t. this basis is a set of the form: $$\bigg\{ z\in\mathbb{F} : |z - A_{j,j}| \le \sum_{k=1, k \ne j}^{n} |A_{j, k}| \bigg\}\\ j\in \{1,...,n\}$$

Since there are $n$ choices for $j$, an operator $T$ will have $n$ Gershgorin disks. In the special case of a diagonal matrix, each Gershgorin disk consists of a *single poin* that is a diagonal entry of $A$. When the non-diagonal entries of $A$ are small, then each eigenvalue of $T$ is near a diagonal entry of $A$. This follows from the result below:

**Result 5.67: Gershgorin Disk Theorem**
- For $T\in\mathcal{L}(V)$ and basis $v_1,...,v_n$ of $V$, each eigenvalue of $T$ is *contained within* a Gershgorin disk of $T$ with respect to the basis $v_1,...,v_n$

I guess this is handy for approximating eigenvalues...

## 5E Commuting Operators

**Definition 5.71: Commute**
- Two operators on the same vector space commute if $ST = TS$
- Two square matrices of the same size commute if $AB = BA$

**Result 5.74: Commuting Operators Correspond to Commuting Matrices**
- For operators $S, T \in \mathcal{L}$ with matrices $\mathcal{M}(S)$ and $\mathcal{M}(T)$ both w.r.t. the same basis of $V$, \
$S$ and $T$ only commute if $\mathcal{M}(T)$ and $\mathcal{M}(S)$ also commute
$$ST = TS \iff \mathcal{M}(ST) = \mathcal{M}(TS) \iff \mathcal{M}(S)\mathcal{M}(T) = \mathcal{M}(T)\mathcal{M}(S)$$

**Result 5.75: Eigenspace is Invariant Under Commuting Operator**
- If $S,T\in\mathcal{L}(V)$ commute, then $E(\lambda,S)$ is invariant under $T$

*Proof:*\
Let $v\in E(\lambda, S)$, then: $$S(Tv) = (ST)v = (TS)v = T(Sv) = T(\lambda v) = \lambda Tv$$

**Result 5.76: Simultaneous Diagonalizability $\iff$ Commutativity**
- Two diagonalizable operators on the same vector space have diagonal matrices *w.r.t. the same basis* if and only if the two operators commute

**Result 5.78: Common Eigenvector for Commuting Operators**
- Every pair of commuting operators on a finite-dimensional nonzero complex vector space has a common eigenvector

**Result 5.80: Commuting Operators are Simultaneously Upper Triangularizable**
- Let $S,T \in \mathcal{L}(V)$ commute, then there exists a basis of $V$ with respect to which **both** $S$ and $T$ have upper-triangular matrices

**Result 5.81: Eigenvalues of Sum and Product of Commuting Operators**
- Let $V$ be a finite-complex vector space and $S,T\in\mathcal{L}(V)$ commute, then:
    - Every eigenvalue of $S + T$ is an eigenvalue of $S$ plus and eigenvalue of $T$
    - Every eigenvalue of $ST$ is an eigenvalue of $S$ times an eigenvalue $T$

### Exercises 5E

#### 1. Give an example of commuting operators $S, T \in \mathcal{L}(\mathbb{F}^4)$ such that there is a subspace of $\mathbb{F}^4$ that is invariant under $S$ but not invariant under $T$, and vis-versa

This is a really valuable question. The crux is this: although commuting operators must share a basis w.r.t. which they are both diagonalizable, and also to which they both have upper-traingular matrices (of which diagonal matrices are a special case), and thus they share an eigenbasis, it is not necessarily the case that they share eigenspaces. E.g:

In [9]:
import numpy as np
from pprint import pprint

S = np.array([[1, 0, 0, 0], [0, 2, 0, 0], [0, 0, 3, 0], [0, 0, 0, 4]])
T = np.array([[4, 0, 0, 0], [0, 4, 0, 0], [0, 0, 5, 0], [0, 0, 0, 5]])

pprint(S); pprint(T)

array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])
array([[4, 0, 0, 0],
       [0, 4, 0, 0],
       [0, 0, 5, 0],
       [0, 0, 0, 5]])


Ok, these matrices represent two operators that clearly commute. However, the first has four distinct eigenspaces, each of which is one-dimensional. The second has only two eigenspaces, each of which is two dimensional. Thus, the same eigenbasis corresponds to different eigenspaces for each. 

Now, despite these different eigenspaces, it is clear by examination that any subspace will be invariant under both. This is crucial, and fundamental to the notion of diagonal matrices - any subspace spanned by a subset of basis vectors is invariant under a diagonalizable operator. 

Therefore, it must be that the operators $S$ and $T$ *are not* both diagonalizable. Result 5.76 only holds for diagonalizable operators.

Thus, we may demonstrate two commuting opeators that do not have mutually invariant subspaces. If we only had to show exclusivity in one direction, then we could use the identity operator but since we need to go in both directions it is a bit trickier. 

One example found with Gemini:

In [None]:
S = np.array([
    [0, 1, 0, 0], 
    [0, 0, 0, 0], 
    [0, 0, 0, 1], 
    [0, 0, 0, 0]
])
T = np.array([
    [0, 0, 1, 0], 
    [0, 0, 0, 1], 
    [0, 0, 0, 0], 
    [0, 0, 0, 0]
])

pprint(T @ S); pprint(S @ T)

array([[0, 0, 0, 1],
       [0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])
array([[0, 0, 0, 1],
       [0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])


The subspace $\text{span}(e_3, e_4)$ is invariant under $S$ ($S$ maps to a vector in $e_3$ which is in the span), but is not invariant under $T$ ($T$ maps to a vector in $\text{span}(e_1, e_2)$ which is outside of the span). Meanwhile, $\text{span}(e_2, e_4)$ is invariant under $T$ but not under $S$ (for similar reasons).

#### 4. Prove or give a counterexample: all diagonal matrices commute with all upper-triangular matrices of the same size.

This is not always true. Counterexample:

In [21]:
A = np.array([[2, 0], [0, 5]])
B = np.array([[1, 3], [0, 2]])

print(A @ B)
print(B @ A)

[[ 2  6]
 [ 0 10]]
[[ 2 15]
 [ 0 10]]


In general, commuting operators *preserve each other's eigenspaces*. That is, when operators $A$ and $B$ commute, applying $B$ to $A$ on a vector within an eigenspace of $A$ keeps that vector within the eigenspace of $A$. This is result 5.75 - eigenspace is invariant under commuting operators. Thus, to find an upper-triangle matrix that is commutative with a diagonal matrix, we need to ensure that all eigenspaces are invariant under the upper triangle matrix.

For a diagonal matrix $A$ with repeated eigenvalues, the commutative upper-triangular matrix may only have non-zeros off of the diagonal in positions that correspond with the redundant eigenvalues. E.g.:

In [26]:
A = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 2]])
B = np.array([[1, 2, 0], [0, 3, 0], [0, 0, 1]])

print(A @ B)
print(B @ A)

[[1 2 0]
 [0 3 0]
 [0 0 2]]
[[1 2 0]
 [0 3 0]
 [0 0 2]]


When the diagonal matrix has no repeated eigenvalues, then the off-diagonal entries of the commutative upper-triangle matrix must be zero or in some sense degenerate (i.e. the upper-triangle matrix must be diagonalizable).