diff --git a/content/notes/math/linear_algebra.mdx b/content/notes/math/linear_algebra.mdx
index 6e612f4..d8bc384 100644
--- a/content/notes/math/linear_algebra.mdx
+++ b/content/notes/math/linear_algebra.mdx
@@ -21,7 +21,7 @@ This notation can be abbreviated by writing the entries as $\boldsymbol{A} = (a_
 
 The *main diagonal* of an $m\times n$ matrix $\boldsymbol{A}$ is sequence of entries $(a_{i,i})_{i=1}^{\min\Set{m,n}}$. An $m\times n$ matrix $\boldsymbol{A}$ is called diagonal if its off-diagonal entries are zero, and denoted $\boldsymbol{A} = \mathrm{diag}(a_{i,i})_{i=1}^n$. A square matrix is *upper triangular* if all of its entries below the main diagonal are $0$. Similarly, a square matrix is *lower triangular* if all of its entries above the main diagonal are $0$.
 
-The transpose of $\boldsymbol{A}\in \mathcal{M}_{m,n}(\mathbb{F})$ is the matrix $\boldsymbol{A}^T$ defined by $[\boldsymbol{A}^T]_{i,j} = \boldsymbol{A}_{j,i}$. A matrix is symmetric if $\boldsymbol{A} = \boldsymbol{A}^T$ and skew-symmetric if $\boldsymbol{A}^T = -\boldsymbol{A}$. All $n\times n$ diagonal matrices are by definition symmetric.
+The transpose of $\boldsymbol{A}\in \mathcal{M}_{m,n}(\mathbb{F})$ is the matrix $\boldsymbol{A}^\top$ defined by $[\boldsymbol{A}^\top]_{i,j} = \boldsymbol{A}_{j,i}$. A matrix is symmetric if $\boldsymbol{A} = \boldsymbol{A}^\top$ and skew-symmetric if $\boldsymbol{A}^\top = -\boldsymbol{A}$. All $n\times n$ diagonal matrices are by definition symmetric.
 
 The conjugate transpose (adjoint) of a complex $m\times n$ matrix $\boldsymbol{A}\in\mathcal{M}_{m,n}(\mathbb{C})$ is the matrix $\boldsymbol{A}^*$ defined by $[\boldsymbol{A}^*]_{i,j} = \overline{\boldsymbol{A}}_{j,i}$.
 
@@ -35,11 +35,11 @@ $$
 
 <MathBox title='Properties of the transpose' boxType='proposition'>
 The transpose operation has the following properties for $A,B\in\mathcal{M}_{m,n}$ and $\lambda\in\mathbb{F}$
-1. **(injection):** $(\boldsymbol{A}^T)^T = \boldsymbol{A}$
-2. $(\boldsymbol{A} + \boldsymbol{B})^T = \boldsymbol{A}^T + \boldsymbol{B}^T$
-3. $(\lambda \boldsymbol{A})^T = \lambda \boldsymbol{A}^T$
-4. $(\boldsymbol{A}\boldsymbol{B})^T = \boldsymbol{B}^T \boldsymbol{A}^T$
-5. $\det(\boldsymbol{A}^T) = \det(\boldsymbol{A})$
+1. **(injection):** $(\boldsymbol{A}^\top)^\top = \boldsymbol{A}$
+2. $(\boldsymbol{A} + \boldsymbol{B})^\top = \boldsymbol{A}^\top + \boldsymbol{B}^\top$
+3. $(\lambda \boldsymbol{A})^\top = \lambda \boldsymbol{A}^\top$
+4. $(\boldsymbol{A}\boldsymbol{B})^\top = \boldsymbol{B}^\top \boldsymbol{A}^\top$
+5. $\det(\boldsymbol{A}^\top) = \det(\boldsymbol{A})$
 </MathBox>
 
 ## Matrix multiplication
@@ -151,7 +151,7 @@ $$
 Two matrices $\boldsymbol{A}, \boldsymbol{B}\in\mathcal{M}_{m,n}$ are *congruent* if there exists an invertible matrix $\boldsymbol{P}$ such that 
 
 $$
-  \boldsymbol{A} = \boldsymbol{PBP}^T
+  \boldsymbol{A} = \boldsymbol{PBP}^\top
 $$
 </MathBox>
 
@@ -341,7 +341,7 @@ For $\boldsymbol{A}\in M_{m,n}(\mathbb{F})$, the following claims are equivalent
 4. $\operatorname{rank}(\boldsymbol{A}) = n$
 5. The linear map $f_{\boldsymbol{A}}:\mathbb{F}^n \to\mathbb{F}^m$ by $\boldsymbol{x}\mapsto \boldsymbol{A}\boldsymbol{x}$ is injective. 
 
-For square matrices, i.e. $\boldsymbol{A}\in M_{n,n}(\mathbb{F})$, it follows that $\ker(\boldsymbol{A}) = \Set{0} \iff \operatorname{ran}(\boldsymbol{A}) = \mathbb{F}^n$, or equivalently that $f_\boldsymbol{A}$ is bijective.
+For square matrices, i.e. $\boldsymbol{A}\in M_{n,n}(\mathbb{F})$, it follows that $\ker(\boldsymbol{A}) = \Set{0} \iff \operatorname{ran}(\boldsymbol{A}) = \mathbb{F}^n$, or equivalently that $f_{\boldsymbol{A}}$ is bijective.
 </MathBox>
 
 ## Determinants
@@ -369,13 +369,13 @@ The measure $\operatorname{vol}_n(\boldsymbol{u}_1,\dots,\boldsymbol{u}_n)$ give
 
 <MathBox title='Determinant of matrices' boxType='definition'>
 The determinant is a function $\det:\mathcal{M}_{n}(\mathbb{F}) \to\mathbb{F}$ with the following properties for $\boldsymbol{A} \in \mathcal{M}_{n}(\mathbb{F})$
-1. If $\boldsymbol{A} = \left[\begin{smallmatrix} \shortmid & ~ & \shortmid \\ \boldsymbol{a}_1 & \dots & \boldsymbol{a}_n \\ \shortmid & ~ & \shortmid \end{smallmatrix}\right]$ then $\operatorname{vol}_n (\boldsymbol{a}_1,\dots,\boldsymbol{a_n}) = |\det(\boldsymbol{A})|$ with the geometric interpretation that the column vectors of $\boldsymbol{A}$ span a parallelepiped.
+1. If $\boldsymbol{A} = \left[\begin{smallmatrix} \shortmid & & \shortmid \\ \boldsymbol{a}_1 & \dots & \boldsymbol{a}_n \\ \shortmid & & \shortmid \end{smallmatrix}\right]$ then $\operatorname{vol}_n (\boldsymbol{a}_1,\dots,\boldsymbol{a_n}) = |\det(\boldsymbol{A})|$ with the geometric interpretation that the column vectors of $\boldsymbol{A}$ span a parallelepiped.
 2. $\det(\boldsymbol{A}) = 0$ if and only if $\left[\begin{smallmatrix} \shortmid \\ \boldsymbol{a}_1 \\ \shortmid \end{smallmatrix}\right], \dots, \left[\begin{smallmatrix} \shortmid \\ \boldsymbol{a}_n \\ \shortmid \end{smallmatrix}\right]$ are linearly dependent, or equivalently if $\boldsymbol{A}$ is not invertible.
 3. the sign of $\det(\boldsymbol{A})$ gives the orientation of the parallelepiped spanned by $\boldsymbol{A}$. In particular, $\det(\boldsymbol{I}_n) = 1$.
 </MathBox>
 
 <MathBox title='Leibniz formula' boxType='proposition'>
-The determinant of $\boldsymbol{A} = \left[\begin{smallmatrix} \shortmid & ~ & \shortmid \\ \boldsymbol{a}_1 & \dots & \boldsymbol{a}_n \\ \shortmid & ~ & \shortmid \end{smallmatrix}\right] \in\mathcal{M}_{n}(\mathbb{F})$ is given by the Leibniz formula
+The determinant of $\boldsymbol{A} = \left[\begin{smallmatrix} \shortmid & & \shortmid \\ \boldsymbol{a}_1 & \dots & \boldsymbol{a}_n \\ \shortmid & & \shortmid \end{smallmatrix}\right] \in\mathcal{M}_{n}(\mathbb{F})$ is given by the Leibniz formula
 
 $$
   \det(\boldsymbol{A}) = \sum_{\sigma\in S_n} \left( \operatorname{sgn}(\sigma) \prod_{i=1}^n a_{i,\sigma(i)} \right)
@@ -475,7 +475,7 @@ $$
 Let $\boldsymbol{A}\in\mathcal{M}_n (\boldsymbol{F})$ be an $n\times n$-matrix. The *adjugate* of $\boldsymbol{A}$ is the transpose of the cofactor matrix $\boldsymbol{C}$ of $\boldsymbol{A}$
 
 $$
-  \operatorname{adj}(\boldsymbol{A}) = \boldsymbol{C}^T
+  \operatorname{adj}(\boldsymbol{A}) = \boldsymbol{C}^\top
 $$
 </MathBox>
 
@@ -577,18 +577,18 @@ $$
   &\left[\begin{array}{c c|c}
     a_{11} & a_{12} & b_1 \\
     a_{21} & a_{22} & b_2
-  \end{array}\right]~
+  \end{array}\right
   \begin{matrix}
-    ~ \\
+   \\
     \scriptsize{\boldsymbol{R}_2 - \frac{a_{21}}{a_{11}}\boldsymbol{R}_1}
   \end{matrix}
   \\
   \sim& \left[\begin{array}{c c|c}
     a_{11} & a_{12} & b_1 \\
     0 & a_{22} - \frac{a_{21}}{a_{11}}a_{12} & b_2 - \frac{a_{21}}{a_{11}}
-  \end{array}\right]~
+  \end{array}\right
   \begin{matrix}
-    ~ \\
+   \\
     \scriptsize{a_{11}\boldsymbol{R}_2}
   \end{matrix}
   \\
@@ -1436,14 +1436,14 @@ Let $\mathrm{T}\in\mathcal{L}(V,W)$, where $\dim(V) = \dim(W) < \infty$. Then $\
 Any $m\times n$ matrix $\boldsymbol{A}$ over $\mathbb{F}$ defines a linear transformation $\mathrm{T}_{\boldsymbol{A}}:\mathbb{F}^n\to\mathbb{F}^m$ in the form of the multiplication map $\boldsymbol{v}\mapsto \boldsymbol{A}\boldsymbol{v}$. 
 
 <MathBox title='' boxType='lemma'>
-1. If $\boldsymbol{A}$ is an $m\times n$ matrix over $\mathbb{F}$, then the multiplication function $\mathrm{T}_\boldsymbol{A}:\mathbb{F}^n \to\mathbb{F}^m$ defined by $\boldsymbol{v} \mapsto \boldsymbol{A}\boldsymbol{v}$ is a linear map, i.e. $\mathrm{T}_\boldsymbol{A} \in\mathcal{L}(\mathbb{F}^n,\mathbb{F}^m)$.
-2. If $\mathrm{T}\in\mathcal{L}(\mathbb{F}^n,\mathbb{F}^m)$ then $\mathrm{T} = \mathrm{T}_\boldsymbol{A}$ where for the standard basis $E = \Set{ \boldsymbol{e}_i }_{i=1}^n$
+1. If $\boldsymbol{A}$ is an $m\times n$ matrix over $\mathbb{F}$, then the multiplication function $\mathrm{T}_{\boldsymbol{A}}:\mathbb{F}^n \to\mathbb{F}^m$ defined by $\boldsymbol{v} \mapsto \boldsymbol{A}\boldsymbol{v}$ is a linear map, i.e. $\mathrm{T}_{\boldsymbol{A}} \in\mathcal{L}(\mathbb{F}^n,\mathbb{F}^m)$.
+2. If $\mathrm{T}\in\mathcal{L}(\mathbb{F}^n,\mathbb{F}^m)$ then $\mathrm{T} = \mathrm{T}_{\boldsymbol{A}}$ where for the standard basis $E = \Set{ \boldsymbol{e}_i }_{i=1}^n$
 
 $$
   \boldsymbol{A} = \begin{bmatrix}
-    \shortmid & ~ & \shortmid \\
+    \shortmid & & \shortmid \\
     \mathrm{T}(\boldsymbol{e}_1) & \cdots & \mathrm{T}(\boldsymbol{e}_n) \\
-    \shortmid & ~ & \shortmid
+    \shortmid & & \shortmid
   \end{bmatrix} \in\mathcal{M}_{m,n}(\mathbb{F})
 $$
 
@@ -1460,7 +1460,7 @@ $$
 
 showing that $\mathrm{T}_{\boldsymbol{A}} \in\mathcal{L}(\mathbb{F}^n,\mathbb{F}^m)$.
 
-**(2)**: Let $E = \Set{ \boldsymbol{e}_i }_{i=1}^n$ be the standard basis of $\mathbb{F}^n$. If a vector $\boldsymbol{v}\in V$ has coordinates $[\boldsymbol{v}]_E = \left[(\beta_i)_{i=1}^n\right]^T \in\mathbb{F}^n$ then $\boldsymbol{v}$ can be written as the linear combination
+**(2)**: Let $E = \Set{ \boldsymbol{e}_i }_{i=1}^n$ be the standard basis of $\mathbb{F}^n$. If a vector $\boldsymbol{v}\in V$ has coordinates $[\boldsymbol{v}]_E = \left[(\beta_i)_{i=1}^n\right]^\top \in\mathbb{F}^n$ then $\boldsymbol{v}$ can be written as the linear combination
 
 $$
   \boldsymbol{v} = \sum_{i=1}^n \beta_i \boldsymbol{e}_i  
@@ -1472,7 +1472,7 @@ $$
 \begin{align*}
   \mathrm{T}(\boldsymbol{v}) =& \mathrm{T} \left(\sum_{i=1}^n \beta_i \boldsymbol{e}_i \right) = \sum_{i=1}^n \beta_i \mathrm{T}(\boldsymbol{e}_i) \\
   =& \begin{bmatrix} \mathrm{T}(\boldsymbol{e}_1) & \cdots & \mathrm{T}(\boldsymbol{e}_n) \end{bmatrix} [\boldsymbol{v}]_E \\
-  =& \boldsymbol{A}[\boldsymbol{v}]_E = \mathrm{T}_\boldsymbol{A} (\boldsymbol{v})
+  =& \boldsymbol{A}[\boldsymbol{v}]_E = \mathrm{T}_{\boldsymbol{A}} (\boldsymbol{v})
 \end{align*}
 $$
 
@@ -1482,8 +1482,8 @@ Hence $\boldsymbol{A} = \begin{bmatrix} \mathrm{T}(\boldsymbol{e}_1) & \cdots &
 
 <MathBox title='' boxType='proposition'>
 Let $\boldsymbol{A}$ be an $m\times n$ matrix over $F$.
-1. $\mathrm{T}_\boldsymbol{A}:\mathbb{F}^n \to\mathbb{F}^m$ is injective if and only if $\operatorname{rank}(\boldsymbol{A}) = n$.
-2. $\mathrm{T}_\boldsymbol{A}:\mathbb{F}^n \to\mathbb{F}^m$ is surjective if and only if $\operatorname{rank}(\boldsymbol{A}) = m$.
+1. $\mathrm{T}_{\boldsymbol{A}}:\mathbb{F}^n \to\mathbb{F}^m$ is injective if and only if $\operatorname{rank}(\boldsymbol{A}) = n$.
+2. $\mathrm{T}_{\boldsymbol{A}}:\mathbb{F}^n \to\mathbb{F}^m$ is surjective if and only if $\operatorname{rank}(\boldsymbol{A}) = m$.
 </MathBox>
 
 ### Change of basis matrices
@@ -1512,7 +1512,7 @@ Hence $[\boldsymbol{v}]_C = \boldsymbol{M}_{B,C}[\boldsymbol{v}]_B$ and $\boldsy
 <details>
 <summary>Proof</summary>
 
-Since $\varphi_{B,C}$ is an operator on $\mathbb{F}^n$ it has the form $\mathrm{T}_\boldsymbol{M}$ where $\boldsymbol{M}\in\mathcal{M}_n$
+Since $\varphi_{B,C}$ is an operator on $\mathbb{F}^n$ it has the form $\mathrm{T}_{\boldsymbol{M}}$ where $\boldsymbol{M}\in\mathcal{M}_n$
 
 $$
 \begin{align*}
@@ -2485,7 +2485,7 @@ Note that several notations are commonly used for the transpose of $\mathrm{T}$,
 - $\mathrm{T}^\times$
 - $\mathrm{T}^\#$
 - $\mathrm{T}'$
-- $\mathrm{T}^T$
+- $\mathrm{T}^\top$
 
 <LatexFig width={50} src='/fig/algebraic_adjoint.svg' alt=''
   caption='Algebraic adjoint'
@@ -2623,7 +2623,7 @@ Let $\mathrm{T}\in\mathcal{L}(V,W)$, where $V$ and $W$ are finite-dimensional ve
 Let $\mathrm{T}\in\mathcal{L}(V,W)$, where $V$ and $W$ are finite-dimensional vector spaces. If $B$ and $C$ are ordered bases for $V$ and $W$, respectively, and $B^*$ and $C^*$ are corresponding dual bases, then
 
 $$
-  [\mathrm{T}^*]_{C^*, B^*} = ([\mathrm{T}]_{B,C})^T
+  [\mathrm{T}^*]_{C^*, B^*} = ([\mathrm{T}]_{B,C})^\top
 $$
 
 <details>
@@ -2644,7 +2644,7 @@ $$
 \end{align*}
 $$
 
-Comparing the expression, we see that $[\mathrm{T}^*]_{C^*, B^*} = ([\mathrm{T}]_{B,C})^T$.
+Comparing the expression, we see that $[\mathrm{T}^*]_{C^*, B^*} = ([\mathrm{T}]_{B,C})^\top$.
 </details>
 </MathBox>
 
@@ -2821,12 +2821,12 @@ The scalar $\lambda$ is the *determinant* of $\mathrm{T}$. The determinant of $\
 Let $V$ be an $n$-dimensional $\mathbb{F}$-vector space. Given a linear operator $\mathrm{T}:V\to V$, there exists a unique scalar $\lambda\in\mathbb{F}$ such that for every alternating linear $n$-form $f$
 
 $$
-  f(\mathrm{T}\boldsymbol{v}_1,\dots,\mathrm{T}\boldsymbol{v}_n) = \lambda f(\boldsymbol{v}_1,\dots,\boldsymbol{v},x_n)
+  f(\mathrm{T}\boldsymbol{v}_1,\dots,\mathrm{T}\boldsymbol{v}_n) = \lambda f(\boldsymbol{v}_1,\dots,\boldsymbol{v},\boldsymbol{x}_n)
 $$
 
 $$
 \begin{CD}
-V^n @>{\mathrm{T}^{n \times}}>> V^n \\
+V^n @>{\mathrm{T}^n}>> V^n \\
 @V{f}VV @VV{f}V \\
 \mathbb{F} @>>{\lambda}> \mathbb{F}
 \end{CD}
@@ -3758,8 +3758,7 @@ $$
 Also
 $$
 \begin{align*}
-  &\langle\beta\mathbf{v}\rangle = \langle\mathbf{v}\rangl \\
-  \iff& (o(\mathbf{v}),\beta) = 1 \\
+  &\langle\beta\mathbf{v}\rangle = \langle\mathbf{v}\rangle \iff& (o(\mathbf{v}),\beta) = 1 \\
   \iff& o(\beta\mathbf{v}) = o(\mathbf{v})
 \end{align*}
 $$
@@ -4200,7 +4199,7 @@ $$
 \end{align*}
 $$
 
-Then $pM = \bigoplus_{i=1}^s p\langle\mathbf{v}_i \rangle$ and $pM = \bigoplus_{i=1}^t p\langle\mathbf{u}_i \rangle$. However, $p\langle \mathbf{v}_1 \rangle = \langle p\mathbf{v}_1 \rangle$ is a cyclic submodule of $M$ with annihilator $\langle p^{e_i - 1}\rangle$ and so by the induction hypothesis $\mathbf{s} = \mathbf{t}$ and $e_1 = f_1,\dots,e_s = f_s$, concluding the proof of uniqueness.
+Then $pM = \bigoplus_{i=1}^s p\langle\mathbf{v}_i \rangle$ and $pM = \bigoplus_{i=1}^\top p\langle\mathbf{u}_i \rangle$. However, $p\langle \mathbf{v}_1 \rangle = \langle p\mathbf{v}_1 \rangle$ is a cyclic submodule of $M$ with annihilator $\langle p^{e_i - 1}\rangle$ and so by the induction hypothesis $\mathbf{s} = \mathbf{t}$ and $e_1 = f_1,\dots,e_s = f_s$, concluding the proof of uniqueness.
 
 **(3):** Suppose $g:M\cong N$ and $M$ has annihilator chain
 
@@ -4250,7 +4249,7 @@ $$
 $$
 is a primary submodule with annihilator $\langle p_i^{e_i} \rangle$. Finally, each primary submodule $M_{p_i}$ can be written as a direct sum of cyclic submodules, so that
 $$
-  M = \underbrace{[\langle \mathbf{v}_{1,1}\oplus\cdots\oplus\langle \mathbf{v}_{i, k_1} \rangle]}_{M_{p_1}} \oplus\cdots\oplus \underbrace{[\langle \mathbf{v}_{1,1}\oplus\cdots\oplus\langle \mathbf{v}_{i, k_n} \rangle]}_{M_{p_n}}
+  M = \underbrace{[\langle\mathbf{v}_{1,1}\rangle\oplus\cdots\oplus\langle \mathbf{v}_{i, k_1} \rangle]}_{M_{p_1}} \oplus\cdots\oplus \underbrace{[\langle\mathbf{v}_{1,1}\rangle\oplus\cdots\oplus\langle \mathbf{v}_{i, k_n} \rangle]}_{M_{p_n}}
 $$
 where $\operatorname{ann}(\langle\mathbf{v}_{i,j}\rangle) = \langle p_i^{e_{i,j}}\rangle$ and the terms in each cyclic decomposition can be arranged so that, for each $i$
 $$
@@ -4259,7 +4258,7 @@ $$
 or equivalently $e_i = e_{i,1} \geq\cdots\geq e_{i,k_i}$.
 2. As for uniqueness, suppose
 $$
-  M = \underbrace{[\langle \mathbf{u}_{1,1}\oplus\cdots\oplus\langle \mathbf{u}_{i, k_1} \rangle]}_{N_{q_1}} \oplus\cdots\oplus \underbrace{[\langle \mathbf{u}_{1,1}\oplus\cdots\oplus\langle \mathbf{u}_{i, k_n} \rangle]}_{N_{q_n}}
+  M = \underbrace{[\langle\mathbf{u}_{1,1}\rangle\oplus\cdots\oplus\langle \mathbf{u}_{i, k_1} \rangle]}_{N_{q_1}} \oplus\cdots\oplus \underbrace{[\langle\mathbf{u}_{1,1}\rangle\oplus\cdots\oplus\langle \mathbf{u}_{i, k_n} \rangle]}_{N_{q_n}}
 $$
 is also a primary cyclic decomposition of $M$. Then
     - The number of summands is the same in both decompositions; in fact, $m = n$ and after possible reindexing $k_u = j_u$ for all $u$.
@@ -4732,7 +4731,7 @@ $$
   [\mathrm{T}]_B = \begin{bmatrix}
     0 & 0 & \cdots & 0 & -a_0 \\
     1 & 0 & \cdots & 0 & -a_1 \\
-    0 & 1 & \ddots & ~ & \vdots \\
+    0 & 1 & \ddots & & \vdots \\
     \vdots & \vdots & \ddots & 0 & -a_{n-2} \\
     0 & 0 & \cdots & 1 & -a_{n-1} \\
   \end{bmatrix}
@@ -4753,7 +4752,7 @@ $$
   C[p(x)] = \begin{bmatrix}
     0 & 0 & \cdots & 0 & -a_0 \\
     1 & 0 & \cdots & 0 & -a_1 \\
-    0 & 1 & \ddots & ~ & \vdots \\
+    0 & 1 & \ddots & & \vdots \\
     \vdots & \vdots & \ddots & 0 & -a_{n-2} \\
     0 & 0 & \cdots & 1 & -a_{n-1} \\
   \end{bmatrix}
@@ -4930,7 +4929,7 @@ where the polynomials $s_k(x)$ are the invariant factors of $\mathrm{T}$ and $s_
 2. Each similarity class $\mathcal{S}$ of matrices contains a matrix $R$ in the invariant factor form of rational canonical form. Moreover, the set of matrices in $\mathcal{S}$ that have this form is the set of matrices obtained from $M$ by reordering the block diagonal matrices. Any such matrix is called and *invariant factor version* of a *rational canoncial form* of $\mathbf{A}$.  
 3. The dimension of $V$ is the sum of the degrees of the invariant factors of $\mathrm{T}$
 $$
-  \dim(V) = \sum_{i=1}^n \sum_{j=1} \deg(s_i)
+  \dim(V) = \sum_{i=1}^n \deg(s_i)
 $$
 </MathBox>
 
@@ -4976,7 +4975,7 @@ $$
   =& \begin{bmatrix} 
     x & 0 & \cdots & 0 & a_0 \\ 
     -1 & x & \cdots & 0 & a_1 \\
-    0 & -1 & \ddots & ~ & \vdots \\
+    0 & -1 & \ddots & & \vdots \\
     \vdots & \vdots & \ddots & x & a_{n-2} \\
     0 & 0 & \cdots & -1 & x + a_{n-1}
   \end{bmatrix}
@@ -5134,7 +5133,7 @@ Every $n\times n$ matrix $\boldsymbol{A}\in\mathcal{M}_n (\mathbb{F})$ has an ei
 <details>
 <summary>Proof</summary>
 
-Assume $\boldsymbol{A}$ represent a linear operator $\mathrm{T}_\boldsymbol{A}:V\to V$ and choose any nonzero $\boldsymbol{v}\in V$. Consider the vectors $\Set{\boldsymbol{A}^i \boldsymbol{v}}_{i=0}^n$, where $\boldsymbol{A}^0 = \boldsymbol{I}$. Since this set has $n + 1$ vectors it must be linearly dependent. Thus,
+Assume $\boldsymbol{A}$ represent a linear operator $\mathrm{T}_{\boldsymbol{A}}:V\to V$ and choose any nonzero $\boldsymbol{v}\in V$. Consider the vectors $\Set{\boldsymbol{A}^i \boldsymbol{v}}_{i=0}^n$, where $\boldsymbol{A}^0 = \boldsymbol{I}$. Since this set has $n + 1$ vectors it must be linearly dependent. Thus,
 
 $$
 \begin{align*}
@@ -5273,7 +5272,7 @@ $$
   c_{n-1} =& (-1)^1 \sum_i \lambda_i \\
   c_{n-2} =& (-1)^2 \sum_{i < j} \lambda_i \lambda_j \\
   c_{n-3} =& (-1)^3 \sum_{i < j < k} \lambda_i \lambda_j \lambda_k \\
-  \vdot& \\
+  \vdots& \\
   c_0 =& (-1)^n \prod_{i=1}^n \lambda_i
 \end{align*}
 $$
@@ -5382,7 +5381,7 @@ Thus, for this basis the matrix of $\mathrm{T}|_{\langle\mathbf{v}_{i,j}\rangle}
 $$
   \mathbf{J}(\lambda_i, e_{i,j}) = \begin{bmatrix}
     \lambda_i & 0 & \cdots & \cdots & 0 \\
-    1 & \lambda_i & \ddots & ~ $ \vdots \\
+    1 & \lambda_i & \ddots & \vdots \\
     0 & 1 & \ddots & \ddots & \vdots \\
     \vdots & \ddots & \ddots & \ddots & 0 \\
     0 & \c
@@ -5517,10 +5516,10 @@ A matrix $\mathbf{A}\in\mathcal{M}_n (\mathbb{F})$ is *almost upper triangular*
 
 $$
   \mathbf{A} = \begin{bmatrix} 
-    \mathbf{A}_1 & ~ & * & ~ \\
-    ~ & \mathbf{A}_2 & ~ & ~ \\
-    ~ & ~ & \ddots & ~ \\
-    ~ & \mathbf{0} & ~ & \mathbf{A}_k
+    \mathbf{A}_1 & & * & \\
+   & \mathbf{A}_2 & & \\
+   & & \ddots & \\
+   & \mathbf{0} & & \mathbf{A}_k
   \end{bmatrix}
 $$
 
@@ -6020,8 +6019,8 @@ To show that $\ker(G(B)) = \Set{\boldsymbol{0}}$, choose $\boldsymbol{\beta}\in\
 
 <MathBox title='Orthonormal basis' boxType='definition'>
 Let $\langle\cdot, \cdot\rangle: V\times V \to\mathbb{F}$ be an inner product on the $\mathbb{F}$-vector space $V$. Suppose $U\subseteq V$ is a $k$-dimensional subspace. A set $B = \Set{\boldsymbol{b}_i}_{i=1}^{m\leq k}$ is called
-- *orthogonal system* (OS) if $\langle\boldsymbol{b}_i,\boldsymbol{b}_j\langle = 0$ for all $i\neq j$
-- *orthonormal system* (ONS) if $\langle\boldsymbol{b}_i,\boldsymbol{b}_j\langle = \delta_{ij}$, where $\delta$ is the Kronecker delta function
+- *orthogonal system* (OS) if $\langle\boldsymbol{b}_i,\boldsymbol{b}_j\rangle = 0$ for all $i\neq j$
+- *orthonormal system* (ONS) if $\langle\boldsymbol{b}_i,\boldsymbol{b}_j\rangle = \delta_{ij}$, where $\delta$ is the Kronecker delta function
 - *orthogonal basis* if it is an OS and a basis of $U$
 - *orthonormal basis* if it is an OSN and a basis of $U$
 </MathBox>
@@ -6144,7 +6143,7 @@ $$
 
 ### QR factorization
 
-The Gram-Schmidt process can be used to factor any real or complex matrix into a product of a matrix with orthogonal columns and an uppoer triangular matrix. Suppose that $\mathbf{A} = \left[\begin{smallmatrix} \shortmid & ~ & \shortmid \\ \mathbf{v}_1 & \cdots & \mathbf{v}_n \\ \shortmid & ~ & \shortmid \end{smallmatrix}\right] \in \mathcal{M}_{m,n} (\mathbb{F})$ is an $m\times n$ matrix where $n \leq m$. The Gram-Schmidt process applied to these columns gives orthogonal vectors $\mathbf{O} = \left[\begin{smallmatrix} \shortmid & ~ & \shortmid \\ \mathbf{u}_1 & \cdots & \mathbf{u}_n \\ \shortmid & ~ & \shortmid \end{smallmatrix}\right]$ for which
+The Gram-Schmidt process can be used to factor any real or complex matrix into a product of a matrix with orthogonal columns and an uppoer triangular matrix. Suppose that $\mathbf{A} = \left[\begin{smallmatrix} \shortmid & & \shortmid \\ \mathbf{v}_1 & \cdots & \mathbf{v}_n \\ \shortmid & & \shortmid \end{smallmatrix}\right] \in \mathcal{M}_{m,n} (\mathbb{F})$ is an $m\times n$ matrix where $n \leq m$. The Gram-Schmidt process applied to these columns gives orthogonal vectors $\mathbf{O} = \left[\begin{smallmatrix} \shortmid & & \shortmid \\ \mathbf{u}_1 & \cdots & \mathbf{u}_n \\ \shortmid & & \shortmid \end{smallmatrix}\right]$ for which
 
 $$
   \langle \mathbf{u}_1,\dots,\mathbf{u}_k \rangle = \langle \mathbf{v}_1,\dots,\mathbf{v}_k,\; \forall k \leq n
@@ -6162,18 +6161,18 @@ $$
 In matrix terms
 
 $$
-  \begin{bmatrix} \shortmid & ~ & \shortmid \\ \mathbf{v}_1 & \cdots & \mathbf{v}_n \\ \shortmid & ~ & \shortmid \end{bmatrix} = \begin{bmatrix} \shortmid & ~ & \shortmid \\ \mathbf{u}_1 & \cdots & \mathbf{u}_n \\ \shortmid & ~ & \shortmid \end{bmatrix} \begin{bmatrix} 1 & \lambda_{2,1} & \cdots & \lambda_{n,1} \\ ~ & 1 & \cdots & \lambda_{n,2} \\ ~ & ~ & \ddots & \vdots \\ ~ & ~ & ~ & 1 \end{bmatrix}
+  \begin{bmatrix} \shortmid & & \shortmid \\ \mathbf{v}_1 & \cdots & \mathbf{v}_n \\ \shortmid & & \shortmid \end{bmatrix} = \begin{bmatrix} \shortmid & & \shortmid \\ \mathbf{u}_1 & \cdots & \mathbf{u}_n \\ \shortmid & & \shortmid \end{bmatrix} \begin{bmatrix} 1 & \lambda_{2,1} & \cdots & \lambda_{n,1} \\ & 1 & \cdots & \lambda_{n,2} \\ & & \ddots & \vdots \\ & & & 1 \end{bmatrix}
 $$
 
 That is $\mathbf{A} = \mathbf{OB}$ where $\mathbf{O}$ has orthogonal columns and $\mathbf{B}$ is upper triangular. We may normalize the nonzero columns $\mathbf{u}_i$ of $\mathbf{O}$ and move the positive constants to $\mathbf{B}$. In particular, if $\alpha_i = \lVert \mathbf{u}_i \rVert$ for $\mathbf{u}_i \neq \mathbf{0}$ and $\alpha_i = 1$ for $\mathbf{u}_i = \mathbf{0}$, then
 
 $$
-  \begin{bmatrix} \shortmid & ~ & \shortmid \\ \mathbf{v}_1 & \cdots & \mathbf{v}_n \\ \shortmid & ~ & \shortmid \end{bmatrix} = \begin{bmatrix} \shortmid & ~ & \shortmid \\ \frac{\mathbf{u}_1}{\alpha_1} & \cdots & \frac{\mathbf{v}_n}{\alpha_n} \\ \shortmid & ~ & \shortmid \end{bmatrix} \begin{bmatrix} \alpha_1 & \alpha_1 \lambda_{2,1} & \cdots & \alpha_1 \lambda_{n,1} \\ ~ & \alpha_2 & \cdots & \alpha_2 \lambda_{n,2} \\ ~ & ~ & \ddots & \vdots \\ ~ & ~ & ~ & \alpha_n \end{bmatrix}
+  \begin{bmatrix} \shortmid & & \shortmid \\ \mathbf{v}_1 & \cdots & \mathbf{v}_n \\ \shortmid & & \shortmid \end{bmatrix} = \begin{bmatrix} \shortmid & & \shortmid \\ \frac{\mathbf{u}_1}{\alpha_1} & \cdots & \frac{\mathbf{v}_n}{\alpha_n} \\ \shortmid & & \shortmid \end{bmatrix} \begin{bmatrix} \alpha_1 & \alpha_1 \lambda_{2,1} & \cdots & \alpha_1 \lambda_{n,1} \\ & \alpha_2 & \cdots & \alpha_2 \lambda_{n,2} \\ & & \ddots & \vdots \\ & & & \alpha_n \end{bmatrix}
 $$
 
 That is $\mathbf{A} = \mathbf{QR}$ where the columns of $\mathbf{Q}$ are orthogonal and each column is either a unit vector or the zero vector and $\mathbf{R}$ is upper triangular with positive entries on the main diagonal. Moreover, if the vectors $\mathbf{v}_1,\dots,\mathbf{v}_n$ are linearly independent, then the columns of $\mathbf{Q}$ are nonzero. Also, if $m = n$ and $\mathbf{A}$ is nonsingular, then $\mathbf{Q}$ is unitary/orthogonal.
 
-If the columns of $\mathbf{A}$ are not linearly independent, we can make one final adjustment to this matrix factorization. If a column $\frac{\mathbf{u}_i}{\alpha_i}$ is zero, then we may replace this column by any vector as long as we replace the $(i,i)$th entry $\alpha_i$ in $\mathbb{R}$ by $0$. Thus, we can take nonzero columns of $\mathbf{Q}, extend to an orthonormal basis for the span of the columns of $\mathbf{Q}$ and replace the zero columns of $\mathbf{Q}$ by the additional members of this orthogonal basis. In this way, $\mathbf{Q}$ is replaced by a unitary/orthogonal matrix $\mathbf{Q}'$ and $\mathbf{R}$ is replaced by an upper triangular matrix $\mathbf{R}'$ that has nonnegative entries on the main diagonal.
+If the columns of $\mathbf{A}$ are not linearly independent, we can make one final adjustment to this matrix factorization. If a column $\frac{\mathbf{u}_i}{\alpha_i}$ is zero, then we may replace this column by any vector as long as we replace the $(i,i)$th entry $\alpha_i$ in $\mathbb{R}$ by $0$. Thus, we can take nonzero columns of $\mathbf{Q}$, extend to an orthonormal basis for the span of the columns of $\mathbf{Q}$ and replace the zero columns of $\mathbf{Q}$ by the additional members of this orthogonal basis. In this way, $\mathbf{Q}$ is replaced by a unitary/orthogonal matrix $\mathbf{Q}'$ and $\mathbf{R}$ is replaced by an upper triangular matrix $\mathbf{R}'$ that has nonnegative entries on the main diagonal.
 
 <MathBox title='QR factorization existence, and uniqueness criteria' boxType='proposition'>
 Let $\mathbf{A}\in\mathcal{M}_{m,n}(\mathbf{F})$ where $\mathbb{F} = \Set{\mathbb{C},\R}$. There exists a matrix $\mathbf{Q}\in\mathcal{M}_{m,n}(\mathbb{F})$ with orthonormal columns and an upper triangular matrix $\mathbf{R}\in\mathcal{M}_n (\mathbb{F})$ with nonnegative real entries on the main diagonal for which
@@ -6203,17 +6202,17 @@ $$
   \mathbf{QRx} = \mathbf{u}
 $$
 
-and since $\mathbf{Q}^{-1} = \mathbf{Q}^*$, we have
+and since $\mathbf{Q}^{-1} = \mathbf{Q}^\dagger$, we have
 
 $$
-  \mathbf{Rx} = \mathbf{Q}^* \mathbf{u}
+  \mathbf{Rx} = \mathbf{Q}^\dagger \mathbf{u}
 $$
 
 which is an upper triangular system easily solved back by substitutions.
 
 The QR factorization can be use to approximate the eigenvalues of a matrix in a process called the *QR algorithm*. Specifically, if $\mathbf{A} = \mathbf{A}_0$ is an $n\times n$ matrix, define a sequence of matrices as follows:
 1. Let $\mathbf{A}_0 = \mathbf{Q}_0 \mathbf{R}_0$ be the QR factorization of $\mathbf{A}_0$ and let $\mathbf{A}_1 = \mathbf{Q}_0 \mathbf{R}_0$
-2. Once $\mathbf{A}_k$ has been defined, let $\mathbf{A}_k = \mathbf{Q}_k \mathbf{R}_k$ be the QR factorization of $\mathbf{A}_k$ and let $\mathbf{A}_{k+1}$ = \mathbf{R}_k \mathbf{Q}_k
+2. Once $\mathbf{A}_k$ has been defined, let $\mathbf{A}_k = \mathbf{Q}_k \mathbf{R}_k$ be the QR factorization of $\mathbf{A}_k$ and let $\mathbf{A}_{k+1} = \mathbf{R}_k \mathbf{Q}_k$
 
 Then $\mathbf{A}_k$ is unitarily/orthogonally similar to $\mathbf{A}$ since
 
@@ -6250,7 +6249,6 @@ Let $V = \ell^2$ and let $M$ be the set of all vectors of the form $\mathbf{e}_i
 On the other hand, the vector space span of $M$ is the subspace $S$ of all sequences in $\ell^2$ that have finite support, i.e. have only a finite number of nonzero terms. Since $\operatorname{span}(M) = S \neq \ell^2$, we see that $M$ is not a Hamel basis for the vector space $\ell^2$.
 </MathBox>
 
-
 ### The projection theorem and best approximations
 
 Orthonormal bases have a great practical advantage over arbitrary bases. If $B = \Set{\mathbf{v}_i}_{i=1}^n$ is a basis for a vector space $V$, then each $\mathbf{v}\in V$ has the form
@@ -6318,10 +6316,10 @@ Hence, $\lVert \mathbf{v} - \mathbf{s} \rVert$ is smallest if and only if $\math
 </MathBox>
 
 <MathBox title='Projection theorem' boxType='theorem'>
-If $S$ is a finite-dimensional subspace of an inner product space $V$, then $S = S \ocirc S^\perp$. In particular, if $\mathbf{v}\in V$, then
+If $S$ is a finite-dimensional subspace of an inner product space $V$, then $S = S \odot S^\perp$. In particular, if $\mathbf{v}\in V$, then
 
 $$
-  \mathbf{v} = \tilde{\mathbf{v}} + (\mathbf{v} - \tilde{\mathbf{v}}) \in S \ocirc S^\perp
+  \mathbf{v} = \tilde{\mathbf{v}} + (\mathbf{v} - \tilde{\mathbf{v}}) \in S \odot S^\perp
 $$
 
 It follows that $\dim(V) = \dim(S) + \dim(S^\perp)$.
@@ -6329,7 +6327,7 @@ It follows that $\dim(V) = \dim(S) + \dim(S^\perp)$.
 <details>
 <summary>Proof</summary>
 
-We have seen that $\mathbf{v} - \tilde{\mathbf{v}} \in S^\perp$ and so $V = S + S^\perp$. However, $S \cap S^\perp = \Set{\mathbf{0}}$ and so $V = S \ocirc S^\perp$.
+We have seen that $\mathbf{v} - \tilde{\mathbf{v}} \in S^\perp$ and so $V = S + S^\perp$. However, $S \cap S^\perp = \Set{\mathbf{0}}$ and so $V = S \odot S^\perp$.
 </details>
 </MathBox>
 
@@ -6414,7 +6412,7 @@ Let $V$ be a finite-dimensional inner product space.
 **(1):** Assume $\mathrm{T}$ is surjective. If $f = 0$, then $\mathrm{R}_f = 0$, so let us assume $f \neq 0$. Then $K = \ker(f)$ has codimension $1$ and so
 
 $$
-  V = \langle\mathbf{w}\rangle \ocirc K,\; \mathbf{w}\in K^\perp
+  V = \langle\mathbf{w}\rangle \odot K,\; \mathbf{w}\in K^\perp
 $$
 
 Letting $\mathbf{x} = \alpha\mathbf{w}$ for $\alpha\in\mathbb{F}$, we require that $f(\mathbf{v}) = \langle\mathbf{v},\alpha\mathbf{w}\rangle$. Since this clearly holds for any $\mathbf{v}\in K$, it is sufficient to show that it holds for $\mathbf{v}  = \mathbf{w}$, i.e.
@@ -6448,7 +6446,586 @@ $$
 </details>
 </MathBox>
 
-## Positive definite matrices
+# Structure theory for normal operators
+
+## The adjoint of a linear operator
+
+<MathBox title='Hermitian adjoint' boxType='proposition'>
+Let $V$ and $W$ be finite-dimensional inner product spaces over $\mathbb{F}$ and let $\mathrm{T}\in\mathcal{L}(V,W)$. Then there is a unique function $\mathrm{T}^\dagger: W\to V$ defined by the condition
+
+$$
+  \langle \mathrm{T}\mathbf{v}, \mathbf{w}\rangle = \langle \mathbf{v}, \mathrm{T}^\dagger \mathbf{w} \rangle,\; \mathbf{v}\in V, \mathbf{w}\in w
+$$
+
+The function $\mathrm{T}^\dagger \mathcal{L}(W, V)$ is the *adjoint* of $\mathrm{T}$.
+
+<details>
+<summary>Proof</summary>
+
+If $\mathrm{T}^\dagger$ exists, then it is unique, for if
+
+$$
+  \langle\mathrm{T}\mathbf{v},\mathbf{w}\rangle = \langle\mathbf{v},\mathrm{S}\mathbf{w}\rangle
+$$
+
+then $\langle\mathbf{v},\mathrm{S}\mathbf{w}\rangle = \langle\mathbf{v},\mathrm{T}^* \mathbf{w}\rangle$ for all $\mathbf{v}\in V$ and $\mathbf{w}\in W$ and so $\mathrm{S} = \mathrm{T}^\dagger$.
+
+We seek a linear map $\mathrm{T}^\dagger : W\to V$ for which $\langle \mathrm{T}\mathbf{v}, \mathbf{w}\rangle = \langle \mathbf{v}, \mathrm{T}^\dagger \mathbf{w} \rangle$. By the Riesz representation theorem, for each $\mathbf{w}\in W$, the linear functional $f_\mathbf{w} \in V^*$ defined by
+
+$$
+  f_\mathbf{w} \mathbf{v} = \langle\mathrm{T}\mathbf{v},\mathbf{w}\rangle
+$$
+
+has the form
+
+$$
+  f_\mathbf{w} \mathbf{v} = \langle\mathbf{v}, \mathrm{R}_{f_\mathbf{w}}\rangle
+$$
+
+where $\mathrm{R}_{f_\mathbf{w}} \in V$ is the Riesz vector for $f_\mathbf{w}$. If $\mathrm{T}^\dagger :W\to V$ is defined by
+
+$$
+  \mathrm{T}^\dagger \mathbf{w} = \mathrm{R}_{f_\mathbf{w}} = \mathrm{R}(f_\mathbf{w})
+$$
+
+where $\mathrm{R}$ is the Riesz map, then
+
+$$
+  \langle\mathbf{v},\mathrm{T}^\dagger \mathbf{w}\rangle = \langle\mathbf{v},\mathrm{R}_{f_\mathbf{w}}\rangle = f_\mathrm{w}\mathbf{v} = \langle\mathrm{T}\mathbf{v},\mathbf{w}\rangle
+$$
+
+Finally, since $\mathrm{T}^\dagger = \mathrm{R}\circ f$ is the composition of the Riesz map $\mathrm{R}$ and the map $f:\mathbf{w}\mapsto f_\mathrm{w}$ and since both of these maps are conjugate linear, their composition is linear.
+</details>
+</MathBox>
+
+<MathBox title='Properties of the Hermitian adjoint' boxType='proposition'>
+Let $V$ and $W$ be finite-dimensional $\mathbb{F}$-inner product spaces. For every $\mathrm{S},\mathrm{T}\in\mathcal{L}(V,W)$ and $\alpha\in\mathbb{F}$
+1. $(\mathrm{S} + \mathrm{T})^\dagger = \mathrm{S}^\dagger + \mathrm{T}^\dagger$
+2. $(\alpha\mathrm{T})^\dagger = \bar{\alpha}\mathrm{T}^\dagger$
+3. $T^{\dagger\dagger} = \mathrm{T}$ and so $\langle\mathrm{T}^\dagger \mathbf{v},\mathbf{w}\rangle = \langle\mathbf{v},\mathrm{T}\mathbf{w}\rangle$
+4. If $V = W$ then $(\mathrm{S}\mathrm{T})^\dagger = \mathrm{T}^\dagger \mathrm{S}^\dagger$
+5. If $\mathrm{T}$ is invertible, then $(\mathrm{T}^{-1})^\dagger = (\mathrm{T}^\dagger)^{-1}$
+6. If $V = W$ and $p(x)\in\R[x]$, then $p(\mathrm{T})^\dagger = p(\mathrm{T}^\dagger)$
+
+Moreover, if $\mathrm{T}\in\mathcal{L}(V)$ is a linear operator and $S$ is a subspace of $V$, then
+7. $\mathrm{S}$ is $\mathrm{T}$-invariant if and only if $S^\perp$ is $\mathrm{T}^\dagger$-invariant
+8. $(S,S^\perp)$ reduces $\mathrm{T}$ if and only if $S$ is both $\mathrm{T}$-invariant nad $\mathrm{T}^\dagger$-invariant, in which case
+$$
+  (\mathrm{T}|_S)^\dagger = (\mathrm{T}^\dagger)|_S
+$$
+
+<details>
+<summary>Proof</summary>
+
+**(7):** Let $\mathbf{s}\in S$ and $\mathbf{z}\in S^\perp$ and write
+
+$$
+  \langle\mathrm{T}^\dagger \mathbf{z}, \mathbf{s}\rangle = \langle \mathbf{z}, \mathrm{T}\mathbf{s}\rangle
+$$
+
+If $S$ is $\mathrm{T}$-invariant, then $\langle\mathrm{T}^\dagger \mathbf{z}, \mathbf{s}\rangle = 0$ for all $\mathbf{s}\in S$ and so $\mathrm{T}^\dagger \mathbf{z} \in S^\perp$ and $S^\perp$ is $\mathrm{T}^\dagger$-invariant. Conversely, if $S^\perp$ is $\mathrm{T}^\dagger$-invariant, then $\langle\mathbf{z},\mathrm{T}\mathbf{s}\rangle = 0$ for all $\mathbf{z}\in S^\perp$ and so $\mathrm{T}\mathbf{s}\in S^{\perp\perp} = S$. Hence $S$ is $\mathrm{T}$-invariant.
+
+**(8):** The first statement follows from **(7)** applied to both $S$ and $S^\perp$. For the second statement, since $S$ is both $\mathrm{T}$-invariant and $\mathrm{T}^\dagger$-invariant, if $\mathbf{s},\mathbf{t}\in S$, then
+
+$$
+  \langle \mathbf{s}, (\mathrm{T}^\dagger)|_S (\mathbf{t}) \rangle = \langle \mathbf{s}, \mathrm{T}^\dagger \mathbf{t} \rangle = \langle \mathrm{T}\mathbf{s}, \mathbf{t} \rangle = \langle\mathrm{T}|_S (\mathbf{s}), \mathbf{t} \rangle
+$$
+
+Hence, by definition of adjoint, $(\mathrm{T}^\dagger)|_S = (\mathrm{T}|_S)^\dagger$.
+</details>
+</MathBox>
+
+<MathBox title='' boxType='proposition'>
+Let $\mathrm{T}\in\mathcal{L}(V,W)$, where $V$ and $W$ are finite-dimensional inner product spaces.
+1. $\ker(\mathrm{T}^\dagger) = \operatorname{ran}(\mathrm{T})^\perp$ and $\operatorname{ran}(\mathrm{T}^\dagger) = \ker(\mathrm{T})^\perp$, and so
+$$
+\begin{align*}
+  \mathrm{T} \text{ surjective} \iff& \mathrm{T}^\dagger \text{ injective} \\
+  \mathrm{T} \text{ injective} \iff& \mathrm{T}^\dagger \text{ surjective}
+\end{align*}
+$$
+2. $\ker(\mathrm{T}^\dagger \mathrm{T}) = \ker(\mathrm{T})$ and $\ker(\mathrm{T}\mathrm{T}^\dagger) = \ker(\mathrm{T}^\dagger)$
+3. $\operatorname{ran}(\mathrm{T}^\dagger \mathrm{T}) = \operatorname{ran}(\mathrm{T}^\dagger)$ and $\operatorname{ran}(\mathrm{T}\mathbf{T}^\dagger) = \operatorname{ran}(\mathrm{T})$
+4. $(\mathrm{P}_{S,T})^\dagger = \mathrm{P}_{T^\perp, S^\perp}$
+
+<details>
+<summary>Proof</summary>
+
+**(1):** We have
+
+$$
+\begin{align*}
+  \mathbf{u} \in\ker(\mathrm{T}^\dagger) \iff& \mathrm{T}^\dagger \mathbf{u} = \mathbf{0} \\
+  \iff& \langle \mathrm{T}^\dagger \mathbf{u}, V \rangle = \Set{\mathbf{0}} \\
+  \iff& \langle \mathbf{u}, \mathrm{T}V \rangle = \Set{\mathrm{0}} \\
+  \iff& \mathbf{u}\in\operatorname{ran}(\mathrm{T})^\perp
+\end{align*}
+$$
+
+and so $\ker(\mathrm{T}^\dagger) = \operatorname{ran}(\mathrm{T})^\perp$. The second identity follows by replacing $\mathrm{T}$ and $\mathrm{T}^\dagger$ and taking complements.
+
+**(2):** It is clear that $\ker(\mathrm{T}) \subseteq\ker(\mathrm{T}^\dagger \mathrm{T})$. For the reverse inclusion, we have
+
+$$
+\begin{align*}
+  \mathrm{T}^\dagger \mathrm{T}\mathbf{u} = \mathbf{0} \implies& \langle \mathrm{T}^\dagger \mathrm{T}\mathbf{u}, \mathbf{u} \rangle = 0 \\
+  \implies& \langle \mathrm{T}\mathbf{u}, \mathrm{T}\mathbf{u} \rangle = 0 \\
+  \implies& \mathrm{T}\mathbf{u} = \mathbf{0}
+\end{align*}
+$$
+
+and so $\ker(\mathrm{T}^\dagger \mathrm{T}) \subseteq\ker(\mathrm{T})$. The second identity follows from the first by replacing $\mathrm{T}$ with $\mathrm{T}^\dagger$.
+</details>
+</MathBox>
+
+### Relation between the algebraic adjoint and Hermitian adjoint
+
+<MathBox title='' boxType='proposition'>
+Let $\mathrm{T}\in\mathcal{L}(V,W)$, where $V$ and $W$ are finite-dimensional inner product spaces.
+1. The algebraic adjoint $\mathrm{T}^* :W^* \to V^*$ and the Hermitian adjoint $\mathrm{T}^\dagger : W\to V$ are related by
+$$
+  \mathrm{T}^* = (\mathrm{R}^V)^{-1} \circ \mathrm{T}^\dagger \circ \mathrm{R}^W
+$$
+where $\mathrm{R}^V$ and $\mathrm{R}^W$ are the conjugate Riesz isomorphisms on $V$ and $W$, respectively.
+2. If $B$ and $C$ are ordered orthonormal bases for $V$ and $W$, respectively, then
+$$
+  [\mathrm{T}^\dagger]_{C,B} = (\mathrm{T}_{B,C})
+$$
+In other words, the matrix of the adjoint $\mathrm{T}^\dagger$ is adjoint (conjugate transpose) of the matrix of $\mathrm{T}$.
+
+$$
+\begin{CD}
+  V^* @<{\mathrm{T}^*}<< W^* \\
+  @V{\mathrm{R}^V}VV @VV{\mathrm{R}^W}V \\
+  V @<{\mathrm{T}^\dagger}<< W
+\end{CD}
+$$
+
+<details>
+<summary>Proof</summary>
+
+**(1):** Consider the composite $\mathrm{S}:W^* \to V^*$ defined by
+
+$$
+  \mathrm{S} = (\mathrm{R}^V)^{-1} \circ\mathrm{T}^\dagger \circ\mathrm{R}^W
+$$
+
+is linear. Moreover, for all $f\in W^*$ and $\mathbf{v}\in V$
+
+$$
+\begin{align*}
+  (\mathrm{T}^* (f))\mathbf{v} =& f(\mathrm{T}\mathbf{v}) \\
+  =& \langle\mathrm{T}\mathbf{v}, \mathrm{R}^W (f)\rangle \\
+  =& \langle \mathbf{v}, \mathrm{T}^\dagger \mathrm{R}^W (f) \rangle \\
+  =& [(\mathrm{R}^V)^{-1} (\mathrm{T}^\dagger \mathrm{R}^W (f))](\mathbf{v}) \\
+  =& (\mathrm{S}f)\mathbf{v}
+\end{align*}
+$$
+
+showing that $\mathrm{S} = \mathrm{T}^*$. Hence, the relationship between $\mathrm{T}^*$ an $\mathrm{T}^\dagger$ is
+
+$$
+  \mathrm{T}^* = (\mathrm{R}^V)^{-1} \circ \mathrm{T}^\dagger \circ \mathrm{R}^W
+$$
+
+Loosely speaking, the Riesz functions are like "change of variables" functions from linear functionals to vectors, and we can say that $\mathrm{T}^\dagger$ does to Riesz vectors what $\mathrm{T}^*$ does to the corresponding linear functions. Put another way, $\mathrm{T}$ and $\mathrm{T}^\dagger$ are the same, up to conjugate Riesz isomorphism.
+
+**(2):** Suppose that $B = (\mathbf{b}_i)_{i=1}^n$ and $C = (\mathbf{c}_i)_{i=1}^m$ are ordered orthonormal bases for $V$ and $W$, respectively, then
+
+$$
+\begin{align*}
+  ([\mathrm{T}^\dagger]_{C,B})_{i,j} =& \langle \mathrm{T}^\dagger \mathbf{c}_j, \mathbf{b}_i \rangle \\
+  =& \langle \mathbf{c}_j, \mathrm{T}\mathbf{b}_i \rangle \\
+  =& \overline{\langle \mathrm{T}\mathbf{b}_i, \mathrm{c}_j \rangle} \\
+  =& \overline{([\mathrm{T}]_{B,C})_{j,i}}
+\end{align*}
+$$
+
+showing that $[\mathrm{T}^\dagger]_{C,B}$ and $[\mathrm{T}]_{B,C}$ are matrix adjoints (conjugate transposes).
+</details>
+</MathBox>
+
+## Orthogonal projections
+
+<MathBox title='Orthogonal projection' boxType='definition'>
+A projection of the form $\mathrm{P}_{S,S^\perp}$ is called *orthogonal*. Equivalently, a projection $\mathrm{P}$ is orthogonal if $\ker(\mathrm{P}) \perp \operatorname{ran}(\mathrm{P})$.
+</MathBox>
+
+Note that orthogonal projections differ in concept from two projections $\mathrm{P}, \mathrm{S}$ that are orthogonal to each other, i.e. $\mathrm{PS} = \mathrm{SP} = 0$ are not necessarily the same as.
+
+<MathBox title='Characterizations of orthogonal projections' boxType='definition'>
+Let $V$ be a finite-dimensional inner product space. The following are equivalent for an operator $\mathrm{P}$ on $V$:
+1. $\mathrm{P}$ is an orthogonal projections
+2. $\mathrm{P}$ is idempotent and self-adjoint
+3. $\mathrm{P}$ is idempotent and does not expand lengths, i.e.
+$$
+  \lVert\mathrm{P}\mathbf{v}\rVert\leq\lVert\mathbf{v}\rVert,\; \mathbf{v}\in V
+$$
+
+<details>
+<summary>Proof</summary>
+
+**(1) $\iff$ (2):** Since $(\mathrm{P}_{S,T})^\dagger = \mathrm{P}_{T^\perp, S^\perp}$ it follows that $\mathrm{P} = \mathrm{P}^\dagger$.
+
+**(1) \iff (3):** Let $\mathrm{P} = \mathrm{P}_{S,S^\perp}$. Then, if $\mathbf{v} = \mathbf{s} + \mathbf{t}$ for $\mathbf{s}\in S$ and $\mathbf{t}\in S^\perp$, it follows that
+
+$$
+  \lVert\mathbf{v}\rVert^2 = \lVert\mathbf{s}\rVert^2 + \lVert\mathbf{t}\rVert^2 \geq \rVert\mathbf{s}\lVert^2 = \rVert\mathbf{P}\mathbf{v}\lVert^2
+$$
+
+Suppose that **(3)** holds, then
+
+$$
+  \operatorname{ran}(\mathrm{P})\oplus\mathrm{P}) = V = \ker(\mathrm{P})^\perp \odot \ker(\mathrm{P})
+$$
+
+and we want to show that the first direct sum is orthogonal. If $\mathbf{w}\in\operatorname{ran}(\mathrm{P})$, then $\mathbf{w} = \mathbf{x} + \mathbf{y}$, where $\mathbf{x}\in\ker(\mathrm{P})$ and $\mathbf{y}\in\ker(\mathrm{P})^\perp$. Hence
+
+$$
+  \mathbf{w} = \mathrm{P}\mathbf{w} = \mathrm{P}\mathbf{x} + \mathrm{P}\mathbf{y} = \mathrm{P}\mathbf{y}
+$$
+
+and so the orthogonality of $\mathbf{x}$ and $\mathbf{y}$ implies that
+
+$$
+  \lVert\mathbf{x}\rVert^2 + \lVert\mathbf{y}\rVert^2 = \lVert\mathbf{w}\rVert^2 = \lVert\mathrm{P}\mathbf{v}\rVert^2 \leq \lVert\mathbf{y}\rVert^2
+$$
+
+Hence, $\mathbf{x} = \mathbf{0}$, and so $\operatorname{ran}(\mathrm{P}) \subseteq\ker(\mathrm{P})^\perp$, which implies that $\operatorname{ran}(\mathrm{P}) = \ker(\mathrm{P})^\perp$.
+</details>
+</MathBox>
+
+### Orthogonal resolutions of the identity
+
+<MathBox title='Orthogonal resolution of identity' boxType='definition'>
+An *orthogonal resolution of the identity* is a resolution of the identity $\sum_{i=1}^k \mathrm{P}_k = \mathrm{I}$ in which each projection $\mathrm{P}_i$ is orthogonal.
+</MathBox>
+
+<MathBox title='Relation between orthogonal direct sum decompositions and orthogonal resolution of identity' boxType='theorem'>
+Let $V$ be an inner product space. Orthogonal resolutions of the identity on $V$ corresponds to orthogonal direct sum decompositions of $V$ as follows:
+1. If $\sum_{i=1}^k \mathrm{P}_i = \mathrm{I}$ is an orthogonal resolution of the identity, then
+$$
+  V = \bigodot_{i=1}^k \operatorname{ran}(\mathrm{P}_i)
+$$
+where $\mathrm{P}_i$ is an orthogonal projections onto $\operatorname{ran}(\mathrm{P}_i)$.
+2. Conversely, if $V = \bigodot_{i=1}^k S_i$ and if $\mathrm{P}_i$ is an orthogonal projection onto $S_i$, then $\sum_{i=1}^k \mathrm{P}_i = \mathrm{I}$.
+
+<details>
+<summary>Proof</summary>
+
+**(1):** If $\sum_{i=1}^k \mathrm{P}_i = \mathrm{I}$ is an orthogonal resolution of the identity, it follows that
+
+$$
+  V = \bigoplus_{i=1}^k \operatorname{ran}(\mathrm{P}_i)
+$$
+
+However, since the $\mathrm{P}_i$ are pairwise orthogonal and self-adjoint, it follows that
+
+$$
+  \langle\mathrm{R}_i \mathbf{v}, \mathrm{R}_j \mathbf{w} \rangle = \langle\mathbf{v}, \mathrm{R}_i \mathrm{R}_j \mathbf{w} \rangle = \langle \mathbf{v}, 0 \rangle = 0
+$$
+
+and so
+
+$$
+  V = \bigodot_{i=1}^k \operatorname{ran}(\mathrm{P}_i)
+$$
+
+**(2):** For the converse, we know that $\sum_{i=1}^k \mathrm{P}_i = \mathrm{I}$ is a resolution of the identity, where e$\mathrm{P}_i$ is a projection onto $\operatorname{ran}(\mathrm{P}_i)$ along
+
+$$
+  \ker(\mathrm{P}_i) = \bigodot_{j\neq i} \operatorname{ran}(\mathrm{P}_j) = \operatorname{ran}(\mathrm{P}_i)^\perp
+$$
+
+Hence, $\mathrm{P}_i$ is orthogonal.
+</details>
+</MathBox>
+
+## Unitary diagonalizability
+
+<MathBox title='Unitarily/orthogonally diagonalizable linear operator' boxType='definition'>
+A linear operator $\mathrm{T}\in\mathcal{L}(V)$ is *unitarily diagonalizable* (when $V$ is complex) and *orthogonally diagonalizable* (when $V$ is real) if there is an ordered orthonormal basis $O = (\mathbf{u}_i)_{i=1}^n$ of $V$ for which the matrix $[\mathrm{T}]_O$ is diagonal, or equivalently if
+
+$$
+  \mathrm{T}\mathbf{u}_i = \lambda_i \mathbf{u}_i,\; \forall i=1,\dots,n
+$$
+</MathBox>
+
+<MathBox title='Characterization of unitarily/orthogonally diagonalizable linear operators' boxType='definition'>
+Let $V$ be a finite-dimensional inner product space and let $\mathrm{T}\in\mathcal{L}(V)$. The following are equivalent:
+1. $\mathrm{T}$ is unitarily (orthogonally) diagonalizable
+2. $V$ has an orthonormal basis that consists entirely of eigenvectors of $\mathrm{T}$
+3. $V$ has the form $V = \bigodot_{i=1}^k E_{\lambda_i}$ where $E_{\lamdba_i}$ is the eigenspace of the eigenvalue $\lambda_i$
+</MathBox>
+
+## Normal operators
+
+<MathBox title='Normal operators and matrices' boxType='definition'>
+1. A linear operator $\mathrm{T}\in\mathcal{L}(V)$ on an inner product space $V$ is *normal* if it commutes with its adjoint
+$$
+  \mathrm{TT}^\dagger = \mathrm{T}^\dagger \mathrm{T}
+$$
+2. A matrix $\mathbf{A}\in\mathcal{M}_n (\mathbf{F})$ is *normal* if $\mathbf{A}$ commutes with its adjoint $\mathbf{A}^\dagger$.
+</MathBox>
+
+If $\mathrm{T}$ is normal and $O$ is an ordered orthonormal basis of $V$, then
+
+$$
+  [\mathrm{T}]_O [\mathrm{T}]_O^\dagger = [\mathrm{T}]_O [\mathrm{T}^\dagger]_O = [\mathrm{TT}^\dagger]_O
+$$
+
+and
+
+$$
+  [\mathrm{T}]_O^\dagger [\mathrm{T}]_O = [\mathrm{T}^\dagger]_O [\mathrm{T}]_O = [\mathrm{T}^\dagger \mathrm{T}]_O
+$$
+
+and so $\mathrm{T}$ is normal if and only if $[\mathrm{T}]_O$ is normal for some, and hence all, orthonormal bases for $V$. Note that this does not hold for bases that are not orthonormal.
+
+<MathBox title='Properties of normal operators' boxType='proposition'>
+Let $\mathrm{T}\in\mathcal{L}(V)$ be a normal operator.
+1. The following are alos normal operators:
+    - $\mathrm{T}|_S$ if $\mathrm{T}$ reduces $(S,S^\perp)$
+    - $\mathrm{T}^\dagger$
+    - $\mathrm{T}^{-1}$ if $\mathrm{T}$ is invertible
+    - $p(\mathrm{T})$ for any polynomial $p(x) \in\mathbb{F}[x]$
+2. For any $\mathbf{v},\mathbf{w}\in V$
+$$
+  \langle\mathrm{T}\mathbf{v},\mathrm{T}\mathbf{w}\rangle = \langle\mathrm{T}^\dagger \mathbf{v}, \mathrm{T}^\dagger \mathbf{w}\rangle
+$$
+and in particular 
+$$
+  \lVert\mathrm{T}\mathbf{v}\rVert = \lVert\mathrm{T}^\dagger \mathbf{v}\rVert
+$$
+and so
+$$
+  \ker(\mathrm{T}^\dagger) = \ker(\mathrm{T})
+$$
+3. For any integer $k\geq 1$
+$$
+  \ker(\mathrm{T}^k) = \ker(\mathrm{T})
+$$
+4. The minimal polynial $m_\mathrm{T}(x)$ is a product of distinct prime monic polynomials
+5. $\mathrm{T}\mathbf{v} = \lambda\mathbf{v} \iff \mathrm{T}^\dagger \mathbf{v} = \bar{\lambda}\mathbf{v}$
+6. If $S$ and $T$ are submodules of $V_\mathrm{T}$ with relatively prime orders, then $S\perp T$
+7. If $\lambda$ and $\mu$ are distinct eigenvalues of $\mathrm{T}$, then $E_\lambda \perp E_\mu$
+
+<details>
+<summary>Proof</summary>
+
+**(2):** Normality implies that
+
+$$
+  \langle \mathrm{T}\mathbf{v},\mathrm{T}\mathbf{w}\rangle = \langle\mathrm{T}^\dagger \mathrm{T}\mathbf{v}, \mathbf{v}\rangle = \langle \mathrm{TT}^\dagger \mathbf{v}, \mathbf{v}\rangle = \langle \mathrm{T}^\dagger \mathbf{v}, \mathrm{T}^\dagger \mathbf{v} \rangle
+$$
+
+**(3):** Consider the operator $\mathrm{S} = \mathrm{T}^\dagger \mathrm{T}$ which is self-adjoint, i.e.
+
+$$
+  \mathrm{S}^\dagger = (\mathrm{T}^\dagger \mathrm{T})^\dagger = \mathrm{T}^\dagger \mathrm{T} = \mathrm{S}
+$$
+
+If $\mathrm{S}^k \mathbf{v} = \mathbf{0}$ for $k > 1$, then
+
+$$
+  0 = \langle\mathrm{S}^k \mathbf{v}, \mathrm{S}^{k-2} \mathbf{v} \rangle = \langle \mathrm{S}^{k-1}\mathbf{v}, \mathrm{S}^{k-1}\mathbf{v} \rangle
+$$
+
+and so $\mathrm{S}^{k-1} \mathbf{v} = \mathbf{0}$. Continuing in this way gives $\mathrm{S}\mathbf{v} = \mathbf{0}$. If $\mathrm{T}^k \mathbf{v} = \mathbf{0}$ for $k > 1$, then
+
+$$
+  \mathrm{S}^k \mathbf{v} = (\mathrm{T}^\dagger \mathrm{T})^k \mathbf{v} = (\mathrm{T}^\dagger)^k \mathrm{T}^k \mathbf{v} = \mathbf{0}
+$$
+and so $\mathrm{S}\mathbf{v} = \mathbf{0}$. Hence
+
+$$
+  0 = \langle\mathrm{S}\mathbf{v},\mathbf{v}\rangle = \langle\mathrm{T}^\dagger \mathrm{T}\mathbf{v},\mathbf{v}\rangle = \langle\mathrm{T}\mathbf{v},\mathrm{T}\mathbf{v}\rangle
+$$
+
+and so $\mathrm{T}\mathbf{v} = \mathbf{0}$
+
+**(4):** Suppose that $m_\mathrm{T} (x) = p^e (x)q(x)$ where $p(x)$ is monic and prime. Then for any $\mathbf{v}\in V$
+
+$$
+  p^e (\mathrm{T})[q(\mathrm{T})\mathbf{v}] = 0
+$$
+
+and since $p(\mathrm{T})$ is also normal, **(3)** implies that
+
+$$
+  p(\mathrm{T})[q(\mathrm{T})\mathbf{v}] = 0
+$$
+
+for all $\mathbf{v}\in V$. Hence, $p(\mathrm{T})q(\mathrm{T}) = 0$, which implies that $e = 1$. Thus, the prime factors of $m_\mathrm{T}(x)$ appear only to the first power.
+
+**(5):** This follows from **(2)**
+
+$$
+  \ker(\mathrm{T} - \lambda\mathrm{I}) = \ker[(\mathrm{T} - \lambda\mathrm{I})^\dagger] = \ker(\mathrm{T}^\dagger - \bar{\lambda}\mathrm{I}) 
+$$
+
+**(6):** If $o(S) = p(x)$ and $o(T) = q(x)$, then there are polynomials $a(x)$ and $b(x)$ for which $a(x)p(x) + b(x)q(x) = 1$ and so
+
+$$
+  a(\mathrm{T})p(\mathrm{T}) + b(\mathrm{T})q(\mathrm{T}) = \mathrm{I}
+$$
+
+Now, $\mathrm{A} = a(\mathrm{T})p(\mathrm{T})$ annihilates $S$ and $\mathrm{B} = b(\mathrm{T})q(\mathrm{T})$ annihilates $T$. Thus, $\mathrm{B}^\dagger$ also annihilates $T$ and so
+
+$$
+  \langle S, T \rangle = \langle (\mathrm{A} + \mathrm{B})S, T\rangle = \langle \mathrm{B}S, T \rangle = \langle S, \mathrm{B}^\dagger T \rangle = \Set{0}
+$$
+
+**(7):** This follows from **(6)**, since $o(E_\lambda) = x - \lambda$ and $o(E_\lambda) = x - \mu$ are relatively prime when $\lambda\neq\mu$. Alternatively, for $\mathbf{v}\in E_\lambda$ and $\mathbf{w}\in E_\mu$, we have
+
+$$
+  \lambda\langle\mathbf{v},\mathbf{w}\rangle = \langle\mathrm{T}\mathbf{v},\mathbf{w}\rangle = \langle\mathbf{v},\mathrm{T}^\dagger \mathbf{w}\rangle = \langle\mathbf{v}, \bar{\mu}\mathbf{v}\rangle = \mu\langle\mathbf{v},\mathbf{w}\rangle
+$$
+
+and so $\lambda\neq\mu$ implies that $\langle\mathbf{v},\mathbf{w}\rangle = 0$.
+</details>
+</MathBox>
+
+### The spectral theorem for normal operators
+
+<MathBox title='The spectral theorem for normal operators: the complex case' boxType='theorem'>
+Let $V$ be a finite-dimensional complex inner product space and let $\mathrm{T}\in\mathcal{L}(V)$. The following are equivalent:
+1. $\mathrm{T}$ is normal
+2. $\mathrm{T}$ is unitarily diagonalizable, i.e. $V_\mathrm{T} = \bigodot_{i=1}^k E_{\lambda_i}$
+3. $\mathrm{T}$ has an orthogonal spectral resolution $\mathrm{T} = \sum_{i=1}^k \lambda_i \mathrm{P}_i$ where $\sum_{i=1}^k \mathrm{P}_i = \mathrm{I}$ and $\mathrm{P}_i$ is orthogonal for all $i$, in which case $\Set{\lambda_i}_{i=1}^k$ is the spectrum of $\mathrm{T}$ and $\operatorname{ran}(\mathrm{P}_i) = E_{\lambda_i}$ and $\ker(\mathrm{P}_i)  = \bigodot_{j\neq i} E_{\lambda_j}$
+
+<details>
+<summary>Proof</summary>
+
+**(2) $\iff$ (3):** From the diagonalizability characterization, we know that $V_\mathrm{T} = \bigoplus_{i=1}^k E_{\lambda_i}$ if and only if $\mathrm{T} = \sum_{i=1}^k \lambda_i \mathrm{P}_i$. In this case $\operatorname{ran}(\mathrm{P}_i) = E_{\lambda_i}$ and $\ker(\mathrm{P}_i) = \bigoplus_{j\neq i} E_{\lambda_j}$. However, $E_{\lambda_i} \perp E_{\lambda_j}$ for $i\neq j$ if and only if
+
+$$
+  \operatorname{ran}(\mathrm{P}_i) \perp \ker(\mathrm{P}_i)
+$$
+
+That is, if and only if each $\mathrm{P}_i$ is orthogonal. Hence, the direct sum $V_\mathrm{T} = \bigoplus_{i=1}^k$ is an orthogonal sum if and only if each projection is orthogonal.
+</details>
+</MathBox>
+
+<MathBox title='The spectral theorem for normal operators: the real case' boxType='theorem'>
+A linear operator $\mathrm{T}$ on a finite-dimensional real inner product space is normal if and only if
+
+$$
+  V = \left(\bigodot_{i=1}^k E_{\lambda_i} \right) \odot \left(\bigodot_{j=1}^m W_j \right) 
+$$
+
+where $\Set{\lambda_i}_{i=1}^k$ is the spectrum of $\mathrm{T}$ and each $W_j$ is an indecomposable two-dimensional $\mathrm{T}$-invariant subspace with an ordered basis $B_i$ for which
+
+$$
+  [\mathrm{T}]_{B_i} = \begin{bmatrix} a_i & -b_i \\ b_i & a_i \end{bmatrix}
+$$
+
+<details>
+<summary>Proof</summary>
+
+We only need to show that if $V$ has such a decomposition, then $\mathrm{T}$ is normal. However,
+
+$$
+  [\mathrm{T}]_{B_i} [\mathrm{T}]_{B_i}^\top = (a_i^2 + b_i^2)\mathbf{I}_2 = [\mathrm{T}]_{B_i}^\top [\mathrm{T}]_{B_i}
+$$
+
+and so $[\mathrm{T}]_{B_i}$ is normal. It follows that $\mathrm{T}$ is normal.
+</details>
+</MathBox>
+
+## Self-adjoint operators
+
+<MathBox title='Self-adjoint (Hermitian) operator' boxType='definition'>
+A linear operator $\mathrm{T}\in\mathcal{L}(V)$ on an $\mathbb{F}$-inner product space $V$ is *self-adjoint* if 
+
+$$
+  \mathrm{T}^\dagger = \mathrm{T}
+$$
+
+A self-adjoint operator is also called *Hermitian* when $\mathbb{F} = \mathbb{C}$ and *symmetric* when $\mathbb{F} = \R$, in which case $\mathrm{T} = \mathrm{T}^\top$.
+</MathBox>
+
+<MathBox title='Skew self-adjoint (skew-Hermitian) operator' boxType='definition'>
+A linear operator $\mathrm{T}\in\mathcal{L}(V)$ on an $\mathbb{F}$-inner product space $V$ is *skew self-adjoint* if 
+
+$$
+  \mathrm{T}^\dagger = -\mathrm{T}
+$$
+
+A skew self-adjoint operator is also called *skew-Hermitian* when $\mathbb{F} = \mathbb{C}$ and *skew-symmetric* when $\mathbb{F} = \R$, in which case $\mathrm{T} = -\mathrm{T}^\top$.
+</MathBox>
+
+<MathBox title='Unitary/orthogonal operator' boxType='definition'>
+A linear operator $\mathrm{T}\in\mathcal{L}(V)$ on an $\mathbb{F}$-inner product space $V$ is *unitary* for $\mathbb{F} = \mathbb{C}$ and orthogonal for $\mathbb{F} = \R$ if $\mathrm{T}$ is invertible and 
+
+$$
+  \mathrm{T}^\dagger = \mathrm{T}^{-1}
+$$
+</MathBox>
+
+<MathBox title='Quadratic form' boxType='definition'>
+Let $\mathrm{T}\in\mathcal{L}(V)$ be a linear operator on an $\mathbb{F}$-inner product space $V$. The *quadratic form* associated with $\mathrm{T}$ is the function $q_\mathrm{T}: V\to\mathbb{F}$ defined by
+
+$$
+  q_\mathrm{T} (\mathbf{v}) = \langle\mathrm{T}\mathbf{v}, \mathbf{v}\rangle
+$$
+</MathBox>
+
+<MathBox title='Properties of self-adjoint operators' boxType='proposition'>
+Let $V$ be a finite-dimensional inner product space and let $\mathrm{S},\mathrm{T}\in\mathcal{L}(V)$ be linear operators on $V$.
+1. If $\mathrm{S}$ and $\mathrm{T}$ are self-adjoint, then so are the following
+    - $\mathrm{S} + \mathrm{T}$
+    - $\mathrm{T}^{-1}$ if $\mathrm{T}$ is invertible
+    - $p(\mathrm{T})$, for any real polynomial $p(x)\in\R[x]$
+2. A complex operator $\mathrm{T}$ is Hermitian if and only if the quadratic form $q_\mathrm{T}$ is real for all $\mathbf{v}\in V$
+3. If $\mathrm{T}$ is a complex operator or a real symmetric operator, then $\mathrm{T} = 0 \iff q_\mathrm{T} = 0$
+4. The characteristic polynomial $c_\mathrm{T}(x)$ of a self-adjoint operator $\mathrm{T}$ splits over $\R$, i.e. all complex roots of $c_\mathrm{T}(x)$ are real. Hence, the minimal polynomial $m_\mathrm{T}(x)$ of $\mathrm{T}$ is the product of distinct monic linear factors over $\R$.
+
+<details>
+<summary>Proof</summary>
+
+**(2):** If $\mathrm{T}$ is Hermitian, then
+
+$$
+  \langle\mathrm{T}\mathbf{v}, \mathbf{v}\rangle = \langle\mathbf{v},\mathrm{T}\mathbf{v}\rangle = \overline{\langle\mathrm{T}\mathbf{v},\mathbf{v}}
+$$
+
+and so $q_\mathrm{T}(\mathbf{v}) = \langle\mathrm{T}\mathbf{v},\mathbf{v}\rangle$ is real. Conversely, if $\langle\mathrm{T}\mathbf{v},\mathbf{v}\rangle\in\R$, then
+
+$$
+  \langle\mathbf{v},\mathrm{T}\mathbf{v}\rangle = \langle\mathrm{T}\mathbf{v},\mathbf{v}\rangle = \langle\mathbf{v}, \mathrm{T}^\dagger \mathbf{v}\rangle
+$$
+
+and so $\mathrm{T} = \mathrm{T}^\dagger$.
+
+**(3):** We only need to prove that $q_\mathrm{T} = 0$ implies $\mathrm{T} = 0$ when $\R$. If $q_\mathrm{T} = 0$, then
+
+$$
+\begin{align*}
+  0 =& \langle\mathrm{T}(\mathbf{x} + \mathbf{y}), \mathbf{x} + \mathbf{y}\rangle \\
+  =& \langle\mathrm{T}\mathbf{x}, \mathbf{x}\rangle + \langle\mathrm{T}\mathbf{y}, \mathbf{y}\rangle + \langle\mathrm{T}\mathbf{x}, \mathbf{y}\rangle + \langle\mathrm{T}\mathbf{y}, \mathbf{x}\rangle \\
+  =& \langle\mathrm{T}\mathbf{x}, \mathbf{y}\rangle + \langle\mathrm{T}\mathbf{y}, \mathbf{x}\rangle \\
+  =& \langle\mathrm{T}\mathbf{x}, \mathbf{y}\rangle + \langle\mathbf{x}, \mathrm{T}\mathbf{x}\rangle \\
+  =& \langle\mathrm{T}\mathbf{x}, \mathbf{y}\rangle + \langle\mathrm{T}\mathbf{x}, \mathbf{y}\rangle \\
+  =& 2\langle\mathrm{T}\mathbf{x}, \mathbf{y}\rangle
+\end{align*}
+$$
+
+and so $\mathrm{T} = 0$.
+
+**(4):** If $\mathrm{T}$ is Hermitian for $\mathbb{F} = \mathbb{C}$ and $\mathrm{T}\mathbf{v} = \lambda\mathbf{v}$, then
+
+$$
+  \lambda\mathbf{v} = \mathrm{T}\mathbf{v} = \mathrm{T}^\dagger \mathbf{v} = \bar{\lambda}\mathbf{v}
+$$
+
+and so $\lambda = \bar{\lambda}$ is real. 
+
+If $\mathrm{T}$ is symmetric for $\mathbb{F} = \R$, note that a nonreal root of $c_\mathrm{T}(x)$ is not an eigenvalue of $\mathrm{T}$. If $\mathbf{A} = [\mathrm{T}]_O$ for any ordered orthonormal basis $O$ for $V$, then $c_\mathrm{T}(x) = c_\mathbf{A}(x)$. Now, $\mathbf{A}$ is a real symmetric matrix, but can be thought of as a complex Hermitian matrix with real entris. As such, it represents a Hermitian linear operator on the complex space $\mathbb{C}^n$ and so, by what we have just shown, all (complex) roots of its characteristic polynomial are real. However, the characteristic polynomial of $\mathbf{A}$ is the same, whether we think of $\mathbf{A}$ as a real or complex matrix and so the result follows.
+</details>
+</MathBox>
+
+### Positive definite matrices
 
 <MathBox title='Positive definite matrix' boxType='definition'>
 An $n\times n$ matrix $\boldsymbol{A}\in\mathcal{M}_{n}(\mathbb{F})$ is *positive definite* if
@@ -6506,6 +7083,7 @@ For a self-adjoint $n\times n$ matrix $\boldsymbol{A}\in\mathcal{M}_{n}(\mathbb{
 4. The determinants of leading principal minors of $\boldsymbol{A}$ are positive. **(Sylvester's criterion)**
 </MathBox>
 
+
 # Metric vector space
 
 ## Matrices