diff --git a/content/notes/math/linear_algebra.mdx b/content/notes/math/linear_algebra.mdx index 6e612f4..d8bc384 100644 --- a/content/notes/math/linear_algebra.mdx +++ b/content/notes/math/linear_algebra.mdx @@ -21,7 +21,7 @@ This notation can be abbreviated by writing the entries as $\boldsymbol{A} = (a_ The *main diagonal* of an $m\times n$ matrix $\boldsymbol{A}$ is sequence of entries $(a_{i,i})_{i=1}^{\min\Set{m,n}}$. An $m\times n$ matrix $\boldsymbol{A}$ is called diagonal if its off-diagonal entries are zero, and denoted $\boldsymbol{A} = \mathrm{diag}(a_{i,i})_{i=1}^n$. A square matrix is *upper triangular* if all of its entries below the main diagonal are $0$. Similarly, a square matrix is *lower triangular* if all of its entries above the main diagonal are $0$. -The transpose of $\boldsymbol{A}\in \mathcal{M}_{m,n}(\mathbb{F})$ is the matrix $\boldsymbol{A}^T$ defined by $[\boldsymbol{A}^T]_{i,j} = \boldsymbol{A}_{j,i}$. A matrix is symmetric if $\boldsymbol{A} = \boldsymbol{A}^T$ and skew-symmetric if $\boldsymbol{A}^T = -\boldsymbol{A}$. All $n\times n$ diagonal matrices are by definition symmetric. +The transpose of $\boldsymbol{A}\in \mathcal{M}_{m,n}(\mathbb{F})$ is the matrix $\boldsymbol{A}^\top$ defined by $[\boldsymbol{A}^\top]_{i,j} = \boldsymbol{A}_{j,i}$. A matrix is symmetric if $\boldsymbol{A} = \boldsymbol{A}^\top$ and skew-symmetric if $\boldsymbol{A}^\top = -\boldsymbol{A}$. All $n\times n$ diagonal matrices are by definition symmetric. The conjugate transpose (adjoint) of a complex $m\times n$ matrix $\boldsymbol{A}\in\mathcal{M}_{m,n}(\mathbb{C})$ is the matrix $\boldsymbol{A}^*$ defined by $[\boldsymbol{A}^*]_{i,j} = \overline{\boldsymbol{A}}_{j,i}$. @@ -35,11 +35,11 @@ $$ The transpose operation has the following properties for $A,B\in\mathcal{M}_{m,n}$ and $\lambda\in\mathbb{F}$ -1. **(injection):** $(\boldsymbol{A}^T)^T = \boldsymbol{A}$ -2. $(\boldsymbol{A} + \boldsymbol{B})^T = \boldsymbol{A}^T + \boldsymbol{B}^T$ -3. $(\lambda \boldsymbol{A})^T = \lambda \boldsymbol{A}^T$ -4. $(\boldsymbol{A}\boldsymbol{B})^T = \boldsymbol{B}^T \boldsymbol{A}^T$ -5. $\det(\boldsymbol{A}^T) = \det(\boldsymbol{A})$ +1. **(injection):** $(\boldsymbol{A}^\top)^\top = \boldsymbol{A}$ +2. $(\boldsymbol{A} + \boldsymbol{B})^\top = \boldsymbol{A}^\top + \boldsymbol{B}^\top$ +3. $(\lambda \boldsymbol{A})^\top = \lambda \boldsymbol{A}^\top$ +4. $(\boldsymbol{A}\boldsymbol{B})^\top = \boldsymbol{B}^\top \boldsymbol{A}^\top$ +5. $\det(\boldsymbol{A}^\top) = \det(\boldsymbol{A})$ ## Matrix multiplication @@ -151,7 +151,7 @@ $$ Two matrices $\boldsymbol{A}, \boldsymbol{B}\in\mathcal{M}_{m,n}$ are *congruent* if there exists an invertible matrix $\boldsymbol{P}$ such that $$ - \boldsymbol{A} = \boldsymbol{PBP}^T + \boldsymbol{A} = \boldsymbol{PBP}^\top $$ @@ -341,7 +341,7 @@ For $\boldsymbol{A}\in M_{m,n}(\mathbb{F})$, the following claims are equivalent 4. $\operatorname{rank}(\boldsymbol{A}) = n$ 5. The linear map $f_{\boldsymbol{A}}:\mathbb{F}^n \to\mathbb{F}^m$ by $\boldsymbol{x}\mapsto \boldsymbol{A}\boldsymbol{x}$ is injective. -For square matrices, i.e. $\boldsymbol{A}\in M_{n,n}(\mathbb{F})$, it follows that $\ker(\boldsymbol{A}) = \Set{0} \iff \operatorname{ran}(\boldsymbol{A}) = \mathbb{F}^n$, or equivalently that $f_\boldsymbol{A}$ is bijective. +For square matrices, i.e. $\boldsymbol{A}\in M_{n,n}(\mathbb{F})$, it follows that $\ker(\boldsymbol{A}) = \Set{0} \iff \operatorname{ran}(\boldsymbol{A}) = \mathbb{F}^n$, or equivalently that $f_{\boldsymbol{A}}$ is bijective. ## Determinants @@ -369,13 +369,13 @@ The measure $\operatorname{vol}_n(\boldsymbol{u}_1,\dots,\boldsymbol{u}_n)$ give The determinant is a function $\det:\mathcal{M}_{n}(\mathbb{F}) \to\mathbb{F}$ with the following properties for $\boldsymbol{A} \in \mathcal{M}_{n}(\mathbb{F})$ -1. If $\boldsymbol{A} = \left[\begin{smallmatrix} \shortmid & ~ & \shortmid \\ \boldsymbol{a}_1 & \dots & \boldsymbol{a}_n \\ \shortmid & ~ & \shortmid \end{smallmatrix}\right]$ then $\operatorname{vol}_n (\boldsymbol{a}_1,\dots,\boldsymbol{a_n}) = |\det(\boldsymbol{A})|$ with the geometric interpretation that the column vectors of $\boldsymbol{A}$ span a parallelepiped. +1. If $\boldsymbol{A} = \left[\begin{smallmatrix} \shortmid & & \shortmid \\ \boldsymbol{a}_1 & \dots & \boldsymbol{a}_n \\ \shortmid & & \shortmid \end{smallmatrix}\right]$ then $\operatorname{vol}_n (\boldsymbol{a}_1,\dots,\boldsymbol{a_n}) = |\det(\boldsymbol{A})|$ with the geometric interpretation that the column vectors of $\boldsymbol{A}$ span a parallelepiped. 2. $\det(\boldsymbol{A}) = 0$ if and only if $\left[\begin{smallmatrix} \shortmid \\ \boldsymbol{a}_1 \\ \shortmid \end{smallmatrix}\right], \dots, \left[\begin{smallmatrix} \shortmid \\ \boldsymbol{a}_n \\ \shortmid \end{smallmatrix}\right]$ are linearly dependent, or equivalently if $\boldsymbol{A}$ is not invertible. 3. the sign of $\det(\boldsymbol{A})$ gives the orientation of the parallelepiped spanned by $\boldsymbol{A}$. In particular, $\det(\boldsymbol{I}_n) = 1$. -The determinant of $\boldsymbol{A} = \left[\begin{smallmatrix} \shortmid & ~ & \shortmid \\ \boldsymbol{a}_1 & \dots & \boldsymbol{a}_n \\ \shortmid & ~ & \shortmid \end{smallmatrix}\right] \in\mathcal{M}_{n}(\mathbb{F})$ is given by the Leibniz formula +The determinant of $\boldsymbol{A} = \left[\begin{smallmatrix} \shortmid & & \shortmid \\ \boldsymbol{a}_1 & \dots & \boldsymbol{a}_n \\ \shortmid & & \shortmid \end{smallmatrix}\right] \in\mathcal{M}_{n}(\mathbb{F})$ is given by the Leibniz formula $$ \det(\boldsymbol{A}) = \sum_{\sigma\in S_n} \left( \operatorname{sgn}(\sigma) \prod_{i=1}^n a_{i,\sigma(i)} \right) @@ -475,7 +475,7 @@ $$ Let $\boldsymbol{A}\in\mathcal{M}_n (\boldsymbol{F})$ be an $n\times n$-matrix. The *adjugate* of $\boldsymbol{A}$ is the transpose of the cofactor matrix $\boldsymbol{C}$ of $\boldsymbol{A}$ $$ - \operatorname{adj}(\boldsymbol{A}) = \boldsymbol{C}^T + \operatorname{adj}(\boldsymbol{A}) = \boldsymbol{C}^\top $$ @@ -577,18 +577,18 @@ $$ &\left[\begin{array}{c c|c} a_{11} & a_{12} & b_1 \\ a_{21} & a_{22} & b_2 - \end{array}\right]~ + \end{array}\right \begin{matrix} - ~ \\ + \\ \scriptsize{\boldsymbol{R}_2 - \frac{a_{21}}{a_{11}}\boldsymbol{R}_1} \end{matrix} \\ \sim& \left[\begin{array}{c c|c} a_{11} & a_{12} & b_1 \\ 0 & a_{22} - \frac{a_{21}}{a_{11}}a_{12} & b_2 - \frac{a_{21}}{a_{11}} - \end{array}\right]~ + \end{array}\right \begin{matrix} - ~ \\ + \\ \scriptsize{a_{11}\boldsymbol{R}_2} \end{matrix} \\ @@ -1436,14 +1436,14 @@ Let $\mathrm{T}\in\mathcal{L}(V,W)$, where $\dim(V) = \dim(W) < \infty$. Then $\ Any $m\times n$ matrix $\boldsymbol{A}$ over $\mathbb{F}$ defines a linear transformation $\mathrm{T}_{\boldsymbol{A}}:\mathbb{F}^n\to\mathbb{F}^m$ in the form of the multiplication map $\boldsymbol{v}\mapsto \boldsymbol{A}\boldsymbol{v}$. -1. If $\boldsymbol{A}$ is an $m\times n$ matrix over $\mathbb{F}$, then the multiplication function $\mathrm{T}_\boldsymbol{A}:\mathbb{F}^n \to\mathbb{F}^m$ defined by $\boldsymbol{v} \mapsto \boldsymbol{A}\boldsymbol{v}$ is a linear map, i.e. $\mathrm{T}_\boldsymbol{A} \in\mathcal{L}(\mathbb{F}^n,\mathbb{F}^m)$. -2. If $\mathrm{T}\in\mathcal{L}(\mathbb{F}^n,\mathbb{F}^m)$ then $\mathrm{T} = \mathrm{T}_\boldsymbol{A}$ where for the standard basis $E = \Set{ \boldsymbol{e}_i }_{i=1}^n$ +1. If $\boldsymbol{A}$ is an $m\times n$ matrix over $\mathbb{F}$, then the multiplication function $\mathrm{T}_{\boldsymbol{A}}:\mathbb{F}^n \to\mathbb{F}^m$ defined by $\boldsymbol{v} \mapsto \boldsymbol{A}\boldsymbol{v}$ is a linear map, i.e. $\mathrm{T}_{\boldsymbol{A}} \in\mathcal{L}(\mathbb{F}^n,\mathbb{F}^m)$. +2. If $\mathrm{T}\in\mathcal{L}(\mathbb{F}^n,\mathbb{F}^m)$ then $\mathrm{T} = \mathrm{T}_{\boldsymbol{A}}$ where for the standard basis $E = \Set{ \boldsymbol{e}_i }_{i=1}^n$ $$ \boldsymbol{A} = \begin{bmatrix} - \shortmid & ~ & \shortmid \\ + \shortmid & & \shortmid \\ \mathrm{T}(\boldsymbol{e}_1) & \cdots & \mathrm{T}(\boldsymbol{e}_n) \\ - \shortmid & ~ & \shortmid + \shortmid & & \shortmid \end{bmatrix} \in\mathcal{M}_{m,n}(\mathbb{F}) $$ @@ -1460,7 +1460,7 @@ $$ showing that $\mathrm{T}_{\boldsymbol{A}} \in\mathcal{L}(\mathbb{F}^n,\mathbb{F}^m)$. -**(2)**: Let $E = \Set{ \boldsymbol{e}_i }_{i=1}^n$ be the standard basis of $\mathbb{F}^n$. If a vector $\boldsymbol{v}\in V$ has coordinates $[\boldsymbol{v}]_E = \left[(\beta_i)_{i=1}^n\right]^T \in\mathbb{F}^n$ then $\boldsymbol{v}$ can be written as the linear combination +**(2)**: Let $E = \Set{ \boldsymbol{e}_i }_{i=1}^n$ be the standard basis of $\mathbb{F}^n$. If a vector $\boldsymbol{v}\in V$ has coordinates $[\boldsymbol{v}]_E = \left[(\beta_i)_{i=1}^n\right]^\top \in\mathbb{F}^n$ then $\boldsymbol{v}$ can be written as the linear combination $$ \boldsymbol{v} = \sum_{i=1}^n \beta_i \boldsymbol{e}_i @@ -1472,7 +1472,7 @@ $$ \begin{align*} \mathrm{T}(\boldsymbol{v}) =& \mathrm{T} \left(\sum_{i=1}^n \beta_i \boldsymbol{e}_i \right) = \sum_{i=1}^n \beta_i \mathrm{T}(\boldsymbol{e}_i) \\ =& \begin{bmatrix} \mathrm{T}(\boldsymbol{e}_1) & \cdots & \mathrm{T}(\boldsymbol{e}_n) \end{bmatrix} [\boldsymbol{v}]_E \\ - =& \boldsymbol{A}[\boldsymbol{v}]_E = \mathrm{T}_\boldsymbol{A} (\boldsymbol{v}) + =& \boldsymbol{A}[\boldsymbol{v}]_E = \mathrm{T}_{\boldsymbol{A}} (\boldsymbol{v}) \end{align*} $$ @@ -1482,8 +1482,8 @@ Hence $\boldsymbol{A} = \begin{bmatrix} \mathrm{T}(\boldsymbol{e}_1) & \cdots & Let $\boldsymbol{A}$ be an $m\times n$ matrix over $F$. -1. $\mathrm{T}_\boldsymbol{A}:\mathbb{F}^n \to\mathbb{F}^m$ is injective if and only if $\operatorname{rank}(\boldsymbol{A}) = n$. -2. $\mathrm{T}_\boldsymbol{A}:\mathbb{F}^n \to\mathbb{F}^m$ is surjective if and only if $\operatorname{rank}(\boldsymbol{A}) = m$. +1. $\mathrm{T}_{\boldsymbol{A}}:\mathbb{F}^n \to\mathbb{F}^m$ is injective if and only if $\operatorname{rank}(\boldsymbol{A}) = n$. +2. $\mathrm{T}_{\boldsymbol{A}}:\mathbb{F}^n \to\mathbb{F}^m$ is surjective if and only if $\operatorname{rank}(\boldsymbol{A}) = m$. ### Change of basis matrices @@ -1512,7 +1512,7 @@ Hence $[\boldsymbol{v}]_C = \boldsymbol{M}_{B,C}[\boldsymbol{v}]_B$ and $\boldsy
Proof -Since $\varphi_{B,C}$ is an operator on $\mathbb{F}^n$ it has the form $\mathrm{T}_\boldsymbol{M}$ where $\boldsymbol{M}\in\mathcal{M}_n$ +Since $\varphi_{B,C}$ is an operator on $\mathbb{F}^n$ it has the form $\mathrm{T}_{\boldsymbol{M}}$ where $\boldsymbol{M}\in\mathcal{M}_n$ $$ \begin{align*} @@ -2485,7 +2485,7 @@ Note that several notations are commonly used for the transpose of $\mathrm{T}$, - $\mathrm{T}^\times$ - $\mathrm{T}^\#$ - $\mathrm{T}'$ -- $\mathrm{T}^T$ +- $\mathrm{T}^\top$ @@ -2644,7 +2644,7 @@ $$ \end{align*} $$ -Comparing the expression, we see that $[\mathrm{T}^*]_{C^*, B^*} = ([\mathrm{T}]_{B,C})^T$. +Comparing the expression, we see that $[\mathrm{T}^*]_{C^*, B^*} = ([\mathrm{T}]_{B,C})^\top$.
@@ -2821,12 +2821,12 @@ The scalar $\lambda$ is the *determinant* of $\mathrm{T}$. The determinant of $\ Let $V$ be an $n$-dimensional $\mathbb{F}$-vector space. Given a linear operator $\mathrm{T}:V\to V$, there exists a unique scalar $\lambda\in\mathbb{F}$ such that for every alternating linear $n$-form $f$ $$ - f(\mathrm{T}\boldsymbol{v}_1,\dots,\mathrm{T}\boldsymbol{v}_n) = \lambda f(\boldsymbol{v}_1,\dots,\boldsymbol{v},x_n) + f(\mathrm{T}\boldsymbol{v}_1,\dots,\mathrm{T}\boldsymbol{v}_n) = \lambda f(\boldsymbol{v}_1,\dots,\boldsymbol{v},\boldsymbol{x}_n) $$ $$ \begin{CD} -V^n @>{\mathrm{T}^{n \times}}>> V^n \\ +V^n @>{\mathrm{T}^n}>> V^n \\ @V{f}VV @VV{f}V \\ \mathbb{F} @>>{\lambda}> \mathbb{F} \end{CD} @@ -3758,8 +3758,7 @@ $$ Also $$ \begin{align*} - &\langle\beta\mathbf{v}\rangle = \langle\mathbf{v}\rangl \\ - \iff& (o(\mathbf{v}),\beta) = 1 \\ + &\langle\beta\mathbf{v}\rangle = \langle\mathbf{v}\rangle \iff& (o(\mathbf{v}),\beta) = 1 \\ \iff& o(\beta\mathbf{v}) = o(\mathbf{v}) \end{align*} $$ @@ -4200,7 +4199,7 @@ $$ \end{align*} $$ -Then $pM = \bigoplus_{i=1}^s p\langle\mathbf{v}_i \rangle$ and $pM = \bigoplus_{i=1}^t p\langle\mathbf{u}_i \rangle$. However, $p\langle \mathbf{v}_1 \rangle = \langle p\mathbf{v}_1 \rangle$ is a cyclic submodule of $M$ with annihilator $\langle p^{e_i - 1}\rangle$ and so by the induction hypothesis $\mathbf{s} = \mathbf{t}$ and $e_1 = f_1,\dots,e_s = f_s$, concluding the proof of uniqueness. +Then $pM = \bigoplus_{i=1}^s p\langle\mathbf{v}_i \rangle$ and $pM = \bigoplus_{i=1}^\top p\langle\mathbf{u}_i \rangle$. However, $p\langle \mathbf{v}_1 \rangle = \langle p\mathbf{v}_1 \rangle$ is a cyclic submodule of $M$ with annihilator $\langle p^{e_i - 1}\rangle$ and so by the induction hypothesis $\mathbf{s} = \mathbf{t}$ and $e_1 = f_1,\dots,e_s = f_s$, concluding the proof of uniqueness. **(3):** Suppose $g:M\cong N$ and $M$ has annihilator chain @@ -4250,7 +4249,7 @@ $$ $$ is a primary submodule with annihilator $\langle p_i^{e_i} \rangle$. Finally, each primary submodule $M_{p_i}$ can be written as a direct sum of cyclic submodules, so that $$ - M = \underbrace{[\langle \mathbf{v}_{1,1}\oplus\cdots\oplus\langle \mathbf{v}_{i, k_1} \rangle]}_{M_{p_1}} \oplus\cdots\oplus \underbrace{[\langle \mathbf{v}_{1,1}\oplus\cdots\oplus\langle \mathbf{v}_{i, k_n} \rangle]}_{M_{p_n}} + M = \underbrace{[\langle\mathbf{v}_{1,1}\rangle\oplus\cdots\oplus\langle \mathbf{v}_{i, k_1} \rangle]}_{M_{p_1}} \oplus\cdots\oplus \underbrace{[\langle\mathbf{v}_{1,1}\rangle\oplus\cdots\oplus\langle \mathbf{v}_{i, k_n} \rangle]}_{M_{p_n}} $$ where $\operatorname{ann}(\langle\mathbf{v}_{i,j}\rangle) = \langle p_i^{e_{i,j}}\rangle$ and the terms in each cyclic decomposition can be arranged so that, for each $i$ $$ @@ -4259,7 +4258,7 @@ $$ or equivalently $e_i = e_{i,1} \geq\cdots\geq e_{i,k_i}$. 2. As for uniqueness, suppose $$ - M = \underbrace{[\langle \mathbf{u}_{1,1}\oplus\cdots\oplus\langle \mathbf{u}_{i, k_1} \rangle]}_{N_{q_1}} \oplus\cdots\oplus \underbrace{[\langle \mathbf{u}_{1,1}\oplus\cdots\oplus\langle \mathbf{u}_{i, k_n} \rangle]}_{N_{q_n}} + M = \underbrace{[\langle\mathbf{u}_{1,1}\rangle\oplus\cdots\oplus\langle \mathbf{u}_{i, k_1} \rangle]}_{N_{q_1}} \oplus\cdots\oplus \underbrace{[\langle\mathbf{u}_{1,1}\rangle\oplus\cdots\oplus\langle \mathbf{u}_{i, k_n} \rangle]}_{N_{q_n}} $$ is also a primary cyclic decomposition of $M$. Then - The number of summands is the same in both decompositions; in fact, $m = n$ and after possible reindexing $k_u = j_u$ for all $u$. @@ -4732,7 +4731,7 @@ $$ [\mathrm{T}]_B = \begin{bmatrix} 0 & 0 & \cdots & 0 & -a_0 \\ 1 & 0 & \cdots & 0 & -a_1 \\ - 0 & 1 & \ddots & ~ & \vdots \\ + 0 & 1 & \ddots & & \vdots \\ \vdots & \vdots & \ddots & 0 & -a_{n-2} \\ 0 & 0 & \cdots & 1 & -a_{n-1} \\ \end{bmatrix} @@ -4753,7 +4752,7 @@ $$ C[p(x)] = \begin{bmatrix} 0 & 0 & \cdots & 0 & -a_0 \\ 1 & 0 & \cdots & 0 & -a_1 \\ - 0 & 1 & \ddots & ~ & \vdots \\ + 0 & 1 & \ddots & & \vdots \\ \vdots & \vdots & \ddots & 0 & -a_{n-2} \\ 0 & 0 & \cdots & 1 & -a_{n-1} \\ \end{bmatrix} @@ -4930,7 +4929,7 @@ where the polynomials $s_k(x)$ are the invariant factors of $\mathrm{T}$ and $s_ 2. Each similarity class $\mathcal{S}$ of matrices contains a matrix $R$ in the invariant factor form of rational canonical form. Moreover, the set of matrices in $\mathcal{S}$ that have this form is the set of matrices obtained from $M$ by reordering the block diagonal matrices. Any such matrix is called and *invariant factor version* of a *rational canoncial form* of $\mathbf{A}$. 3. The dimension of $V$ is the sum of the degrees of the invariant factors of $\mathrm{T}$ $$ - \dim(V) = \sum_{i=1}^n \sum_{j=1} \deg(s_i) + \dim(V) = \sum_{i=1}^n \deg(s_i) $$ @@ -4976,7 +4975,7 @@ $$ =& \begin{bmatrix} x & 0 & \cdots & 0 & a_0 \\ -1 & x & \cdots & 0 & a_1 \\ - 0 & -1 & \ddots & ~ & \vdots \\ + 0 & -1 & \ddots & & \vdots \\ \vdots & \vdots & \ddots & x & a_{n-2} \\ 0 & 0 & \cdots & -1 & x + a_{n-1} \end{bmatrix} @@ -5134,7 +5133,7 @@ Every $n\times n$ matrix $\boldsymbol{A}\in\mathcal{M}_n (\mathbb{F})$ has an ei
Proof -Assume $\boldsymbol{A}$ represent a linear operator $\mathrm{T}_\boldsymbol{A}:V\to V$ and choose any nonzero $\boldsymbol{v}\in V$. Consider the vectors $\Set{\boldsymbol{A}^i \boldsymbol{v}}_{i=0}^n$, where $\boldsymbol{A}^0 = \boldsymbol{I}$. Since this set has $n + 1$ vectors it must be linearly dependent. Thus, +Assume $\boldsymbol{A}$ represent a linear operator $\mathrm{T}_{\boldsymbol{A}}:V\to V$ and choose any nonzero $\boldsymbol{v}\in V$. Consider the vectors $\Set{\boldsymbol{A}^i \boldsymbol{v}}_{i=0}^n$, where $\boldsymbol{A}^0 = \boldsymbol{I}$. Since this set has $n + 1$ vectors it must be linearly dependent. Thus, $$ \begin{align*} @@ -5273,7 +5272,7 @@ $$ c_{n-1} =& (-1)^1 \sum_i \lambda_i \\ c_{n-2} =& (-1)^2 \sum_{i < j} \lambda_i \lambda_j \\ c_{n-3} =& (-1)^3 \sum_{i < j < k} \lambda_i \lambda_j \lambda_k \\ - \vdot& \\ + \vdots& \\ c_0 =& (-1)^n \prod_{i=1}^n \lambda_i \end{align*} $$ @@ -5382,7 +5381,7 @@ Thus, for this basis the matrix of $\mathrm{T}|_{\langle\mathbf{v}_{i,j}\rangle} $$ \mathbf{J}(\lambda_i, e_{i,j}) = \begin{bmatrix} \lambda_i & 0 & \cdots & \cdots & 0 \\ - 1 & \lambda_i & \ddots & ~ $ \vdots \\ + 1 & \lambda_i & \ddots & \vdots \\ 0 & 1 & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & \ddots & 0 \\ 0 & \c @@ -5517,10 +5516,10 @@ A matrix $\mathbf{A}\in\mathcal{M}_n (\mathbb{F})$ is *almost upper triangular* $$ \mathbf{A} = \begin{bmatrix} - \mathbf{A}_1 & ~ & * & ~ \\ - ~ & \mathbf{A}_2 & ~ & ~ \\ - ~ & ~ & \ddots & ~ \\ - ~ & \mathbf{0} & ~ & \mathbf{A}_k + \mathbf{A}_1 & & * & \\ + & \mathbf{A}_2 & & \\ + & & \ddots & \\ + & \mathbf{0} & & \mathbf{A}_k \end{bmatrix} $$ @@ -6020,8 +6019,8 @@ To show that $\ker(G(B)) = \Set{\boldsymbol{0}}$, choose $\boldsymbol{\beta}\in\ Let $\langle\cdot, \cdot\rangle: V\times V \to\mathbb{F}$ be an inner product on the $\mathbb{F}$-vector space $V$. Suppose $U\subseteq V$ is a $k$-dimensional subspace. A set $B = \Set{\boldsymbol{b}_i}_{i=1}^{m\leq k}$ is called -- *orthogonal system* (OS) if $\langle\boldsymbol{b}_i,\boldsymbol{b}_j\langle = 0$ for all $i\neq j$ -- *orthonormal system* (ONS) if $\langle\boldsymbol{b}_i,\boldsymbol{b}_j\langle = \delta_{ij}$, where $\delta$ is the Kronecker delta function +- *orthogonal system* (OS) if $\langle\boldsymbol{b}_i,\boldsymbol{b}_j\rangle = 0$ for all $i\neq j$ +- *orthonormal system* (ONS) if $\langle\boldsymbol{b}_i,\boldsymbol{b}_j\rangle = \delta_{ij}$, where $\delta$ is the Kronecker delta function - *orthogonal basis* if it is an OS and a basis of $U$ - *orthonormal basis* if it is an OSN and a basis of $U$ @@ -6144,7 +6143,7 @@ $$ ### QR factorization -The Gram-Schmidt process can be used to factor any real or complex matrix into a product of a matrix with orthogonal columns and an uppoer triangular matrix. Suppose that $\mathbf{A} = \left[\begin{smallmatrix} \shortmid & ~ & \shortmid \\ \mathbf{v}_1 & \cdots & \mathbf{v}_n \\ \shortmid & ~ & \shortmid \end{smallmatrix}\right] \in \mathcal{M}_{m,n} (\mathbb{F})$ is an $m\times n$ matrix where $n \leq m$. The Gram-Schmidt process applied to these columns gives orthogonal vectors $\mathbf{O} = \left[\begin{smallmatrix} \shortmid & ~ & \shortmid \\ \mathbf{u}_1 & \cdots & \mathbf{u}_n \\ \shortmid & ~ & \shortmid \end{smallmatrix}\right]$ for which +The Gram-Schmidt process can be used to factor any real or complex matrix into a product of a matrix with orthogonal columns and an uppoer triangular matrix. Suppose that $\mathbf{A} = \left[\begin{smallmatrix} \shortmid & & \shortmid \\ \mathbf{v}_1 & \cdots & \mathbf{v}_n \\ \shortmid & & \shortmid \end{smallmatrix}\right] \in \mathcal{M}_{m,n} (\mathbb{F})$ is an $m\times n$ matrix where $n \leq m$. The Gram-Schmidt process applied to these columns gives orthogonal vectors $\mathbf{O} = \left[\begin{smallmatrix} \shortmid & & \shortmid \\ \mathbf{u}_1 & \cdots & \mathbf{u}_n \\ \shortmid & & \shortmid \end{smallmatrix}\right]$ for which $$ \langle \mathbf{u}_1,\dots,\mathbf{u}_k \rangle = \langle \mathbf{v}_1,\dots,\mathbf{v}_k,\; \forall k \leq n @@ -6162,18 +6161,18 @@ $$ In matrix terms $$ - \begin{bmatrix} \shortmid & ~ & \shortmid \\ \mathbf{v}_1 & \cdots & \mathbf{v}_n \\ \shortmid & ~ & \shortmid \end{bmatrix} = \begin{bmatrix} \shortmid & ~ & \shortmid \\ \mathbf{u}_1 & \cdots & \mathbf{u}_n \\ \shortmid & ~ & \shortmid \end{bmatrix} \begin{bmatrix} 1 & \lambda_{2,1} & \cdots & \lambda_{n,1} \\ ~ & 1 & \cdots & \lambda_{n,2} \\ ~ & ~ & \ddots & \vdots \\ ~ & ~ & ~ & 1 \end{bmatrix} + \begin{bmatrix} \shortmid & & \shortmid \\ \mathbf{v}_1 & \cdots & \mathbf{v}_n \\ \shortmid & & \shortmid \end{bmatrix} = \begin{bmatrix} \shortmid & & \shortmid \\ \mathbf{u}_1 & \cdots & \mathbf{u}_n \\ \shortmid & & \shortmid \end{bmatrix} \begin{bmatrix} 1 & \lambda_{2,1} & \cdots & \lambda_{n,1} \\ & 1 & \cdots & \lambda_{n,2} \\ & & \ddots & \vdots \\ & & & 1 \end{bmatrix} $$ That is $\mathbf{A} = \mathbf{OB}$ where $\mathbf{O}$ has orthogonal columns and $\mathbf{B}$ is upper triangular. We may normalize the nonzero columns $\mathbf{u}_i$ of $\mathbf{O}$ and move the positive constants to $\mathbf{B}$. In particular, if $\alpha_i = \lVert \mathbf{u}_i \rVert$ for $\mathbf{u}_i \neq \mathbf{0}$ and $\alpha_i = 1$ for $\mathbf{u}_i = \mathbf{0}$, then $$ - \begin{bmatrix} \shortmid & ~ & \shortmid \\ \mathbf{v}_1 & \cdots & \mathbf{v}_n \\ \shortmid & ~ & \shortmid \end{bmatrix} = \begin{bmatrix} \shortmid & ~ & \shortmid \\ \frac{\mathbf{u}_1}{\alpha_1} & \cdots & \frac{\mathbf{v}_n}{\alpha_n} \\ \shortmid & ~ & \shortmid \end{bmatrix} \begin{bmatrix} \alpha_1 & \alpha_1 \lambda_{2,1} & \cdots & \alpha_1 \lambda_{n,1} \\ ~ & \alpha_2 & \cdots & \alpha_2 \lambda_{n,2} \\ ~ & ~ & \ddots & \vdots \\ ~ & ~ & ~ & \alpha_n \end{bmatrix} + \begin{bmatrix} \shortmid & & \shortmid \\ \mathbf{v}_1 & \cdots & \mathbf{v}_n \\ \shortmid & & \shortmid \end{bmatrix} = \begin{bmatrix} \shortmid & & \shortmid \\ \frac{\mathbf{u}_1}{\alpha_1} & \cdots & \frac{\mathbf{v}_n}{\alpha_n} \\ \shortmid & & \shortmid \end{bmatrix} \begin{bmatrix} \alpha_1 & \alpha_1 \lambda_{2,1} & \cdots & \alpha_1 \lambda_{n,1} \\ & \alpha_2 & \cdots & \alpha_2 \lambda_{n,2} \\ & & \ddots & \vdots \\ & & & \alpha_n \end{bmatrix} $$ That is $\mathbf{A} = \mathbf{QR}$ where the columns of $\mathbf{Q}$ are orthogonal and each column is either a unit vector or the zero vector and $\mathbf{R}$ is upper triangular with positive entries on the main diagonal. Moreover, if the vectors $\mathbf{v}_1,\dots,\mathbf{v}_n$ are linearly independent, then the columns of $\mathbf{Q}$ are nonzero. Also, if $m = n$ and $\mathbf{A}$ is nonsingular, then $\mathbf{Q}$ is unitary/orthogonal. -If the columns of $\mathbf{A}$ are not linearly independent, we can make one final adjustment to this matrix factorization. If a column $\frac{\mathbf{u}_i}{\alpha_i}$ is zero, then we may replace this column by any vector as long as we replace the $(i,i)$th entry $\alpha_i$ in $\mathbb{R}$ by $0$. Thus, we can take nonzero columns of $\mathbf{Q}, extend to an orthonormal basis for the span of the columns of $\mathbf{Q}$ and replace the zero columns of $\mathbf{Q}$ by the additional members of this orthogonal basis. In this way, $\mathbf{Q}$ is replaced by a unitary/orthogonal matrix $\mathbf{Q}'$ and $\mathbf{R}$ is replaced by an upper triangular matrix $\mathbf{R}'$ that has nonnegative entries on the main diagonal. +If the columns of $\mathbf{A}$ are not linearly independent, we can make one final adjustment to this matrix factorization. If a column $\frac{\mathbf{u}_i}{\alpha_i}$ is zero, then we may replace this column by any vector as long as we replace the $(i,i)$th entry $\alpha_i$ in $\mathbb{R}$ by $0$. Thus, we can take nonzero columns of $\mathbf{Q}$, extend to an orthonormal basis for the span of the columns of $\mathbf{Q}$ and replace the zero columns of $\mathbf{Q}$ by the additional members of this orthogonal basis. In this way, $\mathbf{Q}$ is replaced by a unitary/orthogonal matrix $\mathbf{Q}'$ and $\mathbf{R}$ is replaced by an upper triangular matrix $\mathbf{R}'$ that has nonnegative entries on the main diagonal. Let $\mathbf{A}\in\mathcal{M}_{m,n}(\mathbf{F})$ where $\mathbb{F} = \Set{\mathbb{C},\R}$. There exists a matrix $\mathbf{Q}\in\mathcal{M}_{m,n}(\mathbb{F})$ with orthonormal columns and an upper triangular matrix $\mathbf{R}\in\mathcal{M}_n (\mathbb{F})$ with nonnegative real entries on the main diagonal for which @@ -6203,17 +6202,17 @@ $$ \mathbf{QRx} = \mathbf{u} $$ -and since $\mathbf{Q}^{-1} = \mathbf{Q}^*$, we have +and since $\mathbf{Q}^{-1} = \mathbf{Q}^\dagger$, we have $$ - \mathbf{Rx} = \mathbf{Q}^* \mathbf{u} + \mathbf{Rx} = \mathbf{Q}^\dagger \mathbf{u} $$ which is an upper triangular system easily solved back by substitutions. The QR factorization can be use to approximate the eigenvalues of a matrix in a process called the *QR algorithm*. Specifically, if $\mathbf{A} = \mathbf{A}_0$ is an $n\times n$ matrix, define a sequence of matrices as follows: 1. Let $\mathbf{A}_0 = \mathbf{Q}_0 \mathbf{R}_0$ be the QR factorization of $\mathbf{A}_0$ and let $\mathbf{A}_1 = \mathbf{Q}_0 \mathbf{R}_0$ -2. Once $\mathbf{A}_k$ has been defined, let $\mathbf{A}_k = \mathbf{Q}_k \mathbf{R}_k$ be the QR factorization of $\mathbf{A}_k$ and let $\mathbf{A}_{k+1}$ = \mathbf{R}_k \mathbf{Q}_k +2. Once $\mathbf{A}_k$ has been defined, let $\mathbf{A}_k = \mathbf{Q}_k \mathbf{R}_k$ be the QR factorization of $\mathbf{A}_k$ and let $\mathbf{A}_{k+1} = \mathbf{R}_k \mathbf{Q}_k$ Then $\mathbf{A}_k$ is unitarily/orthogonally similar to $\mathbf{A}$ since @@ -6250,7 +6249,6 @@ Let $V = \ell^2$ and let $M$ be the set of all vectors of the form $\mathbf{e}_i On the other hand, the vector space span of $M$ is the subspace $S$ of all sequences in $\ell^2$ that have finite support, i.e. have only a finite number of nonzero terms. Since $\operatorname{span}(M) = S \neq \ell^2$, we see that $M$ is not a Hamel basis for the vector space $\ell^2$. - ### The projection theorem and best approximations Orthonormal bases have a great practical advantage over arbitrary bases. If $B = \Set{\mathbf{v}_i}_{i=1}^n$ is a basis for a vector space $V$, then each $\mathbf{v}\in V$ has the form @@ -6318,10 +6316,10 @@ Hence, $\lVert \mathbf{v} - \mathbf{s} \rVert$ is smallest if and only if $\math -If $S$ is a finite-dimensional subspace of an inner product space $V$, then $S = S \ocirc S^\perp$. In particular, if $\mathbf{v}\in V$, then +If $S$ is a finite-dimensional subspace of an inner product space $V$, then $S = S \odot S^\perp$. In particular, if $\mathbf{v}\in V$, then $$ - \mathbf{v} = \tilde{\mathbf{v}} + (\mathbf{v} - \tilde{\mathbf{v}}) \in S \ocirc S^\perp + \mathbf{v} = \tilde{\mathbf{v}} + (\mathbf{v} - \tilde{\mathbf{v}}) \in S \odot S^\perp $$ It follows that $\dim(V) = \dim(S) + \dim(S^\perp)$. @@ -6329,7 +6327,7 @@ It follows that $\dim(V) = \dim(S) + \dim(S^\perp)$.
Proof -We have seen that $\mathbf{v} - \tilde{\mathbf{v}} \in S^\perp$ and so $V = S + S^\perp$. However, $S \cap S^\perp = \Set{\mathbf{0}}$ and so $V = S \ocirc S^\perp$. +We have seen that $\mathbf{v} - \tilde{\mathbf{v}} \in S^\perp$ and so $V = S + S^\perp$. However, $S \cap S^\perp = \Set{\mathbf{0}}$ and so $V = S \odot S^\perp$.
@@ -6414,7 +6412,7 @@ Let $V$ be a finite-dimensional inner product space. **(1):** Assume $\mathrm{T}$ is surjective. If $f = 0$, then $\mathrm{R}_f = 0$, so let us assume $f \neq 0$. Then $K = \ker(f)$ has codimension $1$ and so $$ - V = \langle\mathbf{w}\rangle \ocirc K,\; \mathbf{w}\in K^\perp + V = \langle\mathbf{w}\rangle \odot K,\; \mathbf{w}\in K^\perp $$ Letting $\mathbf{x} = \alpha\mathbf{w}$ for $\alpha\in\mathbb{F}$, we require that $f(\mathbf{v}) = \langle\mathbf{v},\alpha\mathbf{w}\rangle$. Since this clearly holds for any $\mathbf{v}\in K$, it is sufficient to show that it holds for $\mathbf{v} = \mathbf{w}$, i.e. @@ -6448,7 +6446,586 @@ $$
-## Positive definite matrices +# Structure theory for normal operators + +## The adjoint of a linear operator + + +Let $V$ and $W$ be finite-dimensional inner product spaces over $\mathbb{F}$ and let $\mathrm{T}\in\mathcal{L}(V,W)$. Then there is a unique function $\mathrm{T}^\dagger: W\to V$ defined by the condition + +$$ + \langle \mathrm{T}\mathbf{v}, \mathbf{w}\rangle = \langle \mathbf{v}, \mathrm{T}^\dagger \mathbf{w} \rangle,\; \mathbf{v}\in V, \mathbf{w}\in w +$$ + +The function $\mathrm{T}^\dagger \mathcal{L}(W, V)$ is the *adjoint* of $\mathrm{T}$. + +
+Proof + +If $\mathrm{T}^\dagger$ exists, then it is unique, for if + +$$ + \langle\mathrm{T}\mathbf{v},\mathbf{w}\rangle = \langle\mathbf{v},\mathrm{S}\mathbf{w}\rangle +$$ + +then $\langle\mathbf{v},\mathrm{S}\mathbf{w}\rangle = \langle\mathbf{v},\mathrm{T}^* \mathbf{w}\rangle$ for all $\mathbf{v}\in V$ and $\mathbf{w}\in W$ and so $\mathrm{S} = \mathrm{T}^\dagger$. + +We seek a linear map $\mathrm{T}^\dagger : W\to V$ for which $\langle \mathrm{T}\mathbf{v}, \mathbf{w}\rangle = \langle \mathbf{v}, \mathrm{T}^\dagger \mathbf{w} \rangle$. By the Riesz representation theorem, for each $\mathbf{w}\in W$, the linear functional $f_\mathbf{w} \in V^*$ defined by + +$$ + f_\mathbf{w} \mathbf{v} = \langle\mathrm{T}\mathbf{v},\mathbf{w}\rangle +$$ + +has the form + +$$ + f_\mathbf{w} \mathbf{v} = \langle\mathbf{v}, \mathrm{R}_{f_\mathbf{w}}\rangle +$$ + +where $\mathrm{R}_{f_\mathbf{w}} \in V$ is the Riesz vector for $f_\mathbf{w}$. If $\mathrm{T}^\dagger :W\to V$ is defined by + +$$ + \mathrm{T}^\dagger \mathbf{w} = \mathrm{R}_{f_\mathbf{w}} = \mathrm{R}(f_\mathbf{w}) +$$ + +where $\mathrm{R}$ is the Riesz map, then + +$$ + \langle\mathbf{v},\mathrm{T}^\dagger \mathbf{w}\rangle = \langle\mathbf{v},\mathrm{R}_{f_\mathbf{w}}\rangle = f_\mathrm{w}\mathbf{v} = \langle\mathrm{T}\mathbf{v},\mathbf{w}\rangle +$$ + +Finally, since $\mathrm{T}^\dagger = \mathrm{R}\circ f$ is the composition of the Riesz map $\mathrm{R}$ and the map $f:\mathbf{w}\mapsto f_\mathrm{w}$ and since both of these maps are conjugate linear, their composition is linear. +
+
+ + +Let $V$ and $W$ be finite-dimensional $\mathbb{F}$-inner product spaces. For every $\mathrm{S},\mathrm{T}\in\mathcal{L}(V,W)$ and $\alpha\in\mathbb{F}$ +1. $(\mathrm{S} + \mathrm{T})^\dagger = \mathrm{S}^\dagger + \mathrm{T}^\dagger$ +2. $(\alpha\mathrm{T})^\dagger = \bar{\alpha}\mathrm{T}^\dagger$ +3. $T^{\dagger\dagger} = \mathrm{T}$ and so $\langle\mathrm{T}^\dagger \mathbf{v},\mathbf{w}\rangle = \langle\mathbf{v},\mathrm{T}\mathbf{w}\rangle$ +4. If $V = W$ then $(\mathrm{S}\mathrm{T})^\dagger = \mathrm{T}^\dagger \mathrm{S}^\dagger$ +5. If $\mathrm{T}$ is invertible, then $(\mathrm{T}^{-1})^\dagger = (\mathrm{T}^\dagger)^{-1}$ +6. If $V = W$ and $p(x)\in\R[x]$, then $p(\mathrm{T})^\dagger = p(\mathrm{T}^\dagger)$ + +Moreover, if $\mathrm{T}\in\mathcal{L}(V)$ is a linear operator and $S$ is a subspace of $V$, then +7. $\mathrm{S}$ is $\mathrm{T}$-invariant if and only if $S^\perp$ is $\mathrm{T}^\dagger$-invariant +8. $(S,S^\perp)$ reduces $\mathrm{T}$ if and only if $S$ is both $\mathrm{T}$-invariant nad $\mathrm{T}^\dagger$-invariant, in which case +$$ + (\mathrm{T}|_S)^\dagger = (\mathrm{T}^\dagger)|_S +$$ + +
+Proof + +**(7):** Let $\mathbf{s}\in S$ and $\mathbf{z}\in S^\perp$ and write + +$$ + \langle\mathrm{T}^\dagger \mathbf{z}, \mathbf{s}\rangle = \langle \mathbf{z}, \mathrm{T}\mathbf{s}\rangle +$$ + +If $S$ is $\mathrm{T}$-invariant, then $\langle\mathrm{T}^\dagger \mathbf{z}, \mathbf{s}\rangle = 0$ for all $\mathbf{s}\in S$ and so $\mathrm{T}^\dagger \mathbf{z} \in S^\perp$ and $S^\perp$ is $\mathrm{T}^\dagger$-invariant. Conversely, if $S^\perp$ is $\mathrm{T}^\dagger$-invariant, then $\langle\mathbf{z},\mathrm{T}\mathbf{s}\rangle = 0$ for all $\mathbf{z}\in S^\perp$ and so $\mathrm{T}\mathbf{s}\in S^{\perp\perp} = S$. Hence $S$ is $\mathrm{T}$-invariant. + +**(8):** The first statement follows from **(7)** applied to both $S$ and $S^\perp$. For the second statement, since $S$ is both $\mathrm{T}$-invariant and $\mathrm{T}^\dagger$-invariant, if $\mathbf{s},\mathbf{t}\in S$, then + +$$ + \langle \mathbf{s}, (\mathrm{T}^\dagger)|_S (\mathbf{t}) \rangle = \langle \mathbf{s}, \mathrm{T}^\dagger \mathbf{t} \rangle = \langle \mathrm{T}\mathbf{s}, \mathbf{t} \rangle = \langle\mathrm{T}|_S (\mathbf{s}), \mathbf{t} \rangle +$$ + +Hence, by definition of adjoint, $(\mathrm{T}^\dagger)|_S = (\mathrm{T}|_S)^\dagger$. +
+
+ + +Let $\mathrm{T}\in\mathcal{L}(V,W)$, where $V$ and $W$ are finite-dimensional inner product spaces. +1. $\ker(\mathrm{T}^\dagger) = \operatorname{ran}(\mathrm{T})^\perp$ and $\operatorname{ran}(\mathrm{T}^\dagger) = \ker(\mathrm{T})^\perp$, and so +$$ +\begin{align*} + \mathrm{T} \text{ surjective} \iff& \mathrm{T}^\dagger \text{ injective} \\ + \mathrm{T} \text{ injective} \iff& \mathrm{T}^\dagger \text{ surjective} +\end{align*} +$$ +2. $\ker(\mathrm{T}^\dagger \mathrm{T}) = \ker(\mathrm{T})$ and $\ker(\mathrm{T}\mathrm{T}^\dagger) = \ker(\mathrm{T}^\dagger)$ +3. $\operatorname{ran}(\mathrm{T}^\dagger \mathrm{T}) = \operatorname{ran}(\mathrm{T}^\dagger)$ and $\operatorname{ran}(\mathrm{T}\mathbf{T}^\dagger) = \operatorname{ran}(\mathrm{T})$ +4. $(\mathrm{P}_{S,T})^\dagger = \mathrm{P}_{T^\perp, S^\perp}$ + +
+Proof + +**(1):** We have + +$$ +\begin{align*} + \mathbf{u} \in\ker(\mathrm{T}^\dagger) \iff& \mathrm{T}^\dagger \mathbf{u} = \mathbf{0} \\ + \iff& \langle \mathrm{T}^\dagger \mathbf{u}, V \rangle = \Set{\mathbf{0}} \\ + \iff& \langle \mathbf{u}, \mathrm{T}V \rangle = \Set{\mathrm{0}} \\ + \iff& \mathbf{u}\in\operatorname{ran}(\mathrm{T})^\perp +\end{align*} +$$ + +and so $\ker(\mathrm{T}^\dagger) = \operatorname{ran}(\mathrm{T})^\perp$. The second identity follows by replacing $\mathrm{T}$ and $\mathrm{T}^\dagger$ and taking complements. + +**(2):** It is clear that $\ker(\mathrm{T}) \subseteq\ker(\mathrm{T}^\dagger \mathrm{T})$. For the reverse inclusion, we have + +$$ +\begin{align*} + \mathrm{T}^\dagger \mathrm{T}\mathbf{u} = \mathbf{0} \implies& \langle \mathrm{T}^\dagger \mathrm{T}\mathbf{u}, \mathbf{u} \rangle = 0 \\ + \implies& \langle \mathrm{T}\mathbf{u}, \mathrm{T}\mathbf{u} \rangle = 0 \\ + \implies& \mathrm{T}\mathbf{u} = \mathbf{0} +\end{align*} +$$ + +and so $\ker(\mathrm{T}^\dagger \mathrm{T}) \subseteq\ker(\mathrm{T})$. The second identity follows from the first by replacing $\mathrm{T}$ with $\mathrm{T}^\dagger$. +
+
+ +### Relation between the algebraic adjoint and Hermitian adjoint + + +Let $\mathrm{T}\in\mathcal{L}(V,W)$, where $V$ and $W$ are finite-dimensional inner product spaces. +1. The algebraic adjoint $\mathrm{T}^* :W^* \to V^*$ and the Hermitian adjoint $\mathrm{T}^\dagger : W\to V$ are related by +$$ + \mathrm{T}^* = (\mathrm{R}^V)^{-1} \circ \mathrm{T}^\dagger \circ \mathrm{R}^W +$$ +where $\mathrm{R}^V$ and $\mathrm{R}^W$ are the conjugate Riesz isomorphisms on $V$ and $W$, respectively. +2. If $B$ and $C$ are ordered orthonormal bases for $V$ and $W$, respectively, then +$$ + [\mathrm{T}^\dagger]_{C,B} = (\mathrm{T}_{B,C}) +$$ +In other words, the matrix of the adjoint $\mathrm{T}^\dagger$ is adjoint (conjugate transpose) of the matrix of $\mathrm{T}$. + +$$ +\begin{CD} + V^* @<{\mathrm{T}^*}<< W^* \\ + @V{\mathrm{R}^V}VV @VV{\mathrm{R}^W}V \\ + V @<{\mathrm{T}^\dagger}<< W +\end{CD} +$$ + +
+Proof + +**(1):** Consider the composite $\mathrm{S}:W^* \to V^*$ defined by + +$$ + \mathrm{S} = (\mathrm{R}^V)^{-1} \circ\mathrm{T}^\dagger \circ\mathrm{R}^W +$$ + +is linear. Moreover, for all $f\in W^*$ and $\mathbf{v}\in V$ + +$$ +\begin{align*} + (\mathrm{T}^* (f))\mathbf{v} =& f(\mathrm{T}\mathbf{v}) \\ + =& \langle\mathrm{T}\mathbf{v}, \mathrm{R}^W (f)\rangle \\ + =& \langle \mathbf{v}, \mathrm{T}^\dagger \mathrm{R}^W (f) \rangle \\ + =& [(\mathrm{R}^V)^{-1} (\mathrm{T}^\dagger \mathrm{R}^W (f))](\mathbf{v}) \\ + =& (\mathrm{S}f)\mathbf{v} +\end{align*} +$$ + +showing that $\mathrm{S} = \mathrm{T}^*$. Hence, the relationship between $\mathrm{T}^*$ an $\mathrm{T}^\dagger$ is + +$$ + \mathrm{T}^* = (\mathrm{R}^V)^{-1} \circ \mathrm{T}^\dagger \circ \mathrm{R}^W +$$ + +Loosely speaking, the Riesz functions are like "change of variables" functions from linear functionals to vectors, and we can say that $\mathrm{T}^\dagger$ does to Riesz vectors what $\mathrm{T}^*$ does to the corresponding linear functions. Put another way, $\mathrm{T}$ and $\mathrm{T}^\dagger$ are the same, up to conjugate Riesz isomorphism. + +**(2):** Suppose that $B = (\mathbf{b}_i)_{i=1}^n$ and $C = (\mathbf{c}_i)_{i=1}^m$ are ordered orthonormal bases for $V$ and $W$, respectively, then + +$$ +\begin{align*} + ([\mathrm{T}^\dagger]_{C,B})_{i,j} =& \langle \mathrm{T}^\dagger \mathbf{c}_j, \mathbf{b}_i \rangle \\ + =& \langle \mathbf{c}_j, \mathrm{T}\mathbf{b}_i \rangle \\ + =& \overline{\langle \mathrm{T}\mathbf{b}_i, \mathrm{c}_j \rangle} \\ + =& \overline{([\mathrm{T}]_{B,C})_{j,i}} +\end{align*} +$$ + +showing that $[\mathrm{T}^\dagger]_{C,B}$ and $[\mathrm{T}]_{B,C}$ are matrix adjoints (conjugate transposes). +
+
+ +## Orthogonal projections + + +A projection of the form $\mathrm{P}_{S,S^\perp}$ is called *orthogonal*. Equivalently, a projection $\mathrm{P}$ is orthogonal if $\ker(\mathrm{P}) \perp \operatorname{ran}(\mathrm{P})$. + + +Note that orthogonal projections differ in concept from two projections $\mathrm{P}, \mathrm{S}$ that are orthogonal to each other, i.e. $\mathrm{PS} = \mathrm{SP} = 0$ are not necessarily the same as. + + +Let $V$ be a finite-dimensional inner product space. The following are equivalent for an operator $\mathrm{P}$ on $V$: +1. $\mathrm{P}$ is an orthogonal projections +2. $\mathrm{P}$ is idempotent and self-adjoint +3. $\mathrm{P}$ is idempotent and does not expand lengths, i.e. +$$ + \lVert\mathrm{P}\mathbf{v}\rVert\leq\lVert\mathbf{v}\rVert,\; \mathbf{v}\in V +$$ + +
+Proof + +**(1) $\iff$ (2):** Since $(\mathrm{P}_{S,T})^\dagger = \mathrm{P}_{T^\perp, S^\perp}$ it follows that $\mathrm{P} = \mathrm{P}^\dagger$. + +**(1) \iff (3):** Let $\mathrm{P} = \mathrm{P}_{S,S^\perp}$. Then, if $\mathbf{v} = \mathbf{s} + \mathbf{t}$ for $\mathbf{s}\in S$ and $\mathbf{t}\in S^\perp$, it follows that + +$$ + \lVert\mathbf{v}\rVert^2 = \lVert\mathbf{s}\rVert^2 + \lVert\mathbf{t}\rVert^2 \geq \rVert\mathbf{s}\lVert^2 = \rVert\mathbf{P}\mathbf{v}\lVert^2 +$$ + +Suppose that **(3)** holds, then + +$$ + \operatorname{ran}(\mathrm{P})\oplus\mathrm{P}) = V = \ker(\mathrm{P})^\perp \odot \ker(\mathrm{P}) +$$ + +and we want to show that the first direct sum is orthogonal. If $\mathbf{w}\in\operatorname{ran}(\mathrm{P})$, then $\mathbf{w} = \mathbf{x} + \mathbf{y}$, where $\mathbf{x}\in\ker(\mathrm{P})$ and $\mathbf{y}\in\ker(\mathrm{P})^\perp$. Hence + +$$ + \mathbf{w} = \mathrm{P}\mathbf{w} = \mathrm{P}\mathbf{x} + \mathrm{P}\mathbf{y} = \mathrm{P}\mathbf{y} +$$ + +and so the orthogonality of $\mathbf{x}$ and $\mathbf{y}$ implies that + +$$ + \lVert\mathbf{x}\rVert^2 + \lVert\mathbf{y}\rVert^2 = \lVert\mathbf{w}\rVert^2 = \lVert\mathrm{P}\mathbf{v}\rVert^2 \leq \lVert\mathbf{y}\rVert^2 +$$ + +Hence, $\mathbf{x} = \mathbf{0}$, and so $\operatorname{ran}(\mathrm{P}) \subseteq\ker(\mathrm{P})^\perp$, which implies that $\operatorname{ran}(\mathrm{P}) = \ker(\mathrm{P})^\perp$. +
+
+ +### Orthogonal resolutions of the identity + + +An *orthogonal resolution of the identity* is a resolution of the identity $\sum_{i=1}^k \mathrm{P}_k = \mathrm{I}$ in which each projection $\mathrm{P}_i$ is orthogonal. + + + +Let $V$ be an inner product space. Orthogonal resolutions of the identity on $V$ corresponds to orthogonal direct sum decompositions of $V$ as follows: +1. If $\sum_{i=1}^k \mathrm{P}_i = \mathrm{I}$ is an orthogonal resolution of the identity, then +$$ + V = \bigodot_{i=1}^k \operatorname{ran}(\mathrm{P}_i) +$$ +where $\mathrm{P}_i$ is an orthogonal projections onto $\operatorname{ran}(\mathrm{P}_i)$. +2. Conversely, if $V = \bigodot_{i=1}^k S_i$ and if $\mathrm{P}_i$ is an orthogonal projection onto $S_i$, then $\sum_{i=1}^k \mathrm{P}_i = \mathrm{I}$. + +
+Proof + +**(1):** If $\sum_{i=1}^k \mathrm{P}_i = \mathrm{I}$ is an orthogonal resolution of the identity, it follows that + +$$ + V = \bigoplus_{i=1}^k \operatorname{ran}(\mathrm{P}_i) +$$ + +However, since the $\mathrm{P}_i$ are pairwise orthogonal and self-adjoint, it follows that + +$$ + \langle\mathrm{R}_i \mathbf{v}, \mathrm{R}_j \mathbf{w} \rangle = \langle\mathbf{v}, \mathrm{R}_i \mathrm{R}_j \mathbf{w} \rangle = \langle \mathbf{v}, 0 \rangle = 0 +$$ + +and so + +$$ + V = \bigodot_{i=1}^k \operatorname{ran}(\mathrm{P}_i) +$$ + +**(2):** For the converse, we know that $\sum_{i=1}^k \mathrm{P}_i = \mathrm{I}$ is a resolution of the identity, where e$\mathrm{P}_i$ is a projection onto $\operatorname{ran}(\mathrm{P}_i)$ along + +$$ + \ker(\mathrm{P}_i) = \bigodot_{j\neq i} \operatorname{ran}(\mathrm{P}_j) = \operatorname{ran}(\mathrm{P}_i)^\perp +$$ + +Hence, $\mathrm{P}_i$ is orthogonal. +
+
+ +## Unitary diagonalizability + + +A linear operator $\mathrm{T}\in\mathcal{L}(V)$ is *unitarily diagonalizable* (when $V$ is complex) and *orthogonally diagonalizable* (when $V$ is real) if there is an ordered orthonormal basis $O = (\mathbf{u}_i)_{i=1}^n$ of $V$ for which the matrix $[\mathrm{T}]_O$ is diagonal, or equivalently if + +$$ + \mathrm{T}\mathbf{u}_i = \lambda_i \mathbf{u}_i,\; \forall i=1,\dots,n +$$ + + + +Let $V$ be a finite-dimensional inner product space and let $\mathrm{T}\in\mathcal{L}(V)$. The following are equivalent: +1. $\mathrm{T}$ is unitarily (orthogonally) diagonalizable +2. $V$ has an orthonormal basis that consists entirely of eigenvectors of $\mathrm{T}$ +3. $V$ has the form $V = \bigodot_{i=1}^k E_{\lambda_i}$ where $E_{\lamdba_i}$ is the eigenspace of the eigenvalue $\lambda_i$ + + +## Normal operators + + +1. A linear operator $\mathrm{T}\in\mathcal{L}(V)$ on an inner product space $V$ is *normal* if it commutes with its adjoint +$$ + \mathrm{TT}^\dagger = \mathrm{T}^\dagger \mathrm{T} +$$ +2. A matrix $\mathbf{A}\in\mathcal{M}_n (\mathbf{F})$ is *normal* if $\mathbf{A}$ commutes with its adjoint $\mathbf{A}^\dagger$. + + +If $\mathrm{T}$ is normal and $O$ is an ordered orthonormal basis of $V$, then + +$$ + [\mathrm{T}]_O [\mathrm{T}]_O^\dagger = [\mathrm{T}]_O [\mathrm{T}^\dagger]_O = [\mathrm{TT}^\dagger]_O +$$ + +and + +$$ + [\mathrm{T}]_O^\dagger [\mathrm{T}]_O = [\mathrm{T}^\dagger]_O [\mathrm{T}]_O = [\mathrm{T}^\dagger \mathrm{T}]_O +$$ + +and so $\mathrm{T}$ is normal if and only if $[\mathrm{T}]_O$ is normal for some, and hence all, orthonormal bases for $V$. Note that this does not hold for bases that are not orthonormal. + + +Let $\mathrm{T}\in\mathcal{L}(V)$ be a normal operator. +1. The following are alos normal operators: + - $\mathrm{T}|_S$ if $\mathrm{T}$ reduces $(S,S^\perp)$ + - $\mathrm{T}^\dagger$ + - $\mathrm{T}^{-1}$ if $\mathrm{T}$ is invertible + - $p(\mathrm{T})$ for any polynomial $p(x) \in\mathbb{F}[x]$ +2. For any $\mathbf{v},\mathbf{w}\in V$ +$$ + \langle\mathrm{T}\mathbf{v},\mathrm{T}\mathbf{w}\rangle = \langle\mathrm{T}^\dagger \mathbf{v}, \mathrm{T}^\dagger \mathbf{w}\rangle +$$ +and in particular +$$ + \lVert\mathrm{T}\mathbf{v}\rVert = \lVert\mathrm{T}^\dagger \mathbf{v}\rVert +$$ +and so +$$ + \ker(\mathrm{T}^\dagger) = \ker(\mathrm{T}) +$$ +3. For any integer $k\geq 1$ +$$ + \ker(\mathrm{T}^k) = \ker(\mathrm{T}) +$$ +4. The minimal polynial $m_\mathrm{T}(x)$ is a product of distinct prime monic polynomials +5. $\mathrm{T}\mathbf{v} = \lambda\mathbf{v} \iff \mathrm{T}^\dagger \mathbf{v} = \bar{\lambda}\mathbf{v}$ +6. If $S$ and $T$ are submodules of $V_\mathrm{T}$ with relatively prime orders, then $S\perp T$ +7. If $\lambda$ and $\mu$ are distinct eigenvalues of $\mathrm{T}$, then $E_\lambda \perp E_\mu$ + +
+Proof + +**(2):** Normality implies that + +$$ + \langle \mathrm{T}\mathbf{v},\mathrm{T}\mathbf{w}\rangle = \langle\mathrm{T}^\dagger \mathrm{T}\mathbf{v}, \mathbf{v}\rangle = \langle \mathrm{TT}^\dagger \mathbf{v}, \mathbf{v}\rangle = \langle \mathrm{T}^\dagger \mathbf{v}, \mathrm{T}^\dagger \mathbf{v} \rangle +$$ + +**(3):** Consider the operator $\mathrm{S} = \mathrm{T}^\dagger \mathrm{T}$ which is self-adjoint, i.e. + +$$ + \mathrm{S}^\dagger = (\mathrm{T}^\dagger \mathrm{T})^\dagger = \mathrm{T}^\dagger \mathrm{T} = \mathrm{S} +$$ + +If $\mathrm{S}^k \mathbf{v} = \mathbf{0}$ for $k > 1$, then + +$$ + 0 = \langle\mathrm{S}^k \mathbf{v}, \mathrm{S}^{k-2} \mathbf{v} \rangle = \langle \mathrm{S}^{k-1}\mathbf{v}, \mathrm{S}^{k-1}\mathbf{v} \rangle +$$ + +and so $\mathrm{S}^{k-1} \mathbf{v} = \mathbf{0}$. Continuing in this way gives $\mathrm{S}\mathbf{v} = \mathbf{0}$. If $\mathrm{T}^k \mathbf{v} = \mathbf{0}$ for $k > 1$, then + +$$ + \mathrm{S}^k \mathbf{v} = (\mathrm{T}^\dagger \mathrm{T})^k \mathbf{v} = (\mathrm{T}^\dagger)^k \mathrm{T}^k \mathbf{v} = \mathbf{0} +$$ +and so $\mathrm{S}\mathbf{v} = \mathbf{0}$. Hence + +$$ + 0 = \langle\mathrm{S}\mathbf{v},\mathbf{v}\rangle = \langle\mathrm{T}^\dagger \mathrm{T}\mathbf{v},\mathbf{v}\rangle = \langle\mathrm{T}\mathbf{v},\mathrm{T}\mathbf{v}\rangle +$$ + +and so $\mathrm{T}\mathbf{v} = \mathbf{0}$ + +**(4):** Suppose that $m_\mathrm{T} (x) = p^e (x)q(x)$ where $p(x)$ is monic and prime. Then for any $\mathbf{v}\in V$ + +$$ + p^e (\mathrm{T})[q(\mathrm{T})\mathbf{v}] = 0 +$$ + +and since $p(\mathrm{T})$ is also normal, **(3)** implies that + +$$ + p(\mathrm{T})[q(\mathrm{T})\mathbf{v}] = 0 +$$ + +for all $\mathbf{v}\in V$. Hence, $p(\mathrm{T})q(\mathrm{T}) = 0$, which implies that $e = 1$. Thus, the prime factors of $m_\mathrm{T}(x)$ appear only to the first power. + +**(5):** This follows from **(2)** + +$$ + \ker(\mathrm{T} - \lambda\mathrm{I}) = \ker[(\mathrm{T} - \lambda\mathrm{I})^\dagger] = \ker(\mathrm{T}^\dagger - \bar{\lambda}\mathrm{I}) +$$ + +**(6):** If $o(S) = p(x)$ and $o(T) = q(x)$, then there are polynomials $a(x)$ and $b(x)$ for which $a(x)p(x) + b(x)q(x) = 1$ and so + +$$ + a(\mathrm{T})p(\mathrm{T}) + b(\mathrm{T})q(\mathrm{T}) = \mathrm{I} +$$ + +Now, $\mathrm{A} = a(\mathrm{T})p(\mathrm{T})$ annihilates $S$ and $\mathrm{B} = b(\mathrm{T})q(\mathrm{T})$ annihilates $T$. Thus, $\mathrm{B}^\dagger$ also annihilates $T$ and so + +$$ + \langle S, T \rangle = \langle (\mathrm{A} + \mathrm{B})S, T\rangle = \langle \mathrm{B}S, T \rangle = \langle S, \mathrm{B}^\dagger T \rangle = \Set{0} +$$ + +**(7):** This follows from **(6)**, since $o(E_\lambda) = x - \lambda$ and $o(E_\lambda) = x - \mu$ are relatively prime when $\lambda\neq\mu$. Alternatively, for $\mathbf{v}\in E_\lambda$ and $\mathbf{w}\in E_\mu$, we have + +$$ + \lambda\langle\mathbf{v},\mathbf{w}\rangle = \langle\mathrm{T}\mathbf{v},\mathbf{w}\rangle = \langle\mathbf{v},\mathrm{T}^\dagger \mathbf{w}\rangle = \langle\mathbf{v}, \bar{\mu}\mathbf{v}\rangle = \mu\langle\mathbf{v},\mathbf{w}\rangle +$$ + +and so $\lambda\neq\mu$ implies that $\langle\mathbf{v},\mathbf{w}\rangle = 0$. +
+
+ +### The spectral theorem for normal operators + + +Let $V$ be a finite-dimensional complex inner product space and let $\mathrm{T}\in\mathcal{L}(V)$. The following are equivalent: +1. $\mathrm{T}$ is normal +2. $\mathrm{T}$ is unitarily diagonalizable, i.e. $V_\mathrm{T} = \bigodot_{i=1}^k E_{\lambda_i}$ +3. $\mathrm{T}$ has an orthogonal spectral resolution $\mathrm{T} = \sum_{i=1}^k \lambda_i \mathrm{P}_i$ where $\sum_{i=1}^k \mathrm{P}_i = \mathrm{I}$ and $\mathrm{P}_i$ is orthogonal for all $i$, in which case $\Set{\lambda_i}_{i=1}^k$ is the spectrum of $\mathrm{T}$ and $\operatorname{ran}(\mathrm{P}_i) = E_{\lambda_i}$ and $\ker(\mathrm{P}_i) = \bigodot_{j\neq i} E_{\lambda_j}$ + +
+Proof + +**(2) $\iff$ (3):** From the diagonalizability characterization, we know that $V_\mathrm{T} = \bigoplus_{i=1}^k E_{\lambda_i}$ if and only if $\mathrm{T} = \sum_{i=1}^k \lambda_i \mathrm{P}_i$. In this case $\operatorname{ran}(\mathrm{P}_i) = E_{\lambda_i}$ and $\ker(\mathrm{P}_i) = \bigoplus_{j\neq i} E_{\lambda_j}$. However, $E_{\lambda_i} \perp E_{\lambda_j}$ for $i\neq j$ if and only if + +$$ + \operatorname{ran}(\mathrm{P}_i) \perp \ker(\mathrm{P}_i) +$$ + +That is, if and only if each $\mathrm{P}_i$ is orthogonal. Hence, the direct sum $V_\mathrm{T} = \bigoplus_{i=1}^k$ is an orthogonal sum if and only if each projection is orthogonal. +
+
+ + +A linear operator $\mathrm{T}$ on a finite-dimensional real inner product space is normal if and only if + +$$ + V = \left(\bigodot_{i=1}^k E_{\lambda_i} \right) \odot \left(\bigodot_{j=1}^m W_j \right) +$$ + +where $\Set{\lambda_i}_{i=1}^k$ is the spectrum of $\mathrm{T}$ and each $W_j$ is an indecomposable two-dimensional $\mathrm{T}$-invariant subspace with an ordered basis $B_i$ for which + +$$ + [\mathrm{T}]_{B_i} = \begin{bmatrix} a_i & -b_i \\ b_i & a_i \end{bmatrix} +$$ + +
+Proof + +We only need to show that if $V$ has such a decomposition, then $\mathrm{T}$ is normal. However, + +$$ + [\mathrm{T}]_{B_i} [\mathrm{T}]_{B_i}^\top = (a_i^2 + b_i^2)\mathbf{I}_2 = [\mathrm{T}]_{B_i}^\top [\mathrm{T}]_{B_i} +$$ + +and so $[\mathrm{T}]_{B_i}$ is normal. It follows that $\mathrm{T}$ is normal. +
+
+ +## Self-adjoint operators + + +A linear operator $\mathrm{T}\in\mathcal{L}(V)$ on an $\mathbb{F}$-inner product space $V$ is *self-adjoint* if + +$$ + \mathrm{T}^\dagger = \mathrm{T} +$$ + +A self-adjoint operator is also called *Hermitian* when $\mathbb{F} = \mathbb{C}$ and *symmetric* when $\mathbb{F} = \R$, in which case $\mathrm{T} = \mathrm{T}^\top$. + + + +A linear operator $\mathrm{T}\in\mathcal{L}(V)$ on an $\mathbb{F}$-inner product space $V$ is *skew self-adjoint* if + +$$ + \mathrm{T}^\dagger = -\mathrm{T} +$$ + +A skew self-adjoint operator is also called *skew-Hermitian* when $\mathbb{F} = \mathbb{C}$ and *skew-symmetric* when $\mathbb{F} = \R$, in which case $\mathrm{T} = -\mathrm{T}^\top$. + + + +A linear operator $\mathrm{T}\in\mathcal{L}(V)$ on an $\mathbb{F}$-inner product space $V$ is *unitary* for $\mathbb{F} = \mathbb{C}$ and orthogonal for $\mathbb{F} = \R$ if $\mathrm{T}$ is invertible and + +$$ + \mathrm{T}^\dagger = \mathrm{T}^{-1} +$$ + + + +Let $\mathrm{T}\in\mathcal{L}(V)$ be a linear operator on an $\mathbb{F}$-inner product space $V$. The *quadratic form* associated with $\mathrm{T}$ is the function $q_\mathrm{T}: V\to\mathbb{F}$ defined by + +$$ + q_\mathrm{T} (\mathbf{v}) = \langle\mathrm{T}\mathbf{v}, \mathbf{v}\rangle +$$ + + + +Let $V$ be a finite-dimensional inner product space and let $\mathrm{S},\mathrm{T}\in\mathcal{L}(V)$ be linear operators on $V$. +1. If $\mathrm{S}$ and $\mathrm{T}$ are self-adjoint, then so are the following + - $\mathrm{S} + \mathrm{T}$ + - $\mathrm{T}^{-1}$ if $\mathrm{T}$ is invertible + - $p(\mathrm{T})$, for any real polynomial $p(x)\in\R[x]$ +2. A complex operator $\mathrm{T}$ is Hermitian if and only if the quadratic form $q_\mathrm{T}$ is real for all $\mathbf{v}\in V$ +3. If $\mathrm{T}$ is a complex operator or a real symmetric operator, then $\mathrm{T} = 0 \iff q_\mathrm{T} = 0$ +4. The characteristic polynomial $c_\mathrm{T}(x)$ of a self-adjoint operator $\mathrm{T}$ splits over $\R$, i.e. all complex roots of $c_\mathrm{T}(x)$ are real. Hence, the minimal polynomial $m_\mathrm{T}(x)$ of $\mathrm{T}$ is the product of distinct monic linear factors over $\R$. + +
+Proof + +**(2):** If $\mathrm{T}$ is Hermitian, then + +$$ + \langle\mathrm{T}\mathbf{v}, \mathbf{v}\rangle = \langle\mathbf{v},\mathrm{T}\mathbf{v}\rangle = \overline{\langle\mathrm{T}\mathbf{v},\mathbf{v}} +$$ + +and so $q_\mathrm{T}(\mathbf{v}) = \langle\mathrm{T}\mathbf{v},\mathbf{v}\rangle$ is real. Conversely, if $\langle\mathrm{T}\mathbf{v},\mathbf{v}\rangle\in\R$, then + +$$ + \langle\mathbf{v},\mathrm{T}\mathbf{v}\rangle = \langle\mathrm{T}\mathbf{v},\mathbf{v}\rangle = \langle\mathbf{v}, \mathrm{T}^\dagger \mathbf{v}\rangle +$$ + +and so $\mathrm{T} = \mathrm{T}^\dagger$. + +**(3):** We only need to prove that $q_\mathrm{T} = 0$ implies $\mathrm{T} = 0$ when $\R$. If $q_\mathrm{T} = 0$, then + +$$ +\begin{align*} + 0 =& \langle\mathrm{T}(\mathbf{x} + \mathbf{y}), \mathbf{x} + \mathbf{y}\rangle \\ + =& \langle\mathrm{T}\mathbf{x}, \mathbf{x}\rangle + \langle\mathrm{T}\mathbf{y}, \mathbf{y}\rangle + \langle\mathrm{T}\mathbf{x}, \mathbf{y}\rangle + \langle\mathrm{T}\mathbf{y}, \mathbf{x}\rangle \\ + =& \langle\mathrm{T}\mathbf{x}, \mathbf{y}\rangle + \langle\mathrm{T}\mathbf{y}, \mathbf{x}\rangle \\ + =& \langle\mathrm{T}\mathbf{x}, \mathbf{y}\rangle + \langle\mathbf{x}, \mathrm{T}\mathbf{x}\rangle \\ + =& \langle\mathrm{T}\mathbf{x}, \mathbf{y}\rangle + \langle\mathrm{T}\mathbf{x}, \mathbf{y}\rangle \\ + =& 2\langle\mathrm{T}\mathbf{x}, \mathbf{y}\rangle +\end{align*} +$$ + +and so $\mathrm{T} = 0$. + +**(4):** If $\mathrm{T}$ is Hermitian for $\mathbb{F} = \mathbb{C}$ and $\mathrm{T}\mathbf{v} = \lambda\mathbf{v}$, then + +$$ + \lambda\mathbf{v} = \mathrm{T}\mathbf{v} = \mathrm{T}^\dagger \mathbf{v} = \bar{\lambda}\mathbf{v} +$$ + +and so $\lambda = \bar{\lambda}$ is real. + +If $\mathrm{T}$ is symmetric for $\mathbb{F} = \R$, note that a nonreal root of $c_\mathrm{T}(x)$ is not an eigenvalue of $\mathrm{T}$. If $\mathbf{A} = [\mathrm{T}]_O$ for any ordered orthonormal basis $O$ for $V$, then $c_\mathrm{T}(x) = c_\mathbf{A}(x)$. Now, $\mathbf{A}$ is a real symmetric matrix, but can be thought of as a complex Hermitian matrix with real entris. As such, it represents a Hermitian linear operator on the complex space $\mathbb{C}^n$ and so, by what we have just shown, all (complex) roots of its characteristic polynomial are real. However, the characteristic polynomial of $\mathbf{A}$ is the same, whether we think of $\mathbf{A}$ as a real or complex matrix and so the result follows. +
+
+ +### Positive definite matrices An $n\times n$ matrix $\boldsymbol{A}\in\mathcal{M}_{n}(\mathbb{F})$ is *positive definite* if @@ -6506,6 +7083,7 @@ For a self-adjoint $n\times n$ matrix $\boldsymbol{A}\in\mathcal{M}_{n}(\mathbb{ 4. The determinants of leading principal minors of $\boldsymbol{A}$ are positive. **(Sylvester's criterion)** + # Metric vector space ## Matrices