# Introduction to Partial Differential Equations
---

## Chapter 1: Preliminaries (Calculus, Linear Algebra, ODEs, and Python)
---

## Want to use Colab? [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp1/Chp1Sec5.ipynb)

---

## Prepping the environment for interactive plots in Colab
---

In [None]:
if 'google.colab' in str(get_ipython()):
    print('Running on CoLab - installing missing packages')
    !pip install ipympl
    from IPython.display import clear_output
    clear_output()
    exit()
else:
    print('Not running on CoLab - assuming environment has necessary packages')

In [None]:
%matplotlib widget
if 'google.colab' in str(get_ipython()):
    from google.colab import output
    output.enable_custom_widget_manager()

## Creative Commons License Information
---

<a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc/4.0/80x15.png" /></a><br /><span xmlns:dct="http://purl.org/dc/terms/" property="dct:title">Introduction to Partial Differential Equations: Theory and Computations</span> by <a xmlns:cc="http://creativecommons.org/ns#" href="https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations" property="cc:attributionName" rel="cc:attributionURL">Troy Butler</a> is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/">Creative Commons Attribution-NonCommercial 4.0 International License</a>.<br />Based on a work at <a xmlns:dct="http://purl.org/dc/terms/" href="https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations" rel="dct:source">https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations</a>.

---
## Section 1.5: Linear Algebra and Differential Equations
---

As we come to the end of Chapter 1, we make some important connections between linear algebra and differential equations that serve us well throughout the rest of this course. Along the way, we review some useful linear algebra concepts and how to use the [`linalg` sublibrary of `numpy`](https://numpy.org/doc/stable/reference/routines.linalg.html) to perform some useful calculations. 

In [None]:
import numpy as np
from numpy.linalg import matrix_rank
import numpy.linalg as linalg  

In [None]:
import matplotlib.pyplot as plt
%matplotlib widget

In [None]:
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

---
### Section 1.5.1: A high-level overview of linear algebra concepts
---

Note that we could allow the vectors and matrices discussed below to have complex components, but we are limiting ourselves to real spaces for now. 
The extension to complex spaces is straightforward in most cases since $\mathbb{C}$ is [isometrically isomorphic](https://en.wikipedia.org/wiki/Isometry) to $\mathbb{R}^2$ (i.e., they are "basically the same" and the more significant differences arise when considering behavior of functions on these spaces). 

The primary focus is on reviewing linear algebra properties related to the [Banach space](https://en.wikipedia.org/wiki/Banach_space) $\mathbb{R}^n$ (recall that a Banach space is a complete normed linear space). However, we will also discuss/present ideas in the context of some polynomial spaces of finite order (representing a finite-dimensional function space) and also make a few connections to useful perspectives we take when studying PDEs throughout this course. 

---
### Section 1.5.2: Linear Independence
--- 

A set of vectors $V=\{v_i\}_{i=1}^k\subset\mathbb{R}^n$ is **linearly independent** if the only linear combination that produces the $n$-dimensional zero vector is the trivial linear combination, i.e., for $\{c_i\}_{i=1}^k\subset\mathbb{R}$

$$
    \sum_{i=1}^k c_i v_i = 0\in\mathbb{R}^n \Longleftrightarrow c_i = 0 \ \forall 1\leq i\leq k.
$$

Otherwise, the set of vectors is **linearly dependent**.

Other important concepts related to a set of vectors $V$ are

  - The **span** of a set of vectors $V$ is a subspace of $\mathbb{R}^n$ defined by the set of all possible linear combinations of vectors in $V$, i.e.,
  <br><br>
  $$
      \operatorname{span} V := \left\{\sum_{i=1}^k c_i v_k \in\mathbb{R}^n \, : \, c_i\in\mathbb{R} \ \forall 1\leq i\leq k\right\}.
  $$
  <br><br>
  - A linearly independent spanning set for a subspace is called a **basis**. 

In [None]:
# The span of a single vector in $\mathbb{R}^2$ is a vector subspace conceptualized
# by a line through the origin pointing in both the positive and negative directions
# of the vector. Any vector that "lays" upon this line can be used as a basis for
# this subspace.
def plot_span_vector_2d(v_x, v_y, xlim=[-5, 5], ylim=[-5, 5], fignum=0):
    plt.figure(fignum)
    plt.clf()
    
    # Define a range of constant multiples of the vector
    cs = np.linspace(-50, 50, 100)
    # Plot the span
    plt.plot(cs*v_x, cs*v_y)  
    # Plot the vector
    plt.arrow(0, 0, v_x, v_y, 
              head_width=0.25, width=0.1, color='r', alpha=1)
    
    # Various plotting properties
    plt.xlim(xlim)
    plt.ylim(ylim)
    plt.axvline(0, ls=':', c='k')
    plt.axhline(0, ls=':', c='k')
    plt.show()
    plt.tight_layout()
    return

In [None]:
%reset -f out 

%matplotlib widget
interact_manual(plot_span_vector_2d, 
                v_x = widgets.BoundedFloatText(value=-2, 
                                               min=-2,
                                               max=2,
                                               step=0.1,
                                               description='x-direction of vector'),
                v_y = widgets.BoundedFloatText(value=1, 
                                               min=-2,
                                               max=2,
                                               step=0.1,
                                               description='y-direction of vector'),
                xlim = fixed([-5, 5]),
                ylim = fixed([-5, 5]),
                fignum = fixed(0)
               )

---
#### Theorem: Any collection of $n+1$ vectors in $\mathbb{R}^n$ form a linearly dependent set.
---

**Proof:**
Let $e_i$ denote the standard basis vectors in $\mathbb{R}^n$ for $i=1,2,\ldots, n$. 

Let $V=\{v_1, v_2, \ldots, v_{n+1}\}$ denote a set of $n+1$ vectors from $\mathbb{R}^n$. Assume that this set is linearly independent.

For $v_1$, there exists constants $\alpha_i$, for $i=1,2,\ldots, n$, such that
$$
   \large v_1 = \sum_{i=1}^n \alpha_ie_i.
$$
Assume without loss of generality that $\alpha_1\neq 0$. 
Then, 
$$
    \large e_1 = \frac{1}{\alpha_1}\left(v_1 - \sum_{i=2}^n \alpha_i e_i\right).
$$
This implies that $\{v_1, e_2, e_3, \ldots, e_n\}$ spans $\mathbb{R}^n$. 
Thus, for $v_2$, there exists (new) constants $\alpha_i$, for $i=1,2,\ldots, n$, such that
$$
    \large v_2 = \alpha_1 v_1 + \sum_{i=2}^n \alpha_i e_i.
$$
Since the set $V$ is linearly independent, we must have that at least one $\alpha_i\neq 0$ for an $i=2,3,\ldots, n$. 
Again, without loss of generality, assume that $\alpha_2 \neq 0$, and as before arrive at the conclusion that $\{v_1, v_2, e_3, e_4, \ldots, e_n\}$ is a spanning set for $\mathbb{R}^n$.

We can repeat this argument $n$ times until we have replaced each $e_i$ with an associated $v_i$ so that $\{v_1,v_2,\ldots, v_n\}$ forms a spanning set for $\mathbb{R}^n$.
However, this immediately implies that $v_{n+1}$ is a linear combination of these vectors, which contradics the assumption of linear independence. $\Box$

---
#### Spaces of polynomials are finite-dimensional vector spaces
---

Let $\mathcal{P}_n$ denote the space of all real-valued polynomials of order less than or equal to $n$, i.e., 

$$
    \mathcal{P}_n := \{ a_0 + a_1x + \cdots + a_n x^n \, : \, a_i\in\mathbb{R}, 0\leq i\leq n\}.
$$

A standard problem in linear algebra is to show that $\mathcal{P}_n$ is an $(n+1)$-dimensional vector space. 

Each polynomial function is then viewed as a vector in the space, and a useful basis is given by the set of $n+1$ monomials defined by $V = \{ 1, x, x^2, \ldots, x^n\}$ (note that $1 = x^0$). In other words, this is a set of linearly independent vectors whose span is $\mathcal{P}_n$. 

For a specific example, consider $\mathcal{P}_2$ and note $\{1, x, x^2\}$ is the smallest set needed to describe every single quadratic polynomial. 

---
### Section 1.5.3: Solvability of a linear system
---

Suppose $A\in\mathbb{R}^{k\times n}$ and we are interested in the solvability of the linear system given by

$$
    Ax=b, 
$$

where $x\in\mathbb{R}^n$ and $b\in\mathbb{R}^k$. We refer to the vector $b$ as the "data" and $x$ as the "solution" to this system. 

Let $\operatorname{Cols}(A)$ denote the set of $k$-dimensional vectors defined by the columns of $A$, i.e., $\operatorname{Cols}(A) = \{ v_i\in\mathbb{R}^k\, : \, v_i = i\text{th column of }A\}$. 

The vector subspace of $\mathbb{R}^k$ defined by $\operatorname{span} \operatorname{Cols}(A)$ is called the column space of $A$. 

- In order for a solution to $Ax=b$ to exist, we need $b\in \operatorname{span} \operatorname{Cols}(A) $. In other words, $b$ must exist in the column space of $A$.

- $\operatorname{Cols}(A)$ may not form a basis to the space of admissible data (i.e., the data for which there exists a solution to the system) because the columns of $A$ may not be linearly independent. 

- The dimension of the column space is called the **rank** of the matrix.

In this course, we focus primarily on linear differential equations and use $L$ to denote the linear differential operator. The differential equation is then written as $Lu=f$ where $f$ is the "data" and $u$ is the solution we are after. Note the immediate symmetry between the notation $Ax=b$ and $Lu=f$. When we use finite difference approximations in place of derivatives, $Lu=f$ is often rewritten as $Av=b$ where $v$ is a finite-dimensional vector with components representing the approximation to $u$ at a particular grid point in the spatial domain. The solvability of $Lu=f$ is often directly tied to the solvability of $Av=b$.

Below, we give examples of square matrices defined in terms of vectors and explore the linear dependence or independence of these vectors by investigating the rank of the matrix.

In [None]:
# These are linearly dependent vectors used to construct a matrix
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])
v3 = np.array([7, 8, 9])

A = np.array([v1, v2, v3]).transpose()  # transpose interchanges rows and columns
print(A)

In [None]:
matrix_rank(A)

In [None]:
# As we will see in Chapter 2, the following vectors are inspired by a 
# centered finite difference approximation used to discretize $-u''=f$
u1 = np.array([2, -1, 0])
u2 = np.array([-1, 2, -1])
u3 = np.array([0, -1, 2])

A = np.array([u1, u2, u3]).transpose()

print(A)

In [None]:
print(matrix_rank(A))

The above example hints at the fact that the way in which we discretize $-u''=f$ in this course should lead to a solvable linear system of equations because the subsequent matrix-vector equation defined by this discretization will involve a nonsingular matrix. We define these concepts below more carefully.

---
#### Polynomial interpolation
---

Consider $\mathcal{P}_1$, which is the space of all linear real-valued functions of the form $a_0+a_1x$. Recall that we can fit a line through any two points. Thus, if we are given points $\{(x_0, y_0), (x_1, y_1)\}$, we can seek the $a_0$ and $a_1$ defining the line that passes through these points by solving the linear system defined by

$$
    \begin{pmatrix}
        1 & x_0 \\
        1 & x_1 \\
    \end{pmatrix}
    \begin{pmatrix}
        a_0 \\
        a_1
    \end{pmatrix} 
    =
    \begin{pmatrix}
        y_0 \\
        y_1
    \end{pmatrix}.
$$

Note that as long as $x_0\neq x_1$, the matrix above will always have rank 2 because the two columns will always be linearly independent. Moreover, the column space will always be $\mathbb{R}^2$ as long as $x_0\neq x_1$, which implies we can always find a solution to the system. This is consistent with our recollection that we can fit a line through any two points. 

What if $x_0=x_1$? 

- The matrix will have rank 1 because the two columns are now linearly dependent (the second column will just be equal to $x_0$ times the first column in this case). One of two things will happen as a result. 

  - If $y_0=y_1$, then there are an infinite number of solutions because there are an infinite number of linear functions that pass through the point $(x_0,y_0)$. 
  
  - However, if $y_0\neq y_1$, then the data vector is no longer in the span of the column space of the matrix, so there is *no* solution. While you may object to this since you can envision a *vertical* line through the points $(x_0,y_0)$ and $(x_0, y_1)$, you need to recall that this is *not* a function (it fails the aptly named "vertical line test") and does not belong to the space $\mathcal{P}_1$.
  
We can of course generalize the above ideas to $\mathcal{P}_n$ where any set of $n+1$ points in $\mathbb{R}^2$ with distinct $x$-coordinates can be used to uniquely define a polynomial of order $n$ that interpolates (i.e., "passes through") this data (notice how the $y$-coordinates of these points do in fact define the so-called "data" vector). It is important to note that if $n\geq 2$, it is possible that the function in $\mathcal{P}_n$ that interpolates a set of $n+1$ data points is given by a polynomial of order $k<n$. For example, suppose $n=10$ but the $11$ points we are given all lay upon a *horizontal* line, then $a_1=a_2=\cdots=a_{10}=0$ and only $a_0$ has the potential to be non-zero (it will be equal to whatever the $y$-values of the data points are, which must all agree if the data do in fact lay upon a horizontal line). 

Why bring this up? Well, certain types of data that are "restricted" (in a sense) often lead to solutions that are themselves restricted to a particular smaller-dimensional subspace of the vector space defining all possible solutions to the problem. This matters in PDEs where the problem $Lu=f$ may be investigated for a class of data $f$ parameterized in some way (e.g., to model typically expected variations in source terms one will encounter in a laboratory or field setting). 

One final note: What if we are given more than $n+1$ data points for determining a function in $\mathcal{P}_n$ that interpolates this data? Again assuming that the $x$-coordinates are all distinct, then there are two cases. The solution will either not exist (there is no interpolant) or a unique solution will exist in which case more data are given than is necessary (the problem is solvable but over-determined). The rank of the matrix we form cannot grow beyond $n+1$ because there are only $n+1$ columns no matter how many data points we consider. However, we often turn to *regression* to find a polynomial of *best fit* defined by minimizing the sum square error when this occurs (assuming, of course, that the data are now "noisy" and shouldn't be interpolated at all in which case we often prefer much more than $n+1$ data points). 

In [None]:
import sympy as sym

In [None]:
class polynomial_interpolant(object):
    def __init__(self, data):
        '''
        Initialize object for constructing a polynomial interpolant through 
        the given array of data points.
        
        Parameters
        ----------
        data : numpy array of shape (n+1, 2)
            The (x,y) points where each of the points defines a row in the array
        '''
        self.data = data
        self.n = len(data)-1
        self.construct_poly()
        
    def order(self):
        print('Polynomial order is ', self.n)
    
    def compute_poly_coeffs(self):
        self.A = np.zeros((self.n+1, self.n+1))
        self.A[:,0] = np.ones(self.n+1)
        for i in range(1, self.n+1):
            self.A[:,i] = self.data[:,0]**i
        self.poly_coeffs = np.linalg.solve(self.A, self.data[:,1])
        
    def construct_poly(self):
        self.compute_poly_coeffs()
        x = sym.symbols('x')
        self.poly = 0
        for i in range(self.n+1):
            self.poly += self.poly_coeffs[i]*x**i

In [None]:
poly1 = polynomial_interpolant(np.random.uniform(low=-5, high=5, size=(3,2)))
poly1.poly

In [None]:
p = sym.lambdify(sym.symbols('x'), poly1.poly)

In [None]:
%matplotlib widget
plt.figure(1)

x = np.linspace(-5, 5, 100)
plt.plot(x, p(x))

plt.scatter(poly1.data[:,0], poly1.data[:,1], marker='s', color='r', s=100)

In [None]:
poly2 = polynomial_interpolant(np.random.uniform(low=-5, high=5, size=(5,2)))
poly2.poly

In [None]:
p = sym.lambdify(sym.symbols('x'), poly2.poly)

In [None]:
%matplotlib widget
plt.figure(2)

x = np.linspace(-5, 5, 100)
plt.plot(x, p(x))

plt.scatter(poly2.data[:,0], poly2.data[:,1], marker='s', color='r', s=100)

---
### Section 1.5.4: Singular and Nonsingular Matrices
---

We only use the terms singular and nonsingular matrices to refer to square matrices $A\in\mathbb{R}^{n\times n}$. 

If matrix $A$ has a multiplicative inverse $A^{-1}$ so that $A^{-1}A= I$ where $I$ is the $n\times n$ identity matrix, then $A$ is said to be nonsingular. 
Otherwise, we say that $A$ is singular.

   - If $A$ is nonsingular, then $Ax=b$ can be solved for all $b\in\mathbb{R}^n$ and the solution is unique.

   - If $A$ is nonsingular, then $Ax=b$ either has no solution or an infinite number of solutions. It will have an infinite number of solutions if $b$ is in the span of the columns of $A$.
   
   - The **rank** of a matrix is the number of linearly independent columns (or rows) in the matrix (and the column and row ranks are exactly the same, which is why we just say rank in general). The rank of a nonsingular $n\times n$ matrix is $n$.

In [None]:
A1 = np.array([[2, -1, 0], [-1, 2, -1], [0, -1, 2]])
A2 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print('A1 = \n', A1)
print('\nThe rank of A1 is ', matrix_rank(A1))
print('\nThe inverse of A1 is \n\n', np.linalg.inv(A1))

print('\nA2 = \n', A2)
print('\nThe rank of A2 is \n', matrix_rank(A2))
print('\nThe inverse of A2 is (uh-oh)\n\n', np.linalg.inv(A2))

In [None]:
b1 = np.array([1, 2, 1])

x1 = np.linalg.solve(A1, b1)
print(x1)

# check solution, should be equal to b1
print(np.dot(A1, x1))

In [None]:
b2 = np.array([1, 4, 7])  # This is the first column of A2, so a solution should exist

x2 = np.linalg.solve(A2, b2)  # But this will produce an error
print(x2)

In [None]:
x2 = np.linalg.lstsq(A2, b2, rcond=None)[0]  # What if we try to find the vector of "best fit" with least squares?
print(x2)

# check solution, should be equal to b2
print(np.dot(A2, x2))

---
### Section 1.5.5: Eigenvalues and Eigenvectors
---

This again only makes sense if we are discussing square matrices $A\in\mathbb{R}^{n\times n}$ (the eigenvalues and eigenvectors may in fact be complex valued even if $A$ is not). We say that $\lambda$ is an eigenvalue of $A$ if there exists a nonzero vector $x$ such that 
$$
\large Ax=\lambda x.
$$

- The span of all eigenvectors associated with an eigenvalue is called the ***eigenspace***. 

- Using eigenvectors to form a basis for a linear operator is a very convenient way to form a basis for the solution space that makes solving problems very straightforward. In particular, if an operator is self-adjoint (sometimes we say symmetric when referring to real-valued matrices, which just means $A=A^\top$), then all the eigenvalues are real and the corresponding eigenvectors form an orthogonal set. In this case, it is very easy to solve problems involving the linear operator using the eigenvectors as a basis for the solution space. 
 
When studying a linear differential equation written as $Lu=f$, we often step back and look for eigenvalues and eigenfunctions (remember functions are types of vectors) of the operator $L$, i.e.,  $Lu=\lambda u$, where $u$ also is required to satisfy any potential boundary conditions. It turns out that the eigenfunctions we encounter in this class correspond to solving $Lu=f$ via Fourier series.

In [None]:
eig_vals1, eig_vecs1 = np.linalg.eig(A1)
eig_vals2, eig_vecs2 = np.linalg.eig(A2)

print('Eig. values of A1, \n', eig_vals1)
print('\nEig. vectors of A1 arranged as a matrix, \n', eig_vecs1)

print('\nEig. values of A2, \n', eig_vals2)
print('\nEig. vectors of A2 arranged as a matrix, \n', eig_vecs2)

In [None]:
print(np.dot(A1, eig_vecs1))

print()

print(np.dot(eig_vecs1, np.diag(eig_vals1)))

---
#### Theorem: Let $(\lambda, x)$ be an eigenvalue/eigenvector pair associated with a matrix $A\in\mathbb{R}^{n\times n}$ and $\alpha\in\mathbb{R}$, then $\gamma=1+\alpha\lambda$ and $x$ defines an eigenvalue/eigenvector pair for the matrix $B=I+\alpha A$. 
---

**Proof:**
$$
\begin{align}
    Bx  = & (I+\alpha A)x \\
        = & Ix + \alpha Ax \\
        = & x + \alpha \lambda x \\
        = & (1+\alpha\lambda)x \\
        = & \gamma x. \ \Box
\end{align}
$$

---
#### Corollary: Let $(\lambda, x)$ be an eigenvalue/eigenvector pair associated with a matrix $A\in\mathbb{R}^{n\times n}$ and $\alpha\in\mathbb{R}$, then $\gamma = \alpha + \lambda$ and $x$ defines an eigenvalue/eigenvector pair for the matrix $B=\alpha I + A$.
---

The proof is a straightforward modification of the prior argument. 

---
### Section 1.5.6: Euclidean Inner Product and the Associated Norm
---

We write the Euclidean inner product of two vectors $x,y\in\mathbb{R}^n$ as

$$
\large (x,y) = \sum_{j=1}^n x_jy_j,
$$

and the associated (Euclidean) norm is defined by 

$$
 \large   \|x\| = (x,x)^{1/2}
$$

While all norms are equivalent on a finite-dimensional vector space, the Euclidean norm (also called the 2-norm) is the only one *induced* by an inner product. An inner product structure is very useful to have on a vector space because it provides two things: a natural way to define a norm and a geometric notion of "angles" between vectors. In other words, we can define and investigate orthogonality (i.e., whether or not two vectors are "perpendicular" to each other).

- Two vectors $x$ and $y$ are ***orthogonal*** if $(x,y)=0$.

- A set of vectors $V=\{v_1, v_2, \ldots, v_k\}$ is an ***orthogonal set*** if $(v_i,v_j)=0$ for all $i\neq j$. If, in addition, $||v_i||=1$ for all $i=1,2,\ldots, k$, then the set is called ***orthonormal***.

  - Of all the properties that a norm satisfies, perhaps the two most important are the triangle inequality
<br><br>
$$
\large \|x+y\| \leq \|x\| + \|y\|
$$
<br><br>
and the Cauchy-Schwarz inequality for any inner-product induced norm
<br><br>
$$
\large (x,y)\leq \|x\|\cdot\|y\|.
$$

Note that the Cauchy-Schwartz inequality only holds for inner-product induced norms (i.e., on an inner product space where we use the inner-product induced norm), but the triangle inequality holds on any normed linear vector space. 

In [None]:
print(u1)
print(u2)
print(np.dot(u1, u2))

In [None]:
print(u1)
print(u3)
print(np.dot(u1, u3))

In [None]:
print(u2)
print(u3)
print(np.dot(u2, u3))

In [None]:
print(v1)
print(v3)
print(np.dot(v1, v3))

In [None]:
print(np.linalg.norm(u1))

print(np.linalg.norm(u2))

print(np.linalg.norm(v1))

print(np.linalg.norm(v3))

In [None]:
# Check Cauchy-Schwarz inequality
print(np.dot(u1, u3))
print(np.linalg.norm(u1)*np.linalg.norm(u3))

In [None]:
print(np.dot(v1, v3))
print(np.linalg.norm(v1)*np.linalg.norm(v3))

---
#### Theorem: The Pythagorean theorem can be generalized to inner product spaces. 
---

**Proof:**

Let $x,y\in\mathbb{R}^n$ be orthogonal. Then,
\begin{eqnarray}
   ||x+y||^2 &=& (x+y,x+y) \\
              &=& (x,x) + \underbrace{(x,y)}_{=0} + \underbrace{(y,x)}_{=0} + (y,y) \\
              &=& (x,x) + (y,y) \\
              &=& ||x||^2 + ||y||^2 \ \Box
\end{eqnarray}

---
#### Theorem: A set of othonormal vectors forms a linearly independent set. 
---

**Proof:**

Let $V=\{v_1,v_2,\ldots, v_k\}\subset\mathbb{R}^n$ be a set of orthonormal vectors and let $\alpha_i$ for $1\leq i\leq k$ be any constants such that
$$
    \alpha_1 v_1 + \alpha_2 v_2 + \cdots + \alpha_k v_k = 0.
$$
Consider any $1\leq j\leq k$, take the inner product of both sides of the above equation with $v_j$. 
The orthormality of the vectors implies
$$
    \alpha_j = 0.
$$
Since the $j\in\{1,2,\ldots, k\}$ was arbitrary, this shows that the only linear combination of the vectors in $V$ that produces the zero vector is in fact the trivial linear combination. 
Thus, the vectors $V$ are linearly independent. $\Box$

---
#### Theorem: An orthonormal set in $\mathbb{R}^n$ is a basis for $\mathbb{R}^n$ and the (unique) coefficients we use to write a vector $z\in\mathbb{R}^n$ as a linear combination of this orthonormal set are easily determined via inner products. 
---

**Proof:**

Let $Y=\{y_1,y_2,\ldots, y_n\}\subset\mathbb{R}^n$ be an orthonormal set.
Then, by part (j), this is a linearly independent set of $n$ vectors.
Thus, it is a basis for $\mathbb{R}^n$ by a standard result in linear algebra.
Finally, for any $z\in\mathbb{R}^n$, there exists constants $\{c_1,c_2,\ldots, c_n\}\subset\mathbb{R}$ such that
$$
    z = \sum_{j=1}^n c_jy_j.
$$
These constants can easily be determined by taking the inner product of both sides of the equation with respect to $y_k$ for $1\leq k\leq n$ and exploiting the orthormality of the set to reveal that
$$
    (z, y_k) = c_k \ \forall \ 1\leq k\leq n. \ \Box
$$

---
### Section 1.5.7 Positive Definite Matrices
---

If $A\in\mathbb{R}^{n\times n}$ is symmetric and $x^\top Ax>0$ for all nonzero $x\in\mathbb{R}^n$, then $A$ is said to be ***positive definite***. We often write SPD to denote a symmetric positive definite matrix. 
If $x^\top Ax\geq 0$ for all nonzero $x\in\mathbb{R}^n$, then $A$ is said to be ***positive semidefinite***.

- A SPD matrix is nonsingular.

- A symmetric matrix is also positive definite if and only if all th e eigenvalues are real and strictly positive. If some of the eigenvalues are zero but all others are positive, then the symmetric matrix is positive semidefinite. 

In [None]:
# A simple way to check symmetry
print(A1 - np.transpose(A1))
print(A2 - A2.T)

---
#### Theorem: The sum of SPD matrices is also SPD.
---

**Proof:**

Suppose $A,B\in\mathbb{R}^{n\times n}$ are SPD and let $C=A+B$.
Then, 

$$
    C^\top = (A+B)^\top = A^\top + B^\top = A+B = C,
$$

so $C$ is symmetric.
Let $x\in\mathbb{R}^n$ be nonzero.
Then,

$$
    x^\top Cx = x^\top(A+B)x = x^\top Ax + x^\top Bx > 0,
$$

so $C$ is positive definite. $\Box$

---
#### Theorem: Let $A\in\mathbb{R}^{n\times n}$ be nonsingular and $B=A^\top A$, then $B$ is SPD.
---

**Proof:**

Let $A\in\mathbb{R}^{n\times n}$ be nonsingular and $B=A^\top A$. Then,

$$
    B^\top = (A^\top A)^\top = A^\top (A^\top)^\top = A^\top A = B,
$$

so $B$ is symmetric.
Let $x\in\mathbb{R}^n$ be nonzero, then

$$
    x^\top B x = x^\top A^\top A x = (Ax, Ax) = ||Ax||^2.
$$

Since $A$ is nonsingular and $x\neq 0$, then $Ax\neq 0$, so $||Ax||^2>0$. 
Thus, 

$$
    x^\top B x > 0, 
$$

which implies $B$ is positive definite. $\Box$

---
## Navigation:

- [Previous](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp1/Chp1Sec4.ipynb)

- [Next](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp1/Chp1Sec6.ipynb)
---