# **Higher Order Linear Equations**

---

### **Introduction**

This notebook introduces concepts and techniques useful for studying higher order equations. 

---

### **Author**
**Junichi Koganemaru**  

---

### **Last Updated**
**January 28, 2025**

## Introduction

In this notebook we aim to study general $n$-th order linear differential equations of the form

$$
a_n(t) y^{(n)}(t) + a_{n-1}(t) y^{(n-1)}(t) + \cdots + a_1(t) y'(t) + a_0(t) y(t) = f(t), \; t \in \mathbb{R}
$$
where $a_0, ..., a_n, f: \mathbb{R} \to \mathbb{R}$ are assumed to be continuous. 

The focus for us will be the case when $n = 2$, as many physical systems are modeled via second order differential equations. However, we will see that the theory for higher order equations is completely analogous.

First, we go over an example to illustrate why the concepts we introduce in the next section are useful.

> **Example:**
> Consider the second order linear differential equation
> $$
> y''(t) + 2y'(t) + 2y(t) = f(t), \; t \in \mathbb{R},
> $$
> where $f$ is assumed to be continuous. If $y: \mathbb{R} \to \mathbb{R}$ is a solution to the equation above, consider the **vector-valued** function $\boldsymbol{Y}: \mathbb{R} \to \mathbb{R}^2$ defined via
> $$
> \boldsymbol{Y}(t) = \begin{pmatrix} y(t) \\ y'(t) \end{pmatrix}.
> $$
> Here $\boldsymbol{Y}$ takes in a value $t \in \mathbb{R}$ and gives back a **column vector** in $\mathbb{R}^2$. We will carefully introduce operations that can be performed on vector-valued functions later, for now we just need to know how to differentiate them. For vector valued functions, their derivatives are defined **component wise**, i.e. 
> $$
> \boldsymbol{Y}'(t) = \begin{pmatrix} \frac{d}{dt} [y(t)] \\ \frac{d}{dt} [y'(t)] \end{pmatrix} = \begin{pmatrix} y'(t) \\ y''(t) \end{pmatrix}, \; t \in \mathbb{R}.
> $$
> Note that since $y$ is a solution to the differential equation, we have that $y''(t) = -2y'(t) - 2y(t) + f(t)$. Therefore, we have 
> $$
> \boldsymbol{Y}'(t) = \begin{pmatrix} y'(t) \\ y''(t) \end{pmatrix} = \begin{pmatrix} y'(t) \\ -2y'(t) - 2y(t) + f(t) \end{pmatrix} =  \begin{pmatrix} 0 \cdot y(t) + 1 \cdot y'(t) \\ - 2 \cdot y(t) -2 \cdot y'(t)  \end{pmatrix} + \begin{pmatrix} 0 \\ f(t) \end{pmatrix}.
> $$
> Note that the first term on the right hand side only depends on the entries of $\boldsymbol{Y}$. Using some notation that we will go over later, we can write the above equation as
> $$
> \boldsymbol{Y}'(t) =  \begin{pmatrix}  0 & 1 \\ -2 & -2 \end{pmatrix} \begin{pmatrix} y(t) \\ y'(t) \end{pmatrix} + \begin{pmatrix} 0 \\ f(t) \end{pmatrix},
> $$
> where 
> $$ 
> A = \begin{pmatrix}  0 & 1 \\ -2 & -2 \end{pmatrix}
> $$
> is a **matrix** recording the coefficients of the components of $\boldsymbol{Y}$ on the right hand side of the equation. We thus obtain a **first order vector-valued** equation  
> $$
> \boldsymbol{Y}'(t) = A \boldsymbol{Y}(t) + \boldsymbol{F}(t) \Longleftrightarrow \boldsymbol{Y}'(t) - A  \boldsymbol{Y}(t) =\boldsymbol{F}(t), \; t \in \mathbb{R},
> $$
> where $F: \mathbb{R} \to \mathbb{R}^2$ is defined via $\boldsymbol{F}(t) = \begin{pmatrix} 0 \\ f(t) \end{pmatrix}$. This is a **first order linear differential equation** for the vector-valued function $\boldsymbol{Y}$.

The upshot of this example is that we can rewrite a second order scalar differential equation as a first order vector-valued differential equation. 

In fact, this can be done for general $n$-th order linear equations as well, which means that we can always trade the order of the differential equation for the number of components of the vector-valued function $\boldsymbol{Y}$. 

While this is extremely powerful, as it shows that any $n$-th order scalar equation can be studied as a first order vector-valued equation, it requires one to be comfortable with the language of linear algebra. We will take this approach later as it gives us a unified framework for studying linear equations.

For now, we will only study second order equations as scalar equations, however the example shows that the theory of $n$-th order linear equations must be intricately connected to concepts coming from linear algebra. As such, we need to introduce some preliminary notions below. 

## Preliminaries 

### Linear Independence
Linear independence is a notion from linear algebra that encodes the "dependencies" among a collection of objects.

It is then useful in this discussion to borrow some terminology from linear algebra.

> **Definition (Linear combination of functions)**
> Given $n$ continuous functions $y_1, \ldots, y_n: I \to \mathbb{R}$ on an interval $I$, a linear combination of these functions is another continuous function $y: I \to \mathbb{R}$ which can be written as 
> $$
> y(t) = c_1 y_1(t) + c_2 y_2(t) + \ldots + c_n y_n(t), \; t \in I
> $$
> where $c_1, \ldots, c_n \in \mathbb{R}$ are constants.

To motivate our discussion, let's first consider an example.

> **Example**
> Consider the functions $y_1, y_2, y_3: \mathbb{R} \to \mathbb{R}$ defined via $y_1(t) = t, y_2(t) = t+1, y_3(t) = 2t+1$ for $t \in \mathbb{R}$, and suppose that $y$ is a linear combination of $y_1, y_2, y_3$: 
> $$
> y(t) = c_1 y_1(t) + c_2 y_2(t) + c_3 y_3(t), \; t \in \mathbb{R} 
> $$
> for some constants $c_1, c_2, c_3$. Notice that here we have 
> $$
> y_1 + y_2 = y_3,
> $$
> so in the equation above, we can replace $y_3$ and write 
> $$
> y(t) = c_1 y_1(t) + c_2 y_2(t) + c_3(y_1(t) + y_2(t)) = C_1 y_1(t) + C_2 y_2(t), \; t \in \mathbb{R}
> $$
> where $C_1 = c_1 + c_3, C_2 = c_2 + c_3$.
   

  What this shows us is that because $y_3$ is "dependent" on $y_1$ and $y_2$, even though in the definition of $y$ there seems to be three "building blocks", in reality only two "building blocks" are required to create $y$. In other words, any linear combination of $y_1, y_2, y_3$ can always be written as a linear combination of $y_1, y_2$.
  
  In this sense, the appearance of $y_3$ is redundant.

We will see later that general solutions to homogeneous linear differential equation is given as a linear combination of functions. What we want to do is find a way to capture the notion of "non-redundancy". This is where the notion of **linear independence** comes in. First, let us give an intuitive definition as to what it means for a set of functions to be linearly independent. 

> **Definition (Linear independence I)**
> A set of continuous functions defined over an interval $I$ is said to be *linearly independent* if none of the functions can be written as the linear combination of the other functions in the set, on the interval $I$.

In the previous example, the set $\{y_1, y_2, y_3\}$ is not linearly independent over $\mathbb{R}$ because $y_3$ is a linear combination of $y_1$ and $y_2$. 


The previous definition, while conceptually clear, is hard to apply in practice. Instead we will often use the following alternative definition.

> **Definition (Linear independence II)**
> A set of continuous functions $y_1, \ldots, y_n: I \to \mathbb{R}$ is said to be linearly independent on an interval $I$ if 
> $$
> c_1 y_1(t) + \ldots + c_n y_n(t) = 0 \; \text{for all} \; t \in I
> $$
> implies $c_1 = c_2 = \ldots = c_n = 0$.


Let's try to see why the two definitions are equivalent to each other. Suppose there exists some non-zero $c_i$'s such that 
$$
c_1 y_1(t) + \ldots + c_k y_k(t)= 0, \; t \in I.
$$
Up to relabeling of the indices, we can suppose without loss of generality that $c_1 \neq 0$. Then we can immediately write 
$$
y_1(t) = -\frac{1}{c_1} \left( c_2 y_2(t) + \ldots + c_k y_k(t) \right), \; t \in I
$$
meaning that $y_1$ is a linear combination of other functions in the set. On the other hand, if a function is already a linear combination of the other functions, then we can find non-zero coefficients to produce the zero function. So the two definitions are equivalent to each other.

**Remark:** According to this definition, any set containing the zero function is automatically linearly dependent.


> **Example**
> Consider the functions $y_1, y_2, y_3$ defined via $y_1(t) = e^t, y_2(t) = e^{-t}, y_3(t) = e^{t} + e^{-t}$ for $t \in \mathbb{R}$. Notice that
> $$
> y_3(t) = y_1(t) + y_2(t) \Longleftrightarrow y_1(t) + y_2(t) - y_3(t) = 0, \; t \in \mathbb{R}
> $$
> for all real values of $t$. So if $c_1 y_1(t) + c_2 y_2(t) + c_3 y_3(t) = 0$ for all $t \in \mathbb{R}$, it's not necessarily true that $c_1 = c_2 = c_3 = 0$. Therefore these three functions are linearly dependent.

> **Example**
> Consider the functions $y_1, y_2, y_3$ defined via $y_1(t) = \sin^2(t), y_2(t) = \cos^2(t), y_3(t) = 1$ for $t \in \mathbb{R}$. Since 
> $$
> y_1(t) + y_2(t) = \sin^2(t) + \cos^2(t) = 1 = y_3(t)
> $$
> for all $t \in \mathbb{R}$, we see that these functions are linearly dependent.


## The Wronskian

 In practice, it is usually cumbersome to check linear independence via either of the previous definition when the set contains more than two functions. So instead, we'll again borrow some ideas from linear algebra and encode the dependence/independence of these functions in a single object - the Wronskian.

> **Definition**
> Let $I \subseteq \mathbb{R}$ be an interval and let $y_1, ..., y_n \in C^{n-1}(I ; \mathbb{R})$. Then the *Wronskian* of this collection of functions is a function $W: I \to \mathbb{R}$ defined as a **determinant** of a matrix,
> $$
> W(y_1, y_2, ..., y_n) (t) := \det \begin{pmatrix}
> y_1(t) & y_2(t) & \cdots & y_n(t) \\
> y_1'(t) & y_2'(t) & \cdots & y_n'(t) \\
> \vdots & \vdots & \ddots &\vdots  \\
> y_1^{(n-1)}(t) & y_2^{(n-1)}(t) & \cdots & y_n^{(n-1)}(t) 
> \end{pmatrix}, \; t \in I.
> $$

The following proposition establishes a connection between the Wronskian and linear independence.

> **Proposition**
> Let $y_1, \ldots , y_n: I \to \mathbb{R}$ be $n$ **analytic functions** over an interval $I$. Then the set of functions are linearly dependent if and only if $W(y_1,\ldots ,y_n)$ is identically zero on $I$.

In other words,
1. If the Wronskian of $n$ analytic functions is identically zero over an interval $I$, then the set of functions is linearly dependent.
2. Otherwise, the set of functions is linearly independent.

This means as long as we can compute the Wronksian of a set of analytic functions, we can determine whether they are linearly independent or not.

**Remark:** Analytic functions are smooth functions that admit a local power series representation at every point in their respective domains. In practice, many functions that we encounter are analytic, so this theorem is quite useful. Constant functions, trigonometric functions, polynomials, and exponential functions are all examples of analytic functions.


If the functions in question are solutions to linear differential equation, we can say something even stronger. 

> **Theorem (Cauchy–Kovalevskaya)** 
> Let $y_1, ..., y_n \in C^{n-1}(\mathbb{R})$ be $n$ solutions to an $n$-th order homogeneous linear differential equation on an interval $I$ of the form 
> $$
> y^{(n)}(t) + a_{n-1}(t) y^{(n-1)}(t) + \ldots  + a_1(t)y'(t) + a_0(t) y(t) = 0, \; t \in I
> $$
> where $\{a_i\}_{i=1}^{n-1}$ are *analytic* over $I$. Then the following hold. 
> 1. $y_1, ..., y_n$ are analytic 
> 2. Either $W(y_1, y_2, ..., y_n)(t) = 0$ for all $t \in I$ and the set $\{y_1, ..., y_n\}$ is linearly dependent, or $W(y_1, y_2, ..., y_n)(t) \neq 0$ for all $t \in I$ and the set $\{y_1, ..., y_n\}$ is linearly independent.

**Remark**: This theorem is stated only for functions that are solutions to homogeneous equations of a specific form (leading coefficient must be 1 and the coefficients must be analytic). By item 2, as long as the Wronskian does not vanish at one point, we can immediately conclude that the set of functions is linearly independent.


## Determinants 
Next we discuss how to compute the Wronskian by going over how to compute determinants.

### Determinants of $2 \times 2$ matrices

We first consider the determinant of $2 \times 2$ matrices.

> **Definition**  
> Consider the square matrix  
> $$  
> A = \begin{pmatrix}  
> a & b \\  
> c & d  
> \end{pmatrix}.  
> $$  
> The *determinant* of $A$ is a number associated to the matrix $A$, defined by  
> $$  
> \det A := ad - bc.  
> $$  

**Remark** 
The determinant is sometimes denoted with vertical bars: 
$$
\det(A) = \begin{vmatrix}
a & b \\
c & d
\end{vmatrix}.
$$

> **Example:**
> Let 
> $$
> A = \begin{pmatrix} 
> 1 & 2 \\
> 3 & 4 
> \end{pmatrix}, B = \begin{pmatrix} 
> 2 & 3 \\
> 4 & 5
> \end{pmatrix}. 
> $$
> Then
> $$
> \det A = 1 \times 4 - 2 \times 3 = -2, \det B = 2 \times 5 - 3 \times 4 = -2.
> $$



### Determinants of $3 \times 3$ matrices

Suppose $A$ is a $3 \times 3$ matrix of the form
$$
A = \begin{pmatrix}
a & b & c\\
d & e & f\\
g & h & i
\end{pmatrix}.
$$
The simplest way to calculate $\det(A)$ is to use
$$
\det(A) = \begin{vmatrix}
a & b & c\\
d & e & f\\
g & h & i
\end{vmatrix} = a  \begin{vmatrix}
e & f \\
h & i
\end{vmatrix} - d \begin{vmatrix}
b & c \\
h & i
\end{vmatrix}  + g\begin{vmatrix}
b & c \\
e & f
\end{vmatrix}
= a(ei - fh) - d(bi - ch) + g( bf - ce).
$$

One way to interpret this is that we are expanding along the first column, and we're multiplying each element in the column by the determinant of the matrix obtained from ignoring the row and column containing that element. Here's the way to visualize it (the smaller matrices are called \emph{minors})
$$
\begin{vmatrix}
a & \cdots & \cdots \\
\vdots & \textcolor{red}{e} & \textcolor{red}{f}\\
\vdots & \textcolor{red}{h} & \textcolor{red}{i}
\end{vmatrix}  \quad \quad \begin{vmatrix}
\vdots & \textcolor{red}{b} & \textcolor{red}{c}\\
d & \cdots & \cdots\\
\vdots & \textcolor{red}{h} & \textcolor{red}{i}
\end{vmatrix} \quad \quad 
\begin{vmatrix}
\vdots & \textcolor{red}{b} & \textcolor{red}{c}\\
\vdots & \textcolor{red}{e} & \textcolor{red}{f}\\
g & \cdots & \cdots
\end{vmatrix}
$$

Notice that the signs in the expansion flipped from $+a$ to $-d$ to $+g$. This is important. 

In general one can expand along any row and any column, as long as you have the right sign in front of the elements in the expansion. The signs associated with each element is given in the matrix on the right:
$$
\begin{pmatrix}
a & b & c\\
d & e & f\\
g & h & i
\end{pmatrix} \quad \quad \begin{pmatrix}
+ & - & +\\
- & +& -\\
+ & - & +
\end{pmatrix}
$$

So in the equation above
$$
\det(A) = \textcolor{red}{+a}  \begin{vmatrix}
e & f \\
h & i
\end{vmatrix} \textcolor{red}{-d} \begin{vmatrix}
b & c \\
h & i
\end{vmatrix}  \textcolor{red}{+g} \begin{vmatrix}
b & c \\
e & f
\end{vmatrix},
$$
we took
$$
\begin{pmatrix}
\textcolor{red}{a} & b & c\\
\textcolor{red}{d} & e & f\\
\textcolor{red}{g} & h & i
\end{pmatrix} \quad \quad \begin{pmatrix}
\textcolor{red}{+} & - & +\\
\textcolor{red}{-} & +& -\\
\textcolor{red}{+} & - & +
\end{pmatrix}.
$$

If we were to expand along the second column then we'd have the minors as 
$$
\begin{pmatrix}
\cdots & b & \cdots \\
\textcolor{blue}{d} & \vdots  & \textcolor{blue}{f}\\
\textcolor{blue}{g} & \vdots & \textcolor{blue}{i}
\end{pmatrix}  \quad \quad \begin{pmatrix}
\textcolor{blue}{a} & \vdots &  \textcolor{blue}{c} \\
\cdots & e  & \cdots\\
\textcolor{blue}{g} & \vdots & \textcolor{blue}{i}
\end{pmatrix} \quad \quad 
\begin{pmatrix}
\textcolor{blue}{a} & \vdots &  \textcolor{blue}{c} \\
	\textcolor{blue}{g} & \vdots & \textcolor{blue}{i}
   \end{pmatrix}  \quad \quad \begin{pmatrix}
	\textcolor{blue}{a} & \vdots &  \textcolor{blue}{c} \\
	\cdots & e  & \cdots\\
	\textcolor{blue}{g} & \vdots & \textcolor{blue}{i}
   \end{pmatrix} \quad \quad 
   \begin{pmatrix}
   \textcolor{blue}{a} & \vdots &  \textcolor{blue}{c} \\
	\textcolor{blue}{d} & \vdots  & \textcolor{blue}{f}\\
   \cdots& h & \cdots
   \end{pmatrix}
   $$
   and thus 
   $$
   \det(A) =  \textcolor{red}{-b}  \begin{vmatrix}
   d & f \\
   g & i
   \end{vmatrix}  \textcolor{red}{+e} \begin{vmatrix}
   a & c \\
   g & i
   \end{vmatrix} \textcolor{red}{-h} \begin{vmatrix}
   a & c \\
   d & f
   \end{vmatrix}.
   $$

> **Example:**
> Let 
> $$
> A = \begin{pmatrix} 
> 1 & 2 & 3 \\
> 4 & 5 & 6 \\
> 7 & 8 & 9
> \end{pmatrix}.
> $$
> According to the discussion above, we can evaluate $\det$ along the second row:
> $$
> \det A = -4 \times \det \begin{pmatrix} 
> 2 & 3 \\
> 8 & 9
> \end{pmatrix} + 5 \times \det \begin{pmatrix} 
> 1 & 3 \\
> 7 & 9
> \end{pmatrix} - 6 \times  \det \begin{pmatrix} 
> 1 & 2 \\
> 7 & 8
> \end{pmatrix}.
> $$
> We can also evaluate along the second column:
> $$
> \det A = -2 \times \det \begin{pmatrix} 
> 4 & 6 \\
> 7 & 9
> \end{pmatrix} + 5 \times \det  \begin{pmatrix} 
> 1 & 3 \\
> 7 & 9
> \end{pmatrix} - 8 \times \det \begin{pmatrix} 
> 1 & 3 \\
> 4 & 6
> \end{pmatrix}.  
> $$


### Determinants of $n \times n$ matrices

Now we consider determinants of $n \times n$ matrices.

> **Definition (Determinants via cofactor expansion along first row)**  
> Let $A = (a_{ij})$ be an $n \times n$ square matrix. Denote by $\tilde{A}_{ij}$ the $(n-1) \times (n-1)$ matrix obtained from deleting the row and column containing $a_{ij}$ from $A$. Then we define the determinant of $A$ to be 
> $$
> \det A := \sum_{k=1}^n (-1)^{1 +k} a_{1k} \det \tilde{A}_{1k}.
> $$

One can show that, in fact, you can expand along any column or any row. 

> **Proposition (Cofactor expansion along any row)**  
> Let $i$ be any integer between $1$ and $n$. Then the determinant of $A$ is equal to 
> $$
> \det A = \sum_{k=1}^n (-1)^{i + k} a_{ik} \det \tilde{A}_{ik}.
> $$

> **Proposition (Cofactor expansion along any column)**  
> Let $j$ be any integer between $1$ and $n$. Then the determinant of $A$ is equal to 
> $$
> \det A = \sum_{k=1}^n (-1)^{j + k} a_{kj} \det \tilde{A}_{kj}.
> $$


## Examples calculating Wronskians

> **Example**
> Consider the functions $y_1, y_2: \mathbb{R} \to \mathbb{R}$ defined via 
> $$
> y_1(x) = e^x, \; y_2(x) = e^{2x}.
> $$
> Then 
> $$
> W(y_1,y_2) (x) = \det \begin{pmatrix}
> e^x & e^{2x} \\
> e^x & 2e^{2x}
> \end{pmatrix} = 2e^{3x} - e^{3x} = e^{3x} \neq 0 \; \text{for all} \; x \in \mathbb{R}.
> $$
> Therefore the two functions are linearly independent over any interval $I \subseteq \mathbb{R}$.  

> **Example**
> Consider the functions $y_1, y_2: \mathbb{R} \to \mathbb{R}$ defined via 
> $$
> y_1(t) = t^2, \; y_2(t) = t^3.
> $$
> Then 
> $$
> W(y_1,y_2) (t) = \det \begin{pmatrix}
> t^2 & t^3 \\
> 2t & 3t^2
> \end{pmatrix} = 3t^4 - 2t^4 = t^4 \neq 0 \; \text{for all} \; t \neq 0.
> $$
> Therefore the two functions are linearly independent on any interval $I \subseteq \mathbb{R}$, even if it includes $t = 0$.  

**Remark:**
Since the Wronskian in the previous example doesn't vanish identically on any interval $I$ containing $0$, this also means that on any interval $I$ containing 0, there is no equation of the form 
$$
y''(t) + a_1(t) y'(t) + a_0(t)y(t) = 0, \; t \in I
$$
with analytic coefficients that has these two functions as solutions, as otherwise either the Wronskian is either never zero or it vanishes identically. 

However, if $I$ does not contain 0, then the Wronskian never vanishes and this is possible. One can check that $y_1, y_2: (0,\infty) \to \mathbb{R}$ are solutions to 
$$
y''(t) - \frac{5}{2t} y'(t) + \frac{3}{2t^2} y(t) = 0, \; t > 0.
$$
on the interval $I = (0,\infty)$. 


> **Example**
> Consider the functions $y_1, y_2, y_3: \mathbb{R} \to \mathbb{R}$ defined via 
> $$
> y_1(t) = \sin^2(t), \; y_2(t) = \cos^2(t), \; y_3(t) = 1.
> $$
> Then 
> $$
> W(y_1,y_2, y_3) (t) = \det \begin{pmatrix}
> \sin^2(t) & \cos^2(t) & 1 \\
> 2 \sin(t) \cos(t) & -2 \sin(t)\cos(t) & 0 \\
> 2 \cos(2t) & -2 \cos(2t) & 0
> \end{pmatrix}
> = \det \begin{pmatrix}
> 2 \sin(t) \cos(t) & -2 \sin(t)\cos(t) \\
> 2 \cos(2t) & -2 \cos(2t)
> \end{pmatrix}
> = 0 \; \text{for all} \; t \in \mathbb{R}.
> $$
> Therefore the three functions are linearly dependent on any interval $I \subseteq \mathbb{R}$.

First, we state a version of the existence and uniqueness for higher order linear differential equations.

> **Theorem (Existence and uniqueness theorem)**
> Let $t_0 \in \mathbb{R}$ and let $I$ be an interval containing $t_0$. Consider an initial value of the form 
> $$
> \begin{cases}
> a_n(t) y^{(n)}(t) + \ldots + a_1(t) y'(t) + a_0(t)y(t)  = 0, & t \in I \\
> y^{(k)}(t_0) = y_k, \; 0 \le k \le n-1.
> \end{cases} 
> $$
> If $a_n, \ldots, a_0: I \to \mathbb{R} $ are continuous and $a_n(t) \neq 0$ for all $t \in I$, then there exists a unique solution that is defined globally on $I$. 

As a consequence of the existence and uniqueness theorem, we can deduce the structure of general solutions to homogeneous linear equations. 

> **Proposition**
> Let $I$ be an interval and let $a_n, \ldots, a_0: I \to \mathbb{R}$ be continuous, and assume $a_n(t) \neq 0$ for all $t \in I$. Given an $n$-th order homogeneous linear ODE of the form 
> $$
> L(y) (t) = a_n(t) y^{(n)}(t) + \ldots + a_1(t) y'(t) + a_0(t)y(t)  = 0,
> $$
> the general solution to this equation is given by 
> $$
> y_h(t) = c_1 y_1(t) + \ldots + c_n y_n(t), \; t \in I
> $$
> where $y_1, \ldots, y_n$ are linearly independent over $I$ and $c_1,\ldots,c_n$ are arbitrary constants.

**Justification:** For the sake of simplicity we will only show this for the case of $n=2$. Suppose we'd like to identify the unique solution to the IVP
$$
\begin{cases}
a_2(t) y''(t) + a_1(t) y'(t) + a_0(t)y(t) = 0, \; t \in I \\
y(t_0) = c_1 \\
y'(t_0) = c_2.
\end{cases}
$$
By uniqueness, there exists unique solutions $y_1, y_2$ defined over $I$ to the IVPs 
$$
\begin{cases}
a_2(t) y''(t) + a_1(t) y'(t) + a_0(t)y(t) = 0, \; t \in I\\
y(t_0) = 1 \\
y'(t_0) = 0.
\end{cases}
$$
and 
$$
\begin{cases}
a_2(t) y''(t) + a_1(t) y'(t) + a_0(t)y(t) = 0, \; t \in I\\
y(t_0) = 0 \\
y'(t_0) = 1
\end{cases}
$$
respectively, since the coefficients $a_2, a_1, a_0$ are assumed to be continuous and $a_2(t) \neq 0$ for all $t \in I$. Then if we define the function $y$ via $y(t) = c_1y_1(t) + c_2y_2(t), t \in I$, by linearity,
$$
L(y) = c_1 L(y_1) + c_2 L(y_2) = c_1 (0) + c_2(0) = 0,
$$
and also 
\begin{align}
y(t_0) &= c_1 y_1(t_0) + c_2 y_2(t_0) = c_1 (1) + c_2 (0) = c_1 \\
y'(t_0) &= c_1 y_1'(t_0) + c_2 y_2'(t_0) = c_1 (0) + c_2 (1) = c_2.
\end{align}
So by uniqueness, $y = c_1 y_1(t) + c_2 y_2(t)$ is the unique solution to the IVP.

To check that $y_1, y_2$ are linearly independent, we can check either via the definition of linear independence or via the Wronskian. If we want to check via definition, suppose $d_1 y_1(t) + d_2 y_2(t) = 0$ for all $t \in I$. Note that this implies $d_1 y_1'(t) + d_2 y_2'(t) = 0$ for all $t \in I$. In particular, we have 
\begin{align}
d_1 y_1(t_0) + d_2 y_2(t_0) &= 0 \\
d_1 y_1'(t_0) + d_2 y_2'(t_0) &= 0,
\end{align}
or 
\begin{align}
d_1 &= 0\\
d_2 &= 0.
\end{align}
This shows that $y_1, y_2$ are linearly independent over $I$.

We can also verify linear independence directly via the Wronskian. We note that 
$$
W(y_1,y_2)(t_0) = \det \begin{pmatrix}
y_1(t_0) & y_2(t_0) \\
y_1'(t_0) & y_2'(t_0)
\end{pmatrix} = \det \begin{pmatrix}
1 & 0 \\
0 & 1
\end{pmatrix} = 1 \neq 0.
$$
Since the Wronskian is non-zero at the point $t_0 \in I$, the two functions are linearly independent over $I$. $\square$



This shows that if want to find the general solution to the homogeneous equation, it suffices to identify the $n$ linearly independent "building blocks." This leads to the notion of a **fundamental set of solutions**.

> **Definition**
> Let $I$ be an interval and let $a_n, \ldots, a_0: I \to \mathbb{R}$ be continuous, and assume $a_n(t) \neq 0$ for all $t \in I$. A set of $n$ functions $\{y_1, \ldots, y_n\} \subseteq C^{n}(I ; \mathbb{R})$ is called a **fundamental set of solutions** to the $n$-th order homogeneous linear ODE 
> $$
> a_n(t) y^{(n)}(t) + \ldots + a_1(t) y'(t) + a_0(t)y(t)  = 0, \; t \in I
> $$
> if the functions are linearly independent over $I$ and the general solution to the equation is given by
> $$
> y_h(t) = c_1 y_1(t) + \ldots + c_n y_n(t), \; t \in I.
> $$

In general, identifying the fundamental set of solutions is not easy. However, there is a special case where they can be identified explicitly. 

## Constant coefficient linear differential equations

In the case where the equation in consideration is a homogeneous **constant coefficient** linear differential equation of the form 
$$
a_n y^{(n)} + \ldots + a_1 y' + a_0 y = 0, 
$$
where $a_n \neq 0$, the general solution is given by 
$$y_h(t) = c_1 y_1(t) + \ldots + c_n y_n(t), t \in \mathbb{R},$$
and the linearly independent functions $y_1, \ldots, y_n$ can be identified through the **characteristic equation**
$$a_n r^n + \ldots + a_1 r + a_0 = 0.$$

By the fundamental theorem of algebra we know that there are exactly $n$ (possibly complex) roots to this polynomial. Suppose $\lambda_1, \lambda_2, ..., \lambda_n$ are the roots to this polynomial.

1. If $\lambda_1, ..., \lambda_n$ are all real and they are all distinct, then the solution to 
$$
y(t) = c_1 e^{\lambda_1 t} + c_2 e^{\lambda_2 t} + ... + c_n e^{\lambda_n t}.
$$
2. If we have $k$ repeated real roots, say $\lambda_1 = \lambda_2 = ... = \lambda_k$ (and $\lambda_{k+1}, ..., \lambda_n$ are real distinct roots), then 
$$
y(t) = c_1 \textcolor{red}{e^{\lambda_1 t}} + c_2 \textcolor{red}{te^{\lambda_2 t}} + c_3 \textcolor{red}{t^2 e^{\lambda_3 t}}  ... + c_k \textcolor{red}{t^{k-1} e^{\lambda_k t}}  + c_{k+1} e^{\lambda_{k+1} t} + ... + c_n e^{\lambda_n t}.
$$
3. If $\lambda_i = \alpha + \beta i$ is a complex root of multiplicity 1, then $\lambda_j = \overline{\lambda_i} = \alpha - \beta i$ must also be a root. Then we replace $c_i e^{\lambda_i t} + c_j e^{\lambda_j t}$ (which is still valid, but we are interested only in real-valued solutions) with $c_i e^{\alpha t} \cos(\beta t) + c_j e^{\alpha t} \sin(\beta t)$. 
4. If $\lambda = \alpha + \beta i$ is a complex root of multiplicity $k$, then its conjugate $\overline{\lambda} = \alpha - \beta i$ must also be a root of multiplicity $k$. In this case the solution is the same as the solution given in case (2), except we replace the complex exponentials in the previous equation with sines and cosines, as in case (3).

This is not to difficult to justify using the aforementioned linear algebraic framework. For now, we will simply verify this through examples. 

## Examples

> **Example:**
> Consider the IVP
> $$
> \begin{cases}
> y''(t) + t y'(t) + t^2 y(t) = 0, \; t \in I \\
> y(t_0) = a_1 \\
> y'(t_0) =a_2.	
> \end{cases}
> $$
> Recall that by the existence and uniqueness theorem for higher order equations, as long as the coefficients are continuous and the leading coefficient does not vanish on an interval $I$, there will always exist a unique solution to this IVP that is defined over $I$, for all values of $t_0,a_1, a_2$.
> Note that since the equation is linear, the general solution is still given by 
> $$
> y(t) = c_1 y_1(t) + c_2 y_2(t), \; t \in I
> $$
> but since the coefficients are not constant functions, there's no simple mechanism to identify $y_1, y_2$.

> **Example:**
> Consider the IVP
> $$
> \begin{cases}
> y'' - 3y'(t) + 2 y(t) = 0, \;t \in \mathbb{R}
> y(0) = a\\
> y'(0) = b.
> \end{cases}
> $$
> The associated characteristic polynomial is 
> $$
> r^2 - 3r + 2 = (r-1)(r-2) = 0,
> $$
> so its roots are $r = 1, r= 2$. This means that the general solution to the equation is given by 
> $$
> y(t) = c_1 e^{t} + c_2 e^{2t}, \; t \in \mathbb{R}.
> $$
> To identify the (unique) solution to the IVP, we need to identify the coefficients $c_1, c_2$ using the initial conditions. We have 
> $$
> y(0) = c_1 e^{0} + c_2 e^{0} = c_1 + c_2 \\
> y'(0) = c_1 e^{0} + 2 c_2 e^{0} = c_1 + 2c_2,
> $$
> so for $y$ to be a solution to the IVP we must have 
> $$
> c_1 + c_2 = a\\
> c_1 + 2c_2 = b.
> $$
> This implies that $c_2 = b-a, c_1 = 2a-b$. So the unique solution to the IVP is 
> $$
> y(t) = (2a-b) e^t + (b-a) e^{2t}, \; t \in \R.
> $$
> Notice that a unique solution exists for all values of $a,b$. This is in line with the existence and uniqueness theorem for higher order equations, since the coefficients are constant functions (hence continuous).  

> **Example:**
> Consider the equation 
> $$
> y''(t) - y(t) = 0, \; t \in \mathbb{R}
> $$
> The associated characteristic polynomial is 
> $$
> r^2 - 1 =(r-1)(r+1) = 0.
> $$
> So the general solution to the equation is given by 
> $$
> y(t) = c_1 e^{-t} + c_2 e^{t}, \; t \in \R.
> $$
> where $c_1, c_2$ are arbitrary constants. 
> 
> Notice that we can also write the general solution as
> $$
> y(t) = d_1 (e^{-t} + e^{t}) + d_2 e^t, \; t \in \R
> $$
> where $d_1, d_2$ are arbitrary constants, since
> $$
> d_1 (e^{-t} + e^{t}) + d_2 e^t = d_1 e^{-t} + (d_1 + d_2) e^t, \; t \in \R
> $$
> and 
> $$
> c_1 e^{-t} + c_2 e^{t} = c_1 (e^{-t} + e^{t}) + (c_2 - c_1) e^{t}, \; t \in \R.
> $$
> In general, the "building blocks" of the general solution are not unique. What's important is that they are solutions to the homogeneous equation and that they are linearly independent. 

> **Example:**
> Consider the equation 
> $$
> y''(t) - 2y'(t) + y(t) = 0, \; t \in \R.
> $$
> The associated characteristic polynomial is 
> $$
> r^2 - 2r + 1 =(r-1)^2 = 0.
> $$
> Here, $r = 1$ is a repeated root. So the general solution to the equation is 
> $$
> y(t) = c_1 e^t + c_2 t e^t, \; t \in \R.
> $$

> **Example**
> Consider the equation 
> $$
> y^{(3)} + 3 y'' + 3 y' + y = 0, \; \; t \in \R.
> $$
> The associated characteristic polynomial is 
> $$
> r^3 +3r^2 + 3r + 1 =(r+1)^3 = 0.
> $$
> Here, $r = -1$ is a repeated root of multiplicity 3. So the general solution to the equation is 
> $$
> y(t) = c_1 e^{-t} + c_2 t e^{-t} + c_3 t^2 e^{-t}, \; \; t \in \R.
> $$

> **Example:**
> Consider the equation
> $$
> y'''(x) - y(x) = 0, \; x \in \R.
> $$
> The associated characteristic polynomial is 
> $$
> r^3 - 1 =0. 
> $$
> The roots of this polynomial of given by the *3rd roots of unity*:
> $$
> \begin{split}
> r_1 &= \exp \left( \frac{2\pi i}{3} *0 \right) = 1\\
> r_2 &=  \exp \left( \frac{2\pi i}{3} * 1 \right) = \cos \left( \frac{2\pi}{3} \right) + i \sin \left( \frac{2\pi}{3} \right) = -\frac{1}{2} + i \frac{\sqrt{3}}{2}. \\
> r_3 &=  \exp \left( \frac{2\pi i}{3} *2 \right) = \cos \left( \frac{4\pi}{3} \right) + i \sin \left( \frac{4\pi}{3} \right) = -\frac{1}{2} - i \frac{\sqrt{3}}{2}.
> \end{split}
> $$
> Therefore the general solution is given by 
> $$
> y(x) = c_1e^{x} + c_2 e^{-1/2x} \cos \left( \frac{\sqrt{3}}{2} x\right) + c_3 e^{-1/2x} \sin \left( \frac{\sqrt{3}}{2} x\right), \; x \in \R.
> $$
> Another way to find the roots to the characteristic polynomial is by using the factorization
> $$
> r^3 - 1 = (r-1) (r^2 + r + 1 ).
> $$

> **Example**
> Consider the equation
> $$
> {x}'''(t) - x''(t) - 4x(t) = 0, \; t \in \R.
> $$
> The associated characteristic polynomial is 
> $$
> r^3 - r^2 - 4 = 0.
> $$
> By inspection, $r = 2$ is a root to this polynomial. Then via long division we see that 
> $$
> r^3 - r^2 - 4 = (r-2) (r^2 + r + 2) =0.
> $$
> Therefore the roots of this polynomial are 
> $$
> \begin{split}
> r_1 &= 2\\
> r_2 &= \frac{-1 - \sqrt{1 - 8}}{2} = -\frac{1}{2} - i \frac{\sqrt{7}}{2}\\
> r_3 &= -\frac{1}{2} + i \frac{\sqrt{7}}{2}.
> \end{split}
> $$
> So the general solution is given by 
> $$
> x(t) = c_1e^{2t} + c_2 e^{-1/2t} \cos \left( \frac{\sqrt{7}}{2} t\right) + c_3 e^{-1/2t} \sin \left( \frac{\sqrt{7}}{2} t\right), \; t \in \R.
> $$


## Appendix: complex numbers
> **Definition**
> A *complex number* is a number of the form $z = a+ib$, where $a,b$ are real numbers and $i$ is the imaginary unit satisfying $i^2 = -1$. $a$ is referred to as the *real part* of $z$ and denoted as $\Re(z)$. $b$ is referred to as the *imaginary part* of $z$ and denoted as $\Im(z)$. 

We usually denote the set of complex numbers with $\mathbb{C}$.

> **Definition**
> The *complex conjugate* of a complex number $a + ib$ is the complex number $a - ib$. We often denote the complex conjugate of a complex number $z$ by $\overline{z}$.

> **Example**
> The complex conjugate of $3 + 2i$ is 
> $$
> \overline{3+2i} = 3-2i.
> $$

> **Theorem (Fundamental theorem of algebra)**
> Let $p(z)$ be an $n$-th degree polynomial with possibly complex coefficients. Then $p$ has (up to multiplicity) exactly $n$ roots.

> **Proposition**
> If a complex number $z = a+ib$ is a root to a polynomial $p(z)$, then its conjugate $\overline{z}$ is also a root of $p(z)$.
