## Linear dynamical and time invariant systems ##

### Content ###
This is a list of things that will eventually be in here. The stuff I talked about on 2/22/19 more or less already is, although there are a few alternative angles I want to expand on the same material. That was:

- Basics
    - Abstract definition of a linear dynamical system (LDS)
    - Definition of time-invariant version (LTI)
    - Composition
    - Feedback (for the future)

- Characterization of LTI systems
    - Transfer functions
    - Impulse response
    - Frequency response
    - Step response (for the future)

- Solution to LDS via integrating factors

More things for the future:
- More on convolution
- More on Fourier, Laplace
- Matrix exponential, eigenvalues, eigenvectors
- Volterra series

- Applications
    - Low pass aspect of LDS
    - Spring mass system
    - Langevin equation in Brownian motion
    - Lengevin equation as neural network model
    - Markov chain mixing
    - 1-dim potential well
    - Relation to Liouville's theorem 
    - Linearized systems
    - Kalman filters
    - Building dynamics from exponential filters

### Basics ###
A **linear dynamical system** can be given the following abstract definition:

$$ \dot{x(t)} = f(x(t),u(t)) \text{, s.t. f is linear in x and u.} $$

Since every (finite dimensional) linear operator has a matrix representation, this can also be given as 

$$ \dot{x(t)} = A(t)x(t) + B(t)u(t) $$

where A and B are matrices, and x and u are vectors. A is called a dynamics matrix, and B is called an input or coupling matrix. Equations of this form are often called **state space models**. Such models often posit that x is also latent, but is observable through another (distinct) linear transformation. Linear dynamical systems are interesting because linear algebra and calculus (i.e. dynamics) form the basis of most applied mathematical knowledge.

Another useful property is time invariance. By definition, **time invariant** systems have response properties that do not depend on when, in absolute time, a signal is applied. Applying an input now should be equivalent to applying the same input later. Time invariance is therefore a kind of affine invariance, which is worth noting if one is interested in group properties of operators and the like.

Setting aside the specificity of dynamical systems for a moment and thinking only of linearity and time invariance, consider a system

$$ y(t) = S(x(t)) = Ax(t)$$

where $S$ denotes the function computed by the system on some time varying input $x$. Since $S$ is a linear transformation, it has a matrix representation as multiplication. (If that part is unclear, linear algebra books typically prove the space of finite-dimensional linear operators is isomorphic with finite dimensional matrices.) Formally, if $D_{\tau}$ is a shift operator (so that $D_{\tau}y(t) = y(t-\tau)$), time invariance of a system requires

$$ D_{\tau}S(x(t)) = S(x(t-\tau))$$

so that we also now know

$$ D_{\tau}(A(t)x(t)) = A(t)x(t-\tau) $$

which implies A is a constant. As another notable aside, linearity and time invariance can both be thought of as stating which operators are commutative with others. Since composition is a binary operation, i.e. it takes 2 things and produces 1 new one, commutativity means changing the order of function composition. So if $T$ is a linear transformation, $D$ is a delay, and $y=Sx$ is a system, what we have said is

$$ D \circ S = S \circ D $$

$$ T \circ S = S \circ T $$

which emphasizes a nice symmetry, and is suggestive of how invariance is a cool and interesting concept. The fact that we use it here to constrain the behavior of a system is typical of its usefulness. The take away is that it's worth being on the lookout for these types of things.

Finally, since the system operators $S$ for any LTI systems commute with these operators, compositions of such systems obey the same properties. They will be linear and time invariant since we can move $D$ and $T$ operators in and out of both systems together just as well as we can move them in and out of each individually.


### A few examples ###
- A linear system which is time invariant: $Sx = ax $
- A linear system which is not time invariant: $Sx = t+x $
- A nonlinear system which is time invariant: $Sx = x^2 $
- A nonlinear system which is not time invariant: $ Sx = t + x^2 $

The first above is linear because substituting $x \mapsto ax_1 + bx_2$ gives $aSx_{1} + bSx_{2}$. The second is not time invariant because the system has an internal clock; Substituting $x(t) \mapsto x(t-\tau)$ does not give $y(t-\tau)$.

- Another linear system which is time invariant: $Sx = \frac{d}{dt}x$

Derivatives are linear because they're tiny subtractions.

## Characterizing LTI systems ##
### Eigenfunctions of an LTI system ###
A **transfer** function, or a **response function** is a function describing the input-output relation of a system. LTI systems have some easy to characterize response functions.

The simplest possible response function would be multiplication. We know it distributes over addition and commutes with multiplication, satisfying linearity, and that if the multiplier doesn't depend on time it will be time invariant. This leads us to inspect the operator equation

$$ Ax = ax $$

which is an eigenvalue equation for A. It's worth pointing out for clarity's sake that matrices are doing multiple forms of work here. In the above, x is a function, i.e. an infinite dimensional vector with dimensions indexed by time. A is an infinite dimensional linear operator. But we treat this just like the case where x is a finite dimensional vector, so we can think about the eigenvalue equation in the same way. The multiple-duties of $A$ arise if we take $x$ to be a vector of functions, in which case $A$ describes the coupling between functions which determines the output. Right now we care about $x$ being *one* function, and $A$ being the simple corrosponding operation on that function. For an LTI system, there are a couple of ways to get at system eigenfunctions. Here's a bottom up version.

### Eigenfunctions of the shift operator are eigenfunctions of the system ###
Looking at a few properties of the system, one gets a suspician for what eigenfunctions it might have. Note that if $x$ is an eigenfunction of $S$, we have

$$ Sx = ax \Rightarrow DSx = SDx \Rightarrow ax(t-\tau) = Sx(t-\tau) $$

so that $x(t-\tau)$ is also an eigenvector. The simplest possible case would be for these eigenvectors to have the same functional form, from the perspective of $S$, which would be very convenient. One way of being "the same" in terms of $S$ basically means one being a multiple of the other, since $S$ is a homogenous system. In that case, we would have

$$ x(t-\tau) = Dx = b(\tau)x(t) $$

which is getting highly suggestive. It seems like we want a function that converts multiplication into addition. Exponentials are the prototypes for this, so we consider $x = exp(\lambda t)$. It turns out that the symmetry between $b$ and $x$ is crucial for this supposition to work. It's not obvious how to verify that $x$ is an eigenfunction for $S$ in a generic way, but the shared form alleviates this. Observe that

$$ Dy = DSx = SDx = Sbx = bSx = by $$

We've now said that if $x$ takes this form, y is an eigfunction of D with the same eigenvalue. This can be a bit of a red herring, since eigenfunctions having identical eigenvalues doesn't imply they're the same. Every vector is an eigenvector of the identity operator, and all of them have the same eigenvalues, for instance. Here's where the symmetry of $b$ and $x$ come in however. Look at $t = 0$:

$$ y(t-\tau) = b(\tau)y(t) \Rightarrow y(-\tau) = b(\tau)y(0) $$

So $y$ *does* in fact have the same functional form as $x$, and our eigenvectors of $D$ are eigenvectors of $S$. This looks to me like a generalization of finite dimensional commutative operators being simultaneously diagonolizable, but in an infinite dimensional space. I haven't seen a general proof of this for Hilbert spaces, although presumably there is one, given some sort of niceness constraints. Anyway, google simultaneous diagonalization if this symmetry stuff is starting to sound super cool and important.


### Frequency response ###
That exponentials are eigenfunctions of LTI systems is especially convenient, because the non-increasing ones form one of the most natural function space bases to work with. This is the Fourier basis, which has elements given as

$$ e_{\omega} = \cos(\omega t) + i\sin(\omega t) = e^{i\omega t} $$

Why is this a basis? For $\omega \neq \phi$ we have inner products (over time, with constant function weighting):

$$ \int \cos(\omega t)\sin(\omega t) = \int \cos(\omega t)\sin(\phi t) = 0 $$
$$ \int \cos(\omega t)\cos(\phi t) = \frac{1}{2}\int \cos((\omega - \phi) t) + \frac{1}{2}\int \cos((\omega + \phi) t) = 0 $$

where the former are zero because sine is antisymmetric about the origin. That is, draw the graph of sine and note that things before zero are the negative of things after zero. As for the square terms, we have

$$ \int_{-T}^{T} f^{2}(\omega) dt  = \mathcal{O}(T) $$
$$ f \in \{\sin, \cos\} $$

where [-T,T] is the integration window. There are some technicalities associated with normalizing this, but they don't matter practically, since we don't actually deal with infinite time windows. As for theoretically, they impose the constraint that any function you want to project onto Fourier basis elements needs to go to zero at plus/minus infiniy. Don't worry about that though, and if the projection part is unclear, just wait a few paragraphs.

Now one might ask, what's the point of using the complex version rather than the real basis? Basically, packaging. When we compute things like the power at a given frequency we take the norm of the fourier coefficients anyway.

Getting back to the point, we can decompose functions into sines and cosines, using the standard basis expansion technique which is to take inner products with all the basis elements. This will give a number to attach to each basis element. If we send this signal, as a sum, to our system, we get

$$ x(t) = \int a_{\omega}e^{i\omega t} \Rightarrow y(t) = Sx = \int H(\omega)a_{\omega}e^{i\omega t}$$

The complex numbers $H(\omega)$ which depend on frequency, are just the eigenvalues for each exponential eigenfunction which we found above. In this context, they are called **frequency response functions**. What we have just shown is that **frequency response functions fully characterize an LTI system**. It should also be clear that expanding a signal into a sequence of basis elements and characterizing the system response to those basis elements can give arbitrary response functions. For example, one could expand the time varying signal $x$ into the set of orthogonal polynomials, known as Legendre polynomials, and get a set of response functions for those. It isn't always practical to use a given basis in different situations, but the symmetries of the problem (in this case time invariance) have, in principle, something to tell you about what an especially appropriate one might be. 

In our case, another obvious choice of basis to look at is just the extension of the standard one-hot vector basis into infinite dimensions.

### Impulse response functions ###
In a finite dimensional setting we usually choose a particular basis of one-hot vectors as

$$ e_{n} = \begin{bmatrix}0, 0, ... 0, 1, 0, ... 0 \end{bmatrix}^{T}$$

where the nth entry is 1. As a side note, any orthonormal basis can actually be represented this way, so we're actually choosing a representation rather than an actual basis. The above could stand for the Fourier basis on a finite dimensional space for example. Individual vectors would be discretized sines / cosines. In two dimensions, we would have [-1,1] and [1,1] (but appropriately normalized). An orthogonal matrix would transform these to the standard basis, and the same orthogonal matrix would give us our original vectors in terms of our new sines and cosines written as one-hots. Then the coefficients of a vector would represent its sine and cosine (antisymmetric and symmetric, since dim-2) aspects. If that doesn't make sense because it sounds wrong, let me know, whereas if that doesn't make sense because it doesn't sound meaningful, don't worry about it...

The infinite dimensional analogue of the one hot vector basis is the dirac delta basis. This is basically because inner products go from being sums to being integrals. An inner product with one of the one-hot vectors is just the element in that spot of the vector you're taking a product with. The dirac delta is defined so as to pick out function values at specific times, which are the analogues of indexes, under integration. So we can represent a function as

$$ x(t) = \int x(\tau)\delta(t-\tau) d\tau $$

in the same way we would represent an element of a vector as the whole vector's inner product with a one-hot basis element. This representation is an instance of a more generic operation called a **convolution**, where $x$ is convolved with a delta function. Since our system is **linear**, it commutes compositionally with the integral. Since the system is **time invariant** the response to every delta function is the same, but shifted. This shows that

$$ y(t) = S \circ x(t) = \int x(\tau)D_{\tau}S\delta(t) \equiv \int x(\tau)D_{\tau}h(t) = \int x(\tau)h(t-\tau) $$

where $h(t)$ is *defined* as the function for which these equations hold, and is called the **impulse response function**. The name refers to the fact that the delta function represents an infinitely short ("impulse") input to the system. 

## Integrating factors ##
The one dimensional linear dynamical system 

$$ \dot{x} = ax + u $$

can be solved with a variety of methods. The canonical method makes use of integrating factors. This is just the transformation of a function which is not a derivative of some other function into one that is. For example

$$\frac{d}{dt}(xy) = \dot{x}y + x\dot{y}$$

implies that functions with the form on the r.h.s. can be integrated easily. The main upshot of this approach is that functions which are derivatives of other functions have path independent integrals. This is called the fundamental theorem of calculus. Returning to the dynamical system, we have

\begin{align*}
\dot{x} - ax &= u \\
f\dot{x} - fax &= fu \\
\dot{(fx)} &= fu  \\
fx &= \int fu dt + c  \\
x &= f^{-1} \int fu + c 
\end{align*}

for a function $f$ such that 

\begin{align*}
\dot{f} &= -af \\
\frac{\dot{f}}{f} &= -a \\
\log{f} &= -\int a + c \\
f(t) &= c\text{exp}(-\int a(t) dt)
\end{align*}

Now in particular, if we have a time invariant system, $a$ is a constant. This means that

$$ f(t) = ce^{-at} $$

and consequently

$$ x = x(0)e^{at} + e^{at} \int_{0}^{t} e^{-a \tau}u(\tau) d\tau $$

rearranging slightly, and using the fact that $x(0) = 0$ by time invariance, 

$$ x(t) = \int_{0}^{t} e^{a(t- \tau)}u(\tau) d\tau $$

which shows that **linear dynamical systems are convolutions with an exponential filter**. Now it's worth noting that all the mechanics of this equation are the same if we take 

$$ a \mapsto A $$
$$ u \mapsto Bu $$

with x, u as vectors. In this case we have to interpret a matrix exponential, which I will leave until next time (along with the low-pass filtering property).


# Applications #
...will be elaborated on in the future.