In [None]:
from resources.workspace import *

In the [previous tutorial](T3%20-%20Univariate%20Kalman%20filtering.ipynb) we saw that the Kalman filter (KF) produces reasonable results for straight lines (in so far as linear regression does!).
What about more intricate time series?

### The model
The straight line (of the previous tutorial) could result from discretizing the model
\begin{align*}
\frac{dx}{dt} &= a \, , \\
x(0) &= 0 \, ,
\end{align*}
using `dt = 1`.
Now, instead, we're going to consider the model
$$ \frac{d^M x}{dt^M} = 0 \, .$$

This can be written as 1-st order vector (i.e. coupled system of) ODE:
$$ \frac{d x^m}{dt} = x^{m+1} \, , \quad \frac{d x^M}{dt} = 0 \, ,$$
where the superscript $m = 1,\ldots,M$ is the index of the state vector element.

To make it more interesting, we'll add two terms to this evolution model:  
 - damping: $\beta x^m$, with $\beta < 0$;
 - noise: $\frac{d q^m}{dt}$.  

Thus,
$$ \frac{d x^m}{dt} = \beta x^m + x^{m+1} + \frac{d q^m}{dt} \, ,$$
where $q^m$ is the noise process, and $\beta = \log(0.9)$.

Discretized, with a time step `dt=1`, this yields
$$ x^m_{k+1} = 0.9 x^m_k + x^{m+1}_k + q^m_k\, ,$$

In summary, $\mathbf{x}_{k+1} = \mathbf{F} \mathbf{x}_k + \mathbf{q}_k$, with $\mathbf{F}$ as below.

In [None]:
M = 4 # model order (and also ndim)
F_matrix = 0.9*eye(M) + diag(ones(M-1),1)
print(F_matrix)

### Estimation by the Kalman filter (and smoother) with DAPPER

Note that this is an $M$-dimensional time series. 
However, we'll only observe the first (0th) component.

We shall not write the code for the multivariate Kalman filter,
because it already exists in DAPPER in `da_methods.py` and is called `ExtKF()`.

The following code configures an experiment based on the above model. Don't worry about the specifics. We'll get back to how to use DAPPER later.


In [None]:
# Forecast dynamics
Dyn = linear_model_setup(F_matrix)
Dyn['noise'] = 0.0001*(1+arange(M))

# Initial conditions
X0 = GaussRV(M=M,C=0.02*arange(M))

# Observe 0th component only
Obs = partial_direct_Obs(M,[0])
Obs['noise'] = 1000

# Time settings
t = Chronology(dt=1,dtObs=5,K=250)

# Wrap-up
HMM = HiddenMarkovModel(Dyn,Obs,t,X0)

This generates (simulates) a synthetic truth (xx) and observations (yy)

In [None]:
xx,yy = simulate(HMM)
for m,x in enumerate(xx.T):
    plt.plot(x,label="x^%d"%m)
plt.legend();

Now we'll run assimilation methods on the data. Firstly, the KF, available as `ExtKF` in DAPPER:

In [None]:
stats_KF = ExtKF(store_u=1).assimilate(HMM,xx,yy)

We'll also run the "Kalman smoother" available as `ExtRTS`.
Without going into details, this method is based on the Kalman *filter* but,
being a *smoother*,
it also goes backwards and updates previous estimates with future (relatively speaking) observations.

In [None]:
stats_KS = ExtRTS(store_u=1).assimilate(HMM,xx,yy)

### Estimation by "time series analysis"
The following methods perform time series analysis of the observations, and are mainly derived from signal processing theory.
Considering that derivatives can be approximated by differentials, it is plausible that the above model could also be written as an AR(M) process. Thus these methods should perform quite well.

In [None]:
# Tools
import scipy.signal as sp_sp
normalize = lambda x: x / x.sum()
truncate  = lambda x,n: np.hstack([x[:n],zeros(len(x)-n)])

# We only estimate the 0-th component.
signal = yy[:,0]

# Estimated signals
ESig = {} 
ESig['Gaussian']       = sp_sp.convolve(signal, normalize(sp.signal.gaussian(30,3)),'same')
ESig['Wiener']         = sp_sp.wiener(signal)
ESig['Butter']         = sp_sp.filtfilt(*sp_sp.butter(10, 0.12), signal, padlen=len(signal)//10)
ESig['Spline']         = sp.interpolate.UnivariateSpline(t.kkObs,signal,s=1e4)(t.kkObs)
ESig['Trunc. Fourier'] = np.fft.irfft(truncate(np.fft.rfft(signal),len(signal)//14))

### Comparison
The following code plots the results.

In [None]:
%matplotlib notebook

@interact(Visible=SelectMultiple(options=['Truth','Kalman smoother','Kalman filter','My Method']+list(ESig)))
def plot_results(Visible):
    plt.figure(figsize=(9,5))
    plt.plot(t.kkObs,yy,'k.',alpha=0.4,label="Obs")
    if 'Truth'           in Visible: plt.plot(t.kk   ,xx[:,0]           ,'k',label="Truth")
    if 'Kalman smoother' in Visible: plt.plot(t.kk   ,stats_KS.mu.u[:,0],'m',label="K. smoother")
    #if'Kalman filter u' in Visible: plt.plot(t.kk   ,stats_KF.mu.u[:,0],'b',label="K. filter (u)")
    #if'Kalman filter f' in Visible: plt.plot(t.kkObs,stats_KF.mu.f[:,0],'b',label="K. filter (f)")
    #if'Kalman filter a' in Visible: plt.plot(t.kkObs,stats_KF.mu.a[:,0],'b',label="K. filter (a)")
    if 'Kalman filter'   in Visible:
        pw_xxf, pw_xxa = weave_fa(stats_KF.mu.f[:,0],stats_KF.mu.a[:,0])
        pw_kkf, pw_kka = weave_fa(t.kkObs)
        plt.plot(pw_kkf,pw_xxf,'b',label="KF. forecast")
        plt.plot(pw_kka,pw_xxa,'c',label="KF. analysis")
    
    if 'My Method' in Visible and 'stats_MM' in locals():
        pw_xxf, pw_xxa = weave_fa(stats_MM.mu.f[:,0],stats_MM.mu.a[:,0])
        pw_kkf, pw_kka = weave_fa(t.kkObs)
        plt.plot(pw_kkf,pw_xxf,'y',label=stats_MM.config.da_method.__name__+" forecast")
        plt.plot(pw_kka,pw_xxa,'g',label=stats_MM.config.da_method.__name__+" analysis")
    
    for method, estimate in ESig.items():
        if method in Visible: plt.plot(t.kkObs, estimate,label=method)
    
    plt.ylabel('$x^0$, $y$, and $\hat{x}^0$')
    plt.xlabel('Time index ($k$)')
    plt.legend()
    plt.show()

Visually, it's hard to imagine better performance than from the Kalman smoother.
However, recall the advantage of the Kalman filter (and smoother): *they know the forecast model that generated the truth*.

Since the noise levels Q and R are given to the DA methods (but they don't know the actual outcomes/realizations of the random noises), they also do not need any *tuning*, compared to signal processing filters, or choosing between the myriad of signal processing filters [out there](https://docs.scipy.org/doc/scipy/reference/signal.html#module-scipy.signal).

In [None]:
def average_error(estimate_at_obs_times):
    return np.mean(np.abs(xx[t.kkObs,0] - estimate_at_obs_times))

for method, estimate in ESig.items():
    print(method   , average_error(estimate))
print('K. smoother', average_error(stats_KS.mu.u[t.kkObs,0]))
print('K. filter'  , average_error(stats_KF.mu.a[:,0]))
# print('My Method', average_error(stats_MM.mu.a[:,0])) # uncomment after Exc 8

**Exc 2:** Theoretically, in the long run, the Kalman smoother should yield the optimal result. Verify this by increasing the experiment length to `K=10**4`.

**Exc 4:** Re-run the experiment with different paramters, for example the observation noise strength or `dkObs`.  
[Results will differ even if you changed nothing because the truth noises (and obs) are stochastic.]

**Exc 6:** Right before executing the assimilations (but after simulating the truth and obs), change $R$ by inserting:

    HMM.h.noise = GaussRV(C=0.01*eye(1))
    
What happens to the estimates of the Kalman filter and smoother?

**Exc 8*:** Try out different methods from DAPPER by replacing `MyMethod` below with one of the following:
 - Climatology
 - Var3D
 - OptInterp
 - EnKF
 - EnKS
 - PartFilt

You typically also need to set (and possibly tune) some method parameters. Otherwise you will get an error (or possibly the method will perform very badly). You may find (some) documentation for each method in its source code...

In [None]:
stats_MM = MyMethod(param1=val1,...).assimilate(HMM,xx,yy)

### Summary
Like linear regression, time series analysis is also a subset of state estimation and DA [(much of time series analysis can be formulated as state estimation)](https://www.google.com/search?q="We+now+demonstrate+how+to+put+these+models+into+state+space+form"). Moreover, DA methods produce uncertainty quantification, something which is usually more obscure with time series analysis methods. Still, the best is yet to come: DA methods should have the capacity to handle inhomogeneous, multivariate, sparsely observed, chaotic systems (which is more fun than stochastically-driven signals such as the above example).

### Next: [Multivariate Kalman](T5 - Multivariate Kalman.ipynb)