# Geostatistics

## 3 Excursus: Time series

Time series analysis is a discipline of its own and is beyond the scope of this lecture. Nevertheless, a few basics will be covered that relate to geostatistics. We will cover:

* Describing a time series
* Time series decomposition

In [1]:
import pandas as pd
import numpy as np

from bokeh.plotting import output_notebook, show, figure
from bokeh.layouts import row, column, gridplot
from bokeh.models import HoverTool

output_notebook()

## 3.1 Describing a time series

There are two major features of a time series, that we will look at. Both have their origin in the **ordered characteristic** of a time series. First we will look at the dynamics (the change of value over time) and second at the *autocorrelation* 

Load all data from the provided sample file and use the second column as `data` object

In [2]:
df = pd.read_csv('data/sample_data.txt', sep='\s+')
data = df.iloc[:,1]

### 3.1.1 Dynamics

By the dynamics of a timeseries we ask for the **change over time**.

For a series $x$ that has observations $x_{t}, x_{t+1}, x_{t+n}$ with $t$ the start point in time and $n$ the length of the timeseries the dynamics of $x$, denoted $d_x$ can be defined as:

$$d_x = \frac{\partial x}{\partial t}$$

For the discrete case of a sample we can simply take the differences:

$$d_{x_i} = x_i - x_{i - 1}$$

In [3]:
# time series plot
ts = figure(title='time series', width=800, height=200)
ts_line = ts.line(data.index, data.values)

# dynamics plot
diff = figure(title='Time series - dynamics', width=800, height=200, x_range=ts.x_range)
diff_line = diff.line(data.diff().index, data.diff().values, color='firebrick', alpha=0.7)

# create the hover tool
h1 = HoverTool(renderers=[ts_line], tooltips=[('','@y')], mode='vline')
h2 = HoverTool(renderers=[diff_line], tooltips=[('','@y')], mode='vline')
ts.add_tools(h1)
diff.add_tools(h2)

In [4]:
show(gridplot([[ts], [diff]]))

From this we can observe various things:


* The time series is not stable in time and changing. There seems to be a trend or a seasonality.
* The dynamics are much more stable: They change within a range, but there is no trend

* From what we can see here, there seems to be a large random component in the time series.

### 3.1.2 Autocorrelation

The concept of autocorrelation can be used to determine how much a time series is similar to itself.
To be more precise: **When** is it more self-similar and when not. For this to happen, a measure of **correlation** is chosen. 
The correlation between two samples measures their similarity.
The difference to the correlation calculated in the last lecture is that we calculate it over and over again. Each time, we shift the whole time series by one time step (called a **lag**). This way, we correlate the time series to **itself, but moments ago.**

Summarized:

* Find out **when** a time series showed similar values
* Learn about periodicity
* shift the time series by different time steps (*lags*)
* for discrete time series, the normed covariance is used
* visualize how the correlation relates to lag

The *normed covariance* $\rho$ is defined:

$$ \rho_{t_1, t_2} = \frac{Cov(X_{t_1}, X_{t_2})}{\sigma_{t_1} * \sigma_{t_2}} $$

for time series $X$ at two points in time $t_1, t_2$ and their standard deviations $\sigma$

For **stationary** time series, $\sigma_{t_1} == \sigma_{t_2}$ and the autocorrelation is simplified to the covariance

In [5]:
n  = 500
ac = []
for i in range(n):
    # shift by n places
    shifted = np.concatenate((data[i:], data[:i][::-1]))
    cf = np.cov(data.values, shifted) * (np.std(data.values), np.std(shifted))
    #cf = np.corrcoef(data.values, shifted, rowvar=False)
    # use either (1,0) or (0,1) of correlation matrix
    ac.append(cf[0,1])

*You will find an exercise to implement this yourself.*

In [6]:
autocorr = figure(title='Autocorrelation', width=800, height=350)

renderer = autocorr.line(range(n), ac, color='firebrick', alpha=0.7)
hover = HoverTool(renderers=[renderer], tooltips=[('Corr: ','@y')], mode='vline')

autocorr.add_tools(hover)

In [7]:
show(autocorr)