# Experimentation 1: Basic Properties of Time Series -- Stationarity
We will investigate basic statistical properties of time series together with the concept of stationarity.

The idea is to implement time series from random numbers, drawn in Python, and then investigate the statistical properties.
The introduction will provide you with additional information on helpful Python functionality.

These experiments will partially give us obvious results, as we know the statistics of the random numbers used,
but will very likely also give us an insight into where the difference between *the average value of one random time series* and *the average of an ensemble of time series* lies.
Hopefully providing a better understanding of how statistical properties can be used to describe an individual implementation of a time series as well as a general description of all possible time series which might be realised by a speific process.

Please also look into the accompanying notes to get a better understanding of the concepts and functions used in this notebook.
(Please allow the accompanying notes to show up with a delay…)

And please use the forum on itslearning to discuss results, questions or what is unclear to you.

## What to do
For
- white noise and
- a random walk

please determine for one realisation of white noise and a random walk each
- mean & variance
- the autocorrelation
- and test stationarity with the Augmented Dickey-Fuller test

and for an ensemble of realisations (try 1000)
- the mean
- and the variance

over these different realisations (for each time point), i.e., creating the average time series.

## Discussion
Prepare for discussion in class, which will take place either in groups or with your neighbours
- How do mean & variance behave for white noise and a random walk? Where is the difference?
- Describe the autocorrelation function. What does the autocorrelation tell us about the signal?
- How do the statistics for the individual and the ensemble of realisations for white noise and random walks differ? What is noteworthy in your opinion? Did you expect this behaviour?
- How do the statistics of slices of the time series compare to the ensemble behaviours?
- Discuss the relation of the statistical measures in the different perspectives to the concept of stationarity. What does stationarity tell you about a time series?
  And how does the stationarity relate to the statistical observations collected above?

Feel free to use the forum on itslearning.

## Good to Know
- Numpy provides the mean & var functions, either directly in functional style (np.mean()) or object oriented as member functions of its array class (x.mean()). Use the method which seems more descriptive to you.
- Statsmodels provides the function [adfuller](https://www.statsmodels.org/dev/generated/statsmodels.tsa.stattools.adfuller.html) for testing the stationarity with the [Augmented Dickey-Fuller test](https://en.wikipedia.org/wiki/Augmented_Dickey%E2%80%93Fuller_test). The second entry in the result contains the p-value, i.e., use this value to determine if the time series can be assumed as stationary. If the p-value is below a threshold, we assume the time series to be stationary.
- Pandas provides ```pandas.plotting.autocorrelation_plot``` to directly plot the autocorrelation function.
An alternative is ```statsmodels.graphics.tsaplots.plot_acf```.
Please try both and find out which resulting graph you personally prefer, i.e., which can you read better.
But be warned that not every laptop will be able to complete both functions on the given data.
One is fast and the other isn't.
- To create a [random number generator with numpy](https://numpy.org/doc/stable/reference/random/index.html#random-quick-start), use the ```default_rng``` import. Use the normal(…) member function to draw a series of random values.
- To prevent multiple plots from ending up in the same figure, you can issue a ```plt.figure()``` before plotting, to make sure a new figure is created.
- Slicing is the python feature used to access a part of an array. Try the following slicing example if you are not familiar with this feature.
```
d = np.linspace(1, 15, 15)
print('Length of time series in samples:', len(d))
print('First 5 samples:', d[:5])
print('Some 5 samples:', d[4:9])
print('Last 5 samples: ', d[-5:])
```
- Normally, the last command's return value will be printed in a cell's output. If you want to supress that output, close the last line with a semicolon;

In [None]:
import numpy as np                            # Array library
import pandas as pd                           # Advanced data frames & csv reading


# Jupyter lab supports interactive plots      # Matplotlib for plotting
# using "widget"
%matplotlib widget

# Jupyter lab doesn't support notebook,
# which was the preferred method for jupyter notebooks.
#%matplotlib notebook
#%matplotlib inline


from matplotlib import pyplot as plt

import seaborn as sns                         # Advanced plotting, support for data frames

# Adjust plot size & resolution for inline display.
# Tune to your needs.
plt.rcParams['figure.figsize'] = [9, 5.56]
plt.rcParams['figure.dpi'] = 100

# Augmented Dickey-Fuller test
from statsmodels.tsa.stattools import adfuller, pacf, acf

from pandas.plotting import autocorrelation_plot

from numpy.random import default_rng

# White Noise
To generate Gaussian noise (a special realisation of White Noise), use numpy.random.default_rng to create a generator from which you can draw a time series of length 5000 with mean 0 and variance 1.

Plot the resulting time series with matplotlib, e.g.,
```
plt.figure()
plt.plot(white_noise)
plt.title('White Noise')
```

Determine the mean & variance before you perform the Augmented Dickey-Fuller test.

Use pandas.plotting.autocorrelation_plot to get the autocorrelation function.

# Random Walk
Realise a Random Walk by summing up Gaussian Noise. Then plot the resulting time series.

Determine the mean & variance before you perform the Augmented Dickey-Fuller test.

Use pandas.plotting.autocorrelation_plot to get the autocorrelation function.

# Investigating an Ensemble of Realisations
Create 1000 realisations of white noise and random walks each. Investigate the mean and the variance over these realisations, i.e., per time step, by plotting them. Use 5000 time steps again.

Also, plot 100 time steps of 30 individual random walks.

**Notes**
- When having a number of realisations, use 2D-arrays. Matplotlib can plot a 2D-array as a set of plots. Please note, that the individual realisations would be expected in the columns of the array in this case!
- Create an empty array of length N using numpy.zeros, i.e., ```np.zeros((N))```.
- Create an empty 2D-array with r rows and c columns, use ```np.zeros((r, c))```.

# How do Mean and Variance Vary with Time in a Time Series?
Investigate the local mean and variance of one time series.

To do this, use slicing to access parts of the array which contains the time series.
These slices behave just like arrays.

Create a new time series by looking at equally sized chunks of the original time series and determining the mean / variance for each of these. As a result, you will get a time series of the local mean / variance of the original time series.

Compare to the ensemble results.