# Time Series Analysis

This chapter illustrates methods for exploratory analysis to understand time series data.

## Preamble

In [None]:
import pandas
import seaborn
import matplotlib.pyplot as plt
import numpy

# Reset matplotlib_converters to standards 
pandas.plotting.register_matplotlib_converters()

In [None]:
import data_science_learning_paths
data_science_learning_paths.setup_plot_style(dark=True)

## Decomposition

This section contains an example of how a time series can be decomposed into **additive components** - **trend**, **seasonality**, and **residuals**.

Speaking of seasonality - what could be a better example for this than the weather? In the following we are going to use a time series of the average temperature measured per month in the USA.

In [None]:
usa_temp = data_science_learning_paths.datasets.read_usa_temperature()

In [None]:
usa_temp.head()

In [None]:
usa_temp["Value"].plot()

In [None]:
import statsmodels

In [None]:
from statsmodels.tsa.seasonal import seasonal_decompose

In [None]:
seasonal_decompose(
    usa_temp["Value"],
    model="additive",
).plot();

In [None]:
decomposition = seasonal_decompose(
    usa_temp["Value"],
    model="additive",
    
)

Is global warming visible? Let's look at the temperature trend:

In [None]:
decomposition.trend.resample("y").mean().plot()

In [None]:
decomposition.seasonal["2016":"2018"].plot()

## Correlation

### Normalized Cross-Correlation

**Normalized cross-correlation** is a correlation measure that can be applied to two time series. The time series need to have the same length, but can have different magnitudes. It has the following properties:
- The higher the value, the higher the correlation is.
- The maximum value is 1 when the two series are exactly the same
- The minimum value is -1 when the course of the series are exactly opposites.

$$\operatorname{norm}_{-} \operatorname{corr}(x, y)=\frac{\sum_{i=0}^{n-1} x[i] \cdot y[i]}{\sqrt{\sum_{i=0}^{n-1} x[i]^{2} \cdot \sum_{i=0}^{n-1} y[i]^{2}}}$$

In [None]:
def normalized_crosscorrelation(a, b):
    corr = numpy.sum(a * b) /  numpy.sqrt(numpy.sum(a**2) * numpy.sum(b**2))
    return corr

In [None]:
usa_temp["Value"]["1900": "1910"].plot()
usa_temp["Value"]["2000": "2010"].plot()

In [None]:
normalized_crosscorrelation(
    usa_temp["Value"]["1900": "1910"].values,
    usa_temp["Value"]["2000": "2010"].values
)


In [None]:
normalized_crosscorrelation(
    usa_temp["Value"]["1900": "1910"].values,
    -usa_temp["Value"]["2000": "2010"].values
)


## References

- [Understanding Cross-Correlation, Auto-Correlation, Normalization and Time Shift](https://anomaly.io/understand-auto-cross-correlation-normalized-shift/#cross_correlation)

---
_This notebook is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright © 2018-2025 [Point 8 GmbH](https://point-8.de)_