# Transforming Time Series Data

**In this lecture you will learn how to:**

* Apply a log transform to a time series in order to make it easier to forecast.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## Recap: Calender Adjustments

We have already met one type of transformation: **the calender adjustment**

We did this in order to remove some of the noise from the time series, due different number of days in in each month.  Removing this noise makes it an easier forecasting problem.

In [None]:
sales = pd.read_csv('data/Alcohol_Sales.csv', index_col='DATE', parse_dates=True)
sales.index.freq = 'MS'

In [None]:
sales.plot(figsize=(12,4))

In [None]:
sales.index.days_in_month

In [None]:
sales_rate = sales['sales'] / sales.index.days_in_month

In [None]:
sales_rate.head(5)

### Backtransforming a calender adjustment

After you have produced a forecast on the transformed data you are likely to need to back-transform to its original units.  This is straightforward for calender adjustments.  Just reverse the operation.

In [None]:
sales_rate * sales_rate.index.days_in_month

## Dealing with increasing variance.

One property of seasonal time series with trend is that the seasonal peaks and troughs increase with the level of the series. This property can negatively impact the accuracy of forecasts.  An example of increasing variance is illustrated by the Alcohol Sales data below.  The orange line is a 12 month moving average.  You can see that as the level increases the fluctuations either side of the mean increase.

In [None]:
ma = sales_rate.rolling(window=12).mean()
ax = sales_rate.plot(figsize=(12,4), label='mean sales per day')
ma.plot(ax=ax, label='MA_12')
ax.legend()

### Taking the natural logarithm stabilises variance 

In [None]:
log_sales = np.log(sales_rate)

In [None]:
ma = log_sales.rolling(window=12).mean()
ax = log_sales.plot(figsize=(12,4), label='log sales')
ma.plot(ax=ax, label='MA_12')
ax.legend()

#### Backtransforming logged data

This is straightforward with `NumPy`.  We simply need to exponentiate the data using the function `np.exp()`

In [None]:
np.exp(log_sales)

### Power Transformations

As an alterantive to taking the logarithm of the data is to take the cube root or square root.

In [None]:
cbrt_sales = np.cbrt(sales_rate)

In [None]:
cbrt_sales.plot(figsize=(12,4))

In [None]:
sqrt_sales = np.sqrt(sales_rate)
sqrt_sales.plot(figsize=(12,4))

In [None]:
ax = log_sales.plot(figsize=(12,4), label='log')
cbrt_sales.plot(ax=ax, label='cbrt')
ax.legend()