# What is Trend? #

The **trend** component of a time series represents a persistent, long-term change in the mean of the series. The trend is the slowest-moving part of a series, the part representing the largest time scale of importance. In a time series of product sales, an increasing trend might be the effect of a market expansion as more people become aware of the product year by year.

<figure style="padding: 1em;">
<img src="https://i.imgur.com/7cxTLCt.png" width=800, alt="">
<figcaption style="textalign: center; font-style: italic"><center>Trend patterns in four time series.</center></figcaption>
</figure>

In this course, we'll focus on trends in the mean. More generally, however, a trend is any persistant and slow-moving property of a series -- time series commonly have trends in their variation or in the frequency of their outliers, for instance.

# Moving Average Plots #

To see what kind of trend a time series might have, we can use a **moving average plot**. To compute a moving average of a time series, we compute the average of the values within a sliding window of some defined width. Each point on the graph represents the average of all the values in the series that fall within the window on either side.

<figure style="padding: 1em;">
<img src="https://i.imgur.com/EZOXiPs.gif" width=800, alt="An animated plot showing an undulating curve slowly increasing with a moving average line developing from left to right within a window of 12 points (in red).">
<figcaption style="textalign: center; font-style: italic"><center>A moving average plot illustrating a linear trend. Each point on the curve (blue) is the average of the points (red) within a window of size 12.
</center></figcaption>
</figure>

The moving average plot is meant to smooth over any short-term fluctuations in the series so that only long-term changes remain. For a change to be a part of the trend, it should occur over a longer period than any seasonal changes. To visualize a trend, therefore, we take an average over a period longer than any seasonal period in the series. For instance, if a time series had daily observations and an annual (yearly) seasonality, we would use a moving average with a 365-day window.

# The Time Dummy #

In this course, we will model trend as a time-dependent property. When a property is time dependent, we can think of it as a function that takes time steps as inputs and produces the property as an output. To model time-dependent properties, therefore, we want to include time steps as a feature in our dataset. Such a feature is called a **time dummy**. A time dummy is essentially an ordinal encoding of the time index.

| Date    | Time |
|---------|------|
| 1959-06 | 1.0  |
| 1959-07 | 2.0  |
| 1959-08 | 3.0  |
| 1959-09 | 4.0  |
| 1959-11 | 6.0  |
| ...     | ...  |

A time dummy implicitly includes all the timestamps within the range of dates of the time series, including missing values. Generally, this means that if your time series skips a date, your time dummy should skip the corresponding code. Above, the Date column skipped `1959-11`, so `Time` needed to skip `5.0`.

# Engineering Trend #

When used with linear regression, the time dummy alone will produce a linear trend. The data frame just above would produce a trend model:

```
trend = a * time + b
```

where `a` (the slope) and `b` (the intercept) are learned by the linear regression algorithm. 

To fit a non-linear trend, you can either:
1. use an algorithm that can learn non-linear associations (like a neural network), or,
2. transform the time dummy and use linear regression.

In this course, we will use the second approach. For example, to fit a quadratic trend (a parabola), we could add a new feature that's the square of `Time`, like so:

| Date    | Time  (linear) | Time Squared (quadratic) |
|---------|----------------|--------------------------|
| 1959-06 | 1.0            | 1.0                      |
| 1959-07 | 2.0            | 4.0                      |
| 1959-08 | 3.0            | 9.0                      |
| 1959-09 | 4.0            | 16.0                     |
| 1959-11 | 6.0            | 36.0                     |
| ...     | ...            | ...                      |

This would create a trend model:
```
trend = a * time ** 2 + b * time + c
```

where as before the coefficients `a`, `b,` and `c` are learned by linear regression.

The trend curves in the figure below were both fit using a time dummy and scikit-learn's `LinearRegression`:

<figure style="padding: 1em;">
<img src="https://i.imgur.com/KFYlgGm.png" width=*00, alt="Above, Cars Sold in Quebec: an undulating plot gradually increasing from 1960-01 to 1968-12 with a linear trend-line superimposed. Below, Plastics Production in Australia: an undulating plot with a concave-up quadratic trend-line superimposed.">
<figcaption style="textalign: center; font-style: italic"><center><strong>Top:</strong> Series with a linear trend. <strong>Below:</strong> Series with a quadratic trend.
</center></figcaption>
</figure>

To fit other kinds of trends, you could consider transforming the time dummy with higher powers (like cubic), other mathematical functions (like `np.exp`), or even [basis functions](https://scikit-learn.org/stable/modules/linear_model.html#polynomial-regression). (See the exercise in this lesson for ways this can go wrong, however.)

# Example - Tunnel Traffic #

In this example we'll create a trend model for the *Tunnel Traffic* dataset. Though not required for the exercise, you might like to examine the code in the hidden cell to see how to use Pandas to prepare a time series.

In [None]:
#$HIDE_INPUT$
from pathlib import Path
from warnings import simplefilter

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

simplefilter("ignore")  # ignore warnings to clean up output cells

# Set Matplotlib defaults
plt.style.use("seaborn-whitegrid")
plt.rc("figure", autolayout=True, figsize=(11, 5))
plt.rc(
    "axes",
    labelweight="bold",
    labelsize="large",
    titleweight="bold",
    titlesize=14,
    titlepad=10,
)
plot_params = dict(
    color="0.75",
    style=".-",
    markeredgecolor="0.25",
    markerfacecolor="0.25",
    legend=False,
)
%config InlineBackend.figure_format = 'retina'


# Load Tunnel Traffic dataset
data_dir = Path("../input/ts-course-data")
tunnel = pd.read_csv(data_dir / "tunnel.csv", parse_dates=["Day"])

# Create a time series in Pandas by setting the index to a date
# column. We parsed "Day" as a date type by using `parse_dates` when
# loading the data.
tunnel = tunnel.set_index("Day")

# By default, Pandas creates a `DatetimeIndex` with dtype `Timestamp`
# (equivalent to `np.datetime64`, representing a time series as a
# sequence of measurements taken at single moments. A `PeriodIndex`,
# on the other hand, represents a time series as a sequence of
# quantities accumulated over periods of time. Periods are often
# easier to work with, so that's what we'll use in this course.
tunnel = tunnel.to_period()

Let's make a moving average plot to see what kind of trend this series has. As we saw in the Lesson 1 example, the *Tunnel Traffic* series has an annual seasonality, so we'll use a 365-day window.

To create a moving average, first use the `rolling` method to begin a windowed computation. Follow this by the `mean` method to compute the average over the window.

In [None]:
moving_average = tunnel.rolling(
    window=365,       # 365-day window
    center=True,      # puts the average at the center of the window
    min_periods=183,  # choose about half the window size
).mean()              # compute the mean (could also do median, std, min, max, ...)

ax = tunnel.plot(style=".", color="0.5")
moving_average.plot(
    ax=ax, linewidth=3, title="Tunnel Traffic - 365-Day Moving Average", legend=False,
);

When creating time series features, we can avoid a lot of tricky edge cases by using a library function instead of using Pandas directly. `DeterministicProcess` is a utility from the `statsmodels` library for creating time series features. It accomodates missing timestamps and can generate features for times outside of the training data (much like Prophet's `make_future_dataframe` we saw in Lesson 1). The `constant` argument creates a feature for the y-intercept of the graph, and the `order` argument creates a feature for the trend. "Order" here refers to polynomial order: 1 is linear, 2 is quadratic, 3 is cubic, and so on. The moving average plot suggests that the trend is close to linear, so we'll create a time dummy of order 1.

In [None]:
from statsmodels.tsa.deterministic import DeterministicProcess

dp = DeterministicProcess(
    index=tunnel.index,  # dates from the training data
    constant=True,       # the level
    order=1,             # the trend
    drop=True,           # drop terms to avoid collinearity
)
X = dp.in_sample()  # features for the training data

X.head()

By the way, a "deterministic process" is a model for a time series that is non-random or completely determined. Our trend and seasonal features are deterministic because they depend only on dates and times, which are fixed. In contrast, a "stochastic process" is a series that is random or unpredictable to some degree. Anything truly requiring a forecast is stochastic. The distinction between deterministic and stochastic processes is fundamental to forecasting, and we'll return to it again in Lesson 4.

Now let's create the trend model. As mentioned before, we'll use `LinearRegression` from scikit-learn.

In [None]:
from sklearn.linear_model import LinearRegression

y = tunnel["NumVehicles"]  # the target

# The intercept is the same as the `const` feature from
# DeterministicProcess. LinearRegression behaves badly with duplicated
# features, so we need to be sure to exclude it here.
model = LinearRegression(fit_intercept=False)
model.fit(X, y)

y_pred = pd.Series(
    model.predict(X),
    index=y.index,
)


ax = tunnel.plot(style=".", color="0.5", title="Tunnel Traffic - Linear Trend")
_ = y_pred.plot(ax=ax, linewidth=3, label="Trend")

The trend discovered by our `LinearRegression` model is almost identical to the moving average plot, which suggests that a linear trend was the right decision in this case.

To make a forecast, we apply our model to "out of sample" features. "Out of sample" refers to times outside of the observation period of the training data, which would include the future times of a forecast. Here's how we could make a 30-day forecast:

In [None]:
X = dp.out_of_sample(steps=30)

y_fore = pd.Series(
    model.predict(X),
    index=X.index,
)

y_fore.head()

Let's plot a portion of the series to see the trend forecast for the next 30 days:

In [None]:
#$HIDE_INPUT$
ax = tunnel["2005-05":].plot(title="Tunnel Traffic - Linear Trend Forecast", **plot_params)
ax = y_pred["2005-05":].plot(ax=ax, linewidth=3, label="Trend")
ax = y_fore.plot(ax=ax, linewidth=3, label="Trend Forecast", color="C3")
_ = ax.legend()

---

The trend models we learned about in this lesson turn out to be useful for a number of reasons. Besides acting as a baseline or starting point for more sophisticated models, we can also use them as a component in a "hybrid model" with algorithms unable to learn trends (like XGBoost and random forests). We'll learn more about this technique in Lesson 5.

# Your Turn #