
## 📘 Stationarity in Time Series – Focused Intuition





In [None]:
# 📌 Section 1: Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_theme(style="whitegrid")

# Intro

# =============================================================

This post is meant to provide a concise but comprehensive overview of the concept of stationarity and of the different types of stationarity defined in academic literature dealing with time series analysis.

Future posts will aim to provide similarly concise overviews of detection of non-stationarity in time series data and of the different ways to transform non-stationary time series into stationary ones.¹

(from Stationarity in time series analysis)

# =============================================================

## Introduction

Predicting the next data point in a time series is very valuable and is called forecasting.

One requirement to accurately forecast the next data point is to ensure that the time series is stationary. In this article, we will discuss:

- What a stationary time series is
- How to make a time series stationary
- How to test that a time series is indeed stationary
- Why we need a stationary time series


# =============================================================

(from Stationarity For Time Series)

Stationarity describes the concept that the statistical features of a time series do not change over time.

Thus, some time series forecasting models, such as autoregressive models, rely on the stationarity of the time series.

In this article, you will learn:

- What stationarity is,
- Why it is important,
- Ways to check for stationarity, and
- Techniques you can apply when a time series is non-stationary

(from Stationarity in Time Series — A Comprehensive Guide)

<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>

## 1️⃣ What is Stationarity?

## Raw Definition


Intuitively, stationarity means that the statistical properties of the process do not change over time. However, several different notions of stationarity have been suggested in econometric literature over the years.

(from Stationarity in time series analysis)

# ============================================================

**Stationarity** means that the **statistical properties** of the time series (mean, variance, autocorrelation) are **constant over time**.

(from alternative prompt)

# ============================================================

Stationarity describes the concept that how a time series is changing will remain the same in the future [3].

(from Stationarity in Time Series — A Comprehensive Guide)

# ============================================================

Let’s look at some definition I’ve found a while back:

    Stationarity implies that taking consecutive samples of data with the same size should have identical covariances regardless of the starting point.

Not that easy to process, I know, but let’s break it down. The above definition is a definition for something known as weak-form stationarity, or “covariance stationarity”, as stated in some resources. There exists one other type of stationarity, called strict stationarity, and it implies that samples of identical size have identical distribution. This form is very restrictive, and we rarely observe it, so for doing TSA, the term “stationarity” is used to describe covariance stationarity.

(from What is Stationarity in Time Series and why should you care)

## Key Characteristics

A time series is **stationary** if its **statistical properties do not change over time**. These properties include:

* **Mean** (average value)
* **Variance** (spread or variability)
* **Covariance structure** (how values relate to each other over time)

# ============================================================

In general, a time series is stationary if it does not exhibit any long term trends or obvious seasonality. Mathematically we have:

- A constant variance through time
- A constant mean through time
- The statistical properties of the time series do not change

(from Stationarity For Time Series)

# ============================================================

A time series has to satisfy the following conditions to be considered stationary:

- Constant mean — average value doesn’t change over time.
- Constant variance — variance doesn’t change over time.
- Constant covariance — covariance between periods of identical length doesn’t change over time.

(from Introduction to stationarity)

# ============================================================

1. Constant Mean: A stationary time series should exhibit a consistent average value over time. If the mean changes, it suggests a shift in the underlying behavior of the process.
2. Constant Variance: The variance of the time series, representing the spread of data points, should remain constant. Fluctuations in variance can make it challenging to make accurate predictions.
3. Constant Autocorrelation: Autocorrelation measures the correlation between a time series and its lagged values. In a stationary series, the strength and pattern of autocorrelation should be consistent throughout.

(from Understanding Predictive Maintenance — Unit Roots and Stationarity)

# ============================================================

In mathematical terms, a time series is stationary when its statistical properties are independent of time [3]:

- constant mean,
- constant variance,
- and covariance is independent of time.

(from Stationarity in Time Series — A Comprehensive Guide)

# ============================================================

Okay, I get that, but what does it mean for time series to be stationary? This one’s easy. To some time series to be classified as stationary (covariance stationarity), it must satisfy 3 conditions:

1. Constant mean
2. Constant variance
3. Constant covariance between periods of identical distance

The last one might be a bit trickier to understand at first, so let’s explore it a bit further. All it states is that the covariance between time periods of identical lengths (let’s say 10 days/hours/minutes) should be identical to the covariance of some other period of the same length:

IMAGE

(from What is Stationarity in Time Series and why should you care)


## Other


An important distinction to make before diving into these definitions is that stationarity — of any kind — is a property of a stochastic process, and not of any finite or infinite realization of it (i.e. a time series of values).

(from Stationarity in time series analysis)

# ============================================================

<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>

# =================================================

# 2️⃣ Types of Stationarity

## Weak and Strong


There are two main types:

* **Strict Stationarity**: Distribution is constant across time (stronger).
* **Weak Stationarity**: Mean and variance are constant; covariance depends only on lag, not time.

(from alternative prompt)

# ============================================================

This is the definition of weak-form stationarity. Another type of stationarity is strict stationarity. It implies that samples of identical size have identical distribution [5]. Since strict stationarity is restrictive and rare, this article will only focus on weak-form stationarity.

(from Stationarity in Time Series — A Comprehensive Guide)

# ============================================================

Let’s start by examining one of the formal definitions:

    A stationary process is a stochastic process whose unconditional joint probability distribution does not change when shifted in time. Consequently, parameters such as mean and variance also do not change over time. (Source: Wikipedia)

The above definition tells you what weak-form stationarity is. That’s the only form you should care about in time series analysis.

The other form is strict stationarity. It implies that samples of identical size have identical distribution. It is very restrictive, so you won’t see it often in practice.

(from Time Series From Scratch — Stationarity Tests and Automation)

# ============================================================

### Strong stationarity

Strong stationarity requires the shift-invariance (in time) of the finite-dimensional distributions of a stochastic process. This means that the distribution of a finite sub-sequence of random variables of the stochastic process remains the same as we shift it along the time index axis. For example, all i.i.d. stochastic processes are stationary.³

Formally, the discrete stochastic process 𝑿={xᵢ ; i∈ℤ} is stationary if

EQUATION

for T⊂ℤ with n∈ℕ and any τ∈ℤ. [Cox & Miller, 1965] For continuous stochastic processes the condition is similar, with T⊂ℝ, n∈ℕ and any τ∈ℝ instead.

This is the most common definition of stationarity, and it is commonly referred to simply as stationarity. It is sometimes also referred to as strict-sense stationarity or strong-sense stationarity.

    Note: This definition does not assume the existence/finiteness of any moment of the random variables composing the stochastic process!

### Weak stationarity

Weak stationarity only requires the shift-invariance (in time) of the first moment and the cross moment (the auto-covariance). This means the process has the same mean at all time points, and that the covariance between the values at any two time points, t and t−k, depend only on k, the difference between the two times, and not on the location of the points along the time axis.

Formally, the process {xᵢ ; i∈ℤ} is weakly stationary if:
1. The first moment of xᵢ is constant; i.e. ∀t, E[xᵢ]=𝜇
2. The second moment of xᵢ is finite for all t; i.e. ∀t, E[xᵢ²]<∞ (which also implies of course E[(xᵢ-𝜇)²]<∞; i.e. that variance is finite for all t)
3. The cross moment — i.e. the auto-covariance — depends only on the difference u-v; i.e. ∀u,v,a, cov(xᵤ, xᵥ)=cov(xᵤ₊ₐ, xᵥ₊ₐ)

The third condition implies that every lag 𝜏∈ℕ has a constant covariance value associated with it:

EQUATION

Note that this directly implies that the variance of the process is also constant, since we get that for all t∈ℕ

EQUATION

This paints a specific picture of weakly stationary processes as those with constant mean and variance. Their properties are contrasted nicely with those of their counterparts in Figure 2 below.

IMAGE

Other common names for weak stationarity are wide-sense stationarity, weak-sense stationarity, covariance stationarity and second order stationarity². Confusingly enough, it is also sometimes referred to simply as stationarity, depending on context (see [Boshnakov, 2011] for an example); in geo-statistical literature, for example, this is the dominant notion of stationarity. [Myers, 1989]

    Note: Strong stationarity does not imply weak stationarity, nor does the latter implies the former (see example here)! An exception are Gaussian processes, for which weak stationarity does imply strong stationarity.
    The reason strong stationarity does not imply weak stationarity is that it does not mean the process necessarily has a finite second moment; e.g. an IID process with standard Cauchy distribution is strictly stationary but has no finite second moment⁴ (see [Myers, 1989]). Indeed, having a finite second moment is a necessary and sufficient condition for the weak stationarity of a strongly stationary process.

White Noise Process: A white noise process is a serially uncorrelated stochastic process with a mean of zero and a constant and finite variance.

Formally, the process {xᵢ ; i∈ℤ} is a white noise process if:
1. The first moment of xᵢ is always zero; i.e. ∀t, E[xᵢ]=0
2. The second moment of xᵢ is finite for all t; i.e. ∀t, E[(xᵢ-𝜇)²]<∞
3. The cross moment E[xᵤ xᵥ] is zero when u≠v; i.e. ∀u,v w. u≠v, cov(xᵤ, xᵥ)=0

Note that this implies that every white noise process is a weak stationary process. If, additionally, every variable xᵢ follows a normal distribution with zero mean and the same variance σ², then the process is said to be a Gaussian white noise process.

(from Stationarity in time series analysis)





### N-th order stationarity

Very close to the definition of strong stationarity, N-th order stationarity demands the shift-invariance (in time) of the distribution of any n samples of the stochastic process, for all n up to order N.

Thus, the same condition is required:

EQUATION

for T⊂ℤ with n∈{1,…,N} and any τ∈ℤ.

Naturally, stationarity to a certain order N does not imply stationarity of any higher order (but the inverse is true). An interesting thread in mathoverflow showcases both an example of a 1st order stationary process that is not 2nd order stationary, and an example for a 2nd order stationary process that is not 3rd order stationary.

Note that stationarity of the N-th order for N=2 is surprisingly not equivalent to weak stationarity, even though the latter is sometimes referred to as second-order stationarity. [Myers, 1989] Like with strong stationarity, the condition which 2nd order stationarity sets for the distribution of any two samples of 𝑿 does not imply that 𝑿 has finite moments. And similarly, having a finite second moment is a sufficient and necessary condition for a 2nd order stationary process to also be a weakly stationary process.

### First-order stationarity

The term first-order stationarity is sometimes used to describe a series that has means that never changes with time, but for which any other moment (like variance) can change.[Boshnakov, 2011]

Again, note that this definition is not equivalent to N-th order stationarity for N=1, as the latter entails that xᵢ are all identically distributed for a process 𝑿={xᵢ ; i∈ℤ}. For example, a process where xᵢ~𝓝(𝜇,f(i)) where f(i)=1 for even values of i and f(i)=2 for odd values has a constant mean over time, but xᵢ are not identically distributed. As a result, such a process pertains to this specific definition of first-order stationarity, but not to N-th order stationarity for N=1.

### Cyclostationarity

A stochastic process is cyclostationary if the joint distribution of any set of samples is invariant over a time shift of mP, where m∈ℤ and P∈ℕ is the period of the process:

EQUATION

Cyclostationarity is prominent in signal processing.

IMAGE

### Trend stationarity

A stochastic process is trend stationary if an underlying trend (function solely of time) can be removed, leaving a stationary process. Meaning, the process can be expressed as yᵢ=f(i)+εᵢ, where f(i) is any function f:ℝ→ℝ and εᵢ is a stationary stochastic process with a mean of zero.

In the presence of a shock (a significant and rapid one-off change to the value of the series), trend-stationary processes are mean-reverting; i.e. over time, the series will converge again towards the growing (or shrinking) mean, which is not affected by the shock.

IMAGE

### Joint stationarity

Intuitive extensions exist of all of the above types of stationarity for pairs of stochastic processes. For example, for a pair of stochastic process 𝑿 and 𝒀, joint strong stationarity is defined by the same condition of strong stationarity, but is simply imposed on the joint cumulative distribution function of the two processes. Weak stationarity and N-th order stationarity can be extended in the same way (the latter to M-N-th order joint stationarity).

### The intrinsic hypothesis

A weaker form of weak stationarity, prominent in geostatistical literature (see [Myers 1989] and [Fischer et al. 1996], for example). The intrinsic hypothesis holds for a stochastic process 𝑿={Xᵢ} if:

1. The expected difference between values at any two places separated by distance r is zero: E[xᵢ-xᵢ₊ᵣ]=0
2. The variance of differences, given by Var[xᵢ-xᵢ₊ᵣ], exists (i.e. it’s finite) and depends only the distance r.

This notion implies weak stationarity of the difference Xᵢ-Xᵢ₊ᵣ, and was extended with a definition of N-th order intrinsic hypothesis.

### Locally stationary stochastic processes

An important class of non-stationary processes are locally stationary (LS) processes. One intuitive definition for LS processes, given in [Cardinali & Nason, 2010], is that their statistical properties change slowly over time. Alternatively, [Dahlhaus, 2012] defines them (informally) as processes which locally at each time point are close to a stationary process but whose characteristics (covariances, parameters, etc.) are gradually changing in an unspecific way as time evolves. A formal definition can be found in [Vogt, 2012], and [Dahlhaus, 2012] provides a rigorous review of the subject.

LS processes are of importance because they somewhat bridge the gap between the thoroughly explored sub-class of parametric non-stationary processes (see the following section) and the uncharted waters of the wider family of non-parametric processes, in that they have received rigorous treatment and a corresponding set of analysis tools akin to those enjoyed by parametric processes. A great online resource on the topic is the home page of Prof. Guy Nason, who names LS processes as his main research interest.

### The typology of notions of stationarity

The following typology figure, partial as it may be, can help understand the relations between the different notions of stationarity we just went over:

IMAGE

### Parametric notions of non-stationarity

The definitions of stationarity presented so far have been non-parametric; i.e., they did not assume a model for the data-generating process, and thus apply to any stochastic process. The related concept of a difference stationarity and unit root processes, however, requires a brief introduction to stochastic process modeling.

The topic of stochastic modeling is also relevant insofar as various simple models can be used to create stochastic processes (see figure 5).

IMAGE


(from Stationarity in time series analysis)



<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>

# ==================================================================

## 2️⃣ Why is Stationarity Important?



Many statistical and machine learning models assume that data is stationary. If the properties of the series change over time, it becomes harder to model or forecast accurately.

# ============================================================

Before diving into formal definitions of stationarity, and the related concepts upon which it builds, it is worth considering why the concept of stationarity has become important in time series analysis and its various applications.

In the most intuitive sense, stationarity means that the statistical properties of a process generating a time series do not change over time. It does not mean that the series does not change over time, just that the way it changes does not itself change over time. The algebraic equivalent is thus a linear function, perhaps, and not a constant one; the value of a linear function changes as 𝒙 grows, but the way it changes remains constant — it has a constant slope; one value that captures that rate of change.

IMAGE

Why is this important? First, because stationary processes are easier to analyze. Without a formal definition for processes generating time series data (yet; they are called stochastic processes and we will get to them in a moment), it is already clear that stationary processes are a sub-class of a wider family of possible models of reality. This sub-class is much easier to model and investigate. The above informal definition also hints that such processes should be possible to predict, as the way they change is predictable.

Although it sounds a bit streetlight effect-ish that simpler theories or models should become more prominent, it is actually quite a common pattern in science, and for good reason. In many cases simple models can be surprisingly useful, either as building blocks in constructing more elaborate ones, or as helpful approximations to complex phenomena. As it turns out, this also true for stationary processes.

Due to these properties, stationarity has become a common assumption for many practices and tools in time series analysis. These include trend estimation, forecasting and causal inference, among others.

The final reason, thus, for stationarity’s importance is its ubiquity in time series analysis, making the ability to understand, detect and model it necessary for the application of many prominent tools and procedures in time series analysis. Indeed, for many cases involving time series, you will find that you have to be able to determine if the data was generated by a stationary process, and possibly to transform it so it has the properties of a sample generated by such a process.

Hopefully, I have convinced you by now that understanding stationarity is important if you want to deal with time series data, and we can proceed to introducing the subject more formally.

(from Stationarity in time series analysis)

# ============================================================

## Why Stationarity is a Big Deal

Imagine your predictive models as expert navigators sailing through the sea of data. To navigate smoothly, they prefer calm waters — that’s where stationarity comes in. Stationary data is like a serene ocean, where patterns stay consistent. But, if your data is a stormy sea with waves of ups and downs (non-stationary), accurate predictions become a real challenge. That’s why we need to spot these storms and transform our data into a peaceful pond for effective time series analysis.

### Real-world Implications

Data stationarity isn’t just a tech thing; it’s everywhere, influencing decisions from finance to predicting the weather. In finance, where precision is key for risk and return estimates, assuming stationarity is like having a reliable compass. Climate scientists rely on stationary models to predict long-term weather patterns — it’s like having a trustworthy weather app for Earth’s future.

### Journey to Insightful Analysis

Getting our data stationary is more than a tech quest; it’s an adventure toward clarity. It’s like transforming a chaotic treasure map into a clear guide that helps analysts and decision-makers make sense of it all. In the dynamic world of time-dependent data, stationarity becomes our trusty map, guiding us to understand the patterns beneath the surface and making our journey through data waters much smoother.

Alright, now that we get why it’s cool to have calm data, let’s learn how to make it chill. But wait, before we get our hands dirty by code, let me introduce you to something called “unit roots.” Think of them as the special ingredients that affect how our data behaves. Knowing about unit roots is like having a secret recipe to turn our wavy, wild data into a smooth pond, ready for us to dive in and explore. So, get ready for the next part of our journey!

(from Understanding Predictive Maintenance — Unit Roots and Stationarity)

# ============================================================

The question still lies in why do we need to ensure our time series is stationary?

Well, there are a few reasons:

- Most forecasting model assume the data is stationary
- Stationarity helps to make each data point independent
- Makes the data, in general, easier to analyse

(from Stationarity For Time Series)

# ============================================================

Some time series forecasting models (e.g., autoregressive models) require a stationary time series because they are easier to model due to their constant statistical properties [3]. Thus, you should make your time series stationary if it is not (see What Can You Do When A Time Series Is Not Stationary?).

(from Stationarity in Time Series — A Comprehensive Guide)

# ============================================================

That’s clear now, but why do we need stationarity? 2 reasons (the most important ones), my friend:

1. Stationary processes are easier to analyze
2. Stationarity is assumed by most of the algorithms


(from What is Stationarity in Time Series and why should you care)

# ============================================================

You should care about stationarity for two reasons:

- Stationary processes are easier to analyze.
- Most forecasting algorithms assume a series is stationary.

(from Time Series From Scratch — Stationarity Tests and Automation)

# ============================================================

# Detecting stationarity in time series data

Stationarity is an important concept in time series analysis. For a concise (but thorough) introduction to the topic, and the reasons that make it important, take a look at my previous blog post on the topic. Without reiterating too much, it is suffice to say that:

1. Stationarity means that the statistical properties of a a time series (or rather the process generating it) do not change over time.
2. Stationarity is important because many useful analytical tools and statistical tests and models rely on it.

As such, the ability to determine wether a time series is stationary is important. Rather than deciding between two strict options, this usually means being able to ascertain, with high probability, that a series is generated by a stationary process.

<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>

## 2️⃣ What if the Data is Not Stationary?

The figure below is a clear example of what non-stationary data looks like. The plot on the left has a strong positive trend with strong seasonality. Although this tells us a lot about the characteristics of the data, it is not stationary and therefore cannot be forecasted using traditional time series models. We need to transform the data in order to flatten the increasing variance.

IMAGE

Since the data is non-stationary, you could perform a transformation to convert into a stationary dataset. The most common transforms are the difference and logarithmic transform.

(from Why Does Stationarity Matter in Time Series Analysis?)

# ============================================================

## What Can You Do When A Time Series Is Not Stationary?

You can apply different transformations to a non-stationary time series to try to make it stationary:

- Differencing
- Detrending by model fitting
- Log transformation


# ============================================================


<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>

# =======================================================================
## 4️⃣ Detecting stationarity

## Dickey-Fuller Test (DF)

# ============================================================

#### The Dickey-Fuller Test

The Dickey-Fuller test was the first statistical test developed to test the null hypothesis that a unit root is present in an autoregressive model of a given time series, and that the process is thus not stationary. The original test treats the case of a simple lag-1 AR model. The test has three versions, that differ in the model of unit root process they test for;

1. Test for a unit root: ∆yᵢ = δyᵢ₋₁ + uᵢ
2. Test for a unit root with drift: ∆yᵢ = a₀ + δyᵢ₋₁ + uᵢ
3. Test for a unit root with drift and deterministic time trend:
    ∆yᵢ = a₀ + a₁*t + δyᵢ₋₁ + uᵢ

The choice of which version to use — which can significantly effect the size and power of the test — can use prior knowledge or structured strategies for series of ordered tests, allowing the discovery of the most fitting version.

Extensions of the test were developed to accommodate more complex models and data; these include the Augmented Dickey-Fuller (ADF) (using AR of any order p and supporting modeling of time trends), the Phillips-Perron test (PP) (adding robustness to unspecified autocorrelation and heteroscedasticity)and the ADF-GLS test (locally de-trending data to deal with constant and linear trends).

Python implementations can be found in the statsmodels and ARCH packages.

(from Detecting stationarity in time series data)


## Augmented Dickey-Fuller Test (ADF)

# ============================================================

### 📏 Augmented Dickey-Fuller Test (ADF)

* H₀: The series is **non-stationary**
* H₁: The series is **stationary**

```python
def adf_test(ts, title="ADF Test"):
    print(f"--- {title} ---")
    result = adfuller(ts)
    print(f"ADF Statistic: {result[0]:.4f}")
    print(f"p-value: {result[1]:.4f}")
    for key, value in result[4].items():
        print(f"Critical Value ({key}): {value:.4f}")
    print("Stationary" if result[1] < 0.05 else "Non-Stationary")
    print()

adf_test(stationary_series, "ADF Test - Stationary Series")
adf_test(random_walk, "ADF Test - Random Walk")
```


# ============================================================

There are several unit root tests you can use to check for stationarity. This article will focus on the most popular ones:

- Augmented Dickey-Fuller test [2]
- Kwiatkowski-Phillips-Schmidt-Shin test [4].

### How to test for stationarity with Augmented Dickey-Fuller test

The hypotheses for the Augmented Dickey-Fuller (ADF) test are [2]:

1. Null hypothesis (H0): The time series is not stationary because there is a unit root (if p-value > 0.05)
2. Alternative hypothesis (H1): The time series is stationary because there is no unit root (if p-value ≤ 0.05)

In Python, we can use the adfuller method from the statsmodels.tsa.stattools library [8].

from statsmodels.tsa.stattools import adfuller
result = adfuller(df["example"].values)

The time series is stationary if we can reject the null hypothesis of the ADF test:

- If the p-value (result[1]) ≤ 0.05
- If the test statistic (result[0]) is more extreme than the critical value (result[4]["1%"], result[4]["5%"], and result[4]["10%"])

IMAGE

Below are the results from the ADF test for the sample dataset:

IMAGE

(from Stationarity in Time Series — A Comprehensive Guide)

# ============================================================

## Augmented Dickey-Fuller (ADF) Test

Although the visual test is a quick-and-dirty method to detect stationary, most cases will not as easy as the one above. Statistical tests allow us to prove our hypothesis by testing for stationarity. The ADF test, also known as the “unit root test”, is a statistical test to inform the degree to which a null hypothesis can be rejected or fail to be rejected. The p-value below a threshold (1% or 5%) suggests we reject the null hypothesis.

    Null Hypothesis H0 = If failed to be rejected, it suggests the time series has a unit root, meaning it is non-stationary

    Alternative Hypothesis H1 = The null hypothesis is rejected and suggests the time series does not have a unit root, meaning it is stationary

The easiest way to implement this test into your code is to use the adfuller() function in the statsmodels library.

(from Why Does Stationarity Matter in Time Series Analysis?)

```python
from statsmodels.tsa.stattools import adfuller

def ADF_Cal(x):
    result = adfuller(x)
    ADF_stat = result[0]
    p = result[1]
    print("ADF Statistic: %f" % ADF_stat)
    print("p-value: %f" % p)
    print("Critical Values")
    levels = [.01, .05, .1]
    i = 0
    for key,value in result[4].items():
        print('\t%s: %.3f' % (key,value))
        hyp = p < levels[i]
        if ADF_stat < value:
            cert = (1-levels[i])*100
            print("{}% certain this is staionary".format(cert))
            print('Reject H0: {}'.format(hyp))
            break
        i = i+1
        if i >= 3:
            print("Less than 90% certain that data is stationary")
            print('Reject H0: {}'.format(hyp))print("Calculating ADF test for X...")
ADF_Cal(X)
```

# ============================================================

## Testing For Stationarity

Visually, the data is now stationary. However, there are more quantitative techniques to determine if the data is indeed stationary.

One such method is the Augmented Dickey-Fuller (ADF) test. This is a statistical hypothesis test where the null hypothesis is the series is non-stationary (also known as a unit root test).

The statsmodels package provides an easy to use function for carrying out the ADF test:

(from Stationarity For Time Series)

```python
from statsmodels.tsa.stattools import adfuller

def adf_test(series):
    """Using an ADF test to determine if a series is stationary"""
    test_results = adfuller(series)
    print('ADF Statistic: ', test_results[0])
    print('P-Value: ', test_results[1])
    print('Critical Values:')
    for thres, adf_stat in test_results[4].items():
        print('\t%s: %.2f' % (thres, adf_stat))


adf_test(data["Passenger_Diff_Log"][1:])
```

Running this function we get the following output:

ADF Statistic: -2.717131
P-Value: 0.071121
Critical Values:
        1%: -3.48
        5%: -2.88
        10%: -2.58

Our ADF P-value (7.1%) is in-between the 5% and 10%, so depending on where you set your significance level we either reject or fail to reject the null hypothesis.

We can perhaps carry out further differencing to make it even more stationary if we want.

    If your interested in learning in-depth how the ADF test mathematically works, refer to the links I provided in the references section.

The ADF test is not the only test available for stationarity, there is also the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test. However, in this test the null hypothesis is that the trend is stationary.

    To learn more about the process of hypothesis testing, see the references section.

# ============================================================


## Testing for Stationarity

Some time back, two good fellas named David Dickey and Wayne Fuller have developed a test for stationarity. You would guess it, it’s called the Dicky-Fuller test, or DF test for short. Sometime later an improved version of the test was developed to take into account time dependencies, and it’s called the Augmented Dicky Fuller test (ADF-test).

The entire test boils down to a simple hypothesis testing, where:

- H0: Time series is not stationary
- HA: Time series is stationary

This means that we can easily calculate the test statistic and compare it to critical values. If the test statistic is lower than the critical value, we can reject the null hypothesis and declare time series as stationary.

ADF-test from Python’s statsmodels library will return you the following:

- Test-statistic
- P-value
- Number of lags used
- {1%, 5%, and 10% critical values}
- Estimation of the maximized information criteria (basically the lower it is the easier it is to make future forecasts)

For simplicity's sake, I will compare test-statistic to the p-value, but you can, later on, compare it to 1% critical value if you want. Without further ado, let’s get started!

## Imports and Dataset

With regards to the libraries you’ll need, two of them are the usual suspects — Numpy and Pandas — but you’ll also need to import statoolsfrom the statsmodelslibrary:

CODE IMAGE

Now you can read in the dataset from the provided URL, and do some setup to make everything as it needs to be:

CODE IMAGE

If you’ve done any time series analysis before, I’m sure you’re familiar with this dataset. For those who are not, this is how the first couple of rows look like:

CODE IMAGE

Let’s make a quick visualization also, just to eyeball if the time series is stationary by default:

IMAGE

Just with a quick look, it’s easy to determine the time series is not stationary. The average value changes over time and the peaks in the seasonal periods seem to get only larger.

It would be nice, however, to determine stationarity analytically. That’s what the next section will cover.

## Performing the ADF-Test

Remember the import you did from the statsmodelslibrary? We’re gonna use it now to test for stationarity. The statoolscontains adfullermethod to which you can pass your time-series data:

CODE IMAGE

Well, the situation is not great. As expected, the time series isn’t stationary, which the p-value confirms (0.99). Let’s explore a method that will differentiate the series — ergo subtract the current value by the previous one. The method is called diff(), and in it, you can pass the order — default is 1:


CODE IMAGE

After the same test is performed on the differentiated time series, you can see the p-value dropped just slightly above the usual significance level — not quite satisfactory just yet.

In a case you’re wondering why we’re dropping missing values, here’s the reason:

CODE IMAGE

As you can see, you cannot subtract from the first value, so that results in a missing value. ADF-Test will fail if time series with missing data is provided, so keep that in mind.

We can easily use different differentiation orders to see if the p-value will drop. Let’s try with order = 2:

CODE IMAGE

The p-value is now below the significance level — so the time series can be declared as stationary.

Doing this entire process manually can be tedious — even unmanageable if you have to deal with lots of time series data. Let’s imagine you want to automate some portion of time series model training — this would be a great place to start if you’re gonna use algorithms that require stationary series.

That’s the reason why I decided to make a function that will handle the process for you. I won’t explain it much here, as it is properly commented:

CODE IMAGE

Now the declared function can be easily used:

CODE IMAGE

And just quickly to verify the results — we’ll test for stationarity of supposedly stationary time series:

CODE IMAGE

Looks like everything is good, differentiation order is 2 (as calculated manually), and the time series is stationary — by the p-value.

(from What is Stationarity in Time Series and why should you care)

# ============================================================

## Augumented Dickey Fuller (ADF) helps us

Imagine you have a line of ants moving in a certain direction. The ADF test checks if the ants are marching with a purpose (stationary) or if they’re randomly scattered all over the place (non-stationary).

The ADF test involves a bit of math, but let’s simplify it:

- Null Hypothesis (H0): This is like the default assumption. The null hypothesis for ADF is that the data has a unit root, which means it’s non-stationary. It’s like saying the ants are wandering randomly.
    H0:The data has a unit root (non-stationary)
- Alternative Hypothesis (H1): This is what we’re trying to prove. The alternative hypothesis is that the data is stationary, like the ants marching in a clear line.
    H1:The data is stationary
- Test Statistic: The ADF test calculates a number called the test statistic. If this number is very small, it suggests that the data is likely stationary.
    P-value: This is a probability score. If the p-value is small (less than a certain threshold, like 0.05), we reject the null hypothesis and accept the alternative, saying our data is probably stationary.

This is not very complicated, just run the tests and check P-value

```python
from statsmodels.tsa.stattools import adfuller
# Perform the Augmented Dickey-Fuller (ADF) test for stationarity
adf_statistic, adf_p_value, adf_lags,
adf_nobs, adf_critical_values, adf_reg_results = adfuller(stationary_series)

# Check if the series is stationary based on the p-value
is_stationary = adf_p_value < 0.05  # Using a significance level of 0.05
```

You will probably most of the time use adf like this:

```python
# What youy will probably will use most of the time
_, adf_p_value, _, _, _, _= adfuller(stationary_series)
```

But I will explain what is behind these variables

- adf_statistic: The test statistic from the ADF test, indicating the strength of evidence against the null hypothesis of non-stationarity.
- adf_p_value: The p-value associated with the null hypothesis. A lower p-value suggests stronger evidence against non-stationarity.
- adf_lags: The number of lags used in the test.
- adf_nobs: The number of observations used in the ADF test.
- adf_critical_values: The critical values for the test statistic at various significance levels.
- adf_reg_results: The regression results, which provide additional information about the linear regression performed during the test.

While chaos might seem daunting, we can transform it into our ally by understanding and harnessing its patterns. In the realm of data and analysis, chaos can be a powerful force that, when properly channeled, provides insights, predictions, and a clearer path forward. It’s all about turning the unpredictable into an advantage, making chaos our strategic companion in the journey of exploration and understanding.

## Stationarity check

First lets generate the series.

```python
# Generate series
stationary_series_pseudorandom = generate_stationary_series_pseudorandom()
stationary_series_random = generate_stationary_series_pseudorandom()


titles = [
    'stationary_series_pseudorandom',
    'stationary_series_random'
]

plot_multiple_series(stationary_series_random, stationary_series_pseudorandom, 
                     titles=titles)


_, adf_p_value, _, _, _, _= adfuller(stationary_series_pseudorandom)
print(f'PseudoRandom adf p-value: {adf_p_value}')
_, adf_p_value, _, _, _, _= adfuller(stationary_series_random)
print(f'TrueRandom adf p-value: {adf_p_value}')
```

When the p-value is very small (<0.05), it provides evidence against the null hypothesis, suggesting that your data is likely stationary.

So, in this case, with a p-value much smaller than 0.05, you have the confidence to say, “Yes, our data is stationary.”

Now, let’s take a moment to crunch the numbers. Our pseudorandom boasts a P-value approximately 2 million times smaller than the truly random one.

Why does this happen? Pseudorandom numbers are generated by algorithms, introducing a level of determinism. These algorithms can unintentionally introduce patterns or structure into the data. On the other hand, truly random data, like atmospheric noise, is more likely to exhibit the characteristics of pure randomness. The ADF test, keen on detecting patterns indicative of non-stationarity, may find less evidence of such patterns in truly random data, leading to a relatively higher P-value.

(from Understanding Predictive Maintenance — Unit Roots and Stationarity)

# ============================================================

## ADF test — How to test for stationarity

A while back, David Dickey and Wayne Fuller developed a test for stationarity — Dicky-Fuller test. It was improved later and renamed to Augmented Dicky-Fuller test, or ADF test for short.

It boils down to a simple hypothesis testing:

- Null hypothesis (H0) — Time series is not stationary.
- Alternative hypothesis (H1) — Time series is stationary.

In Python, the ADF test returns the following:

- Test statistic
- P-value
- Number of lags used
- 1%, 5%, and 10% critical values
- Estimation of the maximized information criteria (don’t worry about it)

If the returned P-value is higher than 0.05, the time series isn’t stationary. 0.05 is the standard threshold, but you’re free to change it.

Let’s implement the ADF test next. We’ll start with the library imports, dataset loading, and visualization for the airline passengers dataset:


```python
import numpy as np
import pandas as pd
from statsmodels.tsa.stattools import adfuller

import matplotlib.pyplot as plt
from matplotlib import rcParams
from cycler import cycler

rcParams['figure.figsize'] = 18, 5
rcParams['axes.spines.top'] = False
rcParams['axes.spines.right'] = False
rcParams['axes.prop_cycle'] = cycler(color=['#365977'])
rcParams['lines.linewidth'] = 2.5


# Load
df = pd.read_csv('data/airline-passengers.csv', index_col='Month', parse_dates=True)

# Visualize
plt.title('Airline Passengers dataset', size=20)
plt.plot(df);
```


Here’s how the dataset looks like:

IMAGE

It doesn’t look stationary at all, but let’s verify that with a test:

# ADF stationarity test
# Returns: {Test statistic, P-value, Num lags used, {Critical values}, Estmation of maximized information criteria}
adfuller(df['Passengers'])

(from Time Series From Scratch — Stationarity Tests and Automation)

# ============================================================



# ============================================================



## KPSS Test

# ============================================================

### 📏 KPSS Test (opposite hypothesis)

* H₀: The series is **stationary**
* H₁: The series is **non-stationary**

```python
def kpss_test(ts, title="KPSS Test"):
    print(f"--- {title} ---")
    statistic, p_value, lags, crit = kpss(ts, regression='c', nlags="auto")
    print(f"KPSS Statistic: {statistic:.4f}")
    print(f"p-value: {p_value:.4f}")
    for key, value in crit.items():
        print(f"Critical Value ({key}): {value:.4f}")
    print("Non-Stationary" if p_value < 0.05 else "Stationary")
    print()

kpss_test(stationary_series, "KPSS Test - Stationary Series")
kpss_test(random_walk, "KPSS Test - Random Walk")
```

# ============================================================


There are several unit root tests you can use to check for stationarity. This article will focus on the most popular ones:

- Augmented Dickey-Fuller test [2]
- Kwiatkowski-Phillips-Schmidt-Shin test [4].

### How to test for stationarity with Kwiatkowski-Phillips-Schmidt-Shin test

The hypotheses for the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test are [4]:

1. Null hypothesis (H0): The time series is stationary because there is no unit root (if p-value > 0.05)
2. Alternative hypothesis (H1): The time series is not stationary because there is a unit root (if p-value ≤ 0.05)

The more positive this statistic, the more likely we are to reject the null hypothesis (we have a non-stationary time series).

In Python, we can use the kpss method from the statsmodels.tsa.stattools library [9]. We must use the argument regression = 'ct' to specify that the test's null hypothesis is that the data is trend stationary. [9]

from statsmodels.tsa.stattools import kpss

result = kpss(df["example"].values, 
              regression = "ct")

The time series is stationary if we fail to reject the null hypothesis of the KPSS test:

- If the p-value (result[1]) > 0.05
- If the test statistic (result[0]) is less extreme than the critical value (result[3]["1%"], result[3]["2.5%"], result[3]["5%"], and result[3]["10%"])

IMAGE

Below are the results from the KPSS test for the sample dataset:

IMAGE

(from Stationarity in Time Series — A Comprehensive Guide)

# ============================================================

#### The KPSS Test

Another prominent test for the presence of a unit root is the KPSS test. [Kwiatkowski et al, 1992] Conversely to the Dickey-Fuller family of tests, the null hypothesis assumes stationarity around a mean or a linear trend, while the alternative is the presence of a unit root.

The test is based on linear regression, breaking up the series into three parts: a deterministic trend (βt), a random walk (rt), and a stationary error (εt), with the regression equation:

EQUATION

and where u~(0,σ²) and are iid. The null hypothesis is thus stated to be H₀: σ²=0 while the alternative is Hₐ: σ²>0. Whether the stationarity in the null hypothesis is around a mean or a trend is determined by setting β=0 (in which case x is stationary around the mean r₀) or β≠0, respectively.

The KPSS test is often used to complement Dickey-Fuller-type tests. I will touch on how to interpret such combined results in a future post.

Python implementations can be found in the statsmodels and ARCH packages.

(from Detecting stationarity in time series data)

# ============================================================



## Visually


# ============================================================

You can test for stationarity with statistical tests, but sometimes plotting a time series can give you a rough estimate. Here’s an image showing stationary vs. non-stationary series:

IMAGE

A stationary series is centered around some value, doesn’t have too many spikes and unexpected variations, and doesn’t show drastic behavior changes from one part to the other.

(from Time Series From Scratch — Stationarity Tests and Automation)


# ============================================================


## How to visually assess stationarity

You can visually assess the stationarity of a time series by mentally dividing the time series in half and comparing the mean, amplitude, and cycle length from the first half to the second half of the time series.

- constant mean —The mean value of the first half of the time series should be similar to that of the second half.
- constant variance —The amplitude of the first half of the time series should be similar to that of the second half.
- covariance is independent of time — The cycle length in the first half of the time series should be similar to that in the second half. The cycles should be independent on time (e.g., not weekly or monthly, etc.).

IMAGE

For our examples, the assessment result is visualized below:

IMAGE

(from Stationarity in Time Series — A Comprehensive Guide)

# ============================================================

## Visualizations

The most basic methods for stationarity detection rely on plotting the data, or functions of it, and determining visually whether they present some known property of stationary (or non-stationary) data.

### Looking at the data

Trying to determine whether a time series was generated by a stationary process just by looking at its plot is a dubious venture. However, there are some basic properties of non-stationary data that we can look for. Let’s take as example the following nice plots from [Hyndman & Athanasopoulos, 2018]:

IMAGE

[Hyndman & Athanasopoulos, 2018] give several heuristics used to rule out stationarity in the above plots, corresponding to the basic characteristic of stationary processes (which we’ve discussed previously):

- Prominent seasonality can be observed in series (d), (h) and (i).
- Noticeable trends and changing levels can be seen in series (a), (c), (e), (f) and (i).
- Series (i) shows increasing variance.

The authors also add that although the strong cycles in series (g) might appear to make it non-stationary, the timing of these cycles makes them unpredictable (due to the underlying dynamic dominating lynx population, driven partially by available feed). This leaves series (b) and (g) as the only stationary series.

If, like me, you didn’t find at least some of these observations trivial to make by looking at the above figure, you are not the only one. Indeed, this is not a very dependable method to detect stationarity, and it is usually used to get an initial impression of the data rather than to make definite assertions.

## How Do You Test For Stationarity?

You can test a time series for stationarity in two ways:

1. Intuitive approach: Visual assessment
2. Statistical approach: Unit root test

For this section, we will recreate a few examples Hyndman and Athanasopoulos [3] used to explain the visual assessment of stationarity and extend their usage also to explain testing for stationarity with unit root testing. The data is taken from the related fma R-package [1].

IMAGE


## How to statistically assess stationarity — a detour on unit root tests

A unit root is a stochastic trend called a “random walk with drift”. Since randomness can’t be predicted, that means:

- Unit root present: not stationary (unpredictable)
- Unit root absent: stationary

To test for stationarity with a unit root test, you will state your initial assumption in the form of two competing hypotheses [6]:

- Null hypothesis (H0) — e.g., the time series is stationary (no unit root present)
- Alternative hypothesis (H1) — e.g., the time series is not stationary (unit root present)

Then you will assess whether to reject or not to reject the null hypothesis based on two approaches:

- The p-value approach:
    If the p-value > 0.05, fail to reject the null hypothesis.
    If the p-value ≤ 0.05, reject the null hypothesis.
- The critical value approach:
    If the test statistic is less extreme than the critical value, fail to reject the null hypothesis.
    If the test statistic is more extreme than the critical value, reject the null hypothesis.
    The critical value approach should be used when the p-value is close to significant (e.g., around 0.05) [8].

There are several unit root tests you can use to check for stationarity. This article will focus on the most popular ones:

- Augmented Dickey-Fuller test [2]
- Kwiatkowski-Phillips-Schmidt-Shin test [4].

(moved from here)

(from Stationarity in Time Series — A Comprehensive Guide)



## Stationarity and ACF Plots

# ============================================================

## Looking at Autocorrelation Function (ACF) plots

Autocorrelation is the correlation of a signal with a delayed copy — or a lag — of itself as a function of the delay. When plotting the value of the ACF for increasing lags (a plot called a correlogram), the values tend to degrade to zero quickly for stationary time series (see figure 1, right), while for non-stationary data the degradation will happen more slowly (see figure 1, left).

IMAGE

Alternatively, [Nielsen, 2006] suggests that plotting correlograms based on both autocorrelations and scaled autocovariances, and comparing them, provides a better way of discriminating between stationary and non-stationary data.


# ============================================================

## Parametric tests

Another, more rigorous approach, to detecting stationarity in time series data is using statistical tests developed to detect specific types of stationarity, namely those brought about by simple parametric models of the generating stochastic process (see my previous post for details).

I’ll present here the most prominent tests. I’ll also name Python implementations for each test, assuming I have found any. For R implementations see the CRAN Task View: Time Series Analysis (also here).

## Unit root tests

#### The Dickey-Fuller Test
(moved from here)

#### The KPSS Test
(moved from here)


#### The Zivot and Andrews Test

The aforementioned tests do not allow for the possibility of a structural break — an abrupt change involving a change in the mean or other parameters of the process. Assuming the time of the break as an exogenous phenomenon, Perron showed that the power to reject a unit root decreases when the stationary alternative is true and a structural break is ignored.

[Zivot and Andrews, 1992] propose a unit root test in which they assume that the exact time of the break-point is unknown. Following Perron’s characterization of the form of structural break, Zivot and Andrews proceed with three models to test for a unit root:

- Model A: Permits a one-time change in the level of the series.
- Model B: Allows for a one-time change in the slope of the trend function.
- Model C: Combines one-time changes in the level and the slope of the trend function of the series.

Hence, to test for a unit root against the alternative of a one-time structural break, Zivot and Andrews use the following regression equations corresponding to the above three models: [Waheed et al, 2006]

EQUATIONS

A Python implementation can be found in the ARCH package and here.

## Semi-parametric unit root tests

#### Variance Ratio Test

[Breitung, 2002] suggested a non-parametric test for the presence of a unit root based on a variance ratio statistic. The null hypothesis is a process I(1) (integrated of order one) while the alternative is I(0). I list this test as semi-parametric because it tests for a specific, model-based, notion of stationarity.

## Non-parametric tests

In the wake of the limitations of parametric tests, and the recognition they cover only a narrow sub-class of possible cases encountered in real data, a class of non-parametric tests for stationarity has emerged in time series analysis literature.

Naturally, these tests open up a promising avenue for investigating time series data: you no longer have to assume very simple parametric models happen to apply to your data to find out whether it is stationary or not, or risk not discovering a complex form of the phenomenon not captured by these models.

The reality of it, however, is more complex; there aren’t, at the moment, any widely-applicable non-parametric tests that encompass all real-life scenarios generating time series data. Instead, these tests limit themselves to specific types of data or processes. Also, I was not able to find implementations for any of the following tests.

I’ll mention here the few that I have encountered:

### A Nonparametric Test for Stationarity in Continuous-Time Markov Processes

[Kanaya, 2011] suggest this nonparametric test stationarity for univariate time-homogeneous Markov processes only, construct a kernel-based test statistic and conduct Monte-Carlo simulations to study the finite-sample size and power properties of the test.

### A nonparametric test for stationarity in functional time series

[Delft et al, 2017] suggest a nonparametric stationarity test limited to functional time series — data obtained by separating a continuous (in nature) time record into natural consecutive intervals, for example days. Note that [Delft and Eichler, 2018] have proposed a test for local stationarity for functional time series (see my previous post for some references on local stationarity). Also, [Vogt & Dette, 2015] suggest a nonparametric method to estimate a smooth change point in a locally stationary framework.

### A nonparametric test for stationarity based on local Fourier analysis

[Basu et al, 2009] suggest what may be the most applicable nonparametric test for stationarity present here, as it is applicable to any zero-mean discrete-time random process (and I assume here any finite sample of a discrete process you may have can easily be transformed to have zero mean).

## Final Words

That’s it. I hope the above review gave you some idea as to how to approach the issue of detecting stationarity in your data. I also hope that it exposed you to the complexities of this task; due to the lack of implementations to the handful of nonparametric tests out there, you will be forced to make strong assumptions about your data, and interpret the results you get with the required amount of doubt.

As to the question of what to do once you have detected some type of stationarity in your data, I hope to touch on this in a future post. As always, I’d love to hear about things I’ve missed or was wrong about. Cheers!

## References

## Academic literature

- Basu, P., Rudoy, D., & Wolfe, P. J. (2009, April). A nonparametric test for stationarity based on local Fourier analysis. In 2009 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 3005–3008). IEEE.
- Breitung, J. (2002). Nonparametric tests for unit roots and cointegration. Journal of econometrics, 108(2), 343–363.
- Cardinali, A., & Nason, G. P. (2018). Practical powerful wavelet packet tests for second-order stationarity. Applied and Computational Harmonic Analysis, 44(3), 558–583.
- Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and practice. OTexts.
- Kanaya, S. (2011). A nonparametric test for stationarity in continuous & time markov processes. Job Market Paper, University of Oxford.
- Kwiatkowski, D., Phillips, P. C., Schmidt, P., & Shin, Y. (1992). Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root?. Journal of econometrics, 54(1–3), 159–178.
- Nielsen, B. (2006). Correlograms for non‐stationary autoregressions. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(4), 707–720.
- Waheed, M., Alam, T., & Ghauri, S. P. (2006). Structural breaks and unit root: evidence from Pakistani macroeconomic time series. Available at SSRN 963958.
- van Delft, A., Characiejus, V., & Dette, H. (2017). A nonparametric test for stationarity in functional time series. arXiv preprint arXiv:1708.05248.
- van Delft, A. and Eichler, M. (2018). “Locally stationary functional time series.” Electronic Journal of Statistics, 12:107–170.
- Vogt, M., & Dette, H. (2015). Detecting gradual changes in locally stationary processes. The Annals of Statistics, 43(2), 713–740.
- Zivot, E. and D. Andrews, (1992), Further evidence of great crash, the oil price shock and unit root hypothesis, Journal of Business and Economic Statistics, 10, 251–270.

## Online references

- Data transformations and forecasting models: what to use and when
- Forecasting Flow Chart
- Documentation of the egcm R package
- “Non-Stationary Time Series and Unit Root Tests” by Heino Bohn Nielsen
- How to interpret Zivot & Andrews unit root test?

## Comments:

Mohsen Mollayi

Mohsen Mollayi

Oct 14, 2019

This leaves series (b) as the only stationary series.

Actually the authors claim that series (g) is also stationary.

<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>

# =======================================================
## 5️⃣ Stochastic Processes

https://medium.com/data-science/stationarity-in-time-series-analysis-90c94f27322

# Stationarity in time series analysis


## A formal definition for stochastic processes

Before introducing more formal notions for stationarity, a few precursory definitions are required. This section is meant to provide a quick overview of basic concepts in time series analysis and stochastic process theory required for further reading. Feel free to skip ahead if you are familiar with them.

Time series: Commonly, a time series (x₁, …, xₑ) is assumed to be a sequence of real values taken at successive equally spaced⁶ points in time, from time t=1 to time t=e.

Lag: For some specific time point r, the observation xᵣ₋ᵢ (i periods back) is called the i-th lag of xᵣ. A time series Y generated by back-shifting another time series X by i time steps is also sometime called the i-th lag of X, or an i-lag of X. This transformation is called both the backshifting operator, commonly denoted as B(∙),and the lag operator, commonly denoted as L(∙); thus, L(Xᵣ)=Xᵣ₋₁. Powers of the operators are defined as Lⁱ(Xᵣ)=Xᵣ₋ᵢ.

## Stochastic Processes

A common approach in the analysis of time series data is to consider the observed time series as part of a realization of a stochastic process. Two cursory definitions are required before defining stochastic processes.

Probability Space: A probability space is a triple (Ω, F, P), where
(i) Ω is a nonempty set, called the sample space.
(ii) F is a σ-algebra of subsets of Ω, i.e. a family of subsets closed with respect to countable union and complement with respect to Ω.
(iii) P is a probability measure defined for all members of F.

Random Variable: A real random variable or real stochastic variable on (Ω,F,P) is a function x:Ω→ℝ, such that the inverse image of any interval (-∞,a] belongs to F; i.e. a measurable function.

We can now define what is a stochastic process.

Stochastic Process: A real stochastic process is a family of real random variables 𝑿={xᵢ(ω); i∈T}, all defined on the same probability space (Ω, F, P). The set T is called the index set of the process. If T⊂ℤ, then the process is called a discrete stochastic process. If T is an interval of ℝ, then the process is called a continuous stochastic process.

Finite Dimensional Distribution: For a finite set of integers T={t₁, …,tn}, the joint distribution function of 𝑿={Xᵢ(ω); i∈T} is defined by

EQUATION

Which for a stochastic process 𝑿 is also commonly denoted as:

EQUATION

The finite dimensional distribution of a stochastic process is then defined to be the set of all such joint distribution functions for all such finite integer sets T of any size n. For a discrete process it is thus the set:

EQUATION

Intuitively, this represents a projection of the process onto a finite-dimensional vector space (in this case, a finite set of time points).

### Basic concepts in stochastic process modeling

The forecasting of future values is a common task in the study of time series data. To make forecasts, some assumptions need to be made regarding the Data Generating Process (DGP), the mechanism generating the data. These assumptions often take the form of an explicit model of the process, and are also often used when modeling stochastic processes for other tasks, such as anomaly detection or causal inference. We will go over the three most common such models.

The autoregressive (AR) model: A time series modeled using an AR model is assumed to be generated as a linear function of its past values, plus a random noise/error:

EQUATION

This is a memory-based model, in the sense that each value is correlated with the p preceding values; an AR model with lag p is denoted with AR(p). The coefficients 𝜙ᵢ are weights measuring the influence of these preceding values on the value x[t], c is constant intercept and εᵢ is a univariate white noise process (commonly assumed to be Gaussian).

The vector autoregressive (VAR) model generalizes the univariate case of the AR model to the multivariate case; now each element of the vector x[t] of length k can be modeled as a linear function of all the elements of the past p vectors:

EQUATION

where c is a vector of k constants (the intercepts), Aᵢ are time-invariant k×k matrices and e={eᵢ ; i∈ℤ} is a white noise multivariate process of k variables.

The moving average (MA) model: A time series modeled using a moving average model, denoted with MA(q), is assumed to be generated as a linear function of the last q+1 random shocks generated by εᵢ, a univariate white noise process:

EQUATION

Like for autoregressive models, a vector generalization, VMA, exists.

The autoregressive moving average (ARMA) model: A time series modeled using an ARMA(p,q) model is assumed to be generated as a linear function of the last p values and the last q+1 random shocks generated by εᵢ, a univariate white noise process:

EQUATION

The ARMA model can be generalized in a variety of ways, for example to deal with non-linearity or with exogenous variables, to the multivariate case (VARMA) or to deal with (a specific type of) non-stationary data (ARIMA).

### Difference stationary processes

With a basic understanding of common stochastic process models, we can now discuss the related concept of difference stationary processes and unit roots. This concept relies on the assumption that the stochastic process in question can be written as an autoregressive process of order p, denoted as AR(p):

EQUATION

Where εᵢ are usually uncorrelated white-noise processes (for all times t). We can write the same process as:

EQUATION

The part inside the parenthesis on the left is called the characteristic equation of the process. We can consider the roots of this equation:

EQUATION

If m=1 is a root of the equation then the stochastic process is said to be a difference stationary process, or integrated. This means that the process can be transformed into a weakly-stationary process by applying a certain type of transformation to it, called differencing.

IMAGE

Difference stationary processes have an order of integration, which is the number of times the differencing operator must be applied to it in order to achieve weak stationarity. A process that has to be differenced r times is said to be integrated of order r, denoted by I(r). This coincides exactly with the multiplicity of the root m=1; meaning, if m=1 is a root of multiplicity r of the characteristic equation, then the process is integrated of order r.

(from Stationarity in time series analysis)

## Unit root processes

A common sub-type of difference stationary process are processes integrated of order 1, also called unit root process. The simplest example for such a process is the following autoregressive model:

EQUATION

Unit root processes, and difference stationary processes generally, are interesting because they are non-stationary processes that can be easily transformed into weakly stationary processes. As a result, while the term is not used interchangeably with non-stationarity, the questions regarding them sometimes are.

I thought it worth mentioning here, as sometime tests and procedures to check whether a process has a unit root (a common example is the Dickey-Fuller test) are mistakenly thought of as procedures for testing non-stationarity (as a latter post in this series touches upon). It is thus important to remember that these are distinct notions, and that while every process with a unit root is non-stationary, and so is every processes integrated to an order r>1, the opposite is far from true.

### Semi-parametric unit root processes

Another definition of interest is a wider, and less parametric, sub-class of non-stationary processes, which can be referred to as semi-parametric unit root processes. The definition was introduced in [Davidson, 2002], but a concise overview of it can be found [Breitung, 2002].

If you are interested in the concept of stationarity, or have stumbled into the topic while working with time series data, then I hope you have found this post a good introduction to the subject. Some references and useful links are found below.

As I have mentioned, a latter post in this series provides a similar overview of methods of detection of non-stationarity, and another will provide the same for transformation of non-stationarity time series data.

Also, please feel free to get in touch with me with any comments and thoughts on the post or the topic.

## References
## Academic Literature

- [Boshnakov, 2011] G. Boshnakov. On First and Second Order Stationarity of Random Coefficient Models. Linear Algebra Appl. 434, 415–423. 2011.
- [Breitung, 2002] Breitung, J. (2002). Nonparametric tests for unit roots and cointegration. Journal of econometrics, 108(2), 343–363.
- [Cardinali & Nason, 2010] Cardinali, A., & Nason, G. P. (2010). Costationarity of locally stationary time series. Journal of Time Series Econometrics, 2(2).
- [Cox & Miller, 1965] Cox, D. R.; and Miller, H. D., 1965, The Theory of Stochastic Processes: Methuen, London, 398 p.
- [Dahlhaus, 2012] Dahlhaus, R. (2012). Locally stationary processes. In Handbook of statistics (Vol. 30, pp. 351–413). Elsevier.
- [Davidson, 2002] Davidson, J., 2002. Establishing conditions for the functional central limit theorem in nonlinear and semiparametric time series processes, Journal of Econometrics 106, 243–269.
- [Dyrhovden, 2016] Dyrhovden, Sigve Brix. 2016. Stochastic unit-root processes. The University of Bergen.
- [Fischer et al. 1996] Fischer, M. Scholten, H. J. and Unwin, D. Editors. Spatial analytical perspectives on GIS. Bristol, PA : Taylor & Francis, — GISDATA ; 4.
- [Myers, 1989] Myers, D.E., 1989. To be or not to be . . . stationary? That is the question. Math. Geol. 21, 347–362.
- [Nason, 2006] Nason, GP 2006, Stationary and non-stationary time series. in H Mader & SC Coles (eds), Statistics in Volcanology. The Geological Society, pp. 129–142.
- [Vogt, 2012] Vogt, M. (2012). Nonparametric regression for locally stationary time series. The Annals of Statistics, 40(5), 2601–2633.

## Online References

- A Gentle Introduction to Handling a Non-Stationary Time Series in Python at Analytics Vidhya
- Unit Root at Wikipedia
- Lesson 4: Stationary stochastic processes from Umberto Triacca’s course on stochastic processes
- Roots of characteristic equation reciprocal to roots of its inverse
- Stochastic Process Characteristics
- Trend-Stationary vs. Difference-Stationary Processes
- The home page of Prof. Guy Nason

## Footnotes

1. The phrasing here is not strictly accurate, since — as we will soon see — time series cannot be stationary themselves, rather only the processes generating them can. I have used it, however, so as not to assume any knowledge for the opening paragraphs. ↺
2. The common synonym of weak-sense stationarity as second order stationarity is probably related to (but should not be confused with) the concept of second order stochastic processes, which are defined as stochastic processes that has a finite second moment (i.e. variance). ↺
3. Note that the opposite is not true. Not every stationary process is composed of IID variables; Stationarity means that the joint distribution of variables doesn’t depend on time, but they may still depend on each other. ↺
4. This is also a good example for the fact that IID does not imply weak stationarity; since it does imply strong stationarity, however, it has the same necessary and sufficient condition for it to imply strong stationarity: having finite moments. ↺
5. One minor but interesting notion of stationarity is p-stationary processes.
6. There are also formal ways to treat times series whose samples are not equally spaced.↺

<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>

# =======================================================
## 5️⃣ Unit Root

https://medium.com/data-science/understanding-predictive-maintenance-unit-roots-and-stationarity-f05322f7b6df

# Understanding Predictive Maintenance — Unit Roots and Stationarity

## Article Purpose

In this article, we’re diving into the critical concepts of unit roots and stationarity. Buckle up for an exploration into why checking stationarity is crucial, what unit roots are, and how these elements play a key role in our predictive maintenance arsenal. We will also master the chaos!
This article is part of the series Understanding Predictive Maintenance. I plan to create the entire series in a similar style.

Check the whole series in this link. Ensure you don’t miss out on new articles by following me.




## Unit Roots — Mischievous Time Travelers in Data’s History Book

Unit roots are fundamental concepts in time series analysis, playing a pivotal role in understanding the behavior and characteristics of real-world data. In this exploration, we’ll delve into what unit roots are, why they are important in real data analysis, and how they influence the predictive maintenance landscape. Of course, we will do some experiments in the hands-on section.

## What are unit roots?

A unit root in a time series variable implies a stochastic process where the variable’s value at any given time is influenced by its past values. Formally, a unit root suggests non-stationarity, indicating that the variable does not revert to a constant mean over time.

EQUATIONS

The presence of unit roots introduces persistence into the time series, leading to challenges in modeling and forecasting. The Augmented Dickey-Fuller (ADF) test and other statistical methods are employed to detect the existence of unit roots, providing a quantitative measure of non-stationarity.

Unit roots are like the storytellers of our data, weaving narratives that extend beyond individual moments and create a continuous storyline. They signify the persistence of historical influences, introducing an element of memory into the numerical fabric of our datasets.

Imagine your dataset as a historical novel, with each data point representing a chapter in the unfolding tale. Unit roots, in this context, are the recurring motifs and characters that leave an indelible mark on the narrative, guiding the plot with a subtle yet consistent influence.

## Why it is important for us?

Understanding unit roots is fundamental for time series analysts and modelers. Non-stationary data poses challenges, as traditional models often assume stationarity for accurate predictions. Analysts must address unit roots by employing transformations, such as differencing, to induce stationarity and facilitate model development.

In predictive maintenance scenarios, unit roots play a crucial role in ensuring the accuracy of forecasting models. The long-term influence embedded in unit roots can significantly impact the reliability of predictions, making their identification and mitigation paramount for effective maintenance strategies.

As we navigate this technical exploration, we will delve deeper into unit root testing methodologies, interpret the results, and explore strategies for handling non-stationary time series data. The theoretical underpinnings of unit roots provide a solid foundation for the practical applications that follow in our analytical journey.


## How “random” is your random?

Let’s kick things off by generating a straightforward stationary series, but here’s a heads-up: not all “random” is created equal. There are two main flavors of randomness — true random and pseudorandom. Chances are, you’ve been hanging out with pseudorandom more often because that’s the go-to for computers.

In computing, generating truly random numbers is a challenge because computers are deterministic machines. Pseudorandom numbers, as the name suggests, are not genuinely random but instead are generated by algorithms that simulate randomness. These algorithms start with an initial value called a seed and use it to produce a sequence of numbers that appears random.

## Seeds

The seed is a crucial element in pseudorandom number generation. It serves as the starting point for the algorithm. If you use the same seed, you’ll get the same sequence of pseudorandom numbers every time. This determinism can be advantageous in scenarios where you want reproducibility. For example, if you’re running a simulation or an experiment that involves randomness, setting the seed allows you to recreate the exact sequence of random numbers.

On the flip side, changing the seed results in a different sequence of pseudorandom numbers. This property is often used to introduce variability in simulations or to provide different initial conditions for algorithms that use randomness.

In summary, pseudorandom numbers are generated by algorithms, and the seed is the starting point for these algorithms. Controlling the seed allows you to control the sequence of pseudorandom numbers, providing a balance between determinism and variability in computer-generated randomness.

Time to generate our pseudorandom distribution.

```python
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1992) # WOW this is our deterministic seed.

def generate_stationary_series_pseudorandom(size=100):
    stationary_series = np.random.randn(size)
    return stationary_series
```

## Can we use true randomness?

Now we might feel suprised that even randomness we are affecting most of the time is the deterministic random. But can we make true randomness, ensuring that no determism is behind it?

Well, good news! We can tap into something truly physical — atmospheric noise. Remember those flickering black and white dots on your TV screen? That’s our atmospheric noise, and we’re going to harness it to whip up some genuine randomness. So, your TV’s not just for shows; it’s your ticket out of the deterministic world.

```python
import requests

def generate_stationary_series_random(size=100):
    # Fetch truly random values from random.org atmospheric noise API
    response = requests.get(f'https://www.random.org/integers/?num={size}&min=-10000&max=10000&col=1&base=10&format=plain&rnd=new')
    if response.status_code == 200:
        stationary_series = [int(value) for value in response.text.strip().split('\n')]
        return stationary_series
    else:
        raise Exception(f"Failed to fetch random values. Status code: {response.status_code}")
```


Using this function we can genarate true randomness, Horray!

Check the whole series in this link. Ensure you don’t miss out on new articles by following me.

(from Understanding Predictive Maintenance — Unit Roots and Stationarity)

# ===============================================================================================


## Unit Root

We haven’t spoken about unit root yet, so we’ll cover it now: A unit root (also called a unit root process or a difference stationary process) is a stochastic trend in a time series, sometimes called a “random walk with drift” — If a time series has a unit root, it shows a systematic pattern that is unpredictable [5].

The reason why it’s called a unit root is because of the mathematics behind the process. At a basic level, a process can be written as a series of monomials (expressions with a single term). Each monomial corresponds to a root. If one of these roots is equal to 1, then that’s a unit root [5].

IMAGE

Explained visually, if a series was purely stationary then any spikes in the curve would eventually return to it’s previous value — in this figure above, the previous value is the x-axis itself. However, if a series has a unit root then it could settle at any point between the Pure Random Walk (blue) and the Purely Stationary (green). In this example it is the dotted line, however in reality we have no way of knowing where!

In summary, the stationarity of a time series determines how easily it can be decomposed and forecasted using statistical techniques. Additionally, stationarity is determined by identifying if a series has a unit root using tests such as ADF and KPSS. Most of the time (all the time) a forex series will not be stationary which is one of the main reasons WHY it is so difficult to forecast. We will see a concreate example of this shortly.


(from Explaining Stationarity and its Impact on Forecasting Accuracy)

<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>

# =======================================================
## 5️⃣ Making a Series Stationary

To make the time series stationary, we can apply transformations to the data.

---

More likely than not your time series will not be stationary which means that you will have to identify the trends present in your series and manipulate the data to become stationary. After the trends are removed you can apply advanced modeling techniques while maintaining the valuable knowledge of the separated trends, which will be used later.

## Differencing


# ============================================================

```python
# First-order difference
diff_series = pd.Series(random_walk).diff().dropna()

# Plot and test
plot_series(diff_series, "First Difference of Random Walk")
adf_test(diff_series, "ADF Test - Differenced Series")
kpss_test(diff_series, "KPSS Test - Differenced Series")
```

## Difference Transform

Differencing is a transform that helps stabilize the mean of the time series by removing changes in the level of a time series, which eliminates trend and seasonality. The first-order difference transform consists of taking the data point at the current time and subtracting it with the point before. The result is a dataset of differences between points at time t. If the first-order difference is stationary and random, then it is called a “random walk” model.

EQUATION

IMAGE

In this case, differencing does not produce the desired results. Even though the mean is stable, the variance just keeps increasing. In some cases, using the second-order difference transform would work but I decided to try out the logarithmic transform instead.

(from Why Does Stationarity Matter in Time Series Analysis?)


# ============================================================


### Differencing Transform

The most common transformation is to difference the time series. This is calculating the numerical change between each successive data point. Mathematically, this is written as:

EQUATION

Where d(t) is the difference at time t between the data points y(t) and y(t-1).

We can plot the differenced data by using the diff() pandas method to simply calculate the differenced data as a column of our data-frame:

```python
# Take the difference and plot it
data["Passenger_Diff"] = data["#Passengers"].diff()

plotting(title='Airline Passengers', data=data, x='Month', y='Passenger_Diff',
         x_label='Date', y_label='Passengers<br>Difference Transform')
```

Is the data now stationary? No.

The mean is now constant and is oscillating about zero. However, we can clearly see the variance is still increasing through time.

(from Stationarity For Time Series)

# ============================================================

## Differencing

Differencing calculates the difference between two consecutive observations. It stabilizes the mean of a time series and thus reduces the trend [3].

df["example_diff"] = df["example"].diff()


IMAGE

If you want to expand your knowledge on differencing, you should have a look at fractional differencing.

(from Stationarity in Time Series — A Comprehensive Guide)


# ============================================================

## 3) Differencing

Another method for removing trends in time series data is differencing. This is the process subtracting the value of one observation with the value of another observation x number of periods ago, where x is the time period lag. For instance, in the S&P 500 example, if the lag is one year then the differencing value on January 1, 2020 is equal to the actual price observed on January 1, 2020 minus the value observed on January 1, 2019. The Pandas library’s .diff(periods=x) method can be used to calculate an array of differentiating values. The period parameter denotes the lag used. My values are in daily increments which means a lag of 365 is equal to a year and a lag of 1 is equal to a day.

IMAGE

The differencing also removed the upward trend of the time series although the variance is still time dependent.

(from Time Series From Scratch — Stationarity Tests and Automation)


# ============================================================



# ============================================================



## Transformations

# ============================================================

I organized my price data in a pandas dataframe. The index is set to the date column and the dates are sorted in ascending order. I obtained a year’s worth of OHLC prices ending on January 9, 2020.

TABLE

There are several transformations available in Python’s NumPy library including logarithms, square roots, and more. I created a new column for a few of these transformations by applying them to the adjusted closing price column.

```python
# Create transformation columns of the adjusted close price

# Calculate the log of the adjusted close prices
inx_df['adj_close_log'] = np.log(inx_df['adj_close'])

# Calculate the square root of the adjusted close prices
inx_df['adj_close_sqrt'] = np.sqrt(inx_df['adj_close'])

# Calculate the cubed root of the adjusted close prices
inx_df['adj_close_cbrt'] = np.cbrt(inx_df['adj_close'])
```

TABLE

No single transformation method will universally turn all time series stationary, you will have to test them for yourself. The visualization of the logarithmic transformation is below.

IMAGE

This particular transformation didn’t fully accomplish stationarity for this series. The range of the prices changed drastically and the upward trend of the series has been reduced which is a good first step. Logarithmic functions are inverses of exponential functions with the same base.

(from Time Series From Scratch — Stationarity Tests and Automation)


## Logarithmic Transform

# ============================================================

Sometimes, differencing is not enough to remove trends in all non-stationary data. The logarithmic transform takes the log of each point and changes the data into a logarithmic scale. It is important to remember that the logarithmic transform must always be followed by the difference transform.

IMAGE

As you can see above, the mean and variance level out and become constant. There are no signs of trends or strong seasonality.

(from Why Does Stationarity Matter in Time Series Analysis?)

# ============================================================

To stabilise the variance, we apply the natural logarithm transform to the original data:

```python
# Import numpy
import numpy as np

# Take the log and plot it
data["Passenger_Log"] = np.log(data["#Passengers"])

plotting(title='Airline Passengers', data=data, x='Month',
         y='Passenger_Log', x_label='Date', y_label='Passenger<br>Log Transform')
```

The fluctuations are now on a consistent scale, but there is still a trend. Therefore, we now again have to apply the difference transform.

(from Stationarity For Time Series)


# ============================================================

## Log transformation

Log transformation stabilizes the variance of a time series [8].

```python
df["example_diff"] = np.log(df["example"].value)
```

IMAGE

As you can see, both the detrending with model fitting as well as the log transform alone did not make our example time series stationary. You can also combine different techniques to make a time series stationary:

IMAGE

(from Stationarity in Time Series — A Comprehensive Guide)

# ============================================================



# ============================================================



## Logarithmic and Difference Transform

# ============================================================

### Log Transformation + Differencing (useful for trending/seasonal series)

```python
# Simulate exponential growth series
t = np.arange(100)
exp_series = np.exp(0.03 * t) + np.random.normal(scale=0.5, size=100)

plot_series(exp_series, "Exponential Trend Series")

# Log + diff
log_series = np.log(exp_series)
log_diff = pd.Series(log_series).diff().dropna()

plot_series(log_diff, "Log-Differenced Series")
adf_test(log_diff, "ADF Test - Log-Differenced Series")
```

# ============================================================

### Logarithmic and Difference Transform

Applying both logarithmic and difference transforms:

```python
# Take the difference and log and plot it
data["Passenger_Diff_Log"] = data["Passenger_Log"].diff()

plotting(title='Airline Passengers', data=data, x='Month', y='Passenger_Diff_Log',
         x_label='Date', y_label='Passenger<br>Log and Difference')
```

Is the data now stationary? Yes!

As we can see, the mean and variance is now constant and has no long term trend.

(from Stationarity For Time Series)

# ============================================================


# ============================================================



# ============================================================



## Others ?

# ============================================================

## Detrending by (linear) model fitting

Another way to remove the trend from a non-stationary time series is to fit a simple model (e.g., linear regression) to the data and then to model the residuals from that fit.

```python
from sklearn.linear_model import LinearRegression

# Fit model (e.g., linear model)
X = [i for i in range(0, len(airpass_df))]
X = numpy.reshape(X, (len(X), 1))
y = df["example"].values
model = LinearRegression()
model.fit(X, y)

# Calculate trend
trend = model.predict(X)

# Detrend
df["example_detrend"] = df["example"].values - trend
```

IMAGE

(from Stationarity in Time Series — A Comprehensive Guide)

# ============================================================

## 2) Rolling Means

You can subtract the rolling mean from a time series. This works especially well when the mean is time dependent. A rolling mean is the mean of the previous x number of observations in the series, where the time between each observation is consistent. You have to decide which time window works best for your data. Because I am using daily trading data I selected a window of 20 because that is how many trading days there are in a month, although this is not a universal window for financial data.

Pandas’ .rolling() method can be used to calculate this rolling mean. For instance, the code to calculate a 20 day rolling mean for my data is:

```python
inx_df[‘adj_close’].rolling(window = 20).mean()
```

IMAGE

I created a new array of the rolling mean subtracted from the original closing price column and charted it below to see if this improved stationarity in the series.

IMAGE

This series appears to be much closer to stationarity. The upward trend is virtually gone but the variance is still apparent. For financial data, it is perfectly reasonable to remove the weighted rolling mean from the original data as well. The weighted rolling mean assigns a greater weight to more recent observations. In Python this is calculated with the .ewm() method, for my data the code is as follows:

```python
inx_df[‘adj_close’].ewm().mean()
```

There are several parameters available in this method which determine the individual weights of the observations including com, span, and halflife.

IMAGE

(from Time Series From Scratch — Stationarity Tests and Automation)

## What to do when tests differ?

# ============================================================

Because there are several stationarity types, we can combine the ADF and KPSS tests to determine what transformations to make [7]:

- If the ADF test result is stationary and the KPSS test result is non-stationary, the time series is difference stationary — Apply differencing to time series and check for stationarity again [7].
- If the ADF test result is non-stationary and the KPSS test result is stationary, the time series is trend stationary — Detrend time series and check for stationarity again [7].

(from Stationarity in Time Series — A Comprehensive Guide)


<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>

# =======================================================

## Hands on Testing and Making Stationary



## Hand`s on experience

Now it is the time to make hands dirty by code. We will run some experiments to help you get fammilarized with article concepts. I reccomend you to reproduce it. Before we will dive deeply into stationarity I want to ask you question.

And now we will add couple of examples how we can make this data not stationary, we are going to break our key rules of stationarity. After explanation we will plot all of them.

## Linear Trend (Non-Constant Mean)

```python
def generate_non_stationary_linear_trend(size=100):
    time = np.arange(size)
    linear_trend = 0.5 * time
    non_stationary_series = np.random.randn(size) + linear_trend
    return non_stationary_series
```

Introducing a linear trend to violate the constant mean rule means adding a systematic increase or decrease over time. In the case of the non-stationary linear trend series, the values linearly increase over time. This violates the constant mean rule because the average value of the series is changing, indicating a shift in the underlying behavior of the process. Unit roots, in this context, contribute to the persistence of the linear trend, causing the variable’s value at any given time to be influenced by its past values.

## Sine Amplitude (Non-Constant Variance)

```python
def generate_non_stationary_sin_amplitude(size=100):
    time = np.arange(size)
    amplitude = 0.5 + 0.02 * time
    sin_amplitude_component = amplitude * np.sin(2 * np.pi * time / 10)
    non_stationary_series = np.random.randn(size) + sin_amplitude_component
    return non_stationary_series
```

Adding a sinusoidal component with increasing amplitude violates the constant variance rule. In the non-stationary seasonal component series, the amplitude of the sinusoidal component grows linearly with time. This results in fluctuations in the spread of data points, making the variance non-constant. Unit roots contribute to the persistence of the seasonal component, influencing the variance to vary as the amplitude changes.

## Exponential Growth (Non-Constant Autocorrelation)

Create results plot.

```python
def generate_non_stationary_exponential_growth(size=100, growth_rate=0.05):
    time = np.arange(size)
    exponential_growth_component = np.exp(growth_rate * time)
    non_stationary_series = np.random.randn(size) + exponential_growth_component
    return non_stationary_series
```

Incorporating an exponential growth pattern violates the constant autocorrelation rule. The non-stationary expanding amplitude series exhibits exponential growth, causing the autocorrelation pattern to change with increasing values. Unit roots play a role in introducing persistence into the time series, leading to challenges in modeling and forecasting. The presence of unit roots implies non-stationarity, indicating that the variable does not revert to a constant mean over time.

## Start the experimets

Execute the code and generate the timeseries and plot the results.

```python
# Example usage
stationary_series_pseudorandom = generate_stationary_series_pseudorandom()
non_stationary_linear_trend_series = generate_non_stationary_linear_trend()
non_stationary_sin_amplitude_series = generate_non_stationary_sin_amplitude()
non_stationary_exponential_growth_series = generate_non_stationary_exponential_growth()

# Visualize the examples
plot_multiple_series(stationary_series_pseudorandom, 
                     non_stationary_linear_trend_series, 
                     non_stationary_sin_amplitude_series, 
                     non_stationary_exponential_growth_series,
                     titles=[
                         'Stationary series',
                         'Linear Trend (Non-Constant Mean)',
                         'Sinusoidal Amplitude (Non-Constant Variance)',
                         'Exponential Growth (Non-Constant Autocorrelation)'
                     ])
```

IMAGE

Spotting a linear trend or exponential growth during exploratory data analysis is relatively straightforward, as these patterns exhibit clear visual cues. However, distinguishing between stationary and non-stationary states becomes challenging when dealing with sinusoidal amplitude. Visually, it’s hard to differentiate whether the amplitude is stationary or non-stationary just by looking at the data.

IMAGE

This case will show the power of statistical tests. We have powerfull tools in our hands.

```python
_, adf_p_value_stationary, _, _, _, _ = adfuller(stationary_series_pseudorandom)
_, adf_p_value_linear_trend, _, _, _, _ = adfuller(generate_non_stationary_linear_trend())
_, adf_p_value_sin_amplitude, _, _, _, _ = adfuller(generate_non_stationary_sin_amplitude())
_, adf_p_value_exponential_growth, _, _, _, _ = adfuller(generate_non_stationary_exponential_growth())

# Print the results
print(f'PseudoRandom ADF P-value (Stationary Series): {adf_p_value_stationary}')
print(f'PseudoRandom ADF P-value (Linear Trend): {adf_p_value_linear_trend}')
print(f'PseudoRandom ADF P-value (Sinusoidal Amplitude): {adf_p_value_sin_amplitude}')
print(f'PseudoRandom ADF P-value (Exponential Growth): {adf_p_value_exponential_growth}')
```

The ADF test provides a clear distinction between stationary and non-stationary time series. In the first case, we can confidently reject the null hypothesis, indicating that the time series is stationary. However, for the other cases, we must accept the null hypothesis, concluding that the data is non-stationary. Specifically, in the case of sinusoidal amplitude, even though the non-stationarity is visually evident, the ADF test confirms our observation by not allowing us to reject the null hypothesis.

## Practice the transformation

Now, let’s have some fun with transformations and attempt to convert our non-stationary time series into a stationary one — like a bit of reverse engineering. In real-life scenarios, determining the exact transformation needed is often a trial-and-error process. I recommend conducting exploratory data analysis, plotting the time series, and making empirical attempts. If a transformation renders the series stationary, you not only achieve stationarity but also gain valuable insights into the characteristics of your data.

```python
def make_linear_trend_stationary(series):
    # Subtract the linear trend to make the mean constant.
    time = np.arange(len(series))
    linear_trend = 0.5 * time # Somehow we have found this trend :)
    stationary_series = series - linear_trend
    return stationary_series

def make_sin_amplitude_stationary(series):
    # Apply differencing to stabilize and make the variance constant.
    diff_series = np.diff(series)
    return diff_series

def make_exponential_growth_stationary(series, epsilon=1e-8):
    # Add a small constant to avoid zero or negative values
    series = np.where(series <= 0, epsilon, series)
    
    # Add a small constant to avoid non-finite values
    series += epsilon

    # Apply the log for stabilization
    series = np.log(series)
    
    # Take the first difference to remove the exponential growth
    stationary_series = np.diff(series)
    
    return stationary_series
```

Having defined our transformation functions, it’s time to put them to work. Let’s apply these transformations to our non-stationary time series and see if we can successfully induce stationarity.

```python
# Apply transformations to make non-stationary examples stationary
stationary_linear_trend = make_linear_trend_stationary(generate_non_stationary_linear_trend())
stationary_sin_amplitude = make_sin_amplitude_stationary(generate_non_stationary_sin_amplitude())
stationary_exponential_growth = make_exponential_growth_stationary(generate_non_stationary_exponential_growth())

# Perform ADF test for the transformed series
adf_p_value_stationary_linear_trend = adfuller(stationary_linear_trend)[1]
adf_p_value_stationary_sin_amplitude = adfuller(stationary_sin_amplitude)[1]
adf_p_value_stationary_exponential_growth = adfuller(stationary_exponential_growth)[1]

# Print the results
print(f'ADF P-value (Stationary Linear Trend): {adf_p_value_stationary_linear_trend}')
print(f'ADF P-value (Stationary Sinusoidal Amplitude): {adf_p_value_stationary_sin_amplitude}')
print(f'ADF P-value (Stationary Exponential Growth): {adf_p_value_stationary_exponential_growth}')
```

And how this data looks:

IMAGE

Great news! With our data now stationary, we confidently reject the null hypothesis in each case. Now, for a bit of fun, I’ll take on the challenge of reverse engineering your random generation iteration with the given seed. Let’s see if I can unravel the mystery! 😄

(from Understanding Predictive Maintenance — Unit Roots and Stationarity)

# ============================================================

Here are the results:

IMAGE

The P-value is just over 0.99, providing strong evidence that the dataset isn’t stationary. You’ve learned the concept of differencing in the previous articles. Now you’ll use it to calculate the N-th order difference. Here’s how the procedure looks for the first and second order:


```python
# First and second order difference
df['Passengers_Diff1'] = df['Passengers'].diff()
df['Passengers_Diff2'] = df['Passengers'].diff(2)

# Don't forget to drop missing values
df = df.dropna()

# Plot
plt.title('Airline Passengers dataset with First and Second order difference', size=20)
plt.plot(df['Passengers'], label='Passengers')
plt.plot(df['Passengers_Diff1'], label='First-order difference', color='orange')
plt.plot(df['Passengers_Diff2'], label='Second-order difference', color='green')
plt.legend();
```

And here’s the visualization:

IMAGE

The differenced series looks more promising than the original data, but let’s use the ADF test to verify that claim:


# Perform ADF test
adf_diff_1 = adfuller(df['Passengers_Diff1'])
adf_diff_2 = adfuller(df['Passengers_Diff2'])

# Extract P-values
p_1 = adf_diff_1[1]
p_2 = adf_diff_2[1]

# Print
print(f'P-value for 1st order difference: {np.round(p_1, 5)}')
print(f'P-value for 2nd order difference: {np.round(p_2, 5)}')

Here’s the output:

IMAGE

The first-order difference didn’t make the time series stationary, at least not at the usual significance level. Second-order differencing did the trick.

You can see how manual testing of different differencing orders can be tedious. That’s why you’ll write an automation function next.

## Automating stationarity tests

The automation function will accept the following parameters:

- data: pd.Series — time series values, without the datetime information
- alpha: float = 0.05 — significance level, set to 0.05 by default
- max_diff_order: int = 10 — the maximum time allowed to difference the time series

Python dictionary is returned, containing differencing_order and time_series keys. The first one is self-explanatory, and the second one contains the differenced time series.

The function will first check if the series is already stationary. If that’s the case, it’s returned as-is. If not, the ADF test is performed for every differencing order up to max_diff_order. The function keeps track of P-values and returns the one with the lowest differencing order that’s below the significance level alpha.

Here’s the entire function:

```python
def make_stationary(data: pd.Series, alpha: float = 0.05, max_diff_order: int = 10) -> dict:
    # Test to see if the time series is already stationary
    if adfuller(data)[1] < alpha:
        return {
            'differencing_order': 0,
            'time_series': np.array(data)
        }
    
    # A list to store P-Values
    p_values = []
    
    # Test for differencing orders from 1 to max_diff_order (included)
    for i in range(1, max_diff_order + 1):
        # Perform ADF test
        result = adfuller(data.diff(i).dropna())
        # Append P-value
        p_values.append((i, result[1]))
        
    # Keep only those where P-value is lower than significance level
    significant = [p for p in p_values if p[1] < alpha]
    # Sort by the differencing order
    significant = sorted(significant, key=lambda x: x[0])
    
    # Get the differencing order
    diff_order = significant[0][0]
    
    # Make the time series stationary
    stationary_series = data.diff(diff_order).dropna()
    
    return {
        'differencing_order': diff_order,
        'time_series': np.array(stationary_series)
    }
```

Let’s now use it to make the airline passengers dataset stationary:

```python
ap_stationary = make_stationary(
    data=df['Passengers']
)

plt.title(f"Stationary Airline Passengers Dataset - Order = {ap_stationary['differencing_order']}", size=20)
plt.plot(ap_stationary['time_series']);
```

Here’s the visualization:

IMAGE

Just like before, second-order differencing is required to make the dataset stationary. But what if you decide for a different significance level? Well, take a look for yourself:

```python
ap_stationary = make_stationary(
    data=df['Passengers'],
    alpha=0.01
)

plt.title(f"Stationary Airline Passengers Dataset - Order = {ap_stationary['differencing_order']}", size=20)
plt.plot(ap_stationary['time_series']);
```

Here’s the chart:

IMAGE

You’ll have to difference the dataset eight times for the significance level of 0.01. It would be a nightmare to revert, so you should probably stick with a higher significance level.

(from Time Series From Scratch — Stationarity Tests and Automation)

# ============================================================


<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>

# ===============================================

## 3️⃣ Visualizing Stationary vs Non-Stationary Series

Let's generate a few examples to visualize what *stationary* and *non-stationary* behavior looks like.


### ▶️ Example 1: Stationary Series (White Noise)


💡 This series has:

* Constant mean (\~0)
* Constant variance
* No trend



In [None]:
np.random.seed(42)
stationary = np.random.normal(loc=0, scale=1, size=100)

plt.figure(figsize=(12, 4))
plt.plot(stationary)
plt.title("Stationary Series: White Noise")
plt.xlabel("Time")
plt.ylabel("Value")
plt.tight_layout()
plt.show()

### ▶️ Example 2: Non-Stationary Series (Trend)


💡 You can see:

* Increasing **mean** over time
* Variance might still be constant, but the presence of a **trend** makes it non-stationary


In [None]:
trend = np.linspace(0, 10, 100) + np.random.normal(scale=1, size=100)

plt.figure(figsize=(12, 4))
plt.plot(trend)
plt.title("Non-Stationary Series: Linear Trend")
plt.xlabel("Time")
plt.ylabel("Value")
plt.tight_layout()
plt.show()


### ▶️ Example 3: Non-Stationary Series (Changing Variance)

💡 Characteristics:

* Mean is constant
* But variance is **increasing over time** — makes it non-stationary



In [None]:
variance_change = np.random.normal(loc=0, scale=np.linspace(1, 5, 100), size=100)

plt.figure(figsize=(12, 4))
plt.plot(variance_change)
plt.title("Non-Stationary Series: Changing Variance")
plt.xlabel("Time")
plt.ylabel("Value")
plt.tight_layout()
plt.show()


### ▶️ Example 4: Non-Stationary Series (Seasonality)

💡 This series has:

* Constant mean (on average)
* Constant variance
* But a **repeating pattern** → still considered non-stationary in many cases


In [None]:
t = np.arange(100)
seasonal = 10 + np.sin(2 * np.pi * t / 12) + np.random.normal(scale=0.5, size=100)

plt.figure(figsize=(12, 4))
plt.plot(seasonal)
plt.title("Non-Stationary Series: Seasonality")
plt.xlabel("Time")
plt.ylabel("Value")
plt.tight_layout()
plt.show()


## 4️⃣ How to Check Visually

To visually check for stationarity:

* Look for **trends**: rising or falling average → non-stationary
* Look for **variance shifts**: wider or tighter fluctuations over time → non-stationary
* Look for **seasonal patterns**: repeating cycles → often treated as non-stationary

---

## ✅ Summary

| Type              | Stationary? | Why                                                               |
| ----------------- | ----------- | ----------------------------------------------------------------- |
| White Noise       | ✅ Yes       | Mean and variance constant, no pattern                            |
| Trend             | ❌ No        | Mean increases over time                                          |
| Changing Variance | ❌ No        | Variance increases over time                                      |
| Seasonality       | ❌ No\*      | Regular pattern, violates time-invariance (\*may vary by context) |


<br><br><br><br><br><br>
<br><br><br><br><br><br>
<br><br><br><br><br><br>

# ===============================================
## 4️⃣ Conclusion

Overall, understanding stationarity is vital to know how to approach the data. If the data is non-stationary, then certain transforms may help turn it into stationary data. The difference or logarithmic transforms are common techniques to make data stationarity. One method is not better than the other. The user needs to look at all methods and see each result before making a sound judgment. Using quantitative tools such as the ADF Test can give us a proper understanding of the properties of our data.

(from Why Does Stationarity Matter in Time Series Analysis?)

---



## Conclusion

In this article we have described what a stationary time series is and how you can apply various transforms to make your data stationary. The log transform helps to stabilise the variance and the difference transfom stabilises the mean. We can then test for stationarity using the ADF test. The main importance of stationarity is that most forecasting models assume that the data holds that property. In my next article we will cover one of these forecasting models.

The full code that generated the data, plots and ADF test in this post can be viewed here:

LINK

## References and Further Reading

- Forecasting: Principles and Practice: https://otexts.com/fpp2/
- ADF Test: https://www.machinelearningplus.com/time-series/augmented-dickey-fuller-test/
- Hypothesis Testing: https://towardsdatascience.com/z-test-simply-explained-80b346e0e239


(from Stationarity For Time Series)

# ============================================================


## Summary

In time series forecasting, a time series, which has constant statistical properties (mean, variance, and covariance) and thus is independent of time, is described as stationary.

Because of the constant statistical characteristics, a stationary time series is easier to model than a non-stationary time series. Thus, a lot of time series forecasting models assume stationarity.

Stationarity can be checked either by visual assessment or by a statistical approach. The statistical approach checks for a unit root, an indicator of non-stationarity. The two most popular unit root tests are ADF and KPSS. Both are available in the Python stattools library [8,9].

If a time series is non-stationary, you can try to make it stationary by differencing, log transforming, or removing the trend.

## Dataset

All datasets are taken from the fma R package.

[1] Hyndman RJ (2023). fma: Data sets from “Forecasting: methods and applications” by Makridakis, Wheelwright & Hyndman (1998). R package version 2.5, http://pkg.robjhyndman.com/fma/.

License: GPL-3 (https://cran.r-project.org/web/packages/fma/index.html)

## References

[2] Dickey, D. A. and Fuller, W. A. (1979). Distribution of the estimates for autoregressive time series with a unit root. J. Am. Stat. Assoc. 74, 427–431.

[3] R. J. Hyndman, & G. Athanasopoulos (2021) Forecasting: principles and practice, 3rd edition, OTexts: Melbourne, Australia. OTexts.com/fpp3 . (Accessed on September 26, 2022).

[4] Kwiatkowski, D., Phillips, P. C., Schmidt, P., & Shin, Y. (1992). Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root?. Journal of econometrics, 54(1–3), 159–178.

[5] D. C. Montgomery, C. L. Jennings, Murat Kulahci (2015) Introduction to Time Series Analysis and Forecasting, 2nd edition, John Wiley & Sons.

[6] PennState (2023). S.3 Hypothesis Testing (Accessed on September 26, 2022).

[7] statsmodels (2023). Stationarity and detrending (ADF/KPSS) (Accessed on March 10, 2023).

[8] statsmodels (2023). statsmodels.tsa.stattools.adfuller (Accessed on September 26, 2022).

[9] statsmodels (2023). statsmodels.tsa.stattools.kpss (Accessed on September 26, 2022).

from (Stationarity in Time Series — A Comprehensive Guide)

# ============================================================



## Conclusion

And there you have it — everything you should know about stationarity. The whole concept will get clearer in a couple of articles when you start with modeling and forecasting. For now, remember that a stationary process is easier to analyze and is required by most forecasting models.

There’s still a couple of things left to cover before forecasting. These include train/test splits, metrics, and evaluations. All of these will be covered in the next article, so stay tuned.

# =============================================================
