# Estimating the Forecastability of a Time Series

(Sourced from: https://www.machinelearningplus.com/time-series/time-series-analysis-python/)

- The more regular and repeatable patterns a time series has, the easier it is to forecast. 

- The ‘Approximate Entropy’ can be used to quantify the regularity and unpredictability of fluctuations in a time series. **The higher the approximate entropy, the more difficult it is to forecast it.**

- Another better alternative is the ‘Sample Entropy’. Sample Entropy is similar to approximate entropy but is more consistent in estimating the complexity even for smaller time series. 

- For example, a random time series with fewer data points can have a lower ‘approximate entropy’ than a more ‘regular’ time series, whereas, a longer random time series will have a higher ‘approximate entropy’. Sample Entropy handles this problem nicely (see below)

https://en.wikipedia.org/wiki/Approximate_entropy

https://en.wikipedia.org/wiki/Sample_entropy

## Import libraries

In [1]:
import numpy as np
import pandas as pd
import sys
import matplotlib.pyplot as plt

## Load data

In [2]:
url = "https://raw.githubusercontent.com/selva86/datasets/master/sunspotarea.csv"
sunspots_df = pd.read_csv(url, parse_dates=["date"], index_col="date")
url2 = "https://raw.githubusercontent.com/selva86/datasets/master/a10.csv"
a10_df = pd.read_csv(url2, parse_dates=["date"], index_col="date")

rand_small = np.random.randint(0, 100, size=36)
rand_big = np.random.randint(0, 100, size=136)

## Approximate Entropy

In [3]:
def AproximateEntropy(U, m, r):
    """Compute Aproximate entropy"""
    def _maxdist(x_i, x_j):
        return max([abs(ua - va) for ua, va in zip(x_i, x_j)])

    def _phi(m):
        x = [[U[j] for j in range(i, i + m - 1 + 1)] for i in range(N - m + 1)]
        C = [len([1 for x_j in x if _maxdist(x_i, x_j) <= r]) / (N - m + 1.0) for x_i in x]
        return (N - m + 1.0)**(-1) * sum(np.log(C))

    N = len(U)
    return abs(_phi(m+1) - _phi(m))

In [4]:
print(AproximateEntropy(sunspots_df.value, m=2, r=0.2*np.std(sunspots_df.value)))
print(AproximateEntropy(a10_df.value, m=2, r=0.2*np.std(a10_df.value)))
print(AproximateEntropy(rand_small, m=2, r=0.2*np.std(rand_small)))
print(AproximateEntropy(rand_big, m=2, r=0.2*np.std(rand_big)))

0.6514704970333534
0.5374775224973489
0.40521594801760186
0.79382081886467


## Sample Entropy

In [5]:
def SampleEntropy(U, m, r):
    """Compute Sample entropy"""
    def _maxdist(x_i, x_j):
        return max([abs(ua - va) for ua, va in zip(x_i, x_j)])

    def _phi(m):
        x = [[U[j] for j in range(i, i + m - 1 + 1)] for i in range(N - m + 1)]
        C = [len([1 for j in range(len(x)) if i != j and _maxdist(x[i], x[j]) <= r]) for i in range(len(x))]
        return sum(C)

    N = len(U)
    return -np.log(_phi(m+1) / _phi(m))

In [6]:
print(SampleEntropy(sunspots_df.value, m=2, r=0.2*np.std(sunspots_df.value)))
print(SampleEntropy(a10_df.value, m=2, r=0.2*np.std(a10_df.value)))
print(SampleEntropy(rand_small, m=2, r=0.2*np.std(rand_small)))
print(SampleEntropy(rand_big, m=2, r=0.2*np.std(rand_big)))

0.7853311366380039
0.41887013457621214
inf
2.2472349979108808


  if sys.path[0] == '':


In [7]:
del sys.path[0]