The framework to approach time series modeling and forecasting problems are: <br>
1. Inputs vs. Outputs: What are the inputs and outputs for a forecast?
2. Endogenous vs. Exogenous: Input variables that are influenced or not by other variables in the system?
3. Unstructured vs. Structured: No obvious or systematic time-dependent pattern in a time series variable
4. Regression vs. Classification : What are some alternate ways to frame your time series forecasting problem?
5. Univariate vs. Multivariate: One or multiple input variables measured over time and smae with output?
6. Single-step vs. Multi-step: Forecast next step or more than one future time steps?
7. Static vs. Dynamic: A forecast model is fit once or on newly available data prior to each prediction?
8. Contiguous vs. Discontiguous: Observations are made uniform over time <br>

Next let's discover exactly how to transform a time series data set into a three-dimensional structure ready for fitting a CNN or LSTM model. 

In [1]:
import numpy as np
import pandas as pd

#### Time Series to Supervised - 2D Data Preparation Basics

In [2]:
# split a univariate sequence into samples
# Eg: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
#      X      y
# [1, 2, 3], [4]
# [2, 3, 4], [5]
# [3, 4, 5], [6]
def split_sequence(sequence, n_steps):
    X, y = list(), list()
    for i in range(len(sequence)):
        # find the end of this pattern
        end_ix = i + n_steps
        # check if we are beyond the sequence
        if end_ix > len(sequence)-1:
            break
        # gather input and output parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
        X.append(seq_x)
        y.append(seq_y)
    return X, y

    
# define univariate time series
series = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
# transform to a supervised learning problem
X, y = split_sequence(series, 3)
X = pd.DataFrame(X)
y = pd.DataFrame(y)
print(X)
print(y)

   0  1  2
0  1  2  3
1  2  3  4
2  3  4  5
3  4  5  6
4  5  6  7
5  6  7  8
6  7  8  9
    0
0   4
1   5
2   6
3   7
4   8
5   9
6  10


#### Time Series to deep learning inputs - 3D Data Preparation Basics
The three dimensions of this input are:
- Samples: One sequence is one sample. A batch is comprised of one or more samples.
- Time Steps: One time step is one point of observation in the sample. One sample is comprised of multiple time steps.
- Features: One feature is one observation at a time step. One time step is comprised of one or more features.

This expected three-dimensional structure of input data is often summarized using the array shape notation of: <b>[samples, timesteps, features]</b>

For example: <br>
- X.shape[0] refers to the number of rows in a 2D array, in this case the number of samples.
- X.shape[1] refers to the number of features in a 2D array aka the depth of the 2 D array
- X.shape[2] refers to the number of columns in a 2D array, in this case the number of feature that we will use as the number of time steps.

In [3]:
# transform univariate 2d to 3d
# split a univariate sequence into samples
def split_sequence(sequence, n_steps):
    X, y = list(), list()
    for i in range(len(sequence)):
        # find the end of this pattern
        end_ix = i + n_steps
        # check if we are beyond the sequence
        if end_ix > len(sequence)-1:
            break
        # gather input and output parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
        X.append(seq_x)
        y.append(seq_y)
    return np.array(X), np.array(y)

# define univariate time series
series = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print(series.shape)
# transform to a supervised learning problem
X, y = split_sequence(series, 3)
print(X.shape, y.shape)
# transform input from [samples, features] to [samples, timesteps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))
print(X.shape)

(10,)
(7, 3) (7,)
(7, 3, 1)


#### How to prepare Univariate time series

In [4]:
# Example data set
# define the dataset
data = list()
n = 5000
for i in range(n):
    data.append([i+1, (i+1)*10])
data = np.array(data)
print(data.shape)
print(data[0:5])
# drop time aksfirst axis
data = data[:, 1]
print(data.shape)
print(data[0:5])
# split into samples (e.g. 5000/200 = 25)
samples = list()
length = 200
# step over the 5,000 in jumps of 200
for i in range(0,n,length):
    # grab from i to i + 200
    sample = data[i:i+length]
    samples.append(sample)
# convert list of arrays into 2d array
data = np.array(samples)
print(data.shape)
# reshape into [samples, timesteps, features]
data = data.reshape((len(samples), length, 1))
print(data.shape)

(5000, 2)
[[ 1 10]
 [ 2 20]
 [ 3 30]
 [ 4 40]
 [ 5 50]]
(5000,)
[10 20 30 40 50]
(25, 200)
(25, 200, 1)
