# About
Univariate time series are datasets comprised of a single series of observations with a temporal ordering and a model is required to learn from the series of past observations to predict the next value in the sequence.
-  The CNN model will learn a function that maps a sequence of past observations as input to an output observation

## Note:
* The chosen configuration of the models is arbitrary and not optimized for each problem; that was not the goal.

# Libraries

In [1]:
%run "/home/cesar/Python_NBs/HDL_Project/HDL_Project/global_fv.ipynb"

In [2]:
import mysql.connector

In [3]:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

from sklearn.model_selection import train_test_split

from numpy import array
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Conv1D, Flatten, Dense
from keras.layers.convolutional import MaxPooling1D

from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error

# User-Defined Functions

In [4]:
def split_sequence(sequence, n_steps):
    """
    Transforming a univariate time series into a supervised learning problem.

    We can divide a sequence into multiple input/output patterns called samples, where n
    time steps are used as input and 1 time step is used as output for the one-step prediction
    that is being learned.
    """
    
    # Defining variable lists 
    X, y = list(), list()
    
    for i in range(len(sequence)):
        # find the end of this pattern
        end_ix = i + n_steps
        
        # check if we are beyond the sequence
        if end_ix > len(sequence)-1:
            break
            
        # gather input and output parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)

In [5]:
def reshaping_1D(X, n_features):
    return X.reshape((X.shape[0], X.shape[1], n_features))

In [6]:
def quick_test(model, test_id, X, n_steps, n_features):
    x_input = X[test_id]
    x_input = x_input.reshape((1, n_steps, n_features))
    # yhat
    return model.predict(x_input, verbose=0)

# Data

## Input data and parameters

In [7]:
# ----------------------
# ----- Parameters -----
# ----------------------
# A simple linear input sequence
raw_seq = qdata("SELECT Monterrey FROM `HDL_PM2d5`")
# choose a number of time steps
n_steps = 5
# Number of features (Univariate example)
n_features = 1

## Data preparation

In [8]:
# ----------------------
# ------ Command -------
# ----------------------
# Splittin data into training samples
X, y = split_sequence(raw_seq, n_steps)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=101)

# Reshaping X from [samples, timesteps] into [samples, timesteps, features]
X_train = reshaping_1D(X_train, n_features)
# ----------------------
# --- Visualization ----
# ----------------------
#for i in range(2):
#    print(X[i], y[i])

# CNN Model
1. A CNN model that has a **convolutional hidden** layer that operates over a 1D sequence.
2. This is followed by perhaps a **second convolutional layer** in some cases, such as very long input sequences.
3. And then a **pooling layer** whose job it is to distill the output of the convolutional layer to the most salient elements.

* Each layer is followed by a dense **fully connected layer** that serves as interpreter to the extracted features by the convolution. 
* A **flatten layer** is used between the convolutional layers and the dense layer to reduce the feature maps to a single one-dimensional vector.

In [9]:
# ----------------------
# ------ Command -------
# ----------------------
# We can define a 1D CNN Model for univariate time series forecasting as follows:
model = Sequential()
model.add(Conv1D(64, 2, activation='sigmoid', input_shape=(n_steps, n_features)))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(50, activation='tanh'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

# fit model
model.fit(X_train, y_train, epochs=1000, verbose=0)

<keras.callbacks.History at 0x7f5cc0619ca0>

In [10]:
test_id = 0
    
quick_test(model, test_id, X_test, n_steps, n_features)    

array([[29.46634]], dtype=float32)

# Error Metrics

In [11]:
test = model.predict(X_test, verbose=0)

## RMSE

In [12]:
mean_squared_error(y_test, test, squared=False)

5.833566038157185

## MAE

In [13]:
mean_absolute_error(y_test, test)

2.585869125097883