# Times series with LSTM 

## <font color="#fcc200"> 1. Introduction

The goal of this kernel is to work with times series using Keras. I will develop an easy example using a dataset with information related to the weather in Dehli from 2013 till 2017. 
The objective will be to predict the temperature having some daily information as the wind speed, pressure and humidity. 
With this purpose, I will preprocess the data, transforming variables and reshaping the data in order to work with Keras. Finally, I will create a LSTM neural network which will be trained and saved. It can be used for futures predictions and result visualizations.

## <font color="#fcc200"> 2. Import libraries

In [1]:
import numpy as np
import pandas as pd

from keras.models import Sequential, load_model
from keras.layers import Dense, LSTM
from keras.callbacks import ModelCheckpoint, TensorBoard

from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score as R2_score
from sklearn.preprocessing import MinMaxScaler

import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator

from IPython.display import display_html

import os


## <font color="#fcc200"> 3. Load data

There are two different datasets: training and test. Both have the same structure, containing the data collected in Dehli from  January 2013 to April 2017.

The columns of both datasets are teh same. See description below: 

* date: Date of format YYYY-MM-DD.
* meantemp: Mean temperature averaged out from multiple 3 hour intervals in a day.
* humidity: Humidity value for the day (units are grams of water vapor per cubic meter volume of air).
* wind_speed: Wind speed measured in kmph.
* meanpressure: Pressure reading of weather (measure in atm)
    
To make it easier, I will concatenate both sets and just work with a single dataframe.

In [1]:
data_train = pd.read_csv('../input/daily-climate-time-series-data/DailyDelhiClimateTrain.csv',sep=',')
data_train.head()

In [1]:
data_train.shape

In [1]:
data_test = pd.read_csv('../input/daily-climate-time-series-data/DailyDelhiClimateTest.csv',sep=',')
data_test.head()

In [1]:
data_test.shape

In [1]:
data = pd.concat([data_train,data_test])
data.head()

In [1]:
data.shape

In [1]:
data.isnull().sum()

In [1]:
data.describe()

## <font color="#fcc200"> 4. Initial preprocessing and data visualization

I will convert the date column to datetime and rename some of the columns to simplify the following process. 

In [1]:
data['date'] = pd.to_datetime(data['date'])

In [1]:
data = data.rename(columns={"meantemp":"temp","wind_speed":"wind","meanpressure":"pressure"})

In [1]:
data.head()

In [1]:
print("Starting date of time series: ", data.date.min())
print("Final date of time series:    ", data.date.max())

In [1]:
dates = data['date'].values
temp  = data['temp'].values
humidity = data['humidity'].values
wind = data['wind'].values
pressure = data['pressure'].values

Let´s plot the temperature during the two years time frame. 
We can see the cyclical behavior each year, obviously the temperature gets higher in summer and goes down in winter. 

In [1]:
plt.figure(figsize=(15,5))
plt.plot(dates, temp)
plt.title('Temperature average',
          fontsize=20);

Zooming the graphic above we see how the temperature changes from one day to another but it has a tendency depending on the period of the year. 

In [1]:
plt.figure(figsize=(15,5))
plt.plot(dates, temp, 'o-')
plt.title('Temperature average (3 hours interval)', fontsize=20)
plt.axis([dates[-150],dates[-1],0,50]);

The idea is to transform the variable to be predicted. I will use a logarithmic transformation but first let´s plot the histogram of the original variable and the transformed one. 

The purpose of this transformation is to standarizated the original variable and also supressed possible outliers. 

In [1]:
plt.figure(figsize=(15,5))
plt.subplot(1,2,1)
plt.hist(temp, bins=30)
plt.xlabel('Temperature', fontsize=20)
plt.subplot(1,2,2)
aux = np.log( temp[1:] / temp[0:-1]  )
plt.hist(aux, bins=30)
plt.xlabel('Logarithmic temperature increment', fontsize=20)
plt.show()
print("Temperature average                      :", temp.mean())
print("Logarithmic temperature increment average:", aux.mean())

In [1]:
plt.figure(figsize=(15,5))
plt.plot(dates[1:], aux)
plt.title('Temperature (Logarithmic)',
          fontsize=20);

## <font color="fcc200"> 5. Variable transformation

I will create the function used to transform the original variable (temperature) and also the one needed to undo the transformation. This second function will be used  to obtain the final predictions.

In [1]:
eps = 1e-11

NAN = np.NAN

# Logarithmic transformation

def transform_logratios(serie):
    aux = np.log((serie[1:]+eps) / (serie[0:-1]+eps))
    return np.hstack( ([NAN], aux))
def inverse_transform_logratios(log_ratio, temp_prev):
    return np.multiply(temp_prev, np.exp(log_ratio))

In [1]:
transform = transform_logratios
inverse_transform = inverse_transform_logratios

In [1]:
scaler = MinMaxScaler()
transformed = scaler.fit_transform(data.loc[:, ["humidity","wind","pressure"]])

In [1]:
transformed = pd.DataFrame(transformed, columns = ["humidity_s","wind_s","pressure_s"])
transformed.head()

In [1]:
humidity_s = transformed['humidity_s'].values
wind_s = transformed['wind_s'].values
pressure_s = transformed['pressure_s'].values

## <font color="#fcc200"> 6. Data winnowing

In order to work with Keras, the data needs to be reshaped and treated. The variables needs to be converted to dummies (not in our case since the dataset we are working with doens´t need any conversion) and also, very important, needs to be winnowed. 

It means, Keras will process the data in different windows and we have to define this windows. With that purpose, I will use the following functions where: 

* series: all variables involved in our model, including the one to be predicted. 
* target: variable to predict.
* prev_known: variable that can be used in advance to improve our model. For example, we could add information related to weekends or holidays which could shed some information to the purpose of a model. If we would like to know the sales of a supermarket, it is not the same during the week or weekend. It is important to make sure we can use these variable in advance also in a supposed production model (none of these variable in this kernel since weather variables cannot be known previously).
* W_in: it is the window or frecuency we will split our dataset.
* W_out: it is the exit window, which is one since we just want to know the next day prediction. 

In [1]:
def winnowing(series, target, prev_known,
               W_in=1, W_out=1):
    n = len(series[0])
    dataX = NAN*np.ones((n,W_in,len(series)))
    if np.sometrue([s.dtype == object for s in series]):
        dataX = dataX.astype(object)
    if W_out==1:
        dataY = series[target].copy()
    else:
        dataY = NAN*np.ones((n,W_out))
        if series[target].dtype == object:
            dataY = dataY.astype(object)
        dataY[:,0] = series[target].copy()
        for i in range(1,W_out):
            dataY[:-i,i] = dataY[i:,0].copy()
    
    for i in range(n):
        for j,s in enumerate(prev_known):
            int_s = int(s) 
            ini_X = max([0,W_in-i-int_s])
            dataX[i, ini_X:,j] = \
            series[j][max([0,i-W_in+int_s]):min([n,i+int_s])]
    
    return dataX, dataY


In [1]:
def my_dfs_display(dfs,names):
    df_styler = []
    for df,n in zip(dfs,names):
        df_styler.append(df.style.set_table_attributes("style='display:inline'").\
                         set_caption(n))
    display_html(df_styler[0]._repr_html_()+"__"+df_styler[1]._repr_html_(),
                 raw=True)

In [1]:
def info_winnowing(X,Y,names_series,name_target,times=None):
    c0  = '\033[1m'  
    c1  = '\033[0m'  
    W_in = X.shape[1]
    if len(Y.shape)==1:
        W_out = 1
    else:
        W_out = Y.shape[1]
    print(len(X), "windows created \n")
    print("X.shape={}".format(X.shape)," Y.shape={}".format(Y.shape),"\n")
    for t in range(len(X)):
        print(c0,"Window %d:"%t, c1)
        if times is None:
            names_ts = ["t="+str(t+i-W_in) for i in range(W_in)]
            names_ts_pred = ["t="+str(t+i) for i in range(W_out)]
        else:
            times = list(times)
            if (t-W_in)<0:
                names_ts = ["?"+str(i) for i in range(W_in-t)] + times[:t]
            else:
                names_ts = times[(t-W_in):t]
            if (t+W_out-1)>=len(times):
                names_ts_pred = times[t:] + ["?"+str(i) for i in range(W_out-(len(times)-t))]
            else:
                names_ts_pred = times[t:(t+W_out)]
        aux1 = pd.DataFrame(X[t].T,columns=names_ts,index=names_series)
        aux2 = pd.DataFrame([Y[t]],columns=names_ts_pred,
                            index=[name_target])
        if W_out==1:
            my_dfs_display((aux1,aux2),
                           ("X[{}].shape={}".format(t,X[t].shape),
                            "Y[{}]={}".format(t,Y[t])))
        else:
            my_dfs_display((aux1,aux2),
                           ("X[{}].shape={}".format(t,X[t].shape),
                            "Y[{}].shape={}".format(t,Y[t].shape)))


In [1]:
logratio_temp = transform(temp)

series = [logratio_temp, humidity_s, wind_s, pressure_s]
prev_known = [False, False, False, False]

In [1]:
print(np.shape(series))
print(np.shape(prev_known))

In [1]:
lookback = 6  # Window_in

X, y = winnowing (series, target=0, prev_known=prev_known,
                  W_in=lookback)

print(X.shape, np.shape(y))

I have used a window of 6 days. After trying different windows, 6 days is the one which gives me a better prediction.

We can see below the final structure created and how it will be passed to the neural network. We see how we have a 6 days window with 4 variables for each day.

In [1]:
info_winnowing(X[:20],y[:20],
                 names_series=["logratio_temp",
                                 "humidity_s", "wind_s",
                                 "pressure_s"],
                 name_target="logratio_temp",
                 times=dates)


In [1]:
print(X.shape)
print(np.shape(temp))

## <font color="#fcc200"> 7. Training and test sets

Next, I will split the dataset in training and test.

In [1]:
X_train = X[(lookback+1):len(data_train)]
y_train = y[(lookback+1):len(data_train)]
temp_train = temp[(lookback+1):len(data_train)]
temp_test  = temp[len(data_train):]
X_test  = X[len(data_train):]
y_test  = y[len(data_train):]

print(np.shape(temp_train))
print(np.shape(temp_test))

In [1]:
temp_prev_train =  np.hstack(( [NAN], temp_train[:-1]))
temp_prev_test  =  np.hstack(( temp_train[-1:],
                                      temp_test[:-1]))
dates_train     = dates[(lookback+1):len(data_train)]
dates_test      = dates[len(data_train):]

In [1]:
print(X_train.shape, y_train.shape)

## <font color="#fcc200"> 8. Model with Keras

Finally, the network needs to be created. It is a really simple network where the window size needs to be again specify. 

In [1]:
model = Sequential()
model.add(LSTM(10, input_shape=(lookback, X_train.shape[2]),
#              kernel_regularizer='l1'
              )
         )
model.add(Dense(1,
#                kernel_regularizer='l1'
               )
         )
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mse']) # 'RMSprop'
# mean_absolute_error

import keras.backend as K
print(K.get_value(model.optimizer.lr))

In [1]:
model.optimizer.lr

In [1]:
model.summary()

With the purpose of training and saving the final network, the next functions will be implemented to choose the one that gives the better result (lowest mean square error).

In [1]:
def training_graphic(tr_mse, val_mse):
    ax=plt.figure(figsize=(10,4)).gca()
    plt.plot(1+np.arange(len(tr_mse)), tr_mse)
    plt.plot(1+np.arange(len(val_mse)), val_mse)
    plt.title('mse', fontsize=18)
    plt.xlabel('time', fontsize=18)
    plt.ylabel('mse', fontsize=18)
    plt.legend(['Training', 'Validation'], loc='upper left')
    ax.xaxis.set_major_locator(MaxNLocator(integer=True))
    plt.show()

In [1]:
epochs = 200
batch_size = 64
Nval = 200
control_val = True
save_training_tensorboard = False


if not control_val:
    history = model.fit(X_train, y_train, epochs=epochs,
                        batch_size=batch_size, verbose=2)
    
else:    
    acum_tr_mse = []
    acum_val_mse = []
    filepath="best_model.h5"
    checkpoint = ModelCheckpoint(filepath, monitor='val_mse', verbose=2,
                                 save_best_only=True,
                                 mode='min') 

    if save_training_tensorboard:
        callbacks_list = callbacks + [checkpoint]
    else:
        callbacks_list = [checkpoint]
    
    for e in range(epochs):
        history = model.fit(X_train[:-Nval], y_train[:-Nval],
                            batch_size=batch_size,
                            epochs=1,
                            callbacks=callbacks_list,
                            verbose=0,
                            validation_data=(X_train[-Nval:], y_train[-Nval:]))
        
        acum_tr_mse  += history.history['mse']
        acum_val_mse += history.history['val_mse']
        
        if (e+1)%50 == 0:
            training_graphic(acum_tr_mse, acum_val_mse)

We can load the model already trained and just makes predictions with it.

In [1]:
model = load_model('best_model.h5') 

## <font color="#fcc200"> 9. Predictions

We just need to apply our model to get the predictions undoing the logarithmic transformation and also the prediction visualizations and the different errors depending on the future time frames. 

In [1]:
y_train_prediction = model.predict(X_train).flatten()
y_test_prediction = model.predict(X_test).flatten()

In [1]:
temp_train_pred = inverse_transform(y_train_prediction,
                                          temp_prev_train)
temp_test_pred  = inverse_transform(y_test_prediction,
                                          temp_prev_test)

In [1]:
temp_train_pred

In [1]:
plt.figure(figsize=(15,7))
plt.plot(dates_train, temp_train, '--', c='royalblue',
         label="Training")
plt.plot(dates_train, temp_train_pred,  c='darkorange',
         label="Training daily predictions")

plt.xticks(fontsize=12)
plt.yticks(fontsize=14)
plt.axis([dates_train[4],dates_train[-1],0,75])
plt.legend(fontsize=14);

In [1]:
plt.figure(figsize=(15,5))
plt.plot(dates_train, temp_train, '--', c='royalblue',
         label='Training')
plt.plot(dates_train, temp_train_pred,  c='darkorange',
         label='Training predictions')
plt.plot(dates_test, temp_test, '--',   c='green',
         label='Test')
plt.plot(dates_test, temp_test_pred,    c='red',
         label='Test predictions')
plt.xticks(fontsize=12)
plt.yticks(fontsize=14)
plt.title('Daily predictions', fontsize=16)
plt.legend(fontsize=14);

In [1]:
plt.figure(figsize=(15,5))
plt.plot(dates_train, temp_train, '--', c='royalblue',
         label='Training')
plt.plot(dates_train, temp_train_pred,  c='darkorange',
         label='Training predictions')
plt.plot(dates_test, temp_test, '--',   c='green',
         label='Test')
plt.plot(dates_test, temp_test_pred,    c='red',
         label='Test predictions')
plt.title('Daily predictions (zoom)', fontsize=16)
plt.legend(fontsize=14)
plt.xticks(fontsize=12)
plt.yticks(fontsize=14)
plt.axis([dates_train[-200],dates_test[-100],0,50]);

In [1]:
# R2 scores
print("R2 - Training      : ",
      R2_score(temp_train[1:], temp_train_pred[1:]))
print("R2 - Test          : ",
      R2_score(temp_test, temp_test_pred))
print("r2 - Interval 1 day     : ",
      R2_score(temp_test[1:], temp_test[:-1]))
print("R2 - Interval 1 week : ",
      R2_score(temp_test[7:], temp_test[:-7]))
print("R2 - Interval 4 weeks: ",
      R2_score(temp_test[28:], temp_test[:-28]))
print("R2 - Interval 1 year: ",
      R2_score(temp_train[7*52:], temp_train[:-7*52]))

In [1]:
# RMSEs
sqrt = np.sqrt
print("RMSE - Training      : ",
      sqrt(mean_squared_error(temp_train[1:],
                              temp_train_pred[1:])))
print("RMSE - Test          : ",
      sqrt(mean_squared_error(temp_test,
                              temp_test_pred)))
print("RMSE - Interval 1 day    : ",
      sqrt(mean_squared_error(temp_test[1:],
                              temp_test[:-1])))
print("RMSE - Interval 1 week : ",
      sqrt(mean_squared_error(temp_test[7:],
                              temp_test[:-7])))
print("RMSE - Interval 4 weeks: ",
      sqrt(mean_squared_error(temp_test[28:],
                              temp_test[:-28])))