Team members:
> Felipe Henrique Bastos Costa<br/>
Gabriel Barroso da Silva Lima<br/>
Jonatas Travessa Souza de Barros<br/>

# Data
For this analyze we are going to use a dataset with electrical energy consumption records.
Link to the original dataset: https://www.kaggle.com/kandij/electric-production


## Exploring the data

In [None]:
#Reading the dataset
import pandas as pd

df = pd.read_csv("../input/electric-production/Electric_Production.csv")
df

In [None]:
#setting DATE as index
df = df.set_index('DATE')
df.index = pd.to_datetime(df.index, format='%m-%d-%Y')
df

The dataset has 397 examples, those measurings were recorded monthly, between january 1985 to december 2018.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

#Plotting the measurements 
plt.figure(figsize=(13,4))
plt.plot(df.index ,df['Value'])
plt.title("There is " + str(len(df)) + " measurements")
plt.xlabel('Year', fontsize=16)
plt.ylabel('Power consumption(%)', fontsize=16);

# Train

## Train and test data

In [None]:
dftm = df['Value']

In [None]:
#Function to split train and test
def split_train_test(show_graph=0):
    
    train = dftm[:318].values
    train = train.reshape((len(train), 1))
    test = dftm[318:].values
    test = test.reshape((len(test), 1))
    
    if show_graph:
        
        plt.figure(figsize=(13,4))
        plt.plot(np.arange(len(train)),train, label='train')
        plt.plot(np.arange(len(train), len(train)+len(test)),test, label='test')
        plt.legend();
        
    return train, test

In [None]:
# Spliting the last 6 years for test
train, test = split_train_test(show_graph=1)
#print(test)

## Imports

In [None]:
#Imports to train LSTM and GRU
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, GRU
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint

## Common functions
Defining some reusable code 

In [None]:
# This function shows the loss graphic
def show_loss(history):
    history_dict = history.history
    
    #The first training loss value was usually so big that it became difficult to
    # see the details of the graph, so im showing the loss from epoch 2 onwards 
    loss_values = history_dict['loss'][1:]
    val_loss_values = history_dict['val_loss'][1:]
    
    epochs_x = range(1, len(loss_values) + 1)
    plt.figure(figsize=(5,5))
    #plt.subplot(2,1,1)
    
    #fig, (ax1, ax2) = plt.subplots(1, 2)
    #fig.suptitle('Horizontally stacked subplots')
    #ax1.plot(epochs_x, loss_values, 'bo')
    #ax2.plot(epochs_x, val_loss_values, 'b', label='Validation loss')
    
    plt.plot(epochs_x, loss_values, 'bo', label='Training loss')
    plt.plot(epochs_x, val_loss_values, 'b', label='Validation loss')
    plt.title('Training and validation Loss from epoch 2 onwards')
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.xlabel('Epochs')
    #plt.ylabel('Acc')
    plt.legend()
    plt.show()

In [None]:
# This function is to show the predictions
def show_prediction(model, length, best_model):
    
    #load the best model
    model.load_weights(best_model)

    # Predicting some days ahead.
    test_predictions = []
    first_eval_batch = train[-length:]
    current_batch = first_eval_batch.reshape((1, length, 1))
    for i in range(len(test)):
        # get prediction 1 time stamp ahead ([0] is for grabbing just the number instead of [array])
        current_pred = model.predict(current_batch)[0]
        # store prediction
        test_predictions.append(current_pred)
        # update batch to now include prediction and drop first value
        current_batch = np.append(current_batch[:,1:,:],[[current_pred]],axis=1)
        
        
    #Comparing test data and predictions
    plt.figure(figsize=(13,4))
    plt.plot(np.arange(len(train)), train, label='Train')
    plt.plot(np.arange(len(train),len(train)+len(test)),test, label='Test')
    plt.plot(np.arange(len(train),len(train)+len(test)),test_predictions, label='Prediction')
    plt.legend()
    
    return test_predictions

In [None]:
#This function is to show and get the mean square error
def mse(test_predictions):
    loss = np.mean(np.square(test[:,0] - np.array(test_predictions)[:,0]), axis=-1)
    print("mse: "+str(loss))
          
    return loss

## Definição do length

<div style="text-align: justify"> 
    Testamos vários valores para o length (o tamanho da sequência de outputs), mas o valor no qual obtivemos os melhores resultados foi o 6, ou seja, cada célula foi treinada considerando um intervalo de tempo de 6 meses. Esse resultado é interessante, visto que os valores do dataset são consumos de energia em porcentagem de uma determinada localidade que infelizmente não foi revelada pelo criador da base de dados. No entanto, é bastante comum em várias regiões do planeta a cada 6 meses haver uma mudança brusca da estação do ano, no hemisfério norte, por exemplo, em janeiro é inverno, para que 6 meses após, em julho, é verão. 
</div>
<div style="text-align: justify"> 
    É comum as pessoas utilizarem mais, ou então, menos energia elétrica conforme a estação do ano. Logo, ao longo desses 6 meses utilizados para cada treino de cada célula, a tendência é de uma constante subida ou de uma constante descida no consumo dependendo da estação inicial e da final nesse intervalo de tempo.
</div>

In [None]:
length = 6

## LSTM

In [None]:
# LSTM
# A ideia aqui é fazer os seguintes treinos:
# O lenght vai ser 6

# Número de épocas: 20
    # Treino 1 : 10 neurônios
    
# Número de épocas: 40
    # Treino 2 : 10 neurônios
     
# Número de épocas: 20
    # Treino 3 : 30 neurônios (20 em uma camada e 10 em outra)
    
# Número de épocas: 40
    # Treino 4 : 30 neurônios (20 em uma camada e 10 em outra)
    
#Ir salvando os resultados para apresentar numa tabela

### LSTM 10 cells

In [None]:
#This function to generators,model and train 
def train_lstm_10_cells(epochs, length, train, test, save_model_to):
    
    #train and test generators
    generator = TimeseriesGenerator(train,train,length=length, batch_size=1)

    validation_generator = TimeseriesGenerator(test,test,length=length, batch_size=1)
    
    # Model
    model = Sequential()
    model.add(LSTM(10,activation='relu', input_shape=(length,1)))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mse')
    model.summary()
    
    #Train
    epochs = epochs
    early_stop = EarlyStopping(monitor='val_loss',patience=40)
    ckpt = ModelCheckpoint(save_model_to, save_best_only=True, monitor='val_loss', verbose=1)
    
    history = model.fit_generator(
           generator,
           steps_per_epoch=len(generator),
           epochs=epochs,
           validation_data=validation_generator,
           callbacks=[early_stop, ckpt])
    
    
    return model, history



#### LSTM 10 cells, epochs = 20

In [None]:
#defining epochs
epochs = 20

In [None]:
#Spliting train and test data
train, test = split_train_test(show_graph=0)

In [None]:
#Compile and train
model, history = train_lstm_10_cells(epochs=epochs, 
                                     length=length, 
                                     train=train, 
                                     test=test, 
                                     save_model_to='model_lstm1.hdf5')

In [None]:
#loss graphic
show_loss(history)

In [None]:
#prediction
prediction = show_prediction(model, length=length, best_model='model_lstm1.hdf5')

In [None]:
#evaluation
mse_lstm_1 = mse(prediction)

#### LSTM 10 cells, epochs = 40

In [None]:
#defining epochs
epochs = 40

In [None]:
#Spliting train and test data
train, test = split_train_test(show_graph=0)

In [None]:
#Compile and train
model, history = train_lstm_10_cells(epochs=epochs, 
                                     length=length, 
                                     train=train, 
                                     test=test,
                                     save_model_to='model_lstm2.hdf5')

In [None]:
#loss graphic
show_loss(history)

In [None]:
#prediction
prediction = show_prediction(model, length=length, best_model='model_lstm2.hdf5')

In [None]:
#evaluation
mse_lstm_2 = mse(prediction)

### LSTM 30 cells

In [None]:
#This function to generators,model and train 
def train_lstm_30_cells(epochs, length, train, test, save_model_to):
    
    #train and test generators
    generator = TimeseriesGenerator(train,train,length=length, batch_size=1)

    validation_generator = TimeseriesGenerator(test,test,length=length, batch_size=1)
    
    # Model
    model = Sequential()
    model.add(LSTM(20, activation='relu', return_sequences=True, input_shape=(length,1)))
    model.add(LSTM(10, activation='relu'))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mse')
    model.summary()
    
    #Train
    epochs = epochs
    early_stop = EarlyStopping(monitor='val_loss',patience=50)
    ckpt = ModelCheckpoint(save_model_to, save_best_only=True, monitor='val_loss', verbose=1)
    history = model.fit_generator(
           generator,
           steps_per_epoch=len(generator),
           epochs=epochs,
           validation_data=validation_generator,
           callbacks=[early_stop, ckpt])
    
    
    return model, history

#### LSTM 30 cells, epochs = 20

In [None]:
#defining epochs
epochs = 20

In [None]:
#Spliting train and test data
train, test = split_train_test(show_graph=0)

In [None]:
#Compile and train
model, history = train_lstm_30_cells(epochs=epochs, 
                                     length=length, 
                                     train=train, 
                                     test=test,
                                     save_model_to='model_lstm3.hdf5')

In [None]:
#loss graphic
show_loss(history)

In [None]:
#prediction
prediction = show_prediction(model, length=length, best_model='model_lstm3.hdf5')

In [None]:
#evaluation
mse_lstm_3 = mse(prediction)

#### LSTM 30 cells, epochs = 40

In [None]:
#defining epochs
epochs = 40

In [None]:
#Spliting train and test data
train, test = split_train_test(show_graph=0)

In [None]:
#Compile and train
model, history = train_lstm_30_cells(epochs=epochs, 
                                     length=length, 
                                     train=train, 
                                     test=test,
                                     save_model_to='model_lstm4.hdf5')

In [None]:
#loss graphic
show_loss(history)

In [None]:
#prediction
prediction = show_prediction(model, length=length, best_model='model_lstm4.hdf5')

In [None]:
#evaluation
mse_lstm_4 = mse(prediction)

## GRU


In [None]:
# GRU
# A ideia aqui é fazer os seguintes treinos:
# O lenght vai ser 6

# Número de épocas: 20
    # Treino 1 : 10 neurônios
    
# Número de épocas: 40
    # Treino 2 : 10 neurônios
     
# Número de épocas: 20
    # Treino 3 : 30 neurônios (20 em uma camada e 10 em outra)
    
# Número de épocas: 40
    # Treino 4 : 30 neurônios (20 em uma camada e 10 em outra)
    
#Ir salvando os resultados para apresentar numa tabela

### GRU 10 cells

In [None]:
#This function to generators,model and train 
def train_gru_10_cells(epochs, length, train, test, save_model_to):
    
    #train and test generators
    generator = TimeseriesGenerator(train,train,length=length, batch_size=1)

    validation_generator = TimeseriesGenerator(test,test,length=length, batch_size=1)
    
    # Model
    model = Sequential()
    model.add(GRU(10,activation='relu', input_shape=(length,1)))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mse')
    model.summary()
    
    #Train
    epochs = epochs
    early_stop = EarlyStopping(monitor='val_loss',patience=50)
    ckpt = ModelCheckpoint(save_model_to, save_best_only=True, monitor='val_loss', verbose=1)
    history = model.fit_generator(
           generator,
           steps_per_epoch=len(generator),
           epochs=epochs,
           validation_data=validation_generator,
           callbacks=[early_stop, ckpt])
    
    
    return model, history



#### GRU 10 cells, epochs = 20

In [None]:
#defining epochs
epochs = 20

In [None]:
#Spliting train and test data
train, test = split_train_test(show_graph=0)

In [None]:
#Compile and train
model, history = train_gru_10_cells(epochs=epochs, 
                                     length=length, 
                                     train=train, 
                                     test=test,
                                     save_model_to='model_gru1.hdf5')

In [None]:
#loss graphic
show_loss(history)

In [None]:
#prediction
prediction = show_prediction(model, length=length, best_model='model_gru1.hdf5')

In [None]:
#evaluation
mse_gru_1 = mse(prediction)

#### GRU 10 cells, epochs = 40

In [None]:
#defining epochs
epochs = 40

In [None]:
#Spliting train and test data
train, test = split_train_test(show_graph=0)

In [None]:
#Compile and train
model, history = train_gru_10_cells(epochs=epochs, 
                                     length=length, 
                                     train=train, 
                                     test=test,
                                     save_model_to='model_gru2.hdf5')

In [None]:
#loss graphic
show_loss(history)

In [None]:
#prediction
prediction = show_prediction(model, length=length, best_model='model_gru2.hdf5')

In [None]:
#evaluation
mse_gru_2 = mse(prediction)

### GRU 30 cells

In [None]:
#This function to generators,model and train 
def train_gru_30_cells(epochs, length, train, test, save_model_to):
    
    #train and test generators
    generator = TimeseriesGenerator(train,train,length=length, batch_size=1)

    validation_generator = TimeseriesGenerator(test,test,length=length, batch_size=1)
    
    # Model
    model = Sequential()
    model.add(GRU(20, activation='relu', return_sequences=True, input_shape=(length,1)))
    model.add(GRU(10, activation='relu'))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mse')
    model.summary()
    
    #Train
    epochs = epochs
    early_stop = EarlyStopping(monitor='val_loss',patience=40)
    ckpt = ModelCheckpoint(save_model_to, save_best_only=True, monitor='val_loss', verbose=1)
    history = model.fit_generator(
           generator,
           steps_per_epoch=len(generator),
           epochs=epochs,
           validation_data=validation_generator,
           callbacks=[early_stop, ckpt])
    
    
    return model, history

#### GRU 30 cells, epochs = 20

In [None]:
#defining epochs
epochs = 20

In [None]:
#Spliting train and test data
train, test = split_train_test(show_graph=0)

In [None]:
#Compile and train
model, history = train_gru_30_cells(epochs=epochs, 
                                     length=length, 
                                     train=train, 
                                     test=test,
                                     save_model_to='model_gru3.hdf5')

In [None]:
#loss graphic
show_loss(history)

In [None]:
#prediction
prediction = show_prediction(model, length=length, best_model='model_gru3.hdf5')

In [None]:
#evaluation
mse_gru_3 = mse(prediction)

#### GRU 30 cells, epochs = 40

In [None]:
#defining epochs
epochs = 40

In [None]:
#Spliting train and test data
train, test = split_train_test(show_graph=0)

In [None]:
#Compile and train
model, history = train_gru_30_cells(epochs=epochs, 
                                     length=length, 
                                     train=train, 
                                     test=test,
                                     save_model_to='model_gru4.hdf5')

In [None]:
#loss graphic
show_loss(history)

In [None]:
#prediction
prediction = show_prediction(model, length=length, best_model='model_gru4.hdf5')

In [None]:
#evaluation
mse_gru_4 = mse(prediction)

# Results

In [None]:
df_results = {'base model': ['LSTM', 'LSTM', 'LSTM', 'LSTM',
                             'GRU', 'GRU', 'GRU', 'GRU'], 
              'architecture(cells)': ['10', '10', '30(20+10)', '30(20+10)',
                                      '10', '10', '30(20+10)', '30(20+10)'], 
              'epochs': ['20', '40', '20', '40',
                         '20', '40', '20', '40'],
             'mse': [mse_lstm_1, mse_lstm_2, mse_lstm_3, mse_lstm_4,
                                        mse_gru_1, mse_gru_2, mse_gru_3, mse_gru_4]} 
pd.DataFrame(df_results)

# Discussion

<div style="text-align: justify">
Resolvemos adicionar essa seção de discussão com o intuito de discutirmos sobre nossas impressões ao longo dos treinos. No geral, o mse (erro médio quadrático) tanto do LSTM quanto do GRU variaram muito, hora conseguindo excelentes resultados como, por exemplo, um mse de 19 que representa em nosso caso um erro de aproximadamente 4,3%, mas também, resultados ruins, como algo em torno de um mse de 800, representando um erro real de aproximadamente 28%. Essa volatilidade dos treinos tem haver com a natureza estocástica desses treinamentos.
</div>

<div style="text-align: justify">
No entanto, percebemos certos padrões. Por exemplo, o LSTM parece se sair um pouco melhor que o GRU no mse e na previsão do gráfico. Os treinos do LSTM precisavam de menos tentativas para que o padrão da onda, sua amplitude e sua tendência (leve  aumento no consumo de energia) fosse corretamente representado pelo gráfico previsto. Já no GRU, a maioria dos treinos resultaram em uma previsão que até mantinha o padrão ondulatório do consumo, mas pecava na amplitude e na tendência. Portanto, era comum com o GRU, obtermos um gráfico previsto com uma tendência de aumento muito acentuada ou mesmo de diminuição do consumo.
</div>

<div style="text-align: justify">
Tais observações talvez possam ser explicadas pelo fato da célula do LSTM conseguir carregar mais informações ao longo do tempo do que a célula do GRU que é mais simples e, por isso, possa ter mais dificuldades em reter informações de maior longo prazo.
</div>

<div style="text-align: justify">
Outra observação importante foi a tendência da piora do mse com o aumento de épocas, essa situação talvez possa estar atrelada a um possível overfitting, outra evidência de possível overfitting em alguns casos é também o fato da maioria dos treinos terem seu validation loss constantemente acima do training loss.
</div>