# Long Short-Term Memory

In this jupyter notebook you will find the implementation of the long short-term memory algorithm using the sklearn library. It will help to test this algorithm and to complete [forecasting.md](https://github.com/Hurence/historian/blob/forecasting/docs/forecasting.md) document.

In [1]:
import time
import sklearn.linear_model as sk
import pandas as pd
from sklearn.metrics import r2_score
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
# LSTM
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout

In [2]:
# Load the dataset
# ts_data = pd.read_csv('data/dataHistorian.csv', sep=';', encoding='cp1252')
ts_data = pd.read_csv('data/it-data-4metrics.csv', sep=',')

ts_data.head()

Unnamed: 0,metric_id,timestamp,value,metric_name,warn,crit,min,max
0,091c334c-a90a-4d8f-ba75-2c936220cd64,1575157723,13.375,cpu_prct_used,85.0,95.0,,
1,091c334c-a90a-4d8f-ba75-2c936220cd64,1575157423,13.5,cpu_prct_used,85.0,95.0,,
2,091c334c-a90a-4d8f-ba75-2c936220cd64,1575157123,13.375,cpu_prct_used,85.0,95.0,,
3,091c334c-a90a-4d8f-ba75-2c936220cd64,1575156823,13.5,cpu_prct_used,85.0,95.0,,
4,091c334c-a90a-4d8f-ba75-2c936220cd64,1575156523,13.75,cpu_prct_used,85.0,95.0,,


In [13]:
# Delete the useless columns
ts_data = ts_data.iloc[:,0:4]
ts_data.head()

Unnamed: 0,metric_id,timestamp,value,metric_name
0,091c334c-a90a-4d8f-ba75-2c936220cd64,1575157723,13.375,cpu_prct_used
1,091c334c-a90a-4d8f-ba75-2c936220cd64,1575157423,13.5,cpu_prct_used
2,091c334c-a90a-4d8f-ba75-2c936220cd64,1575157123,13.375,cpu_prct_used
3,091c334c-a90a-4d8f-ba75-2c936220cd64,1575156823,13.5,cpu_prct_used
4,091c334c-a90a-4d8f-ba75-2c936220cd64,1575156523,13.75,cpu_prct_used


In [4]:
# Creation of the dictionnary of all the metric_name in association with their metric_id
dic_name = {}
dic_id = {}
for indx in ts_data.index:
    if ts_data['metric_name'][indx] not in dic_name.keys():
        dic_name[ts_data['metric_name'][indx]] = []
    if ts_data['metric_id'][indx] not in dic_name[ts_data['metric_name'][indx]]:
        dic_name[ts_data['metric_name'][indx]].append(ts_data['metric_id'][indx])
        dic_id[ts_data['metric_id'][indx]] = [ts_data['metric_name'][indx]]
keys_name = list(dic_name.keys())
keys_id = list(dic_id.keys())

In [5]:
indexNames = ts_data[ ts_data['metric_id'] == keys_id[0] ].index
data = ts_data.iloc[indexNames]
data

Unnamed: 0,metric_id,timestamp,value,metric_name
0,091c334c-a90a-4d8f-ba75-2c936220cd64,1575157723,13.375,cpu_prct_used
1,091c334c-a90a-4d8f-ba75-2c936220cd64,1575157423,13.500,cpu_prct_used
2,091c334c-a90a-4d8f-ba75-2c936220cd64,1575157123,13.375,cpu_prct_used
3,091c334c-a90a-4d8f-ba75-2c936220cd64,1575156823,13.500,cpu_prct_used
4,091c334c-a90a-4d8f-ba75-2c936220cd64,1575156523,13.750,cpu_prct_used
...,...,...,...,...
1695,091c334c-a90a-4d8f-ba75-2c936220cd64,1574641218,16.250,cpu_prct_used
1696,091c334c-a90a-4d8f-ba75-2c936220cd64,1574640918,17.125,cpu_prct_used
1697,091c334c-a90a-4d8f-ba75-2c936220cd64,1574640618,15.125,cpu_prct_used
1698,091c334c-a90a-4d8f-ba75-2c936220cd64,1574640318,14.625,cpu_prct_used


In [12]:
scaler = MinMaxScaler(feature_range = (0, 1))
data.loc[:,('value')] = scaler.fit_transform(data.loc[:,('value')].to_numpy().reshape(-1, 1))
data

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  isetter(loc, value[:, i].tolist())


Unnamed: 0,metric_id,timestamp,value,metric_name
0,091c334c-a90a-4d8f-ba75-2c936220cd64,1575157723,0.285303,cpu_prct_used
1,091c334c-a90a-4d8f-ba75-2c936220cd64,1575157423,0.288184,cpu_prct_used
2,091c334c-a90a-4d8f-ba75-2c936220cd64,1575157123,0.285303,cpu_prct_used
3,091c334c-a90a-4d8f-ba75-2c936220cd64,1575156823,0.288184,cpu_prct_used
4,091c334c-a90a-4d8f-ba75-2c936220cd64,1575156523,0.293948,cpu_prct_used
...,...,...,...,...
1695,091c334c-a90a-4d8f-ba75-2c936220cd64,1574641218,0.351585,cpu_prct_used
1696,091c334c-a90a-4d8f-ba75-2c936220cd64,1574640918,0.371758,cpu_prct_used
1697,091c334c-a90a-4d8f-ba75-2c936220cd64,1574640618,0.325648,cpu_prct_used
1698,091c334c-a90a-4d8f-ba75-2c936220cd64,1574640318,0.314121,cpu_prct_used
