# Tain and Test a LSTM model

This script shows how to train, test and save a model using our modules.

## Data Management

Firstly, it's needed to deal with some data menagements procedures. So, import the data_manner.py file.

In [1]:
# If this script is running in another folder, change the base path to the /src folder.
import sys
sys.path.append("../src")

# Then import the data_manner.py file.
import data_manner

After that, defines the variables to collect the data from a specific repository.

In [2]:
# specif code to the repository data.
repo = "p971074907"
# coutry and state acronym splited by a ":"
path = "brl:rn"
# columns (or features) to be extracted from the database, each one splited by a ":"
feature = "date:newDeaths:newCases:"
# start date for the data request.
begin = "2020-05-01"
# finish date for the data request.
end = "2021-07-01"

In our modules, almost all procedures uses class objects and class methods. To collected data from a web .csv file or from our repository data file, use the DataConstructor() constructor and calls the method .collect_dataframe() with the just defined values 

In [3]:
# creating the DataConstructor instance
data_constructor = data_manner.DataConstructor()
# collect data from repository.
collected_data = data_constructor.collect_dataframe(path, repo, feature, begin, end)

The collected data is a N-feature vector, where each feature is a vector with the size of the days number.

In [4]:
print("Feature 0 (newDeaths) from 2020-05-01 to 2021-07-01: length ", len(collected_data[0]))
print("Feature 1 (newCases) from 2020-05-01 to 2021-07-01: length ", len(collected_data[1]))

Feature 0 (newDeaths) from 2020-05-01 to 2021-07-01: length  427
Feature 1 (newCases) from 2020-05-01 to 2021-07-01: length  427


Internally some data processings is done to deal with the data variation, like moving avarage or data difference.

With the collected data, call the .build_train_test() method to transpose the data to the right shape.

In [5]:
# To change the test size, modify the configure.json param "data_test_size_in_days", in the /doc folder.
train, test = data_constructor.build_train_test(collected_data)

print("Tain X and Train target shapes: ", train.x.shape, train.y.shape)
print("Test X and Test target shapes: ", test.x.shape, test.y.shape)

Tain X and Train target shapes:  (378, 7, 2) (378, 7, 1)
Test X and Test target shapes:  (4, 7, 2) (4, 7, 1)


## Creating and training a model

As was said, to create a model use a class instance of the desired model.

In our modules, the models folder contains the files referring to each of the implemented models. This test will uses a LSTM model, so import the lstm_manner.py file.

In [6]:
# Once the system path as changed, it's not needed swtiched it again.

# access the models/artificial path and import
from models.artificial import lstm_manner

Now, create an LSTM model constructor instance and call the .creating() method to set up the model architecture defined in the configure.json file (/doc).

In [7]:
lstm_model = lstm_manner.ModelLSTM(path)
lstm_model.creating()

With the model created using the desired architecture, call the .fiting() method passing the train.x and train.y data to train the model with the internal default routine and the well-defined params in the configure.json file.

In [8]:
# the verborse argument has to be equal zero if the machine used to train the model has no GPU processing.
lstm_model.fiting(train.x, train.y, verbose=0)

True

## Testing the just trained model

When the model was trained, use the .predicting() method from the model instance object, passing the test.x data and store the result in a variable.

In [9]:
yhat = lstm_model.predicting(test.x)

With the prediction values, use the methods as .calculate_rmse() and .calculate_mse() to extract those metrics from forecasted values.

In [10]:
predicted_rmse = lstm_model.calculate_rmse(test.y, yhat)
predicted_mse = lstm_model.calculate_mse(test.y, yhat)

print("RMSE to data Test: ", predicted_rmse)
print("MSE to data Test: ", predicted_mse)

RMSE to data Test:  [14.35729610622325, 4.11787575083366, 3.082998106701867, 26.25647594600296]
MSE to data Test:  [206.13195148177329, 16.95690069930388, 9.504877325927295, 689.4025291030321]


The retuned values are the calculated metric value for each test.x input.

## Saving the trained model

If the model has satisfied metrics (or no), to save it, just call the method .saving()

A .h5 file will be saved on the /dbs/fitted_model path and the name is an unic ID (uuid) hash. 

Along with the model, a file named metada.json (/doc) will also be saved, storing information regarding the training of the model.

In [11]:
lstm_model.saving()