# MLP Example

This Demo file will take you thorugh all the steps involved in predicting a tags value by using the values of other tags in the data source.

We'll use the sci-kit learn implementation of a Multi-layer perceptron (MLP) which is a simple neural network.

In this example the demo data source "IP Datasource 2" will be used as an example so make sure you have it authorised.

Set up the Intelligent Plant clients.

In [None]:
import intelligent_plant.app_store_client as app_store_client
import os
app_store = app_store_client.AppStoreClient(os.environ["ACCESS_TOKEN"])
data_core = app_store.get_data_core_client()

In [None]:
import intelligent_plant.utility as utility
import matplotlib.pyplot as plt
import pandas as pd

We'll query all of the tags in IP Datasource 2 and make a list of just their names

In [None]:
dsn = "IP Datasource 2"

Query the data source for all tags. Tags are requested in pages of 50 at a time.

For larger data sources you should use the tag filter options to find the tags relevant to the problem you are trying to solve.

In [None]:
tags = []

page_size = 50
page_num = 1
while True:
    page = data_core.get_tags(dsn, page_num, page_size)
    
    #append the page just fetched into the list of all tags
    tags += page
    
    page_num += 1
    
    #if a page is shorter than the page size requested it is the last page
    if (len(page) < page_size):
        break

#map tages meta data to only be tag name and filter out the "TIME" tag
tag_names = list(filter(lambda x: x != "TIME", map(lambda x: x["Id"], tags)))

Fetch 10 days of data for the selected tags with a point every 1 hour. Data will be interpolated.

In [None]:
all_data = data_core.get_processed_data({dsn: tag_names}, "*-10d", "*", "1h", "interp")

Convert the returned data into a data frame to make it easier to work with

In [None]:
all_data_frame = utility.query_result_to_data_frame(all_data)

Remove the timestamp of data entries. For this simple example it will be ignored.

In [None]:
all_data_frame.drop('TimeStamp', axis=1, inplace=True)

In [None]:
all_data_frame

We can plot the contents of a data frame easily

In [None]:
all_data_frame.plot(legend=False)

In [None]:
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor

Set up a processing pipe. This encapsulates scaling and the MLP into the same pipe and allows us to train and test them all at once.

In [None]:
pipe = make_pipeline(StandardScaler(), MLPRegressor(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(100,), activation='logistic'))

Randomly split the data into training and testing.

We will try to predict the value of the last tag "WI Pump D_Suct_PI" by using the values of all the other tags.

In [None]:
X_train, X_test, y_train, y_test = train_test_split(all_data_frame[tag_names[:-1]], all_data_frame[tag_names[-1]])

The MLP is going to try and learn the relationship between the graph above and below

Use the training data to train the model.

In [None]:
pipe.fit(X_train, y_train)

In [None]:
prediction = pipe.predict(X_test)

In [None]:
prediction_df = pd.DataFrame({ "actual": y_test, "prediction": prediction }).reset_index(drop=True)

In [None]:
prediction_df.plot()

The score function can be used to detemine the coefficient of determination (R^2) of our model. A score of 1 means that the prediction is perfect.

In [None]:
pipe.score(X_test, y_test)

In [None]:
from sklearn.metrics import *

Other metrics can also be calculated. Such as the mean absoulte error. In this case 0 would be perfect.

In [None]:
mean_absolute_error(y_test, prediction)