# Runner notebook

In this notebook you can run the model that you just trained. In the second section (2. User input) it expects the experiment id of the experiment that was just run. It is optional to change the dummy data sets to your own data sets. It is important that the only datasets you can use as input are the weather and greenhouse climate datasets. 


<a id='f'></a>
## 1. Imports

Importing the necessary packages

In [1]:
import os
import pickle

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.metrics import mean_absolute_error, r2_score, mean_squared_error, mean_absolute_percentage_error

## 2. User input

Fill in your input. It expects:
- experiment_id: This is the id that is given when you run the project with MLflow. When the run is finished, you can see at the end at your terminal.
- file_name_greenhouse (OPTIONAL): Name of the greenhouse climate dataset (csv)
- file_name_weather (OPTIONAL): Name of the weather dataset (csv)

In [2]:
experiment_id = "USER_INPUT"
file_name_greenhouse = "dummy_greenhouse.csv"
file_name_weather = "dummy_weather.csv"

## 3. Load the model

Loads the model using the pickle package.

In [None]:
path_model = os.path.abspath(f"../../mlruns/0/{experiment_id}/artifacts/model/model.pkl")
model = pickle.load(open(path_model, 'rb'))

## 4. Load the test data

Loads the test data into pandas dataframes

In [None]:
path_greenhouse_csv = os.path.abspath(f'input/{file_name_greenhouse}')
path_weather_csv = os.path.abspath(f'input/{file_name_weather}')

In [None]:
df_greenhouse = pd.read_csv(path_greenhouse_csv)
df_weather = pd.read_csv(path_weather_csv)

## 5. Preprocess the data

Preprocesses the data

In [None]:
df_greenhouse = df_greenhouse.set_index('time')
df_weather = df_weather.set_index('time')

In [None]:
data = df_weather.join(df_greenhouse)

In [None]:
# Feature selection
data = data[["Iglob", "PARout", "Rhout", "Tout", "HumDef", "Tair", "t_heat_sp"]]

In [None]:
# Fill in the empty values
data = data.fillna(data.mean())

In [None]:
# Feature engineering
data = (data
       .assign(t=lambda df: np.arange(len(df.index)) + 1,
            hour_of_day=lambda df: np.arange(len(df.index)) + 1,
            month=6))

In [None]:
data['model_pred'] = model.predict(data.drop(columns="Tair"))

## 6. Output

Gives the metrics of running the test data on the model.

In [None]:
# Calculate the matrics
mae = mean_absolute_error(data.Tair, data.model_pred).round(3)
mse = mean_squared_error(data.Tair, data.model_pred).round(3)
rmse = np.sqrt(mean_squared_error(data.Tair, data.model_pred)).round(3)
mape = mean_absolute_percentage_error(data.Tair, data.model_pred).round(3)
r2 = r2_score(data.Tair, data.model_pred).round(3)

print(f'MAE: {mae}')
print(f'MSE: {mse}')
print(f'RMSE: {rmse}')
print(f'MAPE: {mape}')
print(f'r2: {r2}')

In [None]:
# Plot the test data with the predictions
fig, ax = plt.subplots(figsize=(18,6))
data[['Tair']].plot(ax=ax, c='blue')
data[['model_pred']].plot(ax=ax, c='red')
ax.legend(["test set", "model prediction"], prop={'size': 15});