# Train a model
In this example notebook, a model is trained for a location with id '287'. The data for this location can be found in the 'data' folder. 
First, the prediction job will be defined, which contains the properties of the training and prediction. Thereafter, the model can be trained by the ```train_model_pipeline()```. 

In [None]:
import pandas as pd
import IPython
from openstef.pipeline.train_model import train_model_pipeline
from openstef.pipeline.create_forecast import create_forecast_pipeline
from openstef.data_classes.prediction_job import PredictionJobDataClass



In [None]:
# Define properties of training/prediction. We call this a 'prediction_job'
pj = dict(id=287,
        model='xgb', 
        quantiles=[10,30,50,70,90],
        forecast_type="demand",
        lat=52.0,
        lon=5.0,
        horizon_minutes=47*60,
        resolution_minutes=15,
        name="Example",          
        hyper_params={}, # Note, this should become optional
        feature_names=None, # Note, this should become optional
        default_modelspecs=None,
       )
pj=PredictionJobDataClass(**pj)

# Load input data
input_data = pd.read_csv('data/get_model_input_pid_287.csv', index_col='index', parse_dates=True)

# Split in training and forecasting data
train_data = input_data.iloc[:-200,:] # everything except last 200 rows (~ 48 hours)
to_forecast_data = input_data.iloc[:-200,:] # last 200 rows


In [None]:
# Print the train data. 
# For every timestamp, bot the load as well as feature data is available. 
train_data.head()

In [None]:
to_forecast_data.head()

# Train a model
Train the model by using the high-level pipelin ```train_model_pipeline```. Store the model and reports on training proces in ./trained_models by setting mlflow_tracking_uri and artifact_folder to this path. 

In [None]:
train_model_pipeline(
    pj,
    train_data,
    check_old_model_age=False,
    mlflow_tracking_uri="./mlflow_trained_models",
    artifact_folder="./mlflow_artifacts",
    )

You can find the trained model in ./trained_models, along with reports on the training process. Below the Predictor0.25 and Predictor47.0 plots are shown, as well as the weight plot. The predictor plots show {nog invullen}.
The weight plot shows the importance and weight of every feature.

In [None]:
## Inspect local files
IPython.display.HTML(f"<iframe src=./trained_models/{pj['id']}/Predictor0.25.html width=800 height=400></iframe>"
                     f"<iframe src=./trained_models/{pj['id']}/Predictor47.0.html width=800 height=400></iframe>"
                     f"<iframe src=./trained_models/{pj['id']}/weight_plot.html width=800 height=400></iframe>")

In [None]:
html_file_path="examples\\mlflow_artifacts\*"
with open(html_file_path, 'r') as f:
    html_content = f.read()

import plotly.io as pio

# This ensures Plotly output works in multiple places:
# plotly_mimetype: VS Code notebook UI
# notebook: "Jupyter: Export to HTML" command in VS Code
# See https://plotly.com/python/renderers/#multiple-renderers
pio.renderers.default = "plotly_mimetype+notebook"

fig= go.Figure(data=html_content)