# Training and Evaluating Models with the Data-Driven Library

This notebook provides an overview of the tools built in the DDM for extracting predictions from your trained DDM and for evauating the performance of the DDM.

---

We utilize `hydra` to save the configuration of our datasets and our models. The default configuration is in the `conf/config.yaml` directory:

```YAML
defaults:
  - data: house_energy.yaml
  - model: xgboost.yaml
  - simulator: house_energy_simulator.yaml
```

Note that the configuration file points to three additional configuration files for each component: the data, the model, and the simulator.

While the configuration file already has default values specified you can override any element of the configuration file using the `overrides` option. For example, we can override the data configuration to instead use the `yaml` file specified in `data/cartpole-100K-cts.csv.yaml` and the model configuration to use the `yaml` file specified in `model/SVR.yaml`.


In [None]:
cd ..

In [None]:
from hydra.experimental import initialize, compose
from hydra.core.global_hydra import GlobalHydra
from omegaconf import DictConfig, ListConfig, OmegaConf
from model_loader import available_models
import logging
import matplotlib.pyplot as plt
import numpy as np
from rich import print
from rich.logging import RichHandler
import copy
from assessment_metrics_loader import available_metrics

logging.basicConfig(
    level=logging.INFO,
    format="%(message)s",
    datefmt="[%X]",
    handlers=[RichHandler()]
)
logger = logging.getLogger("ddm_notebook")
logger.setLevel(logging.INFO)

In [None]:
GlobalHydra.instance().clear() 
initialize(config_path="../conf", job_name="model_train_validate")
cfg = compose(config_name="config", overrides=["data=house-energy", "model=xgboost"])

## 1. Importing the Dataset

In [None]:
# Extract features from yaml file
input_cols = cfg['data']['inputs']
output_cols = cfg['data']['outputs']
augmented_cols = cfg['data']['augmented_cols']
dataset_path = cfg['data']['path']
iteration_order = cfg['data']['iteration_order']
episode_col = cfg['data']['episode_col']
iteration_col = cfg['data']['iteration_col']
max_rows = cfg['data']['max_rows']
diff_state = cfg['data']['diff_state']
test_perc = cfg['data']['test_perc']

In [None]:
print("DATA STRUCTURE SELECTED:")
print(" - input_cols:", input_cols)
print(" - augmented_cols:", augmented_cols)
print(" - output_cols:", output_cols)

##  2. Model Definition

The `available_models` dictionary provides wrappers for the available models in this repository. We utilize `cfg["model"]` to load and build the model specified in the `model.yaml` file.

### Hyperparameters

Every model has its own hyperparameters, specified through the `cfg["model"]["build_params"]` dictionary, which can be modified directly in the dictionary below or through the `hydra` overrides.

In [None]:
cfg["model"]["build_params"]

## 3. Train the Model

In [None]:
def train_models(config=cfg):

    logger.info(f'Model type: {available_models[config["model"]["name"]]}')
    Model = available_models[config["model"]["name"]]
    model = Model()
    logger.info(f"Building model with parameters: {config}")
    model.build_model(
        **config["model"]["build_params"]
    )
    logger.info(f"Loading data from {dataset_path}")
    X, y = model.load_csv(
        input_cols=input_cols,
        output_cols=output_cols,
        augm_cols=list(augmented_cols),
        dataset_path=dataset_path,
        iteration_order=iteration_order,
        episode_col=episode_col,
        iteration_col=iteration_col,
        max_rows=max_rows,
    )
    global X_train, y_train, episode_ids_train, X_test, y_test, episode_ids_test
    train_id_end = int(np.floor(X.shape[0] * (1 - test_perc)))
    X_train, y_train, episode_ids_train = (X[:train_id_end,],y[:train_id_end,],model.episode_ids[:train_id_end,])
    X_test, y_test, episode_ids_test = (X[train_id_end:,],y[train_id_end:,],model.episode_ids[train_id_end:,])
    
    
    logger.info(f"Fitting model...")
    model.fit(X_train, y_train)
    logger.info(f"Model trained!")
    y_pred = model.predict(X_test)
    r2_score = available_metrics["r2_score"]
    logger.info(f"R^2 score is {r2_score(y_test,y_pred)} for the test set.")

    return model

In [None]:
model = train_models(cfg)

### Save Model

In [None]:
model.save_model(filename=cfg["model"]["saver"]["filename"])

### Data Structure of Saved model

In [None]:
logger.info(f"Input_cols:  {model.features}")
logger.info(f"Output_cols: {model.labels}")

## 4. Model Evaluations

We provide three methods for evaluating the errors of our trained models:

1. Model predictive error: using a specified metric (such as R^2 or RMSE) and a test set, we evaluate the metric on the test set.
2. Visualization of per-iteration predictions on a test set.
3. Visualization of sequential predictions on a test set. Sequential prediction refers to feeding the predicted output back into the input over a full episode.

### 4.1. Overall Prediction Score

In [None]:
# Select your scoring method: r2_score, root_mean_squared_error, or mean_squared_error
scoring_method = available_metrics["r2_score"]

In [None]:
# Evaluate the model using the test set
per_iteration_eval_table = model.evaluate(X_test, y_test, scoring_method, marginal=True)

In [None]:
if (per_iteration_eval_table["score"] < 0.7).any():
    logger.warn("Per-iteration assessment R^2 is low. Please review your model.")

per_iteration_eval_table

### 4.2. Per-Iteration Predictions

In [None]:
# Use the the input columns at time t to predict the output column(s) at time t+1
y_preds = model.predict(X_test)

In [None]:
# Plot all prediction results
label_count = np.shape(y_preds)[1]
for i in range(label_count):
    fig = plt.figure(figsize=(20,5))
    plt.plot(y_test[:,i], "green")
    plt.plot(y_preds[:,i], "brown", linestyle='--')
    plt.title(f"Per-iteration predictions: {model.labels[i]}")
    plt.xlabel("Iteration")
    plt.legend(["Truth", "Prediction"])

In [None]:
# Zoom in on a specific section
iteration_start = -50
iteration_stop = -1

# Define which input column is the action
action_col = "action_command"

# Plot action changes -- zoomed in
fig = plt.figure(figsize=(20,5))
action_idx = model.features.index(action_col)
plt.title(f"Plot [{iteration_start}:{iteration_stop}] actions (ensure action is not stale)")
plt.plot(X_test[iteration_start:iteration_stop,action_idx])
plt.grid()
    
# Plot output changes -- zoomed in
fig = plt.figure(figsize=(20,5))
label_idx = 0
plt.title(f"Plot [{iteration_start}:{iteration_stop}] state preds: {model.labels[label_idx]}")
plt.plot(y_test[iteration_start:iteration_stop,label_idx], "green")
plt.plot(y_preds[iteration_start:iteration_stop,label_idx], "brown", linestyle='--')
plt.xlabel("Iteration")
plt.legend(["Truth", "Prediction"])
plt.grid()

### 4.3. Sequential Predictions

In [None]:
# Feed the predicted output back into the input for a full episode.
preds_sequentially = model.predict_sequentially(X_test, episode_ids=episode_ids_test)

In [None]:
episode_idx

In [None]:
# Plot sequential predictions for first test episode. 
# Do you notice any error propagation (i.e., predictions deviate more from truth over time)?
episode_idx = episode_ids_test==episode_ids_test[-1]
label_count = np.shape(y_preds)[1]
for i in range(label_count):
    fig = plt.figure(figsize=(20,5))
    plt.plot(y_test[episode_idx,i], "green")
    plt.plot(preds_sequentially[episode_idx,i], "brown", linestyle='--')
    plt.title(f"Per-iteration predictions: {model.labels[i]}")
    plt.xlabel("Iteration")
    plt.legend(["Truth", "Prediction"])

## 5. Comparing Model Evaluations

If you want to compare various models, you can use the following section to save them in between runs.

1. Select appropriate "model_name" tag, and run this section
2. Change config through "config.yaml" (located at 'conf' folder)
3. Rerun from Model Build (Steps 1-4), until this section
4. Define a new value for "model_name" tag, and run this section again

In [None]:
# select model name, and feats to extract
model_name = "xgb"

In [None]:
# per-iteration score
model_per_it_scores = copy.deepcopy(per_iteration_eval_table)

In [None]:
# initialize models dictionary if it doesn't exist already
if 'models_dict' not in locals():
    models_dict = dict()

In [None]:
# append tables to model using selected model name as key
models_dict[model_name] = (model_per_it_scores,)

In [None]:
models_dict

In [None]:
# redefine column names if needed
for model_name, score_tables in models_dict.items():
    for score_table in score_tables:
        for col_name in score_table.columns:
            if "score" in col_name and model_name not in col_name:
                score_table.rename(columns = {col_name:model_name+"_"+col_name}, inplace = True)

In [None]:
# concatenate across all models
all_scores = None
for model_name, score_tables in models_dict.items():
    for score_table in score_tables:
        if all_scores is None:
            all_scores = score_table
        else:
            all_scores = all_scores.merge(score_table,how='outer')

all_scores