# DRL usage example

In this notebook example, we will use Stable Baselines 3 to train and load a deep reinforcement learning agent. 

Note how *Sinergym* is entirely agnostic to any DRL algorithm (though it features custom callbacks and a close integration with SB3), so it can be used with any DRL library compatible with Gymnasium interface.

**Note: For more up-to-date examples, see the [Deep Reinforcement Learning section](https://ugr-sail.github.io/sinergym/compilation/main/pages/deep-reinforcement-learning.html) in the Sinergym documentation. It references some available scripts in Sinergym and explains how they work.**

## Training a model

We will use the `train_agent.py` script provided with *Sinergym*. This script leverages all the capabilities of *Sinergym* to work with deep reinforcement learning algorithms, easily configuring the environment and agent parameters by using a JSON configuration file.

For more details on how to run ``train_agent.py``, please refer to [Train a model](https://ugr-sail.github.io/sinergym/compilation/main/pages/deep-reinforcement-learning.html#model-training).

In [6]:
import sys
from datetime import datetime

import gymnasium as gym
import numpy as np
import wandb
from stable_baselines3 import *
from stable_baselines3.common.callbacks import CallbackList
from stable_baselines3.common.logger import HumanOutputFormat
from stable_baselines3.common.logger import Logger as SB3Logger

import sinergym
from sinergym.utils.callbacks import *
from sinergym.utils.constants import *
from sinergym.utils.logger import WandBOutputFormat
from sinergym.utils.rewards import *
from sinergym.utils.wrappers import *

First, let's define some configurations...

In [None]:
# Environment ID
environment = 'Eplus-5zone-mixed-continuous-stochastic-v1'

# Training episodes
episodes = 5

# Name of the experiment
experiment_date = datetime.today().strftime('%Y-%m-%d_%H:%M')
experiment_name = 'SB3_PPO-' + environment + \
    '-episodes-' + str(episodes)
experiment_name += '_' + experiment_date

Now, we are ready to create the Gymnasium environment.

We will use the previously defined environment name. Just remember that you can [change the default environment configuration](https://ugr-sail.github.io/sinergym/compilation/main/pages/notebooks/change_environment.html#Changing-an-environment-registered-in-Sinergym). We will also create an evaluation environment (`eval_env`).

If desired, we can replace the environment name with the experiment name.

In [8]:
env = gym.make(environment, env_name=experiment_name)
eval_env = gym.make(environment, env_name=experiment_name+'_EVALUATION')

[38;20m[ENVIRONMENT] (INFO) : Creating Gymnasium environment.[0m
[38;20m[ENVIRONMENT] (INFO) : Name: SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-09-10_07:57[0m
[38;20m[MODELING] (INFO) : Experiment working directory created.[0m
[38;20m[MODELING] (INFO) : Working directory: /workspaces/sinergym/examples/Eplus-env-SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-09-10_07:57-res1[0m
[38;20m[MODELING] (INFO) : Model Config is correct.[0m
[38;20m[MODELING] (INFO) : Update building model Output:Variable with variable names.[0m
[38;20m[MODELING] (INFO) : Update building model Output:Meter with meter names.[0m
[38;20m[MODELING] (INFO) : Runperiod established.[0m
[38;20m[MODELING] (INFO) : Episode length (seconds): 31536000.0[0m
[38;20m[MODELING] (INFO) : timestep size (seconds): 900.0[0m
[38;20m[MODELING] (INFO) : timesteps per episode: 35040[0m
[38;20m[REWARD] (INFO) : Reward function initialized.[0m
[38;20m[ENVIRONMENT] (INFO)

We can also add some wrappers to the environment. We will use an action and observation normalization wrapper and the *Sinergym* logger.

Normalization is highly recommended for DRL algorithms, while the logger is used to monitor and log environment interactions and save the data, and then dump it into CSV files and/or Weights and Biases.

In [None]:
env = NormalizeObservation(env)
env = NormalizeAction(env)
env = LoggerWrapper(env)
env = CSVLogger(env)

# Discomment the following line to log to WandB (remember to set the API key as an environment variable)
# env = WandBLogger(env,
#                  entity='test-project',
#                  project_name='sail_ugr',
#                  run_name=experiment_name,
#                  group='Train_example',
#                  tags=['DRL', 'PPO', '5zone', 'continuous', 'stochastic', 'v1'],
#                  save_code = True,
#                  dump_frequency = 1000,
#                  artifact_save = False)

eval_env = NormalizeObservation(eval_env)
eval_env = NormalizeAction(eval_env)
eval_env = LoggerWrapper(eval_env)
eval_env = CSVLogger(eval_env)

# Evaluation env is not required to be wrapped with WandBLogger, since the calculations are added in the same WandB session than the training env by using the sinergym LoggerEvalCallback

[38;20m[WRAPPER NormalizeObservation] (INFO) : Wrapper initialized.[0m
[38;20m[WRAPPER NormalizeAction] (INFO) : New normalized action Space: Box(-1.0, 1.0, (2,), float32)[0m
[38;20m[WRAPPER NormalizeAction] (INFO) : Wrapper initialized[0m
[38;20m[WRAPPER LoggerWrapper] (INFO) : Wrapper initialized.[0m
[38;20m[WRAPPER CSVLogger] (INFO) : Wrapper initialized.[0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Wrapper initialized.[0m
[38;20m[WRAPPER NormalizeAction] (INFO) : New normalized action Space: Box(-1.0, 1.0, (2,), float32)[0m
[38;20m[WRAPPER NormalizeAction] (INFO) : Wrapper initialized[0m
[38;20m[WRAPPER LoggerWrapper] (INFO) : Wrapper initialized.[0m
[38;20m[WRAPPER CSVLogger] (INFO) : Wrapper initialized.[0m


At this point, the environment is set up and ready to use. We will create a sample PPO model.

In [10]:
# In this case, all the hyperparameters are the default ones
model = PPO('MlpPolicy', env, verbose=1)

Using cpu device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.


If `WandBLogger` is active, we can log all the hyperparameters as follows:

In [11]:
# Register hyperparameters in wandb if it is wrapped
if is_wrapped(env, WandBLogger):
    experiment_params = {
        'sinergym-version': sinergym.__version__,
        'python-version': sys.version
    }
    # experiment_params.update(conf)
    env.get_wrapper_attr('wandb_run').config.update(experiment_params)

Evaluations will be run periodically during a number of episodes to determine if the current version of the model improves the best one obtained until that training episode.

The generated output will be stored depending on the logger wrapper configuration. We will use the ``LoggerEval`` callback to print and save the current best model during training, saving data in both local CSV files and WandB.

In [None]:
callbacks = []

# Set up Evaluation logging and saving best model
eval_callback = LoggerEvalCallback(
    eval_env=eval_env,
    train_env=env,
    n_eval_episodes=1,
    eval_freq_episodes=2,
    deterministic=True)

callbacks.append(eval_callback)
callback = CallbackList(callbacks)

To add the SB3 logging values in the same WandB session, we need to create a compatible WandB output format (which calls the WandB log method during training).

*Sinergym* provides `WandBOutputFormat` for this purpose:

In [None]:
# wandb logger and setting in SB3
if is_wrapped(env, WandBLogger):
    logger = SB3Logger(
        folder=None,
        output_formats=[
            HumanOutputFormat(
                sys.stdout,
                max_length=120),
            WandBOutputFormat()])
    model.set_logger(logger)

This is the total number of time steps for training:

In [14]:
timesteps = episodes * (env.get_wrapper_attr('timestep_per_episode') - 1)

Now, it is time to train the model with the previously defined callback. This may take a few minutes, depending on your computer.

In [15]:
model.learn(
    total_timesteps=timesteps,
    callback=callback,
    log_interval=100)

#----------------------------------------------------------------------------------------------#
[38;20m[ENVIRONMENT] (INFO) : Starting a new episode.[0m
[38;20m[ENVIRONMENT] (INFO) : Episode 1: SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-09-10_07:57[0m
#----------------------------------------------------------------------------------------------#
[38;20m[MODELING] (INFO) : Episode directory created.[0m
[38;20m[MODELING] (INFO) : Weather file USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw used.[0m
[38;20m[MODELING] (INFO) : Adapting weather to building model.[0m
[38;20m[MODELING] (INFO) : Weather noise applied in columns: ['drybulb'][0m
[38;20m[ENVIRONMENT] (INFO) : Saving episode output path.[0m
[38;20m[ENVIRONMENT] (INFO) : Episode 1 started.[0m
[38;20m[SIMULATOR] (INFO) : handlers initialized.[0m
[38;20m[SIMULATOR] (INFO) : handlers are ready.[0m
[38;20m[SIMULATOR] (INFO) : System is ready.[0m
[38;20m[WRAPPER NormalizeObservation] 

  self.evaluation_metrics = self.evaluation_metrics._append(


[38;20m[WRAPPER CSVLogger] (INFO) : Environment closed, data updated in monitor and progress.csv.[0m
Simulation Progress [Episode 1]: 100%|██████████| 100/100 [00:21<00:00,  4.75%/s, 100% completed]
[38;20m[ENVIRONMENT] (INFO) : Environment closed. [SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-09-10_07:57_EVALUATION][0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Saving normalization calibration data.[0m
#----------------------------------------------------------------------------------------------#
[38;20m[ENVIRONMENT] (INFO) : Starting a new episode.[0m
[38;20m[ENVIRONMENT] (INFO) : Episode 3: SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-09-10_07:57[0m
#----------------------------------------------------------------------------------------------#
[38;20m[MODELING] (INFO) : Episode directory created.[0m
[38;20m[MODELING] (INFO) : Weather file USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw used.[0m
[38;20m[MODELING] (I

<stable_baselines3.ppo.ppo.PPO at 0x7f7ec7eb7740>

Once the training process has finished, the last version of the model is saved. The `mean` and `var` values used for normalization are also stored in the in *Sinergym* training output folder, in order to use them for model evaluation.


Visit the [NormalizeObservation documentation](https://ugr-sail.github.io/sinergym/compilation/main/pages/wrappers.html#normalizeobservation) for additional information.

In [16]:
model.save(env.get_wrapper_attr('workspace_path') + '/model')

Again, remember to close the environment.

If WandB is active, this will save both artifacts and output data remotely.

In [17]:
env.close()

[38;20m[WRAPPER CSVLogger] (INFO) : Environment closed, data updated in monitor and progress.csv.[0m
Simulation Progress [Episode 6]:   4%|▍         | 4/100 [00:01<00:28,  3.39%/s, 4% completed]
[38;20m[ENVIRONMENT] (INFO) : Environment closed. [SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-09-10_07:57][0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Saving normalization calibration data.[0m


Although results are stored locally, you can also follow the execution of any experiments in WandB:

- Once in WandB, you will see the corresponding projects created:

![wandb_projects1](https://github.com/ugr-sail/sinergym/blob/main/images/wandb_projects1.png?raw=true)

- The training hyperparameters:

![wandb_training_hyperparameters](https://github.com/ugr-sail/sinergym/blob/main/images/wandb_training_hyperparameters.png?raw=true)

- Registered artifacts (if evaluation is enabled, the best model obtained is also registered):

![wandb_training_artifact](https://github.com/ugr-sail/sinergym/blob/main/images/wandb_training_artifact.png?raw=true)

- Real-time visualization of metrics:

![wandb_training_charts](https://github.com/ugr-sail/sinergym/blob/main/images/wandb_training_charts.png?raw=true)

## Loading and evaluating a trained model

We will use the `load_agent.py` script for loading a trained model. For more details on how to run `load_agent.py`, please refer to [Load a trained model](https://ugr-sail.github.io/sinergym/compilation/main/pages/deep-reinforcement-learning.html#model-loading).

First, we define the *Sinergym* environment to be used for testing the loaded agent, and the name of the evaluation experiment:

In [None]:
# Environment ID
environment = 'Eplus-5zone-mixed-continuous-stochastic-v1'

# Episodes
episodes = 5

# Evaluation name
evaluation_date = datetime.today().strftime('%Y-%m-%d_%H:%M')
evaluation_name = 'SB3_PPO-EVAL-' + environment + \
    '-episodes-' + str(episodes)
evaluation_name += '_' + evaluation_date

We will now create the Gymnasium environment. We can use the evaluation experiment name to rename the environment.

**It is essential to wrap the environment with the same wrappers used for training if the action or observation spaces were modified**.

If you are loading a pre-trained model and using the observation space normalization wrapper, you should use both means and standard deviations calibrated during the training process for a fair evaluation. `mean` and `var` values are saved in the *Sinergym* training output directory as a .txt file automatically. You can use a list or `numpy` array format, or just set the .txt path directly in the constructor. 

It is also important to deactivate calibration update during evaluations. This is done automatically by the `LoggerEvalCallback`. 

In [None]:
evaluation_env = gym.make(environment, env_name=evaluation_name)
evaluation_env = NormalizeObservation(evaluation_env, mean=env.get_wrapper_attr(
    "mean"), var=env.get_wrapper_attr("var"), automatic_update=False)
evaluation_env = NormalizeAction(evaluation_env)
evaluation_env = LoggerWrapper(evaluation_env)
evaluation_env = CSVLogger(evaluation_env)
# If you want to log the evaluation interactions to WandB, use WandBLogger too

[38;20m[ENVIRONMENT] (INFO) : Creating Gymnasium environment.[0m
[38;20m[ENVIRONMENT] (INFO) : Name: SB3_PPO-EVAL-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-09-10_08:01[0m
[38;20m[MODELING] (INFO) : Experiment working directory created.[0m
[38;20m[MODELING] (INFO) : Working directory: /workspaces/sinergym/examples/Eplus-env-SB3_PPO-EVAL-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-09-10_08:01-res1[0m
[38;20m[MODELING] (INFO) : Model Config is correct.[0m
[38;20m[MODELING] (INFO) : Update building model Output:Variable with variable names.[0m
[38;20m[MODELING] (INFO) : Update building model Output:Meter with meter names.[0m
[38;20m[MODELING] (INFO) : Runperiod established.[0m
[38;20m[MODELING] (INFO) : Episode length (seconds): 31536000.0[0m
[38;20m[MODELING] (INFO) : timestep size (seconds): 900.0[0m
[38;20m[MODELING] (INFO) : timesteps per episode: 35040[0m
[38;20m[REWARD] (INFO) : Reward function initialized.[0m
[38;20m[ENVIRONME

We will load the Stable Baselines 3 PPO model from local, but we could also use a remote model stored in WandB.

In [None]:
# get wandb artifact path for loading the model
if is_wrapped(evaluation_env, WandBLogger):
    wandb_run = evaluation_env.get_wrapper_attr('wandb_run')
else:
    wandb_run = wandb.init(entity='sail_ugr')

load_artifact_entity = 'sail_ugr'
load_artifact_project = 'sinergym'
load_artifact_name = experiment_name
load_artifact_tag = 'latest'
load_artifact_model_path = 'Sinergym_output/evaluation/best_model.zip'
wandb_path = load_artifact_entity + '/' + load_artifact_project + \
    '/' + load_artifact_name + ':' + load_artifact_tag

# Download artifact
artifact = wandb_run.use_artifact(wandb_path)
artifact.get_path(load_artifact_model_path).download('.')

# Set model path to local wandb file downloaded
model_path = './' + load_artifact_model_path
model = PPO.load(model_path)

[34m[1mwandb[0m: Currently logged in as: [33malex_ugr[0m ([33msail_ugr[0m). Use [1m`wandb login --relogin`[0m to force relogin


CommError: invalid alias. aliases must be specified as collectionName:alias (Error 400: Bad Request)

It should be noted that the model can be loaded from an artifact belonging to a different entity or project, provided that it is accessible. This is independent of the entity or project that is being used to register the evaluation of the loaded model.

The next step is to use the model to predict actions and interact with the environment.

In [None]:
for i in range(episodes):
    obs, info = evaluation_env.reset()
    rewards = []
    truncated = terminated = False
    current_month = 0
    while not (terminated or truncated):
        a, _ = model.predict(obs)
        obs, reward, terminated, truncated, info = evaluation_env.step(a)
        rewards.append(reward)
        if info['month'] != current_month:
            current_month = info['month']
            print(info['month'], sum(rewards))
    print(
        'Episode ',
        i,
        'Mean reward: ',
        np.mean(rewards),
        'Cumulative reward: ',
        sum(rewards))
evaluation_env.close()

#----------------------------------------------------------------------------------------------#
[38;20m[ENVIRONMENT] (INFO) : Starting a new episode... [SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40] [Episode 7][0m
#----------------------------------------------------------------------------------------------#
[38;20m[MODELING] (INFO) : Episode directory created [/workspaces/sinergym/examples/Eplus-env-SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40-res1/Eplus-env-sub_run7][0m
[38;20m[MODELING] (INFO) : Weather file USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw used.[0m
[38;20m[MODELING] (INFO) : Adapting weather to building model. [USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3.epw][0m
[38;20m[ENVIRONMENT] (INFO) : Saving episode output path... [/workspaces/sinergym/examples/Eplus-env-SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40-res1/Eplus-env-sub_run7/output][0m


  df = df.replace(


[38;20m[SIMULATOR] (INFO) : Running EnergyPlus with args: ['-w', '/workspaces/sinergym/examples/Eplus-env-SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40-res1/Eplus-env-sub_run7/USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3_Random_1.0_0.0_0.001.epw', '-d', '/workspaces/sinergym/examples/Eplus-env-SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40-res1/Eplus-env-sub_run7/output', '/workspaces/sinergym/examples/Eplus-env-SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40-res1/Eplus-env-sub_run7/5ZoneAutoDXVAV.epJSON'][0m
[38;20m[ENVIRONMENT] (INFO) : Episode 7 started.[0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Saving normalization calibration data... [SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40][0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Saving normalization calibration data... [SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024

  df = df.replace(


[38;20m[SIMULATOR] (INFO) : Running EnergyPlus with args: ['-w', '/workspaces/sinergym/examples/Eplus-env-SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40-res1/Eplus-env-sub_run8/USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3_Random_1.0_0.0_0.001.epw', '-d', '/workspaces/sinergym/examples/Eplus-env-SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40-res1/Eplus-env-sub_run8/output', '/workspaces/sinergym/examples/Eplus-env-SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40-res1/Eplus-env-sub_run8/5ZoneAutoDXVAV.epJSON'][0m
[38;20m[ENVIRONMENT] (INFO) : Episode 8 started.[0m
[38;20m[SIMULATOR] (INFO) : handlers are ready.[0m
[38;20m[SIMULATOR] (INFO) : System is ready.[0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Saving normalization calibration data... [SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40][0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Savin

  df = df.replace(


[38;20m[SIMULATOR] (INFO) : Running EnergyPlus with args: ['-w', '/workspaces/sinergym/examples/Eplus-env-SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40-res1/Eplus-env-sub_run9/USA_NY_New.York-J.F.Kennedy.Intl.AP.744860_TMY3_Random_1.0_0.0_0.001.epw', '-d', '/workspaces/sinergym/examples/Eplus-env-SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40-res1/Eplus-env-sub_run9/output', '/workspaces/sinergym/examples/Eplus-env-SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40-res1/Eplus-env-sub_run9/5ZoneAutoDXVAV.epJSON'][0m
[38;20m[ENVIRONMENT] (INFO) : Episode 9 started.[0m
[38;20m[SIMULATOR] (INFO) : handlers are ready.[0m
[38;20m[SIMULATOR] (INFO) : System is ready.[0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Saving normalization calibration data... [SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40][0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Savin

  df = df.replace(


[38;20m[SIMULATOR] (INFO) : handlers are ready.[0m
[38;20m[SIMULATOR] (INFO) : System is ready.[0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Saving normalization calibration data... [SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40][0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Saving normalization calibration data... [SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40][0m
1 -1.7162450174060817
2 -2515.235832234475------------------------------------------------------------------------------------------| 9%
3 -4835.124638771841*******-----------------------------------------------------------------------------------| 16%
4 -7407.433874239768****************--------------------------------------------------------------------------| 25%
5 -10197.88881457567************************------------------------------------------------------------------| 33%
6 -13209.917010355572*******************************------------

  df = df.replace(


[38;20m[SIMULATOR] (INFO) : handlers are ready.[0m
[38;20m[SIMULATOR] (INFO) : System is ready.[0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Saving normalization calibration data... [SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40][0m
[38;20m[WRAPPER NormalizeObservation] (INFO) : Saving normalization calibration data... [SB3_PPO-Eplus-5zone-mixed-continuous-stochastic-v1-episodes-5_2024-08-07_15:40][0m
1 -1.9855085492179452
2 -2525.9765795852795-----------------------------------------------------------------------------------------| 9%
3 -4836.447992908011*******-----------------------------------------------------------------------------------| 16%
4 -7415.106488666047****************--------------------------------------------------------------------------| 25%
5 -10199.115705433655***********************------------------------------------------------------------------| 33%
6 -13226.748838894438*******************************------------

The results obtained by the loaded model are stored locally, but can also be monitored through WandB if WandBLogger was used.

- When checking out the WandB project list, you will see that the `sinergym_evaluations` project now includes a new run:

![wandb_project2](https://github.com/ugr-sail/sinergym/blob/main/images/wandb_project2.png?raw=true)

- This includes the set of tracked hyperparameters, and the previous training artifact used to load the model:

![wandb_evaluating_hyperparameters](https://github.com/ugr-sail/sinergym/blob/main/images/wandb_evaluating_hyperparameters.png?raw=true)

- The evaluation output, containing the registered artifact and CSV files created by the logger:

![wandb_evaluating_artifact](https://github.com/ugr-sail/sinergym/blob/main/images/wandb_evaluating_artifact.png?raw=true)