# Reinforcement Learning control strategies for Electric Vehicles fleet Virtual Power Plants
Thesis based on the development of a RL agent that manages a VPP through EVs charging stations in an household environment. Main optimization objectives of the VPP are: Valley filling, peak shaving and zero resulting load over time. Main action performed to reach objectives are: storage of Renewable energy resources and power push in the grid at high demand times. The development of the Virtual Power Plant environment is based on the ELVIS (Electric Vehicles Infrastructure Simulator) open library from DAI-Labor: https://github.com/dailab/elvis The thesis code is currently available at: (https://github.com/francescomaldonato/RL_VPP_Thesis)

Author: Francesco Maldonato

## VPP simulator Notebook based on EVs arrival, with StableBaselines3 trained model loaded [MaskablePPO]

Installing required packages and dependencies

In [1]:
%%capture
!pip install py-elvis==0.2.1
!pip install pyyaml==5.4
!pip install plotly==5.9.0
!pip install -U kaleido==0.2.1

!pip install stable-baselines3[extra]==1.6.1
!pip install stable-baselines==1.6.1
!pip install sb3-contrib==1.6.1
!pip install gym==0.20.0
!pip install -q wandb==0.13.4

In [2]:
#Cloning repository and changing directory
!git clone https://github.com/francescomaldonato/RL_VPP_Thesis.git
%cd RL_VPP_Thesis/
%ls

Cloning into 'RL_VPP_Thesis'...
remote: Enumerating objects: 517, done.[K
remote: Counting objects: 100% (124/124), done.[K
remote: Compressing objects: 100% (59/59), done.[K
remote: Total 517 (delta 65), reused 121 (delta 64), pack-reused 393[K
Receiving objects: 100% (517/517), 188.99 MiB | 21.30 MiB/s, done.
Resolving deltas: 100% (214/214), done.
Checking out files: 100% (223/223), done.
/content/RL_VPP_Thesis
[0m[01;34mAgent_trainer_notebooks[0m/          [01;34mRL_VPP_Thesis[0m/
[01;34mAlgorithm_simulator_notebooks[0m/    [01;34mtrained_models[0m/
[01;34mdata[0m/                             VPP_environment.py
[01;34mEV_experiment_notebooks[0m/          VPP_simulator.ipynb
[01;34mHyperparameters_sweep_notebooks[0m/  [01;34mwandb[0m/
README.md


In [3]:
import yaml
import numpy as np
from VPP_environment import VPPEnv, VPP_Scenario_config
from elvis.config import ScenarioConfig
import os
import torch
import random
import wandb
from sb3_contrib import MaskablePPO #The available algoritmhs in sb3-contrib for the custom environment with MultiInputPolicy
from sb3_contrib.common.maskable.utils import get_action_masks
import stable_baselines3 as sb3
from stable_baselines3.common.env_checker import check_env
from sb3_contrib.common.maskable.evaluation import evaluate_policy

#Check if cuda device is available for training
print("Torch-Cuda available device:", torch.cuda.is_available())
print(sb3.get_system_info())
!wandb --version

Torch-Cuda available device: False
OS: Linux-5.10.133+-x86_64-with-Ubuntu-18.04-bionic #1 SMP Fri Aug 26 08:44:51 UTC 2022
Python: 3.7.14
Stable-Baselines3: 1.6.1
PyTorch: 1.12.1+cu113
GPU Enabled: False
Numpy: 1.21.6
Gym: 0.20.0

({'OS': 'Linux-5.10.133+-x86_64-with-Ubuntu-18.04-bionic #1 SMP Fri Aug 26 08:44:51 UTC 2022', 'Python': '3.7.14', 'Stable-Baselines3': '1.6.1', 'PyTorch': '1.12.1+cu113', 'GPU Enabled': 'False', 'Numpy': '1.21.6', 'Gym': '0.20.0'}, 'OS: Linux-5.10.133+-x86_64-with-Ubuntu-18.04-bionic #1 SMP Fri Aug 26 08:44:51 UTC 2022\nPython: 3.7.14\nStable-Baselines3: 1.6.1\nPyTorch: 1.12.1+cu113\nGPU Enabled: False\nNumpy: 1.21.6\nGym: 0.20.0\n')
wandb, version 0.13.4


In [4]:
# Ensure deterministic behavior
torch.backends.cudnn.deterministic = True
random.seed(0)
torch.manual_seed(0)
torch.cuda.manual_seed_all(0)

## Load ELVIS YAML config file
Section where the EVs arrival simulation parameters are loaded through the Yaml config file from the 'data/config_builder/' folder.

In [5]:
#Loading paths for input data
current_folder = ''
VPP_training_data_input_path = current_folder + 'data/data_training/environment_table/' + 'Environment_data_2019.csv'
VPP_testing_data_input_path = current_folder + 'data/data_testing/environment_table/' + 'Environment_data_2020.csv'
VPP_validating_data_input_path = current_folder + 'data/data_validating/environment_table/' + 'Environment_data_2018.csv'
elvis_input_folder = current_folder + 'data/config_builder/'

case = 'wohnblock_household_simulation_adaptive.yaml' #(loaded by default, 20 EVs arrivals per week with 50% average battery)

#Try different simulation parameters, uncomment below
#case = 'wohnblock_household_simulation_adaptive_10.yaml' #(10 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_15.yaml' #(15 EVs arrivals per week with 50% average battery)
#case = 'wohnblock_household_simulation_adaptive_25.yaml' #(25 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_30.yaml' #(30 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_35.yaml' #(35 EVs arrivals per week with 50% average battery) 

with open(elvis_input_folder + case, 'r') as file:
    yaml_str = yaml.full_load(file)

elvis_config_file = ScenarioConfig.from_yaml(yaml_str)
VPP_config_file = VPP_Scenario_config(yaml_str)

print(elvis_config_file)
print(VPP_config_file)

Vehicle types: <generator object ScenarioConfig.__str__.<locals>.<genexpr> at 0x7f1857b6cad0>Mean parking time: 23.99
Std deviation of parking time: 1
Mean value of the SOC distribution: 0.5
Std deviation of the SOC distribution: 0.1
Max parking time: 24
Number of charging events per week: 20
Vehicles are disconnected only depending on their parking time
Queue length: 0
Opening hours: None
Scheduling policy: Uncontrolled

{'start_date': '2022-01-01T00:00:00', 'end_date': '2023-01-01T00:00:00', 'resolution': '0:15:00', 'num_households': 4, 'solar_power': 16, 'wind_power': 12, 'EV_types': [{'battery': {'capacity': 100, 'efficiency': 1, 'max_charge_power': 150, 'min_charge_power': 0}, 'brand': 'Tesla', 'model': 'Model S', 'probability': 1}], 'charging_stations_n': 4, 'EVs_n': 20, 'EVs_n_max': 1044, 'mean_park': 23.99, 'std_deviation_park': 1, 'EVs_mean_soc': 50.0, 'EVs_std_deviation_soc': 10.0, 'EV_load_max': 44, 'EV_load_rated': 14.8, 'EV_load_min': 1, 'houseRWload_max': 10, 'av_max_ener

In [6]:
#TESTING Environment initialization
env = VPPEnv(VPP_testing_data_input_path, elvis_config_file, VPP_config_file)
#env.plot_VPP_input_data()

Charging event: 1, Arrival time: 2022-01-01 12:45:00, Parking_time: 24, Leaving_time: 2022-01-02 12:45:00, SOC: 0.5879173004613288, SOC target: 1.0, Connected car: Tesla, Model S 
 ... 
 Charging event: 1043, Arrival time: 2022-12-31 19:30:00, Parking_time: 24, Leaving_time: 2023-01-01 19:30:00, SOC: 0.5734653089209693, SOC target: 1.0, Connected car: Tesla, Model S 

-DATASET: House&RW_energy_sum=kWh  -21214.64 , Grid_used_en=kWh  4947.18 , RE-to-vehicle_unused_en=kWh  -26161.81 , Total_selling_cost=€  -489.75 , Grid_cost=€  233.11
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  21593.77 , Grid_used_en=kWh  38863.94 , RE-to-vehicle_unused_en=kWh  17270.17 , Total_selling_cost=€  896.29 , Grid_cost=€  1387.13 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  75.08


In [7]:
env.plot_ELVIS_data()

In [8]:
#Function to check custom environment and output additional warnings if needed
check_env(env)
#env.plot_reward_functions()

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  20329.24 , Grid_used_en=kWh  37744.01 , RE-to-vehicle_unused_en=kWh  17414.78 , Total_selling_cost=€  869.78 , Grid_cost=€  1355.65 , Av.EV_en_left=kWh  100.0 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  75.08
Simulating VPP....


In [9]:
## Wandb login to load models
#In Colab, uncomment below:
%env "WANDB_DISABLE_CODE" True
%env "WANDB_NOTEBOOK_NAME" "Simulator_notebooks/MaskablePPO_VPP_simulator.ipynb"
os.environ['WANDB_NOTEBOOK_NAME'] = 'Simulator_notebooks/MaskablePPO_VPP_simulator.ipynb'
#wandb.login(relogin=True)

#In local notebook, uncomment below:
#your_wandb_login_code = 0123456789abcdefghijklmnopqrstwxyzàèìòù0 #example length
#!wandb login {your_wandb_login_code}

env: "WANDB_DISABLE_CODE"=True
env: "WANDB_NOTEBOOK_NAME"="Simulator_notebooks/MaskablePPO_VPP_simulator.ipynb"


In [10]:
#Loading training model, from local directory or from wandb previous trainings
MaskablePPO_path = "trained_models/MaskablePPO_models/model_MaskablePPO_"

model_id = "8mq440dz"
model = MaskablePPO.load(MaskablePPO_path + model_id, env=env)

# run_id_restore = "2y2dqvyn"
# model = wandb.restore(f'model_{run_id_restore}.zip', run_path=f"francesco_maldonato/RL_VPP_Thesis/{run_id_restore}")

## Testing dataset VPP Simulation using the loaded trained model

In [11]:
#TEST Model
episodes = 1
for episode in range(1, episodes+1):
    obs = env.reset()
    done = False
    score = 0
    # cell and hidden state of the LSTM
    lstm_states = None
    num_envs = 1
    # Episode start signals are used to reset the lstm states
    episode_starts = np.ones((num_envs,), dtype=bool)
    while not done:
        #env.render()
        action_masks = get_action_masks(env)
        action, lstm_states = model.predict(obs, state=lstm_states, episode_start=episode_starts, action_masks=action_masks, deterministic=True) #Now using our trained model with deterministic prediction [should improve performances]
        env.lstm_state = lstm_states
        obs, reward, done, info = env.step(action)
        episode_starts = done
        score+=reward
    print('Episode:{} Score:{}'.format(episode, score))

VPP_table = env.VPP_table
#print(env.lstm_states_list)

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  21921.45 , Grid_used_en=kWh  39397.31 , RE-to-vehicle_unused_en=kWh  17475.87 , Total_selling_cost=€  902.16 , Grid_cost=€  1400.17 , Av.EV_en_left=kWh  100.0 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  75.08
Simulating VPP....
- VPP.Simulation results
 LOAD_INFO: Sum_Energy=KWh  -11087.8 , Grid_used_en=KWh  1787.49 , RE-to-vehicle_unused_en=KWh  12875.29 , Total_selling_cost=€  -264.37 , Grid_cost=€  69.25 
 EV_INFO: Av.EV_energy_leaving=kWh  59.58 , Std.EV_energy_leaving=kWh  21.92 , EV_departures =  1040 , EV_queue_left =  0
SCORE:  Cumulative_reward= 219895.19 - Step_rewars (load_t= 225625.69, EVs_energy_t= -10518.96)
 - Final_rewards (Av.EVs_energy= 5285.1, Grid_used_en= -1035.55, RE-to-vehicle_unused_en= -10064.01, Grid_cost= 10602.92)
Episode:1 Score:219895.1939222926


In [12]:
env.plot_VPP_energies()

Output hidden; open in https://colab.research.google.com to view.

In [13]:
VPP_table.head(15000)

Unnamed: 0_level_0,0,1,2,3,EVs_id,actions,mask_truth,ev_charged_pwr,ev_discharged_pwr,load,load_reward,EV_reward,rewards
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2022-01-01 00:00:00,0.000000,0.000000,0.000000,0.000000,"[0, 0, 0, 0]","[2, 0, 0, 0]","[False, True, True, True]",0.000000,0.0,1.437116,-5.560844,0.0,-5.560844
2022-01-01 00:15:00,0.000000,0.000000,0.000000,0.000000,"[0, 0, 0, 0]","[1, 0, 0, 0]","[False, True, True, True]",0.000000,0.0,3.929980,-3.723783,0.0,-3.723783
2022-01-01 00:30:00,0.000000,0.000000,0.000000,0.000000,"[0, 0, 0, 0]","[2, 0, 0, 0]","[False, True, True, True]",0.000000,0.0,2.734270,-2.165263,0.0,-2.165263
2022-01-01 00:45:00,0.000000,0.000000,0.000000,0.000000,"[0, 0, 0, 0]","[2, 0, 0, 0]","[False, True, True, True]",0.000000,0.0,1.799158,-1.534470,0.0,-1.534470
2022-01-01 01:00:00,0.000000,0.000000,0.000000,0.000000,"[0, 0, 0, 0]","[2, 0, 0, 0]","[False, True, True, True]",0.000000,0.0,1.420682,-0.607602,0.0,-0.607602
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-06-06 04:45:00,44.583286,34.058304,64.984482,51.623146,"[2536, 2535, 2537, 2538]","[2, 1, 1, 0]","[False, True, True, True]",11.063604,-1.0,0.000000,15.000000,0.0,15.000000
2022-06-06 05:00:00,44.333286,35.623512,66.549690,51.623146,"[2536, 2535, 2537, 2538]","[2, 1, 1, 0]","[False, True, True, True]",12.521656,-1.0,0.000000,15.000000,0.0,15.000000
2022-06-06 05:15:00,44.083286,37.312637,68.238815,51.623146,"[2536, 2535, 2537, 2538]","[2, 1, 1, 0]","[False, True, True, True]",13.512993,-1.0,0.000000,15.000000,0.0,15.000000
2022-06-06 05:30:00,43.833286,38.979549,69.905731,51.623146,"[2536, 2535, 2537, 2538]","[2, 1, 1, 0]","[False, True, True, True]",13.335301,-1.0,0.000000,15.000000,0.0,15.000000


In [None]:
#env.plot_Elvis_results()

In [15]:
env.plot_VPP_results()

Output hidden; open in https://colab.research.google.com to view.

In [16]:
env.plot_VPP_supply_demand()

Output hidden; open in https://colab.research.google.com to view.

In [17]:
env.plot_VPP_Elvis_comparison()

In [18]:
env.plot_rewards_results()

Output hidden; open in https://colab.research.google.com to view.

In [19]:
env.plot_rewards_stats()

In [20]:
env.plot_EVs_kpi()

In [21]:
env.plot_load_kpi()

In [22]:
env.plot_yearly_load_log()

Output hidden; open in https://colab.research.google.com to view.

## Validating dataset VPP Simulation using the loaded trained model

In [23]:
#VALIDATING Environment initialization
env = VPPEnv(VPP_validating_data_input_path, elvis_config_file, VPP_config_file)

Charging event: 3130, Arrival time: 2022-01-01 06:15:00, Parking_time: 24, Leaving_time: 2022-01-02 06:15:00, SOC: 0.6360169958559092, SOC target: 1.0, Connected car: Tesla, Model S 
 ... 
 Charging event: 4172, Arrival time: 2022-12-31 21:30:00, Parking_time: 24, Leaving_time: 2023-01-01 21:30:00, SOC: 0.5969565584052647, SOC target: 1.0, Connected car: Tesla, Model S 

-DATASET: House&RW_energy_sum=kWh  -30085.39 , Grid_used_en=kWh  2136.67 , RE-to-vehicle_unused_en=kWh  -32222.06 , Total_selling_cost=€  -1187.15 , Grid_cost=€  113.34
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  11440.74 , Grid_used_en=kWh  33605.46 , RE-to-vehicle_unused_en=kWh  22164.71 , Total_selling_cost=€  545.43 , Grid_cost=€  1462.8 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  80.89


In [24]:
#Function to check custom environment and output additional warnings if needed
check_env(env)
#env.plot_VPP_input_data()

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  13464.14 , Grid_used_en=kWh  35142.74 , RE-to-vehicle_unused_en=kWh  21678.6 , Total_selling_cost=€  644.68 , Grid_cost=€  1533.56 , Av.EV_en_left=kWh  100.0 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  80.89
Simulating VPP....


In [25]:
#model = PPO.load(PPO_path + model_run_ID, env = env)
model = MaskablePPO.load(MaskablePPO_path + model_id, env=env)

In [26]:
#TEST Model
episodes = 1
for episode in range(1, episodes+1):
    obs = env.reset()
    done = False
    score = 0
    # cell and hidden state of the LSTM
    lstm_states = None
    num_envs = 1
    # Episode start signals are used to reset the lstm states
    episode_starts = np.ones((num_envs,), dtype=bool)
    while not done:
        #env.render()
        action_masks = get_action_masks(env)
        action, lstm_states = model.predict(obs, state=lstm_states, episode_start=episode_starts, action_masks=action_masks, deterministic=True) #Now using our trained model with deterministic prediction [should improve performances]
        env.lstm_state = lstm_states
        obs, reward, done, info = env.step(action)
        episode_starts = done
        score+=reward
    print('Episode:{} Score:{}'.format(episode, score))

VPP_table = env.VPP_table
#print(env.lstm_states_list)

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  13090.22 , Grid_used_en=kWh  35097.07 , RE-to-vehicle_unused_en=kWh  22006.85 , Total_selling_cost=€  677.12 , Grid_cost=€  1576.03 , Av.EV_en_left=kWh  100.0 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  80.89
Simulating VPP....
- VPP.Simulation results
 LOAD_INFO: Sum_Energy=KWh  -13753.75 , Grid_used_en=KWh  1170.07 , RE-to-vehicle_unused_en=KWh  14923.82 , Total_selling_cost=€  -527.35 , Grid_cost=€  51.73 
 EV_INFO: Av.EV_energy_leaving=kWh  65.43 , Std.EV_energy_leaving=kWh  21.65 , EV_departures =  1043 , EV_queue_left =  0
SCORE:  Cumulative_reward= 231422.11 - Step_rewars (load_t= 205356.74, EVs_energy_t= 11827.07)
 - Final_rewards (Av.EVs_energy= 9983.71, Grid_used_en= -544.42, RE-to-vehicle_unused_en= -8965.2, Grid_cost= 13764.2)
Episode:1 Score:231422.11172783977


In [27]:
env.plot_VPP_energies()

Output hidden; open in https://colab.research.google.com to view.

In [28]:
VPP_table.head(15000)

Unnamed: 0_level_0,0,1,2,3,EVs_id,actions,mask_truth,ev_charged_pwr,ev_discharged_pwr,load,load_reward,EV_reward,rewards
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2022-01-01 00:00:00,0.000000,0.000000,0.000000,0.000000,"[0, 0, 0, 0]","[0, 0, 0, 0]","[True, True, True, True]",0.000000,0.0,-3.035789,-5.458403,0.0,-5.458403
2022-01-01 00:15:00,0.000000,0.000000,0.000000,0.000000,"[0, 0, 0, 0]","[0, 0, 0, 0]","[True, True, True, True]",0.000000,0.0,-4.504244,-5.040315,0.0,-5.040315
2022-01-01 00:30:00,0.000000,0.000000,0.000000,0.000000,"[0, 0, 0, 0]","[0, 0, 0, 0]","[True, True, True, True]",0.000000,0.0,-4.044346,-3.714730,0.0,-3.714730
2022-01-01 00:45:00,0.000000,0.000000,0.000000,0.000000,"[0, 0, 0, 0]","[0, 0, 0, 0]","[True, True, True, True]",0.000000,0.0,-3.228838,-5.565442,0.0,-5.565442
2022-01-01 01:00:00,0.000000,0.000000,0.000000,0.000000,"[0, 0, 0, 0]","[0, 0, 0, 0]","[True, True, True, True]",0.000000,0.0,-4.621987,-6.535266,0.0,-6.535266
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-06-06 04:45:00,65.805206,46.526154,81.752731,59.287613,"[5647, 5648, 5649, 5646]","[2, 1, 1, 0]","[False, True, True, True]",12.447645,-1.0,0.000000,15.000000,0.0,15.000000
2022-06-06 05:00:00,65.555206,48.368942,83.595520,59.287613,"[5647, 5648, 5649, 5646]","[2, 1, 1, 0]","[False, True, True, True]",14.742314,-1.0,0.000000,15.000000,0.0,15.000000
2022-06-06 05:15:00,65.305206,50.255310,85.481888,59.287613,"[5647, 5648, 5649, 5646]","[2, 1, 1, 0]","[False, True, True, True]",15.090948,-1.0,0.000000,15.000000,0.0,15.000000
2022-06-06 05:30:00,65.055206,52.108799,87.335373,59.287613,"[5647, 5648, 5649, 5646]","[2, 1, 1, 0]","[False, True, True, True]",14.827907,-1.0,0.000000,15.000000,0.0,15.000000


In [None]:
#env.plot_Elvis_results()

In [30]:
env.plot_VPP_results()


Output hidden; open in https://colab.research.google.com to view.

In [31]:
env.plot_VPP_supply_demand()

Output hidden; open in https://colab.research.google.com to view.

In [32]:
env.plot_VPP_Elvis_comparison()

In [33]:
env.plot_rewards_results()

Output hidden; open in https://colab.research.google.com to view.

In [34]:
env.plot_rewards_stats()

In [35]:
env.plot_EVs_kpi()

In [36]:
env.plot_load_kpi()

In [37]:
env.plot_yearly_load_log()

Output hidden; open in https://colab.research.google.com to view.

## Training dataset VPP Simulation using the loaded trained model

In [38]:
#TRAINING Environment initialization
env = VPPEnv(VPP_training_data_input_path, elvis_config_file, VPP_config_file)

Charging event: 6259, Arrival time: 2022-01-01 07:15:00, Parking_time: 24, Leaving_time: 2022-01-02 07:15:00, SOC: 0.4245623790785612, SOC target: 1.0, Connected car: Tesla, Model S 
 ... 
 Charging event: 7301, Arrival time: 2022-12-31 16:30:00, Parking_time: 24, Leaving_time: 2023-01-01 16:30:00, SOC: 0.4280773320918259, SOC target: 1.0, Connected car: Tesla, Model S 

-DATASET: House&RW_energy_sum=kWh  -34117.7 , Grid_used_en=kWh  1556.25 , RE-to-vehicle_unused_en=kWh  -35673.95 , Total_selling_cost=€  -1196.64 , Grid_cost=€  97.86
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  8592.43 , Grid_used_en=kWh  32315.28 , RE-to-vehicle_unused_en=kWh  23722.84 , Total_selling_cost=€  506.97 , Grid_cost=€  1371.62 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  84.2


In [39]:
#Function to check custom environment and output additional warnings if needed
check_env(env)
#env.plot_VPP_input_data()

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  9375.24 , Grid_used_en=kWh  32797.47 , RE-to-vehicle_unused_en=kWh  23422.23 , Total_selling_cost=€  534.25 , Grid_cost=€  1390.64 , Av.EV_en_left=kWh  100.0 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  84.2
Simulating VPP....


In [40]:
#model = PPO.load(PPO_path + model_run_ID, env = env)
model = MaskablePPO.load(MaskablePPO_path + model_id, env=env)

In [41]:
#TEST Model
episodes = 1
for episode in range(1, episodes+1):
    obs = env.reset()
    done = False
    score = 0
    # cell and hidden state of the LSTM
    lstm_states = None
    num_envs = 1
    # Episode start signals are used to reset the lstm states
    episode_starts = np.ones((num_envs,), dtype=bool)
    while not done:
        #env.render()
        action_masks = get_action_masks(env)
        action, lstm_states = model.predict(obs, state=lstm_states, episode_start=episode_starts, action_masks=action_masks, deterministic=True) #Now using our trained model with deterministic prediction [should improve performances]
        env.lstm_state = lstm_states
        obs, reward, done, info = env.step(action)
        episode_starts = done
        score+=reward
    print('Episode:{} Score:{}'.format(episode, score))

VPP_table = env.VPP_table
#print(env.lstm_states_list)

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  9066.49 , Grid_used_en=kWh  31695.8 , RE-to-vehicle_unused_en=kWh  22629.31 , Total_selling_cost=€  492.46 , Grid_cost=€  1316.7 , Av.EV_en_left=kWh  100.0 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  84.2
Simulating VPP....
- VPP.Simulation results
 LOAD_INFO: Sum_Energy=KWh  -15915.49 , Grid_used_en=KWh  967.58 , RE-to-vehicle_unused_en=KWh  16883.07 , Total_selling_cost=€  -557.22 , Grid_cost=€  39.75 
 EV_INFO: Av.EV_energy_leaving=kWh  67.6 , Std.EV_energy_leaving=kWh  22.03 , EV_departures =  1041 , EV_queue_left =  0
SCORE:  Cumulative_reward= 222022.25 - Step_rewars (load_t= 192511.49, EVs_energy_t= 15327.61)
 - Final_rewards (Av.EVs_energy= 10587.4, Grid_used_en= -429.68, RE-to-vehicle_unused_en= -10238.83, Grid_cost= 14264.26)
Episode:1 Score:222022.2532852979


In [42]:
env.plot_VPP_energies()

Output hidden; open in https://colab.research.google.com to view.

In [43]:
VPP_table.head(15000)

Unnamed: 0_level_0,0,1,2,3,EVs_id,actions,mask_truth,ev_charged_pwr,ev_discharged_pwr,load,load_reward,EV_reward,rewards
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2022-01-01 00:00:00,0.000000,0.0,0.000000,0.0,"[0, 0, 0, 0]","[0, 0, 0, 0]","[True, True, True, True]",0.000000,0.0,-4.633643,-4.540252,0.0,-4.540252
2022-01-01 00:15:00,0.000000,0.0,0.000000,0.0,"[0, 0, 0, 0]","[0, 0, 0, 0]","[True, True, True, True]",0.000000,0.0,-3.724151,-4.059758,0.0,-4.059758
2022-01-01 00:30:00,0.000000,0.0,0.000000,0.0,"[0, 0, 0, 0]","[0, 0, 0, 0]","[True, True, True, True]",0.000000,0.0,-3.435855,-3.466625,0.0,-3.466625
2022-01-01 00:45:00,0.000000,0.0,0.000000,0.0,"[0, 0, 0, 0]","[0, 0, 0, 0]","[True, True, True, True]",0.000000,0.0,-3.079975,-5.746813,0.0,-5.746813
2022-01-01 01:00:00,0.000000,0.0,0.000000,0.0,"[0, 0, 0, 0]","[0, 0, 0, 0]","[True, True, True, True]",0.000000,0.0,-4.821495,-6.080078,0.0,-6.080078
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-06-06 04:45:00,97.102890,0.0,59.708019,0.0,"[8785, 0, 8786, 0]","[1, 0, 0, 0]","[False, True, False, True]",6.604815,0.0,0.000000,15.000000,0.0,15.000000
2022-06-06 05:00:00,99.712433,0.0,59.708019,0.0,"[8785, 0, 8786, 0]","[1, 0, 0, 0]","[False, True, False, True]",10.438157,0.0,0.000000,-7.583130,0.0,-7.583130
2022-06-06 05:15:00,99.989998,0.0,59.708019,0.0,"[8785, 0, 8786, 0]","[1, 0, 0, 0]","[False, True, False, True]",1.110269,0.0,-6.841443,-9.538791,0.0,-9.538791
2022-06-06 05:30:00,99.989998,0.0,59.708019,0.0,"[8785, 0, 8786, 0]","[1, 0, 0, 0]","[False, True, False, True]",0.000009,0.0,-8.992670,-9.648014,0.0,-9.648014


In [None]:
#env.plot_Elvis_results()

In [45]:
env.plot_VPP_results()

Output hidden; open in https://colab.research.google.com to view.

In [46]:
env.plot_VPP_supply_demand()

Output hidden; open in https://colab.research.google.com to view.

In [47]:
env.plot_VPP_Elvis_comparison()

In [48]:
env.plot_rewards_results()

Output hidden; open in https://colab.research.google.com to view.

In [49]:
env.plot_rewards_stats()

In [50]:
env.plot_EVs_kpi()

In [51]:
env.plot_actions_kpi()

In [52]:
env.plot_load_kpi()

In [53]:
env.plot_yearly_load_log()

Output hidden; open in https://colab.research.google.com to view.

In [54]:
#env.close()