# Reinforcement Learning control strategies for Electric Vehicles fleet Virtual Power Plants
Thesis based on the development of a RL agent that manages a VPP through EVs charging stations in an household environment. Main optimization objectives of the VPP are: Valley filling, peak shaving and zero resulting load over time. Main action performed to reach objectives are: storage of Renewable energy resources and power push in the grid at high demand times. The development of the Virtual Power Plant environment is based on the ELVIS (Electric Vehicles Infrastructure Simulator) open library from DAI-Labor: https://github.com/dailab/elvis The thesis code is currently available at: (https://github.com/francescomaldonato/RL_VPP_Thesis)

Author: Francesco Maldonato

## VPP simulator Notebook based on EVs arrival, with StableBaselines3 trained model loaded [TRPO]

Installing required packages and dependencies

In [1]:
%%capture
!pip install py-elvis==0.2.1
!pip install pyyaml==5.4
!pip install plotly==5.9.0
!pip install -U kaleido==0.2.1

!pip install stable-baselines3[extra]==1.6.1
!pip install stable-baselines==1.6.1
!pip install sb3-contrib==1.6.1
!pip install gym==0.20.0
!pip install -q wandb==0.13.4

In [2]:
#Cloning repository and changing directory
!git clone https://github.com/francescomaldonato/RL_VPP_Thesis.git
%cd RL_VPP_Thesis/
%ls

Cloning into 'RL_VPP_Thesis'...
remote: Enumerating objects: 517, done.[K
remote: Counting objects: 100% (124/124), done.[K
remote: Compressing objects: 100% (59/59), done.[K
remote: Total 517 (delta 65), reused 121 (delta 64), pack-reused 393[K
Receiving objects: 100% (517/517), 188.99 MiB | 20.58 MiB/s, done.
Resolving deltas: 100% (214/214), done.
Checking out files: 100% (223/223), done.
/content/RL_VPP_Thesis
[0m[01;34mAgent_trainer_notebooks[0m/          [01;34mRL_VPP_Thesis[0m/
[01;34mAlgorithm_simulator_notebooks[0m/    [01;34mtrained_models[0m/
[01;34mdata[0m/                             VPP_environment.py
[01;34mEV_experiment_notebooks[0m/          VPP_simulator.ipynb
[01;34mHyperparameters_sweep_notebooks[0m/  [01;34mwandb[0m/
README.md


In [3]:
import yaml
import numpy as np
from VPP_environment import VPPEnv, VPP_Scenario_config
from elvis.config import ScenarioConfig
import os
import torch
import random
import wandb
from sb3_contrib import TRPO #The available algoritmhs in sb3-contrib for the custom environment with MultiInputPolicy
from sb3_contrib.common.maskable.utils import get_action_masks
import stable_baselines3 as sb3
from stable_baselines3.common.env_checker import check_env

#Check if cuda device is available for training
print("Torch-Cuda available device:", torch.cuda.is_available())
print(sb3.get_system_info())
!wandb --version

Torch-Cuda available device: False
OS: Linux-5.10.133+-x86_64-with-Ubuntu-18.04-bionic #1 SMP Fri Aug 26 08:44:51 UTC 2022
Python: 3.7.14
Stable-Baselines3: 1.6.1
PyTorch: 1.12.1+cu113
GPU Enabled: False
Numpy: 1.21.6
Gym: 0.20.0

({'OS': 'Linux-5.10.133+-x86_64-with-Ubuntu-18.04-bionic #1 SMP Fri Aug 26 08:44:51 UTC 2022', 'Python': '3.7.14', 'Stable-Baselines3': '1.6.1', 'PyTorch': '1.12.1+cu113', 'GPU Enabled': 'False', 'Numpy': '1.21.6', 'Gym': '0.20.0'}, 'OS: Linux-5.10.133+-x86_64-with-Ubuntu-18.04-bionic #1 SMP Fri Aug 26 08:44:51 UTC 2022\nPython: 3.7.14\nStable-Baselines3: 1.6.1\nPyTorch: 1.12.1+cu113\nGPU Enabled: False\nNumpy: 1.21.6\nGym: 0.20.0\n')
wandb, version 0.13.4


In [4]:
# Ensure deterministic behavior
torch.backends.cudnn.deterministic = True
random.seed(0)
torch.manual_seed(0)
torch.cuda.manual_seed_all(0)

## Load ELVIS YAML config file
Section where the EVs arrival simulation parameters are loaded through the Yaml config file from the 'data/config_builder/' folder.

In [5]:
#Loading paths for input data
current_folder = ''
VPP_training_data_input_path = current_folder + 'data/data_training/environment_table/' + 'Environment_data_2019.csv'
VPP_testing_data_input_path = current_folder + 'data/data_testing/environment_table/' + 'Environment_data_2020.csv'
VPP_validating_data_input_path = current_folder + 'data/data_validating/environment_table/' + 'Environment_data_2018.csv'
elvis_input_folder = current_folder + 'data/config_builder/'

case = 'wohnblock_household_simulation_adaptive.yaml' #(loaded by default, 20 EVs arrivals per week with 50% average battery)

#Try different simulation parameters, uncomment below
#case = 'wohnblock_household_simulation_adaptive_10.yaml' #(10 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_15.yaml' #(15 EVs arrivals per week with 50% average battery)
#case = 'wohnblock_household_simulation_adaptive_25.yaml' #(25 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_30.yaml' #(30 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_35.yaml' #(35 EVs arrivals per week with 50% average battery) 

with open(elvis_input_folder + case, 'r') as file:
    yaml_str = yaml.full_load(file)

elvis_config_file = ScenarioConfig.from_yaml(yaml_str)
VPP_config_file = VPP_Scenario_config(yaml_str)

print(elvis_config_file)
print(VPP_config_file)

Vehicle types: <generator object ScenarioConfig.__str__.<locals>.<genexpr> at 0x7f08c4c4fd50>Mean parking time: 23.99
Std deviation of parking time: 1
Mean value of the SOC distribution: 0.5
Std deviation of the SOC distribution: 0.1
Max parking time: 24
Number of charging events per week: 20
Vehicles are disconnected only depending on their parking time
Queue length: 0
Opening hours: None
Scheduling policy: Uncontrolled

{'start_date': '2022-01-01T00:00:00', 'end_date': '2023-01-01T00:00:00', 'resolution': '0:15:00', 'num_households': 4, 'solar_power': 16, 'wind_power': 12, 'EV_types': [{'battery': {'capacity': 100, 'efficiency': 1, 'max_charge_power': 150, 'min_charge_power': 0}, 'brand': 'Tesla', 'model': 'Model S', 'probability': 1}], 'charging_stations_n': 4, 'EVs_n': 20, 'EVs_n_max': 1044, 'mean_park': 23.99, 'std_deviation_park': 1, 'EVs_mean_soc': 50.0, 'EVs_std_deviation_soc': 10.0, 'EV_load_max': 44, 'EV_load_rated': 14.8, 'EV_load_min': 1, 'houseRWload_max': 10, 'av_max_ener

In [6]:
#TESTING Environment initialization
env = VPPEnv(VPP_testing_data_input_path, elvis_config_file, VPP_config_file)
#env.plot_VPP_input_data()

Charging event: 1, Arrival time: 2022-01-01 07:45:00, Parking_time: 23.092332859494963, Leaving_time: 2022-01-02 06:50:32.398294, SOC: 0.7140393505209827, SOC target: 1.0, Connected car: Tesla, Model S 
 ... 
 Charging event: 1043, Arrival time: 2022-12-31 17:15:00, Parking_time: 23.47966825725162, Leaving_time: 2023-01-01 16:43:46.805726, SOC: 0.5263909801628447, SOC target: 1.0, Connected car: Tesla, Model S 

-DATASET: House&RW_energy_sum=kWh  -21214.64 , Grid_used_en=kWh  4947.18 , RE-to-vehicle_unused_en=kWh  -26161.81 , Total_selling_cost=€  -489.75 , Grid_cost=€  233.11
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  21771.26 , Grid_used_en=kWh  38697.47 , RE-to-vehicle_unused_en=kWh  16926.21 , Total_selling_cost=€  897.79 , Grid_cost=€  1389.49 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  75.08


In [7]:
env.plot_ELVIS_data()

In [8]:
#Function to check custom environment and output additional warnings if needed
check_env(env)
#env.plot_reward_functions()

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  21908.1 , Grid_used_en=kWh  39108.02 , RE-to-vehicle_unused_en=kWh  17199.92 , Total_selling_cost=€  935.87 , Grid_cost=€  1416.43 , Av.EV_en_left=kWh  100.0 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  75.08
Simulating VPP....


In [9]:
## Wandb login to load models
#In Colab, uncomment below:
%env "WANDB_DISABLE_CODE" True
%env "WANDB_NOTEBOOK_NAME" "Simulator_notebooks/TRPO_VPP_simulator.ipynb"
os.environ['WANDB_NOTEBOOK_NAME'] = 'Simulator_notebooks/TRPO_VPP_simulator.ipynb'
#wandb.login(relogin=True)

#In local notebook, uncomment below:
#your_wandb_login_code = 0123456789abcdefghijklmnopqrstwxyzàèìòù0 #example length
#!wandb login {your_wandb_login_code}

env: "WANDB_DISABLE_CODE"=True
env: "WANDB_NOTEBOOK_NAME"="Simulator_notebooks/TRPO_VPP_simulator.ipynb"


In [10]:
#Loading training model, from local directory or from wandb previous trainings
RecurrentPPO_path = "trained_models/TRPO_models/model_TRPO_"

model_id = "2ydih28d"
model = TRPO.load(RecurrentPPO_path + model_id, env=env)

# run_id_restore = "2y2dqvyn"
# model = wandb.restore(f'model_{run_id_restore}.zip', run_path=f"francesco_maldonato/RL_VPP_Thesis/{run_id_restore}")

## Testing dataset VPP Simulation using the loaded trained model

In [11]:
#TEST Model
episodes = 1
for episode in range(1, episodes+1):
    obs = env.reset()
    done = False
    score = 0
    # cell and hidden state of the LSTM
    lstm_states = None
    num_envs = 1
    # Episode start signals are used to reset the lstm states
    episode_starts = np.ones((num_envs,), dtype=bool)
    while not done:
        #env.render()
        action_masks = get_action_masks(env)
        action, lstm_states = model.predict(obs, state=lstm_states, episode_start=episode_starts, deterministic=True) #Now using our trained model with deterministic prediction [should improve performances]
        env.lstm_state = lstm_states
        obs, reward, done, info = env.step(action)
        episode_starts = done
        score+=reward
    print('Episode:{} Score:{}'.format(episode, score))

VPP_table = env.VPP_table
#print(env.lstm_states_list)

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  20881.66 , Grid_used_en=kWh  38030.94 , RE-to-vehicle_unused_en=kWh  17149.28 , Total_selling_cost=€  867.51 , Grid_cost=€  1351.35 , Av.EV_en_left=kWh  100.0 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  75.08
Simulating VPP....
- VPP.Simulation results
 LOAD_INFO: Sum_Energy=KWh  -5274.39 , Grid_used_en=KWh  1657.22 , RE-to-vehicle_unused_en=KWh  6931.61 , Total_selling_cost=€  -124.59 , Grid_cost=€  54.52 
 EV_INFO: Av.EV_energy_leaving=kWh  65.89 , Std.EV_energy_leaving=kWh  26.77 , EV_departures =  1035 , EV_queue_left =  4
SCORE:  Cumulative_reward= 355838.29 - Step_rewars (load_t= 330624.58, EVs_energy_t= 2330.76)
 - Final_rewards (Av.EVs_energy= 15339.21, Grid_used_en= -962.27, RE-to-vehicle_unused_en= -3828.61, Grid_cost= 12334.61)
Episode:1 Score:355838.29026398266


In [12]:
env.plot_VPP_energies()

Output hidden; open in https://colab.research.google.com to view.

In [13]:
VPP_table.head(15000)

Unnamed: 0_level_0,0,1,2,3,EVs_id,actions,mask_truth,ev_charged_pwr,ev_discharged_pwr,load,load_reward,EV_reward,rewards
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2022-01-01 00:00:00,0.0,0.000000,0.0,0.0,"[0, 0, 0, 0]","[2, 2, 0, 2]","[False, False, True, False]",0.000000,0.0,1.943705,-5.147057,0.0,-5.147057
2022-01-01 00:15:00,0.0,0.000000,0.0,0.0,"[0, 0, 0, 0]","[2, 2, 0, 2]","[False, False, True, False]",0.000000,0.0,3.612744,-2.785023,0.0,-2.785023
2022-01-01 00:30:00,0.0,0.000000,0.0,0.0,"[0, 0, 0, 0]","[2, 2, 0, 2]","[False, False, True, False]",0.000000,0.0,2.171014,-3.203094,0.0,-3.203094
2022-01-01 00:45:00,0.0,0.000000,0.0,0.0,"[0, 0, 0, 0]","[2, 2, 0, 2]","[False, False, True, False]",0.000000,0.0,2.421856,-1.718590,0.0,-1.718590
2022-01-01 01:00:00,0.0,0.000000,0.0,0.0,"[0, 0, 0, 0]","[2, 2, 0, 2]","[False, False, True, False]",0.000000,0.0,1.531154,-1.534368,0.0,-1.534368
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-06-06 04:45:00,0.0,99.989998,0.0,0.0,"[0, 2522, 0, 0]","[0, 1, 0, 0]","[True, True, True, True]",0.000009,0.0,-8.607005,-11.461537,0.0,-11.461537
2022-06-06 05:00:00,0.0,99.989998,0.0,0.0,"[0, 2522, 0, 0]","[0, 1, 0, 0]","[True, True, True, True]",0.000009,0.0,-11.107690,-12.315652,0.0,-12.315652
2022-06-06 05:15:00,0.0,99.989998,0.0,0.0,"[0, 2522, 0, 0]","[0, 1, 0, 0]","[True, True, True, True]",0.000009,0.0,-12.047217,-12.905924,0.0,-12.905924
2022-06-06 05:30:00,0.0,99.989998,0.0,0.0,"[0, 2522, 0, 0]","[0, 1, 0, 0]","[True, True, True, True]",0.000009,0.0,-12.696516,-11.730226,0.0,-11.730226


In [14]:
#env.plot_Elvis_results()

In [15]:
env.plot_VPP_results()

Output hidden; open in https://colab.research.google.com to view.

In [16]:
env.plot_VPP_supply_demand()

Output hidden; open in https://colab.research.google.com to view.

In [17]:
env.plot_VPP_Elvis_comparison()

In [18]:
env.plot_rewards_results()

Output hidden; open in https://colab.research.google.com to view.

In [19]:
env.plot_rewards_stats()

In [20]:
env.plot_EVs_kpi()

In [21]:
env.plot_load_kpi()

In [22]:
env.plot_yearly_load_log()

Output hidden; open in https://colab.research.google.com to view.

## Validating dataset VPP Simulation using the loaded trained model

In [23]:
#VALIDATING Environment initialization
env = VPPEnv(VPP_validating_data_input_path, elvis_config_file, VPP_config_file)

Charging event: 3130, Arrival time: 2022-01-01 10:00:00, Parking_time: 23.50403169288236, Leaving_time: 2022-01-02 09:30:14.514094, SOC: 0.45234121333848615, SOC target: 1.0, Connected car: Tesla, Model S 
 ... 
 Charging event: 4172, Arrival time: 2022-12-31 20:45:00, Parking_time: 23.99591148501052, Leaving_time: 2023-01-01 20:44:45.281346, SOC: 0.6398540364044486, SOC target: 1.0, Connected car: Tesla, Model S 

-DATASET: House&RW_energy_sum=kWh  -30085.39 , Grid_used_en=kWh  2136.67 , RE-to-vehicle_unused_en=kWh  -32222.06 , Total_selling_cost=€  -1187.15 , Grid_cost=€  113.34
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  12357.22 , Grid_used_en=kWh  33896.11 , RE-to-vehicle_unused_en=kWh  21538.9 , Total_selling_cost=€  617.99 , Grid_cost=€  1488.69 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  80.89


In [24]:
#Function to check custom environment and output additional warnings if needed
check_env(env)
#env.plot_VPP_input_data()

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  12384.5 , Grid_used_en=kWh  34050.26 , RE-to-vehicle_unused_en=kWh  21665.77 , Total_selling_cost=€  580.21 , Grid_cost=€  1470.56 , Av.EV_en_left=kWh  100.0 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  80.89
Simulating VPP....


In [25]:
#model = PPO.load(PPO_path + model_run_ID, env = env)
model = TRPO.load(RecurrentPPO_path + model_id, env=env)

In [26]:
#TEST Model
episodes = 1
for episode in range(1, episodes+1):
    obs = env.reset()
    done = False
    score = 0
    # cell and hidden state of the LSTM
    lstm_states = None
    num_envs = 1
    # Episode start signals are used to reset the lstm states
    episode_starts = np.ones((num_envs,), dtype=bool)
    while not done:
        #env.render()
        action_masks = get_action_masks(env)
        action, lstm_states = model.predict(obs, state=lstm_states, episode_start=episode_starts, deterministic=True) #Now using our trained model with deterministic prediction [should improve performances]
        env.lstm_state = lstm_states
        obs, reward, done, info = env.step(action)
        episode_starts = done
        score+=reward
    print('Episode:{} Score:{}'.format(episode, score))

VPP_table = env.VPP_table
#print(env.lstm_states_list)

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  13110.89 , Grid_used_en=kWh  34801.66 , RE-to-vehicle_unused_en=kWh  21690.77 , Total_selling_cost=€  631.72 , Grid_cost=€  1521.21 , Av.EV_en_left=kWh  100.0 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  80.89
Simulating VPP....
- VPP.Simulation results
 LOAD_INFO: Sum_Energy=KWh  -7770.19 , Grid_used_en=KWh  1185.22 , RE-to-vehicle_unused_en=KWh  8955.41 , Total_selling_cost=€  -300.39 , Grid_cost=€  50.52 
 EV_INFO: Av.EV_energy_leaving=kWh  71.21 , Std.EV_energy_leaving=kWh  26.87 , EV_departures =  1042 , EV_queue_left =  0
SCORE:  Cumulative_reward= 360786.88 - Step_rewars (load_t= 316074.59, EVs_energy_t= 18119.0)
 - Final_rewards (Av.EVs_energy= 17467.87, Grid_used_en= -573.73, RE-to-vehicle_unused_en= -3991.27, Grid_cost= 13690.42)
Episode:1 Score:360786.8779955192


In [27]:
env.plot_VPP_energies()

Output hidden; open in https://colab.research.google.com to view.

In [28]:
VPP_table.head(15000)

Unnamed: 0_level_0,0,1,2,3,EVs_id,actions,mask_truth,ev_charged_pwr,ev_discharged_pwr,load,load_reward,EV_reward,rewards
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2022-01-01 00:00:00,0.000000,0.0,0.0,0.000000,"[0, 0, 0, 0]","[1, 0, 2, 1]","[False, True, False, False]",0.000000,0.0,-3.504846,-4.980275,0.0,-4.980275
2022-01-01 00:15:00,0.000000,0.0,0.0,0.000000,"[0, 0, 0, 0]","[1, 0, 2, 1]","[False, True, False, False]",0.000000,0.0,-3.988165,-4.984477,0.0,-4.984477
2022-01-01 00:30:00,0.000000,0.0,0.0,0.000000,"[0, 0, 0, 0]","[1, 0, 2, 1]","[False, True, False, False]",0.000000,0.0,-3.990686,-4.703691,0.0,-4.703691
2022-01-01 00:45:00,0.000000,0.0,0.0,0.000000,"[0, 0, 0, 0]","[1, 0, 2, 1]","[False, True, False, False]",0.000000,0.0,-3.822215,-5.207320,0.0,-5.207320
2022-01-01 01:00:00,0.000000,0.0,0.0,0.000000,"[0, 0, 0, 0]","[1, 0, 2, 1]","[False, True, False, False]",0.000000,0.0,-4.228052,-5.050865,0.0,-5.050865
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-06-06 04:45:00,59.164333,0.0,0.0,99.989998,"[5641, 0, 0, 5640]","[1, 0, 0, 1]","[True, True, True, False]",5.296895,0.0,-5.296878,-7.938946,0.0,-7.938946
2022-06-06 05:00:00,60.972546,0.0,0.0,99.989998,"[5641, 0, 0, 5640]","[1, 0, 0, 1]","[True, True, True, False]",7.232858,0.0,-7.232841,-7.341176,0.0,-7.341176
2022-06-06 05:15:00,62.616371,0.0,0.0,99.989998,"[5641, 0, 0, 5640]","[1, 0, 0, 1]","[True, True, True, False]",6.575311,0.0,-6.575294,-7.951340,0.0,-7.951340
2022-06-06 05:30:00,64.427994,0.0,0.0,99.989998,"[5641, 0, 0, 5640]","[1, 0, 0, 1]","[True, True, True, False]",7.246491,0.0,-7.246474,-7.514350,0.0,-7.514350


In [29]:
#env.plot_Elvis_results()

In [30]:
env.plot_VPP_results()


Output hidden; open in https://colab.research.google.com to view.

In [31]:
env.plot_VPP_supply_demand()

Output hidden; open in https://colab.research.google.com to view.

In [32]:
env.plot_VPP_Elvis_comparison()

In [33]:
env.plot_rewards_results()

Output hidden; open in https://colab.research.google.com to view.

In [34]:
env.plot_rewards_stats()

In [35]:
env.plot_EVs_kpi()

In [36]:
env.plot_load_kpi()

In [37]:
env.plot_yearly_load_log()

Output hidden; open in https://colab.research.google.com to view.

## Training dataset VPP Simulation using the loaded trained model

In [38]:
#TRAINING Environment initialization
env = VPPEnv(VPP_training_data_input_path, elvis_config_file, VPP_config_file)

Charging event: 6259, Arrival time: 2022-01-01 05:30:00, Parking_time: 24, Leaving_time: 2022-01-02 05:30:00, SOC: 0.4094442193849053, SOC target: 1.0, Connected car: Tesla, Model S 
 ... 
 Charging event: 7301, Arrival time: 2022-12-31 11:00:00, Parking_time: 23.352935532920565, Leaving_time: 2023-01-01 10:21:10.567919, SOC: 0.4931915790673214, SOC target: 1.0, Connected car: Tesla, Model S 

-DATASET: House&RW_energy_sum=kWh  -34117.7 , Grid_used_en=kWh  1556.25 , RE-to-vehicle_unused_en=kWh  -35673.95 , Total_selling_cost=€  -1196.64 , Grid_cost=€  97.86
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  8803.48 , Grid_used_en=kWh  32356.93 , RE-to-vehicle_unused_en=kWh  23553.46 , Total_selling_cost=€  515.17 , Grid_cost=€  1367.44 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  84.2


In [39]:
#Function to check custom environment and output additional warnings if needed
check_env(env)
#env.plot_VPP_input_data()

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  8325.71 , Grid_used_en=kWh  31543.91 , RE-to-vehicle_unused_en=kWh  23218.21 , Total_selling_cost=€  452.83 , Grid_cost=€  1314.52 , Av.EV_en_left=kWh  100.0 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  84.2
Simulating VPP....


In [40]:
#model = PPO.load(PPO_path + model_run_ID, env = env)
model = TRPO.load(RecurrentPPO_path + model_id, env=env)

In [41]:
#TEST Model
episodes = 1
for episode in range(1, episodes+1):
    obs = env.reset()
    done = False
    score = 0
    # cell and hidden state of the LSTM
    lstm_states = None
    num_envs = 1
    # Episode start signals are used to reset the lstm states
    episode_starts = np.ones((num_envs,), dtype=bool)
    while not done:
        #env.render()
        action_masks = get_action_masks(env)
        action, lstm_states = model.predict(obs, state=lstm_states, episode_start=episode_starts, deterministic=True) #Now using our trained model with deterministic prediction [should improve performances]
        env.lstm_state = lstm_states
        obs, reward, done, info = env.step(action)
        episode_starts = done
        score+=reward
    print('Episode:{} Score:{}'.format(episode, score))

VPP_table = env.VPP_table
#print(env.lstm_states_list)

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  8768.01 , Grid_used_en=kWh  32124.59 , RE-to-vehicle_unused_en=kWh  23356.58 , Total_selling_cost=€  487.34 , Grid_cost=€  1345.95 , Av.EV_en_left=kWh  100.0 , Charging_events=  1043 
- Exp.VPP_goals: Grid_used_en=kWh 0, RE-to-vehicle_unused_en=kWh 0, Grid_cost=€ 0 , Av.EV_en_left=kWh  84.2
Simulating VPP....
- VPP.Simulation results
 LOAD_INFO: Sum_Energy=KWh  -9281.63 , Grid_used_en=KWh  1229.38 , RE-to-vehicle_unused_en=KWh  10511.0 , Total_selling_cost=€  -321.56 , Grid_cost=€  50.37 
 EV_INFO: Av.EV_energy_leaving=kWh  73.78 , Std.EV_energy_leaving=kWh  25.76 , EV_departures =  1043 , EV_queue_left =  0
SCORE:  Cumulative_reward= 347817.53 - Step_rewars (load_t= 297118.22, EVs_energy_t= 25431.83)
 - Final_rewards (Av.EVs_energy= 17810.89, Grid_used_en= -745.67, RE-to-vehicle_unused_en= -4687.94, Grid_cost= 12890.21)
Episode:1 Score:347817.5336348704


In [42]:
env.plot_VPP_energies()

Output hidden; open in https://colab.research.google.com to view.

In [43]:
VPP_table.head(15000)

Unnamed: 0_level_0,0,1,2,3,EVs_id,actions,mask_truth,ev_charged_pwr,ev_discharged_pwr,load,load_reward,EV_reward,rewards
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2022-01-01 00:00:00,0.000000,0.000000,0.0,0.000000,"[0, 0, 0, 0]","[1, 0, 2, 1]","[False, True, False, False]",0.000000,0.0,-4.033347,-3.064015,0.0,-3.064015
2022-01-01 00:15:00,0.000000,0.000000,0.0,0.000000,"[0, 0, 0, 0]","[1, 0, 2, 1]","[False, True, False, False]",0.000000,0.0,-2.838409,-3.177235,0.0,-3.177235
2022-01-01 00:30:00,0.000000,0.000000,0.0,0.000000,"[0, 0, 0, 0]","[1, 0, 2, 1]","[False, True, False, False]",0.000000,0.0,-2.906341,-4.471724,0.0,-4.471724
2022-01-01 00:45:00,0.000000,0.000000,0.0,0.000000,"[0, 0, 0, 0]","[1, 0, 2, 1]","[False, True, False, False]",0.000000,0.0,-3.683034,-5.373669,0.0,-5.373669
2022-01-01 01:00:00,0.000000,0.000000,0.0,0.000000,"[0, 0, 0, 0]","[1, 0, 2, 1]","[False, True, False, False]",0.000000,0.0,-4.411036,-5.448201,0.0,-5.448201
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-06-06 04:45:00,46.379505,98.401428,0.0,66.104126,"[8799, 8797, 0, 8798]","[2, 1, 0, 1]","[False, False, True, True]",7.096490,-1.0,0.000000,15.000000,0.0,15.000000
2022-06-06 05:00:00,46.129505,99.780899,0.0,67.483597,"[8799, 8797, 0, 8798]","[0, 1, 2, 1]","[False, False, False, True]",11.035752,-1.0,0.000000,-5.051330,0.0,-5.051330
2022-06-06 05:15:00,46.129505,99.989998,0.0,68.706810,"[8799, 8797, 0, 8798]","[0, 1, 0, 1]","[False, False, True, True]",5.729270,0.0,-4.056463,-5.441390,0.0,-5.441390
2022-06-06 05:30:00,46.129505,99.989998,0.0,69.828194,"[8799, 8797, 0, 8798]","[0, 1, 0, 1]","[False, False, True, True]",4.485546,0.0,-4.485529,-5.747812,0.0,-5.747812


In [44]:
#env.plot_Elvis_results()

In [45]:
env.plot_VPP_results()

Output hidden; open in https://colab.research.google.com to view.

In [46]:
env.plot_VPP_supply_demand()

Output hidden; open in https://colab.research.google.com to view.

In [47]:
env.plot_VPP_Elvis_comparison()

In [48]:
env.plot_rewards_results()

Output hidden; open in https://colab.research.google.com to view.

In [49]:
env.plot_rewards_stats()

In [50]:
env.plot_EVs_kpi()

In [51]:
env.plot_actions_kpi()

In [52]:
env.plot_load_kpi()

In [53]:
env.plot_yearly_load_log()

Output hidden; open in https://colab.research.google.com to view.

In [54]:
#env.close()