# Reinforcement Learning control strategies for Electric Vehicles fleet Virtual Power Plants
Thesis based on the development of a RL agent that manages a VPP through EVs charging stations in an household environment. Main optimization objectives of the VPP are: Valley filling, peak shaving and zero resulting load over time. Main action performed to reach objectives are: storage of Renewable energy resources and power push in the grid at high demand times. The development of the Virtual Power Plant environment is based on the ELVIS (Electric Vehicles Infrastructure Simulator) open library from DAI-Labor: https://github.com/dailab/elvis The thesis code is currently available at: (https://github.com/francescomaldonato/RL_VPP_Thesis)

Author: Francesco Maldonato

## VPP experiment tester Notebook based on EVs arrival, with StableBaselines3 trained model (Recurrent PPO) [10-15-20-25-30-35 EVs per week test]

Installing required packages and dependencies

In [1]:
%%capture
!pip install py-elvis==0.2.1
!pip install pyyaml==5.4
!pip install plotly==5.9.0
!pip install -U kaleido==0.2.1

!pip install stable-baselines3[extra]==1.6.1
!pip install stable-baselines==1.6.1
!pip install sb3-contrib==1.6.1
!pip install gym==0.20.0
!pip install -q wandb==0.13.4

In [3]:
#Cloning repository and changing directory
!git clone https://github.com/francescomaldonato/RL_VPP_Thesis.git
%cd RL_VPP_Thesis/
%ls

In [4]:
import yaml
import numpy as np
import pandas as pd
from VPP_environment import VPPEnv, VPP_Scenario_config
from elvis.config import ScenarioConfig
import os
import torch
import random
import wandb
from sb3_contrib import RecurrentPPO #The available algoritmhs in sb3-contrib for the custom environment with MultiInputPolicy
from sb3_contrib.common.maskable.utils import get_action_masks
import stable_baselines3 as sb3
from stable_baselines3.common.env_checker import check_env

#Check if cuda device is available for training
print("Torch-Cuda available device:", torch.cuda.is_available())
print(sb3.get_system_info())
#!wandb --version

Torch-Cuda available device: False
OS: Linux-5.10.133+-x86_64-with-Ubuntu-18.04-bionic #1 SMP Fri Aug 26 08:44:51 UTC 2022
Python: 3.7.14
Stable-Baselines3: 1.6.1
PyTorch: 1.12.1+cu113
GPU Enabled: False
Numpy: 1.21.6
Gym: 0.20.0

({'OS': 'Linux-5.10.133+-x86_64-with-Ubuntu-18.04-bionic #1 SMP Fri Aug 26 08:44:51 UTC 2022', 'Python': '3.7.14', 'Stable-Baselines3': '1.6.1', 'PyTorch': '1.12.1+cu113', 'GPU Enabled': 'False', 'Numpy': '1.21.6', 'Gym': '0.20.0'}, 'OS: Linux-5.10.133+-x86_64-with-Ubuntu-18.04-bionic #1 SMP Fri Aug 26 08:44:51 UTC 2022\nPython: 3.7.14\nStable-Baselines3: 1.6.1\nPyTorch: 1.12.1+cu113\nGPU Enabled: False\nNumpy: 1.21.6\nGym: 0.20.0\n')


In [5]:
# Ensure deterministic behavior
torch.backends.cudnn.deterministic = True
random.seed(0)
torch.manual_seed(0)
torch.cuda.manual_seed_all(0)

## Load ELVIS YAML config file
Section where the EVs arrival simulation parameters are loaded through the Yaml config file from the 'data/config_builder/' folder.

In [6]:
#Loading paths for input data
current_folder = ''
VPP_training_data_input_path = current_folder + 'data/data_training/environment_table/' + 'Environment_data_2019.csv'
VPP_testing_data_input_path = current_folder + 'data/data_testing/environment_table/' + 'Environment_data_2020.csv'
VPP_validating_data_input_path = current_folder + 'data/data_validating/environment_table/' + 'Environment_data_2018.csv'
elvis_input_folder = current_folder + 'data/config_builder/'

#case = 'wohnblock_household_simulation_adaptive.yaml' #(loaded by default, 20 EVs arrivals per week with 50% average battery)

#Try different simulation parameters, uncomment below
#case = 'wohnblock_household_simulation_adaptive_10.yaml' #(10 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_15.yaml' #(15 EVs arrivals per week with 50% average battery)
#case = 'wohnblock_household_simulation_adaptive_25.yaml' #(25 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_30.yaml' #(30 EVs arrivals per week with 50% average battery) 
case = 'wohnblock_household_simulation_adaptive_35.yaml' #(35 EVs arrivals per week with 50% average battery) 

with open(elvis_input_folder + case, 'r') as file:
    yaml_str = yaml.full_load(file)

elvis_config_file = ScenarioConfig.from_yaml(yaml_str)
VPP_config_file = VPP_Scenario_config(yaml_str)

print(elvis_config_file)
print(VPP_config_file)

Vehicle types: <generator object ScenarioConfig.__str__.<locals>.<genexpr> at 0x7fcb7638f2d0>Mean parking time: 23.99
Std deviation of parking time: 1
Mean value of the SOC distribution: 0.5
Std deviation of the SOC distribution: 0.1
Max parking time: 24
Number of charging events per week: 35
Vehicles are disconnected only depending on their parking time
Queue length: 0
Opening hours: None
Scheduling policy: Uncontrolled

{'start_date': '2022-01-01T00:00:00', 'end_date': '2023-01-01T00:00:00', 'resolution': '0:15:00', 'num_households': 4, 'solar_power': 16, 'wind_power': 12, 'EV_types': [{'battery': {'capacity': 100, 'efficiency': 1, 'max_charge_power': 150, 'min_charge_power': 0}, 'brand': 'Tesla', 'model': 'Model S', 'probability': 1}], 'charging_stations_n': 4, 'EVs_n': 35, 'EVs_n_max': 1827, 'mean_park': 23.99, 'std_deviation_park': 1, 'EVs_mean_soc': 50.0, 'EVs_std_deviation_soc': 10.0, 'EV_load_max': 44, 'EV_load_rated': 14.8, 'EV_load_min': 1, 'houseRWload_max': 10, 'av_max_ener

In [7]:
#TESTING Environment initialization
env = VPPEnv(VPP_testing_data_input_path, elvis_config_file, VPP_config_file)
#env.plot_VPP_input_data()
#Function to check custom environment and output additional warnings if needed
check_env(env)
#env.plot_reward_functions()

Charging event: 1, Arrival time: 2022-01-01 10:30:00, Parking_time: 24, Leaving_time: 2022-01-02 10:30:00, SOC: 0.3777073053315077, SOC target: 1.0, Connected car: Tesla, Model S 
 ... 
 Charging event: 1825, Arrival time: 2022-12-31 22:15:00, Parking_time: 23.46607430533517, Leaving_time: 2023-01-01 21:42:57.867499, SOC: 0.35541539370306874, SOC target: 1.0, Connected car: Tesla, Model S 

-DATASET: House&RW_energy_sum=kWh  -21214.64 , over-consume=kWh  4947.18 , under-consume=kWh  -26161.81 , Total_cost=€  -489.75 , overcost=€  233.11
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  34262.26 , over-consume=kWh  48986.66 , under-consume=kWh  14724.4 , Total_cost=€  1304.32 , overcost=€  1736.36 , Charging_events=  1825 
- Exp.VPP_goals: Energy_consumed=kWh 0, Av.load=kW 0, Std.load=kW 0, Total_cost=€ 0 , Av.EV_en_left=kWh  64.34
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  34381.42 , over-consume=kWh  50362.61 , under-consume=kWh  15981.19 , Total_cost=€  1308.44

In [8]:
#Loading training model, from local directory or from wandb previous trainings
RecurrentPPO_path = "trained_models/RecurrentPPO_models/model_RecurrentPPO_"

#model_id = "s37o8q0n"
model_id = "333ckz0i"
model = RecurrentPPO.load(RecurrentPPO_path + model_id, env=env)

# run_id_restore = "2y2dqvyn"
# model = wandb.restore(f'model_{run_id_restore}.zip', run_path=f"francesco_maldonato/RL_VPP_Thesis/{run_id_restore}")
results_table = []

  + "this object."


Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.


## Testing dataset VPP Simulation using the loaded trained model [35 EVs per week]

In [9]:
#TEST Model
episodes = 10
results = []
for episode in range(1, episodes+1):
    obs = env.reset()
    done = False
    score = 0
    # cell and hidden state of the LSTM
    lstm_states = None
    num_envs = 1
    # Episode start signals are used to reset the lstm states
    episode_starts = np.ones((num_envs,), dtype=bool)
    while not done:
        #env.render()
        action_masks = get_action_masks(env)
        action, lstm_states = model.predict(obs, state=lstm_states, episode_start=episode_starts, deterministic=True) #Now using our trained model with deterministic prediction [should improve performances]
        env.lstm_state = lstm_states
        obs, reward, done, info = env.step(action)
        episode_starts = done
        score+=reward
    print('Episode:{} Score:{}'.format(episode, score))
    results.append(env.quick_results)

stacked_results = np.stack(results)

results.append(np.array([str(env.EVs_n)+"_EVs_mean", np.mean(stacked_results[:, 1].astype('float32')), np.mean(stacked_results[:, 2].astype('float32')), np.mean(stacked_results[:, 3].astype('float32')), np.mean(stacked_results[:, 4].astype('float32')), np.mean(stacked_results[:, 5].astype('float32'))]))
results_table.extend(results)
#Save the VPP table
#VPP_table = env.save_VPP_table(save_path='data/environment_optimized_output/VPP_table.csv')
#VPP_table = env.VPP_table

- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  34828.86 , over-consume=kWh  49267.3 , under-consume=kWh  14438.44 , Total_cost=€  1285.81 , overcost=€  1711.08 , Av.EV_en_left=kWh  100.0 , Charging_events=  1825 
- Exp.VPP_goals: Energy_consumed=kWh 0, Av.load=kW 0, Std.load=kW 0, Total_cost=€ 0 , Av.EV_en_left=kWh  64.34
Simulating VPP....
- VPP.Simulation results
 LOAD_INFO: Sum_Energy=KWh  -623.62 , over-consume=KWh  1116.61 , under-consume=KWh  1740.23 , Total_cost=€  -3.11 , Overcost=€  40.4 
 EV_INFO: Av.EV_energy_leaving=kWh  61.19 , Std.EV_energy_leaving=kWh  23.15 , EV_departures =  1819 , EV_queue_left =  2
SCORE:  Cumulative_reward= 494101.31 - Step_rewars (load_t= 458008.84, EVs_energy_t= -2532.99)
 - Final_rewards (EVs_energy= 21233.73, Overconsume= -108.75, Underconsume= 1986.81, Overcost= 15513.67)
Episode:1 Score:494101.3112564697
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  35295.36 , over-consume=kWh  50222.15 , under-consume=kWh  14926.79 , To

## Testing dataset VPP Simulation using the loaded trained model [30 EVs per week]

In [10]:
#case = 'wohnblock_household_simulation_adaptive.yaml' #(loaded by default, 20 EVs arrivals per week with 50% average battery)

#Try different simulation parameters, uncomment below
#case = 'wohnblock_household_simulation_adaptive_10.yaml' #(10 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_15.yaml' #(15 EVs arrivals per week with 50% average battery)
#case = 'wohnblock_household_simulation_adaptive_25.yaml' #(25 EVs arrivals per week with 50% average battery) 
case = 'wohnblock_household_simulation_adaptive_30.yaml' #(30 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_35.yaml' #(35 EVs arrivals per week with 50% average battery) 

with open(elvis_input_folder + case, 'r') as file:
    yaml_str = yaml.full_load(file)

elvis_config_file = ScenarioConfig.from_yaml(yaml_str)
VPP_config_file = VPP_Scenario_config(yaml_str)

#print(elvis_config_file)
#print(VPP_config_file)

#TESTING Environment initialization
env = VPPEnv(VPP_testing_data_input_path, elvis_config_file, VPP_config_file)
#Function to check custom environment and output additional warnings if needed
check_env(env)

model = RecurrentPPO.load(RecurrentPPO_path + model_id, env=env)

#TEST Model
episodes = 10
results = []
for episode in range(1, episodes+1):
    obs = env.reset()
    done = False
    score = 0
    # cell and hidden state of the LSTM
    lstm_states = None
    num_envs = 1
    # Episode start signals are used to reset the lstm states
    episode_starts = np.ones((num_envs,), dtype=bool)
    while not done:
        #env.render()
        action_masks = get_action_masks(env)
        action, lstm_states = model.predict(obs, state=lstm_states, episode_start=episode_starts, deterministic=True) #Now using our trained model with deterministic prediction [should improve performances]
        env.lstm_state = lstm_states
        obs, reward, done, info = env.step(action)
        episode_starts = done
        score+=reward
    print('Episode:{} Score:{}'.format(episode, score))
    results.append(env.quick_results)

stacked_results = np.stack(results)
results.append(np.array([str(env.EVs_n)+"_EVs_mean", np.mean(stacked_results[:, 1].astype('float32')), np.mean(stacked_results[:, 2].astype('float32')), np.mean(stacked_results[:, 3].astype('float32')), np.mean(stacked_results[:, 4].astype('float32')), np.mean(stacked_results[:, 5].astype('float32'))]))
results_table.extend(results)
#Save the VPP table
#VPP_table = env.save_VPP_table(save_path='data/environment_optimized_output/VPP_table.csv')
#VPP_table = env.VPP_table

Charging event: 21901, Arrival time: 2022-01-01 11:30:00, Parking_time: 23.513976979508254, Leaving_time: 2022-01-02 11:00:50.317126, SOC: 0.5201048429008761, SOC target: 1.0, Connected car: Tesla, Model S 
 ... 
 Charging event: 23465, Arrival time: 2022-12-31 14:30:00, Parking_time: 24, Leaving_time: 2023-01-01 14:30:00, SOC: 0.33909614465306115, SOC target: 1.0, Connected car: Tesla, Model S 

-DATASET: House&RW_energy_sum=kWh  -21214.64 , over-consume=kWh  4947.18 , under-consume=kWh  -26161.81 , Total_cost=€  -489.75 , overcost=€  233.11
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  31449.62 , over-consume=kWh  46633.2 , under-consume=kWh  15183.58 , Total_cost=€  1187.25 , overcost=€  1634.42 , Charging_events=  1565 
- Exp.VPP_goals: Energy_consumed=kWh 0, Av.load=kW 0, Std.load=kW 0, Total_cost=€ 0 , Av.EV_en_left=kWh  66.72
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  32114.39 , over-consume=kWh  47403.07 , under-consume=kWh  15288.67 , Total_cost=€  1

## Testing dataset VPP Simulation using the loaded trained model [25 EVs per week]

In [11]:
#case = 'wohnblock_household_simulation_adaptive.yaml' #(loaded by default, 20 EVs arrivals per week with 50% average battery)

#Try different simulation parameters, uncomment below
#case = 'wohnblock_household_simulation_adaptive_10.yaml' #(10 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_15.yaml' #(15 EVs arrivals per week with 50% average battery)
case = 'wohnblock_household_simulation_adaptive_25.yaml' #(25 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_30.yaml' #(30 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_35.yaml' #(35 EVs arrivals per week with 50% average battery) 

with open(elvis_input_folder + case, 'r') as file:
    yaml_str = yaml.full_load(file)

elvis_config_file = ScenarioConfig.from_yaml(yaml_str)
VPP_config_file = VPP_Scenario_config(yaml_str)

#print(elvis_config_file)
#print(VPP_config_file)

#TESTING Environment initialization
env = VPPEnv(VPP_testing_data_input_path, elvis_config_file, VPP_config_file)
#Function to check custom environment and output additional warnings if needed
check_env(env)

model = RecurrentPPO.load(RecurrentPPO_path + model_id, env=env)

#TEST Model
episodes = 10
results = []
for episode in range(1, episodes+1):
    obs = env.reset()
    done = False
    score = 0
    # cell and hidden state of the LSTM
    lstm_states = None
    num_envs = 1
    # Episode start signals are used to reset the lstm states
    episode_starts = np.ones((num_envs,), dtype=bool)
    while not done:
        #env.render()
        action_masks = get_action_masks(env)
        action, lstm_states = model.predict(obs, state=lstm_states, episode_start=episode_starts, deterministic=True) #Now using our trained model with deterministic prediction [should improve performances]
        env.lstm_state = lstm_states
        obs, reward, done, info = env.step(action)
        episode_starts = done
        score+=reward
    print('Episode:{} Score:{}'.format(episode, score))
    results.append(env.quick_results)

stacked_results = np.stack(results)
results.append(np.array([str(env.EVs_n)+"_EVs_mean", np.mean(stacked_results[:, 1].astype('float32')), np.mean(stacked_results[:, 2].astype('float32')), np.mean(stacked_results[:, 3].astype('float32')), np.mean(stacked_results[:, 4].astype('float32')), np.mean(stacked_results[:, 5].astype('float32'))]))
results_table.extend(results)
#Save the VPP table
#VPP_table = env.save_VPP_table(save_path='data/environment_optimized_output/VPP_table.csv')
#VPP_table = env.VPP_table

Charging event: 40681, Arrival time: 2022-01-01 09:00:00, Parking_time: 23.909455482455495, Leaving_time: 2022-01-02 08:54:34.039737, SOC: 0.40094317622868453, SOC target: 1.0, Connected car: Tesla, Model S 
 ... 
 Charging event: 41984, Arrival time: 2022-12-31 18:30:00, Parking_time: 22.922282019087472, Leaving_time: 2023-01-01 17:25:20.215269, SOC: 0.621295513204638, SOC target: 1.0, Connected car: Tesla, Model S 

-DATASET: House&RW_energy_sum=kWh  -21214.64 , over-consume=kWh  4947.18 , under-consume=kWh  -26161.81 , Total_cost=€  -489.75 , overcost=€  233.11
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  27647.67 , over-consume=kWh  43890.21 , under-consume=kWh  16242.55 , Total_cost=€  1078.47 , overcost=€  1549.08 , Charging_events=  1304 
- Exp.VPP_goals: Energy_consumed=kWh 0, Av.load=kW 0, Std.load=kW 0, Total_cost=€ 0 , Av.EV_en_left=kWh  70.06
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  26452.34 , over-consume=kWh  42716.56 , under-consume=kWh  162

## Testing dataset VPP Simulation using the loaded trained model [20 EVs per week]

In [12]:
case = 'wohnblock_household_simulation_adaptive.yaml' #(loaded by default, 20 EVs arrivals per week with 50% average battery)

#Try different simulation parameters, uncomment below
#case = 'wohnblock_household_simulation_adaptive_10.yaml' #(10 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_15.yaml' #(15 EVs arrivals per week with 50% average battery)
#case = 'wohnblock_household_simulation_adaptive_25.yaml' #(25 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_30.yaml' #(30 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_35.yaml' #(35 EVs arrivals per week with 50% average battery) 

with open(elvis_input_folder + case, 'r') as file:
    yaml_str = yaml.full_load(file)

elvis_config_file = ScenarioConfig.from_yaml(yaml_str)
VPP_config_file = VPP_Scenario_config(yaml_str)

#print(elvis_config_file)
#print(VPP_config_file)

#TESTING Environment initialization
env = VPPEnv(VPP_testing_data_input_path, elvis_config_file, VPP_config_file)
#Function to check custom environment and output additional warnings if needed
check_env(env)

model = RecurrentPPO.load(RecurrentPPO_path + model_id, env=env)

#TEST Model
episodes = 10
results = []
for episode in range(1, episodes+1):
    obs = env.reset()
    done = False
    score = 0
    # cell and hidden state of the LSTM
    lstm_states = None
    num_envs = 1
    # Episode start signals are used to reset the lstm states
    episode_starts = np.ones((num_envs,), dtype=bool)
    while not done:
        #env.render()
        action_masks = get_action_masks(env)
        action, lstm_states = model.predict(obs, state=lstm_states, episode_start=episode_starts, deterministic=True) #Now using our trained model with deterministic prediction [should improve performances]
        env.lstm_state = lstm_states
        obs, reward, done, info = env.step(action)
        episode_starts = done
        score+=reward
    print('Episode:{} Score:{}'.format(episode, score))
    results.append(env.quick_results)

stacked_results = np.stack(results)
results.append(np.array([str(env.EVs_n)+"_EVs_mean", np.mean(stacked_results[:, 1].astype('float32')), np.mean(stacked_results[:, 2].astype('float32')), np.mean(stacked_results[:, 3].astype('float32')), np.mean(stacked_results[:, 4].astype('float32')), np.mean(stacked_results[:, 5].astype('float32'))]))
results_table.extend(results)
#Save the VPP table
#VPP_table = env.save_VPP_table(save_path='data/environment_optimized_output/VPP_table.csv')
#VPP_table = env.VPP_table

Charging event: 56329, Arrival time: 2022-01-01 05:15:00, Parking_time: 23.871661576692404, Leaving_time: 2022-01-02 05:07:17.981676, SOC: 0.475612665146233, SOC target: 1.0, Connected car: Tesla, Model S 
 ... 
 Charging event: 57371, Arrival time: 2022-12-31 22:15:00, Parking_time: 23.13127028383383, Leaving_time: 2023-01-01 21:22:52.573022, SOC: 0.4620725846788589, SOC target: 1.0, Connected car: Tesla, Model S 

-DATASET: House&RW_energy_sum=kWh  -21214.64 , over-consume=kWh  4947.18 , under-consume=kWh  -26161.81 , Total_cost=€  -489.75 , overcost=€  233.11
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  20965.79 , over-consume=kWh  37875.0 , under-consume=kWh  16909.21 , Total_cost=€  834.32 , overcost=€  1335.94 , Charging_events=  1043 
- Exp.VPP_goals: Energy_consumed=kWh 0, Av.load=kW 0, Std.load=kW 0, Total_cost=€ 0 , Av.EV_en_left=kWh  75.08
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  22242.56 , over-consume=kWh  39289.59 , under-consume=kWh  17047.0

## Testing dataset VPP Simulation using the loaded trained model [15 EVs per week]


In [13]:
#case = 'wohnblock_household_simulation_adaptive.yaml' #(loaded by default, 20 EVs arrivals per week with 50% average battery)

#Try different simulation parameters, uncomment below
#case = 'wohnblock_household_simulation_adaptive_10.yaml' #(10 EVs arrivals per week with 50% average battery) 
case = 'wohnblock_household_simulation_adaptive_15.yaml' #(15 EVs arrivals per week with 50% average battery)
#case = 'wohnblock_household_simulation_adaptive_25.yaml' #(25 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_30.yaml' #(30 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_35.yaml' #(35 EVs arrivals per week with 50% average battery) 

with open(elvis_input_folder + case, 'r') as file:
    yaml_str = yaml.full_load(file)

elvis_config_file = ScenarioConfig.from_yaml(yaml_str)
VPP_config_file = VPP_Scenario_config(yaml_str)

#print(elvis_config_file)
#print(VPP_config_file)

#TESTING Environment initialization
env = VPPEnv(VPP_testing_data_input_path, elvis_config_file, VPP_config_file)
#Function to check custom environment and output additional warnings if needed
check_env(env)

model = RecurrentPPO.load(RecurrentPPO_path + model_id, env=env)

#TEST Model
episodes = 10
results = []
for episode in range(1, episodes+1):
    obs = env.reset()
    done = False
    score = 0
    # cell and hidden state of the LSTM
    lstm_states = None
    num_envs = 1
    # Episode start signals are used to reset the lstm states
    episode_starts = np.ones((num_envs,), dtype=bool)
    while not done:
        #env.render()
        action_masks = get_action_masks(env)
        action, lstm_states = model.predict(obs, state=lstm_states, episode_start=episode_starts, deterministic=True) #Now using our trained model with deterministic prediction [should improve performances]
        env.lstm_state = lstm_states
        obs, reward, done, info = env.step(action)
        episode_starts = done
        score+=reward
    print('Episode:{} Score:{}'.format(episode, score))
    results.append(env.quick_results)

stacked_results = np.stack(results)
results.append(np.array([str(env.EVs_n)+"_EVs_mean", np.mean(stacked_results[:, 1].astype('float32')), np.mean(stacked_results[:, 2].astype('float32')), np.mean(stacked_results[:, 3].astype('float32')), np.mean(stacked_results[:, 4].astype('float32')), np.mean(stacked_results[:, 5].astype('float32'))]))
results_table.extend(results)
#Save the VPP table
#VPP_table = env.save_VPP_table(save_path='data/environment_optimized_output/VPP_table.csv')
#VPP_table = env.VPP_table

Charging event: 68845, Arrival time: 2022-01-01 15:00:00, Parking_time: 22.99208398693717, Leaving_time: 2022-01-02 13:59:31.502353, SOC: 0.6256760243166473, SOC target: 1.0, Connected car: Tesla, Model S 
 ... 
 Charging event: 69627, Arrival time: 2022-12-31 15:00:00, Parking_time: 24, Leaving_time: 2023-01-01 15:00:00, SOC: 0.5312848381047637, SOC target: 1.0, Connected car: Tesla, Model S 

-DATASET: House&RW_energy_sum=kWh  -21214.64 , over-consume=kWh  4947.18 , under-consume=kWh  -26161.81 , Total_cost=€  -489.75 , overcost=€  233.11
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  13585.86 , over-consume=kWh  32260.83 , under-consume=kWh  18674.97 , Total_cost=€  626.68 , overcost=€  1163.45 , Charging_events=  783 
- Exp.VPP_goals: Energy_consumed=kWh 0, Av.load=kW 0, Std.load=kW 0, Total_cost=€ 0 , Av.EV_en_left=kWh  83.41
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  14216.51 , over-consume=kWh  32882.58 , under-consume=kWh  18666.07 , Total_cost=€  640.

## Testing dataset VPP Simulation using the loaded trained model [10 EVs per week]


In [14]:
#case = 'wohnblock_household_simulation_adaptive.yaml' #(loaded by default, 20 EVs arrivals per week with 50% average battery)

#Try different simulation parameters, uncomment below
case = 'wohnblock_household_simulation_adaptive_10.yaml' #(10 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_15.yaml' #(15 EVs arrivals per week with 50% average battery)
#case = 'wohnblock_household_simulation_adaptive_25.yaml' #(25 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_30.yaml' #(30 EVs arrivals per week with 50% average battery) 
#case = 'wohnblock_household_simulation_adaptive_35.yaml' #(35 EVs arrivals per week with 50% average battery) 

with open(elvis_input_folder + case, 'r') as file:
    yaml_str = yaml.full_load(file)

elvis_config_file = ScenarioConfig.from_yaml(yaml_str)
VPP_config_file = VPP_Scenario_config(yaml_str)

#print(elvis_config_file)
#print(VPP_config_file)

#TESTING Environment initialization
env = VPPEnv(VPP_testing_data_input_path, elvis_config_file, VPP_config_file)
#Function to check custom environment and output additional warnings if needed
check_env(env)

model = RecurrentPPO.load(RecurrentPPO_path + model_id, env=env)

#TEST Model
episodes = 10
results = []
for episode in range(1, episodes+1):
    obs = env.reset()
    done = False
    score = 0
    # cell and hidden state of the LSTM
    lstm_states = None
    num_envs = 1
    # Episode start signals are used to reset the lstm states
    episode_starts = np.ones((num_envs,), dtype=bool)
    while not done:
        #env.render()
        action_masks = get_action_masks(env)
        action, lstm_states = model.predict(obs, state=lstm_states, episode_start=episode_starts, deterministic=True) #Now using our trained model with deterministic prediction [should improve performances]
        env.lstm_state = lstm_states
        obs, reward, done, info = env.step(action)
        episode_starts = done
        score+=reward
    print('Episode:{} Score:{}'.format(episode, score))
    results.append(env.quick_results)

stacked_results = np.stack(results)
results.append(np.array([str(env.EVs_n)+"_EVs_mean", np.mean(stacked_results[:, 1].astype('float32')), np.mean(stacked_results[:, 2].astype('float32')), np.mean(stacked_results[:, 3].astype('float32')), np.mean(stacked_results[:, 4].astype('float32')), np.mean(stacked_results[:, 5].astype('float32'))]))
results_table.extend(results)
#Save the VPP table
#VPP_table = env.save_VPP_table(save_path='data/environment_optimized_output/VPP_table.csv')
#VPP_table = env.VPP_table

Charging event: 78241, Arrival time: 2022-01-01 10:30:00, Parking_time: 23.77738311466849, Leaving_time: 2022-01-02 10:16:38.579213, SOC: 0.5678533035591383, SOC target: 1.0, Connected car: Tesla, Model S 
 ... 
 Charging event: 78762, Arrival time: 2022-12-30 13:30:00, Parking_time: 24, Leaving_time: 2022-12-31 13:30:00, SOC: 0.5354761645811792, SOC target: 1.0, Connected car: Tesla, Model S 

-DATASET: House&RW_energy_sum=kWh  -21214.64 , over-consume=kWh  4947.18 , under-consume=kWh  -26161.81 , Total_cost=€  -489.75 , overcost=€  233.11
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  4087.49 , over-consume=kWh  24274.81 , under-consume=kWh  20187.32 , Total_cost=€  314.04 , overcost=€  884.72 , Charging_events=  522 
- Exp.VPP_goals: Energy_consumed=kWh 0, Av.load=kW 0, Std.load=kW 0, Total_cost=€ 0 , Av.EV_en_left=kWh  100.12
- ELVIS.Simulation (Av.EV_SOC=  50.0 %):
 Sum_Energy=kWh  3741.02 , over-consume=kWh  23970.71 , under-consume=kWh  20229.69 , Total_cost=€  291.35

In [17]:
dataframe_results = pd.DataFrame(np.stack(results_table))
dataframe_results = dataframe_results.rename(columns={0:"Name",1:"underconsume",2:"overconsume",3:"overcost",4:"av_EV_energy_left",5:"cumulative_reward"})
dataframe_results.to_csv("data/algorithms_results/algorithms_results_table/EV_experiments_testing.csv")
dataframe_results.head()

Unnamed: 0,Name,underconsume,overconsume,overcost,av_EV_energy_left,cumulative_reward
0,35_EVs,1740.2299760413978,1116.610748278452,40.402471919896186,61.1935769734227,494101.31125646905
1,35_EVs,2040.072032628344,1269.3679357680962,45.34016568319158,60.99147676284563,485968.2610965516
2,35_EVs,1665.674890127379,1218.847777189692,44.33780118804909,60.902504047930144,494275.97204137314
3,35_EVs,1888.572752910802,1163.182166110694,40.450890238305945,61.35232785024917,496216.68346189056
4,35_EVs,2510.634329579343,1402.0289256254084,49.10034210311356,61.254037675430205,478613.11590071855
