<div style="font-size:200%;font-weight:bold">Energy Storage System</div>

This notebook demontrates how to train an RL agent for Energy Storage System (ESS) arbitrage. The simulated energy environment is created based on the paper [Arbitrage of Energy Storage in Electricity Markets with Deep Reinforcement Learning](https://arxiv.org/abs/1904.12232), and with [this sample dataset](https://aemo.com.au/en/energy-systems/electricity/national-electricity-market-nem/data-nem/aggregated-data).

# Prerequisites

Ensure that your Python virtual environment have installed they energy storage system python package `pip install -e .`. Then, execute the next cell to download the sample data to a a local file called `data/sample-data.csv`.

In [None]:
%%capture
!bash download_data.sh

In [None]:
import pandas as pd
import glob
files = glob.glob("data/PRICE_AND_DEMAND*.csv")
df = pd.concat([pd.read_csv(f) for f in files], axis=0, ignore_index=True)
df.sort_values('SETTLEMENTDATE', inplace=True)
df.to_csv("data/sample-data.csv", index=False)

# Global config

In [None]:
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
%load_ext autoreload
%autoreload 2

import numpy as np
import pandas as pd
from typing import List

from energy_storage_system.agents import MovingAveragePriceAgent, PriceVsCostAgent, RandomAgent
from energy_storage_system.envs import SimpleBattery
from energy_storage_system.utils import evaluate_episode, plot_reward, plot_analysis, train

env_config = {
    "MAX_STEPS_PER_EPISODE": 168,
    "LOCAL": True,  # True means to use data from local src folder instead of S3.
    "FILEPATH": "data/sample-data.csv"
}
env = SimpleBattery(env_config)
episodes = 3000

The next cell defines a helper function `train_eval()` to (train + evaluate + plot) an agent. This function will be used to evaluate three baseline agents:

1. a random agent
2. an agent that considers market price vs cost
3. an agent that considers the moving average of market price

In [None]:
def train_eval(env, agent, episodes) -> pd.DataFrame:
    """Helper function to train, evaluate, and plot."""
    # Training
    train_results = train(env, agent, episodes)
    plot_reward(train_results.rewards_list)  # Jupyter autoplots the returned fig
    print("Average rewards across training episodes:", train_results.mean_rewards)

    # Evaluation
    df_eval = evaluate_episode(agent, env)
    plot_analysis(df_eval)  # Jupyter autoplots the returned fig

    return df_eval

# Random Agent

Train an agent who behaves randomly.

**Policy evaluation and observation**: the agent action is totally random, regardless of price and cost.

In [None]:
np.random.seed(1)
df_eval_random = train_eval(env, RandomAgent(), episodes)

# Market price vs cost agent

This agent behaves as follows:

- SELL: when market price is higher than cost
- BUY: when market price is lower than cost
- HOLD: others

**Policy evaluation and observation**: agent discharges (sell:1) when price is higher than cost, and charges (buy:0)

    CHARGE = 0
    DISCHARGE = 1
    HOLD = 2

In [None]:
np.random.seed(1)
df_eval_price_vs_cost = train_eval(env, PriceVsCostAgent(), episodes)

# Save the evaluation episode.
df_eval_price_vs_cost.to_csv("result_price_vs_cost_agent.csv", index=False)

# Market Price vs Historical price Agent

This agent behaves as follows:

- SELL: when market price is higher than past 5 days average price
- BUY: when market price is lower than past 5 days average price
- HOLD: others

**Policy evaluation and observation**: Agent will start selling when market price is increasing (high than last 5 days average), and buy when market price is dropping.

    CHARGE = 0
    DISCHARGE = 1
    HOLD = 2


In [None]:
np.random.seed(1)
df_eval_ma = train_eval(env, MovingAveragePriceAgent(), episodes)

# Save the evaluation episode.
df_eval_ma.to_csv("result_hist_price_agent.csv", index=False)

# SageMaker RL - DQN

Next is to use DQN algorithm running on SageMaker RL. Please refer to separate notebook for more info.

