<div style="background:#FFFFAA">
<img src="./logo.jpg", width=150, ALIGN="left", border=20>
<h1>L2RPN Starting Kit </h1> 

<br>This code was tested with <br>
Python 3.6.6 |Anaconda custom (64-bit)| (default, Nov 2018, 11:07:29) (https://anaconda.org/)<br>
<i> Adapted for Chalab by Isabelle Guyon from original code of Balázs Kégl</i> <br>
<a href="http://www.datascience-paris-saclay.fr">Paris Saclay Center for Data Science (CDS)</a>
</center>
<p>
ALL INFORMATION, SOFTWARE, DOCUMENTATION, AND DATA ARE PROVIDED "AS-IS". The CDS, CHALEARN, AND/OR OTHER ORGANIZERS OR CODE AUTHORS DISCLAIM ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR ANY PARTICULAR PURPOSE, AND THE WARRANTY OF NON-INFRIGEMENT OF ANY THIRD PARTY'S INTELLECTUAL PROPERTY RIGHTS. IN NO EVENT SHALL AUTHORS AND ORGANIZERS BE LIABLE FOR ANY SPECIAL, 
INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF SOFTWARE, DOCUMENTS, MATERIALS, PUBLICATIONS, OR INFORMATION MADE AVAILABLE FOR THE CHALLENGE. 
</div>

<div style="background:#FFFFAA">
    <h2>Introduction </h2>
    <p> 
     <br>
The goal of this challenge is to use Reinforcement Learning in Power Grid management by designing RL agents to automate the control of the power grid. The dataset used in this challenge is from <a href="https://github.com/MarvinLer/pypownet">pypownet</a>, made by Marvin Lerousseau, it is a simulator that is able to emulate a power grid of any size and electrical properties subject to a set of temporal injections for discretized time-steps.

References and credits: <br>
Founder of pypownet was Marvin Lerousseau. The competition protocol was designed by Isabelle Guyon. Our mentors are Balthazar Donon and Antoine Marot. Pypownet, 2017. https://github.com/MarvinLer/pypownet. The baseline methods were inspired by work performed by Kimang Khun.
 <br> 
</div>

In [22]:
model_dir = 'example_submission/'
problem_dir = 'ingestion_program/'  
score_dir = 'scoring_program/'
input_dir = 'public_data/'
output_dir = 'output/'
from sys import path; path.append(model_dir); path.append(problem_dir); path.append(score_dir);
path.append(input_dir); path.append(output_dir);
%matplotlib inline
# Uncomment the next lines to auto-reload libraries (this causes some problem with pickles in Python 3)
%load_ext autoreload
%autoreload 2
import seaborn as sns; sns.set()
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload



<div style="background:#FFFFAA">
    <h1> Step 1: Exploratory data analysis </h1>
<p>
We provide data with the starting kit.
    <br>
</div>

## Electrical grid
<div >
<img src="./ExampleGrid.JPG", width=750, ALIGN="left", border=20>
    <br>
    <br>
(courtesy of Marvin Lerousseau)
</div>

During the challenge, a grid of 14 substations is given. 20 lines connected the nodes of the network.

For the following example, we take the case where there are 11 loads and 5 prods and particularly the hard level. Furthermore, the information shown are only those of January.

In [23]:
data_dir = 'public_data/hard'              # Change this to the directory where you put the input data
!ls $data_dir*

chronics  configuration.yaml  reference_grid.m


For convenience, we load the data as a "pandas" data frame, so we can use "pandas" to explore the data.

<div style="background:#FFFFAA">
<h1>Step 2: Building an Agent</h1>
</div>

<div style="background:#FFFFAA">
    <h2>Loading data with pypownet</h2>
    <p>
We reload the data with the environment class of pypownet
   <br>
    
To win, flows in a line have to stay under a threshold. Above this threshold, the line will overheat and after a certain amount of overheating, the line will break. Thermal limits are already defined in pypownet.
</div>

In [24]:
import os
import pypownet.environment
import pypownet.runner
data_dir = 'public_data'  
environment = pypownet.environment.RunEnv(parameters_folder=os.path.abspath(data_dir),
                                              game_level="hard",
                                              chronic_looping_mode='natural', start_id=0,
                                              game_over_mode="soft")


                     GAME PARAMETERS
    loadflow_backend: pypower
    loadflow_mode: AC
    n_timesteps_consecutive_soft_overflow_breaks: 10
    n_timesteps_hard_overflow_is_broken: 10
    n_timesteps_horizon_maintenance: 48
    max_number_loads_game_over: 6
    hard_overflow_coefficient: 1.0
    n_timesteps_soft_overflow_is_broken: 10
    max_seconds_per_timestep: 1.0
    max_number_prods_game_over: 3



<div style="background:#FFFFAA">
    <h2>Building an agent</h2>
    <p>
We provide examples of agent (for reinforcement learning) in the `starting-kit/example_submission` directory. It is a quite stupid agent: it does nothing. Replace it with your own agent.
    </div>

## Scoring the results of an agent

<div style="background:#FFFFAA">
    <br>
    <p>
<b>The metric chosen for your challenge</b> is identified in the "metric.txt" file found in the `scoring_function/` directory. The function "get_metric" searches first for a metric having that name in my_metric.py, then in libscores.py, then in sklearn.metric.
    <br>
The aim of a reinforcement learning problem is to maximize the reward function.

When running the agent, two values are given back : the first one is the reward of the last timestep and the second one is the cumulative reward for all the iterations of the run of the agent. The reward indicates if the game is going towards a game over or not.

Specifically, our reward function is composed of 5 subrewards. They describe the proportion of isolated productions, loads, the cost of an action, an indication of the amount of changes between the current grid and the initial grid and lastly information on the lines capacity usage. 
    </div>

In [25]:
from scoring_program import libscores
from libscores import get_metric
metric_name, scoring_function = get_metric()
print('Using scoring metric:', metric_name)
# Uncomment the next line to display the code of the scoring metric
#??scoring_function

Using scoring metric: reward


In [26]:
import time
start = time.time()
end = time.time()
print(end-start)

3.123283386230469e-05


In [27]:
class CustomAgent(pypownet.agent.Agent):
    """
    An example of a baseline controler that randomly switches the status of one random power line per timestep (if the
    random line is previously online, switch it off, otherwise switch it on).
    """
    def __init__(self, environment):
        super().__init__(environment)
    
    def act(self, observation):
        """ Produces an action given an observation of the environment. Takes as argument an observation of the current
        power grid, and returns the chosen action."""
        # Sanity check: an observation is a structured object defined in the environment file.
        assert isinstance(observation, pypownet.environment.Observation)
        #print(" DO NOTHING AGENT !!! ")
        action_space = self.environment.action_space

        # Implement your policy here
        # Example of the do-nothing policy that produces no action (i.e. an action that does nothing) each time
        do_nothing_action = action_space.get_do_nothing_action()

        # Sanity check: verify the good overall structure of the returned action; raises exceptions if not valid
        assert action_space.verify_action_shape(do_nothing_action)
        return do_nothing_action

        # No learning (i.e. self.feed_reward does pass)

In [28]:
import logging
import sys
import time
start = time.time()
NUMBER_ITERATIONS = 1000

submission_dir = 'example_submission'
sys.path.append(submission_dir)

if not os.path.exists(output_dir):
    os.makedirs(output_dir)
log_path = os.path.abspath(os.path.join(output_dir, 'runner.log'))


open(log_path, 'w').close()
submitted_controler = CustomAgent(environment)
# Instanciate a runner, that will save the run statistics within the log_path file, to be parsed and processed
# by the scoring program
phase_runner = pypownet.runner.Runner(environment, submitted_controler, verbose=True, vverbose=False,
                                      log_filepath=log_path)
phase_runner.ch.setLevel(logging.ERROR)
# Run the planned experiment of this phase with the submitted model
score = phase_runner.loop(iterations=NUMBER_ITERATIONS)
print("cumulative rewards : {}".format(score))
end = time.time()
print(end-start)

cumulative rewards : -362.6689729381084
12.930731534957886


<div style="background:#FFFFAA">
    <b> Save the best agent </b> it should be a class Submission and save in "example_submission/submission.py".  Uncomment the line <i>%%writefile example_submission/submission.py to save the agent. </i>
</div>

In [29]:
#%%writefile example_submission/submission.py
import pypownet.agent
import pypownet.environment
import numpy as np
import os

class Submission(pypownet.agent.Agent):
    """
    An example of a baseline controler that randomly switches the status of one random power line per timestep (if the
    random line is previously online, switch it off, otherwise switch it on).
    """

    def __init__(self, environment):
        super().__init__(environment)
        self.verbose = True

    def act(self, observation):
        # Sanity check: an observation is a structured object defined in the environment file.
        assert isinstance(observation, pypownet.environment.Observation)
        action_space = self.environment.action_space

        # Create template of action with no switch activated (do-nothing action)
        action = action_space.get_do_nothing_action()

        # Select lines to switch
        if True :
            lines_load = observation.get_lines_capacity_usage()
            nb_lines = len(lines_load)
            assert nb_lines == action_space.lines_status_subaction_length
            for i in range(nb_lines):
                lines_status = action_space.get_lines_status_switch_from_id(action,i)
                if lines_status == 0:
                    action_space.set_lines_status_switch_from_id(action=action,line_id=i,new_switch_value=0)
                if lines_load[i] > 1:
                    action_space.set_lines_status_switch_from_id(action=action,line_id=i,new_switch_value=1)
                    action_name = 'switching status of line %d' % i
                    if self.verbose:
                        print('Action chosen: ', action_name, '; expected reward %.4f' % reward)


        # Test the reward on the environment
        reward_aslist = self.environment.simulate(action, do_sum=False)
        reward = sum(reward_aslist)
        if self.verbose:
            print('reward: [', ', '.join(['%.2f' % c for c in reward_aslist]), '] =', reward)


        return action

        # No learning (i.e. self.feed_reward does pass)


<div style="background:#FFFFAA">
<h1> Step 3: Making a submission </h1> 

<h2> Unit testing </h2> 

It is <b><span style="color:red">important that you test your submission files before submitting them</span></b>. All you have to do to make a submission is modify the file <code>submission.py</code> in the <code>starting_kit/example_submission/</code> directory, then run this test to make sure everything works fine. This is the actual program that will be run on the server to test your submission. 
<br>
Keep the sample code simple.
</div>

In [30]:
!python $problem_dir/ingestion.py $input_dir $input_dir/res $problem_dir $model_dir

input dir: /home/tp-home010/nbereux/projects/Grid/starting_kit/public_data
output dir: /home/tp-home010/nbereux/projects/Grid/starting_kit/public_data/res
program dir: /home/tp-home010/nbereux/projects/Grid/starting_kit/ingestion_program
submission dir: /home/tp-home010/nbereux/projects/Grid/starting_kit/example_submission
input content ['__pycache__', 'easy', 'hard', 'level0', 'medium', 'res', 'reward_signal.py']
output content ['runner.log']
program content ['__pycache__', 'data_converter.py', 'data_io.py', 'data_manager.py', 'ingestion.py', 'metadata']
submission content ['metadata', '__pycache__', 'baseline_agents.py', 'my_agents.py', 'preprocessing.py', 'submission.py', 'saved_actions.csv', 'saved_rewards.csv', 'saved_states.csv']
Using custom reward signal CustomRewardSignal of file /home/tp-home010/nbereux/projects/Grid/starting_kit/public_data/reward_signal.py

                     GAME PARAMETERS
    loadflow_mode: AC
    n_timesteps_horizon_maintenance: 48
    n_timesteps_con


<div style="background:#FFFFAA">
Also test the scoring program:
    </div>

In [31]:
scoring_output_dir = 'output'
!python $score_dir/evaluate.py $input_dir $scoring_output_dir

public_data/
output
step : 8, cumulative rewards : -1.59


<div style="background:#FFFFAA">
    <h1> Preparing the submission </h1>

Zip the contents of `sample_code_submission/` (without the directory), or download the challenge public_data and run the command in the previous cell, after replacing sample_data by public_data.
Then zip the contents of `sample_result_submission/` (without the directory).
<b><span style="color:red">Do NOT zip the data with your submissions</span></b>.

In [32]:
import datetime 
from data_io import zipdir
the_date = datetime.datetime.now().strftime("%y-%m-%d-%H-%M")
sample_code_submission = 'sample_code_submission_' + the_date + '.zip' 
zipdir(sample_code_submission, model_dir) 
print("Submit one of these files:\n" + sample_code_submission + "\n")

Submit one of these files:
sample_code_submission_19-04-12-09-34.zip

