<a class="anchor" id="nutshell"></a>
<h2 style="font-family:'Verdana',sans-serif; color:#1D7874;">L2RPN ICAPS Opponent</h2>
For more detailed explanation about the opponent, please refer to the notebook <i>9_EnvironmentModifications.ipynb</i> of the official grid2op getting started repository. It is avaible at <a href="https://github.com/rte-france/Grid2Op/tree/master/getting_started"> https://github.com/rte-france/Grid2Op/tree/master/getting_started </a> (if you download grid2op) and can be run interactively (without any install) in a browser thanks to mybinder at this link :
    <a href="https://mybinder.org/v2/gh/rte-france/Grid2Op/master"><img src="utils/img/badge_logo.svg"></a>

<a class="anchor" id="why_opp"></a>
<h3 style="font-family:'Verdana',sans-serif; color:#1D7874;">A. Why an opponent</h3>

<p style="font-family:'Verdana','sans-serif'; color:#393D3F; text-align:justify; font-size:14px;">
    Actually, in the powersystem literature, lots of robustness criterion are used. In most of this literature (and for real time operation in most TSOs!) the "N-1" security criterion is often used. One of the consequences of this criterion is that: </p>
    <ul style="font-family:'Verdana','sans-serif'; color:#393D3F; text-align:justify; font-size:14px;">
        <li> at each time step, the flow on each powerline should be lower than the "thermal limit" [in case of grid2op this translates to np.all(obs.rho < 1)] </li>
        <li> if one single element in the grid fails, the above condition must remain </li>
    </ul>
<p style="font-family:'Verdana','sans-serif'; color:#393D3F; text-align:justify; font-size:14px;">
    This means that, not only an agent should be robust in the current state, but also in a fictive state where one of the powerline is disconnected.
    <br><br>
    This criterion has many motivations.
    <br>
    Powergrids cover large areas, usually they are the side of a whole country / state for example, counting hundred of thousands of "equipment" (varying in size from the tiny screws to fix two things together to wide transformers of the size of a building...) split all accross hundreds of square kilometers (or miles). The probability, in these conditions, that any of this equipment fails is far from neglectable (if an object has a probability of failure of <i>p</i>, then the probability that at least one object among <i>N</i> fails [under some assumption] can be approximate with <i>(1 - (1 - p)^N)</i> which goes really fast to 1 when <i>N</i> is 10000 for example even if <i> p </i> is really small).
    <br>
    The gigantic size of powergrid also means that it is likely that there is something going wrong somewhere. The "failure" of an equipment have dozens of possible causes: it is in a bad shape (not enough maintenance), it suffers from an natural external aggression (wind storm, struck by lightning bolts, a tree falls on it, the external temperature is too hot causing some thermal issues etc.). Once again, the larger the grid, the higher the odds that these conditions arise.
    <br>
    A powergrid can also be the "victim" of malicious attacks (a person hack a piece of software or physically attack the equipement with a bulldozer or a bombing for example) or it is not operated inside its standard operation range (human error, bad data are send to the control center etc.).
    <br><br>
    Most of these phenomenon have a common consequence: one or more powerlines will be disconnected from the grid, and this is why the criterion exposed at the beginning of the paragraph is often chosen as security criterion.
    <br>
    Unfortunately, this criterion implies a significant computational burden. In the grid proposed for this competition, you have 59 powerlines. This would mean that at each time, to make sure your agent is "safe" using standard power system criteria would require 60 times the computation as today (we would need to compute not only one state of the grid but one when all powerlines are connected, and one for each disconnection of powerlines). This is not recommended in practice.
    <br><br>
    This is why we decided to introduce an "opponent" in this track. This opponent has the power to force the disconnection of some powerlines, any powerline it wants actually. Concretly, it can act by looking at the observation (just like your agent) and choose to disconnect a powerline for a given number of time steps. During this time, it is not possible to reconnect the given powerline. The opponent has limits, it cannot act over and over again (and make the problem intractable).
    <br>
    On a thought experiment, if you imagine the opponent simulating all the possible outcomes for all possible powerline disconnections and then disconnecting the worst one (i.e the one leading to the worst grid state), having an agent that successfully manages all the scenario in this setting implies that your agent is robust according to the power system "n-1" security criteria. 
    <br>
    The opponent forces the agents to be as resilient as possible, without requiring too much computing power.
    <br><br>
    When an attack occurs, the powerline is automatically <i>disconnected</i>. Note that it will stay <i>disconnected</i> even after the "work" of the opponent and until you actually reconnect it. This entails that if you don't reconnect the attacked lines, they will remain disconnected until the end of time, leaving you more vulnerable for future attacks!
    <br><br>
    For this track you will be asked to study the powergrid shown below:
</p>

In [None]:
import numpy as np
import matplotlib
import grid2op
import re
assert grid2op.__version__ >= "1.1.0", "You need grid2op at least 1.1.0 to compete in this track."
from grid2op.PlotGrid import PlotMatplot
env_opp = grid2op.make("l2rpn_icaps_2021_small", difficulty="0")
env_opp.seed(3)  # for reproducible experiments
obs = env_opp.reset()
plot_helper_opp = PlotMatplot(env_opp.observation_space, width=1920,height=1080, line_id=False)
_ = plot_helper_opp.plot_layout()

<a class="anchor" id="opp_model"></a>
<h3 style="font-family:'Verdana',sans-serif; color:#1D7874;">B. Grid2op Opponent modeling</h3>

In [None]:
line_id_opp = 56
reconnect_action_opp = env_opp.action_space({"set_line_status": [(line_id_opp, +1)]})
do_nothing_opp = env_opp.action_space()
_ = plot_helper_opp.plot_obs(obs)

<p style="font-family:'Verdana','sans-serif'; color:#393D3F; text-align:justify; font-size:14px;">
    We know (but that is because we set the seed and because we coded this environment) that an attack will happen at time step 5 so we do_nothing until that time. And, to be transparent, we show the state of the powergrid.
</p>    

In [None]:
for i in range(5):
    obs, reward, done, info = env_opp.step(do_nothing_opp)
_ = plot_helper_opp.plot_obs(obs)

<p style="font-family:'Verdana','sans-serif'; color:#393D3F; text-align:justify; font-size:14px;">
    Now let's do another step, as you will see, a powerline will be disconnected (we know it because we made this environment, but there is no way for an agent to actually know this information)
    </p>   

In [None]:
obs, reward, done, info = env_opp.step(do_nothing_opp)
_ = plot_helper_opp.plot_obs(obs)
print("The flow on this powerline is {:.1f}%"\
      "".format(100*obs.rho[line_id_opp]))
print("This powerline is unavailable for {} time steps".format(obs.time_before_cooldown_line[line_id_opp]))
print("I can also spot an attack by looking at the \"info\" dictionnary, that tells me that an attack is taking " \
      "place on powerline: {}, and this attack will last {} time steps (in total, it started this time step "\
      "so it will be over in 47 = 48 - 1 time steps)." \
      "".format(np.where(info["opponent_attack_line"])[0], info["opponent_attack_duration"]))

In [None]:
obs, reward, done, info = env_opp.step(do_nothing_opp)
_ = plot_helper_opp.plot_obs(obs)
print("The powerline will be unavailble for again {} time steps."\
      "".format(obs.time_before_cooldown_line[line_id_opp]))

<p style="font-family:'Verdana','sans-serif'; color:#393D3F; text-align:justify; font-size:14px;">
There is nothing really interesting in here, so we will do nothing for 33 time steps
</p>

In [None]:
for i in range(33):
    obs, reward, done, info = env_opp.step(do_nothing_opp)
_ = plot_helper_opp.plot_obs(obs)
print("The next maintenance is schedule in {} time steps (-1 = never)"\
      "".format(obs.time_next_maintenance[line_id_opp]))
print("The powerline will be unavailble for again {} time steps."\
      "".format(obs.time_before_cooldown_line[line_id_opp]))

<p style="font-family:'Verdana','sans-serif'; color:#393D3F; text-align:justify; font-size:14px;">
We see here that the powerline can be reconnected (`obs.time_before_cooldown_line[line_id_opp] = 0`), but it has not been reconnected automatically (it is still disconnected). In the next cell we reconnect it, now that is possible.
</p>

In [None]:
# and now reconnect it
obs, reward, done, info = env_opp.step(reconnect_action_opp)
print("Can i act on the powerline: {}".format(obs.time_before_cooldown_line[line_id_opp] == 0))
print("In how many time I will be able to reconnect it: {}".format(obs.time_before_cooldown_line[line_id_opp]))
print("Is the powerline connected: {}".format(obs.line_status[line_id_opp]))
print("The flow on it is {:.1f}A".format(obs.a_or[line_id_opp]))
_ = plot_helper_opp.plot_obs(obs)