# Notebook for M3 assesment

This is a mock-up notebook for M3 deliverable for AIRGo project.

### Import of library

In [None]:
import grid2op
from grid2op.PlotGrid import PlotMatplot
from grid2op.Backend.PandaPowerBackend import PandaPowerBackend
from grid2op.Agent import DoNothingAgent
from grid2op.Episode import EpisodeData
import numpy as np
import os
import shutil
from grid2op.gym_compat import GymEnv
from gym import Env
from gym.utils.env_checker import check_env
import tqdm
from grid2op.Runner import Runner

### Create a Grid2op environment

Here we load the rte_case14_realistic file, in the context of our project it should be france network as a whole for the final test.  

As you can see for the experience to be reproducible we can set a seed so the train/val/test sets are always the same. 

The backend would be changed to PypowsyblBackend.

The make function is highly customizable and a lot of parameters could be changed as well other classes.
For more details : https://grid2op.readthedocs.io/en/latest/makeenv.html#grid2op.MakeEnv.make 

In [None]:
env = grid2op.make("l2rpn_case14_sandbox",backend = PandaPowerBackend()) 
max_iter = 5  # we limit the number of iterations to reduce computation time. Put -1 if you don't want to limit it
env.seed(42)
obs = env.reset()

To create your train, val and test environment. ! Should be only runned once !

In [None]:
nm_env_train, nm_env_val, nm_env_test = env.train_val_split_random(pct_val=1., pct_test=1.,add_for_test="test")

In [None]:
train_env = grid2op.make("l2rpn_case14_sandbox_train")

### We can then visualize our network and the data associated with each node

In [None]:
plot_helper = PlotMatplot(train_env.observation_space)
_ = plot_helper.plot_layout()

In [None]:
_ = plot_helper.plot_obs(obs)

### Different type of actions

<strong>There is five main types of actions possible</strong> :
* Injection actions
* Connection/Deconnection of a line
* Topological configuration at every substation  

     <em>If the rights parameters are given</em>
* Redispatching
* Curtailment



For more detail : https://grid2op.readthedocs.io/en/latest/action.html

### Create an agent

An agent would be the algorithm that is gonna take some actions (all the possible one written just a cell above), regarding some observation on the grid and the possible rewards.

In our case we chose the DoNothingAgent that is not gonna take any action at any time step of the simulation which is already pre-implemented. Otherwise it is possible to create one following Grid2op framework and rules.

For more informations : https://grid2op.readthedocs.io/en/latest/agent.html

This agent should be replaced with your personnal RL agent

In [None]:
my_agent = DoNothingAgent

### Train an agent

We are using train environment to achieve training phase on the model.

It is also possible to use a complete gym environment.  
For more detail : https://grid2op.readthedocs.io/en/latest/gym.html 

In [None]:
gym_env = GymEnv(env)

We can see all the possible actions that can be taken.

In [None]:
gym_env.action_space

And the possible observations also.

In [None]:
gym_env.observation_space

Those can be changed to fit within a more classical form of reinforcment learning algorithms that are dealing with discrete action space using 

```python
from grid2op.gym_compat import DiscreteActSpace
gym_env.action_space = DiscreteActSpace(training_env.action_space,
                                        attr_to_keep=["set_bus" , "set_line_status_simple"])
```  
and  
```python
from grid2op.gym_compat import BoxGymObsSpace
gym_env.observation_space = BoxGymObsSpace(training_env.observation_space,
                                           attr_to_keep=["rho"])
gym_env.observation_space
```

Because we our agent "DoNothingAgent" can't be trained I show an example of it could be with a neural net

Once you have your one agent you can run some learning iterations using : 
```python
from YOUR_PACKAGE import YOUR_MODEL
nn_model = YOUR_MODEL(env=gym_env,
               learning_rate=1e-3,
               policy="YOUR_POLICY",
               policy_kwargs={"net_arch": [100, 100, 100]}, # Just an example of architecture
               n_steps=2,
               batch_size=8,
               verbose=True,
               )
```  
and
```python
nn_model.learn(total_timesteps=LEARNING_ITERATION)
```

### Evaluate your agent

In [None]:
save_path = "saved_agent_DoNothingAgent"
path_save_results = "{}_results".format(save_path)
shutil.rmtree(path_save_results, ignore_errors=True)


runner = Runner(**env.get_params_for_runner(),
                agentClass=my_agent
               )
res = runner.run(nb_episode=1, 
                 max_iter=max_iter,
#                  pbar=tqdm,
                 path_save=f"./{path_save_results}")

In [None]:
print("The results for DoNothing agent are:")
for _, chron_name, cum_reward, nb_time_step, max_ts in res:
    msg_tmp = "\tFor chronics with id {}\n".format(chron_name)
    msg_tmp += "\t\t - cumulative reward: {:.6f}\n".format(cum_reward)
    msg_tmp += "\t\t - number of time steps completed: {:.0f} / {:.0f}".format(nb_time_step, max_ts)
    print(msg_tmp)

In [None]:
os.listdir(path_save_results)
EpisodeData.list_episode(path_save_results)


In [None]:
all_episodes = EpisodeData.list_episode(path_save_results)
this_episode = EpisodeData.from_disk(*all_episodes[0])
li_actions = this_episode.actions

Extraction of all the actions taken by the agent

In [None]:
for act in li_actions:
    dict_act_ = act.as_dict()

In [None]:
dict_act_

We can see it is empty, which is normal because our agent is not taking any action

We can now check some observationnal values for the episode, here for example the status of the lines (connected/disconnected) at every step and count the number of real deconnections

In [None]:
li_observations = this_episode.observations
nb_real_disc = 0
for obs_ in li_observations:
    nb_real_disc += (obs_.line_status == False).sum()
print(f'Total number of disconnected powerlines cumulated over all the timesteps : {nb_real_disc}')

In [None]:
actions_count = {}
for act in li_actions:
    act_as_vect = tuple(act.to_vect())
    if not act_as_vect in actions_count:
        actions_count[act_as_vect] = 0
    actions_count[act_as_vect] += 1
print("The agent did {} different valid actions:\n".format(len(actions_count)))

In [None]:
for act in li_actions:
    print(act)