# Reinforcement Learning: an introductory example

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Reinforcement-Learning:-an-introductory-example" data-toc-modified-id="Reinforcement-Learning:-an-introductory-example-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Reinforcement Learning: an introductory example</a></span><ul class="toc-item"><li><span><a href="#Libraries" data-toc-modified-id="Libraries-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Libraries</a></span></li><li><span><a href="#Introducing-the-example-network" data-toc-modified-id="Introducing-the-example-network-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Introducing the example network</a></span><ul class="toc-item"><li><span><a href="#Import-and-prepare-data" data-toc-modified-id="Import-and-prepare-data-1.2.1"><span class="toc-item-num">1.2.1&nbsp;&nbsp;</span>Import and prepare data</a></span></li><li><span><a href="#Create-graph" data-toc-modified-id="Create-graph-1.2.2"><span class="toc-item-num">1.2.2&nbsp;&nbsp;</span>Create graph</a></span></li><li><span><a href="#Possible-solutions" data-toc-modified-id="Possible-solutions-1.2.3"><span class="toc-item-num">1.2.3&nbsp;&nbsp;</span>Possible solutions</a></span></li></ul></li><li><span><a href="#Demo" data-toc-modified-id="Demo-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Demo</a></span><ul class="toc-item"><li><span><a href="#Small-demo-with-500-episodes" data-toc-modified-id="Small-demo-with-500-episodes-1.3.1"><span class="toc-item-num">1.3.1&nbsp;&nbsp;</span>Small demo with 500 episodes</a></span></li><li><span><a href="#Final-result-of-200k-episodes" data-toc-modified-id="Final-result-of-200k-episodes-1.3.2"><span class="toc-item-num">1.3.2&nbsp;&nbsp;</span>Final result of 200k episodes</a></span></li></ul></li></ul></li></ul></div>

## Libraries

In [None]:
%pip install -r requirements.txt

In [None]:
%matplotlib inline
# import libraries
import networkx as nx
import matplotlib
from RL.input import input_data
from RL.execution import execution as rl_exec

In [None]:
matplotlib.rcParams['figure.figsize'] = [16, 8]

In [None]:
# load input class
grid_data = input_data.InputData()

## Introducing the example network   

In this notebook we look at a small neighbourhood with one netstation (MSR), 5 houses and a couple of streets connecting the houses with the MSR. First the neighbourhood will be drawn as a network graph with nodes and edges.

The goal is develop an algorithm that thinks like an engineer and is capable to design an electrical grid that meets certain constraints.

### Import and prepare data  

Here the grid data is imported and the nodes and edges are converted into a networkx graph. The regular network nodes are plotted in red and the MSR node is shown in blue.

In [None]:
# import csv files and prepare the grid data, i.e. create node and edge list and a positional
# dictionary of nodes
grid_data.import_csv_as_df()
grid_data.prepare_grid_data()

In [None]:
grid_data.df_edges.head()

In [None]:
grid_data.df_nodes.head()

### Create graph

Define a networkx graph object and add nodes and edges to the graph.

In [None]:
grid_data.update_networkx_graph()

The data has been prepared. Draw the grid of this easy example neighbourhood to see what it looks like.

In [None]:
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.nodes_pos, 
                       nodelist=grid_data.node_list, node_color='b', node_size=100)
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.nodes_pos, 
                       nodelist=grid_data.node_list_msr, node_color='r', node_size=150)
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.household_nodes_pos, 
                       nodelist=grid_data.node_list_households, node_color='g', node_size=50)
nx.draw_networkx_edges(grid_data.network_graph, 
                       pos=grid_data.nodes_pos,
                       edgelist=grid_data.edge_tuples, edge_color='lightgrey', width=10)

### Possible solutions

In the real world a net station has a number of free cables which may differ per type of station. This can be 8 for instance, but also 4, 5 or sometimes even 12.
In this small neighbourhood example the engineer can only use 2 cables to connect all the 9 houses. He may also use 1 cable if he thinks that's the best option. Some possible solutions are:

- Use two cables: one from the MSR to the 5 houses on the left and one from the MSR to the 4 houses on the right.
- Use one cable straight ahead and split with an "aftakmof" to the left and right.
- Use one cable without a "aftakmof" but go immediately to the left and follow the road. Here at least three "verbindingsmoffen" are needed.

## Demo

### Small demo with 500 episodes

In [None]:
# set number of epsiodes
rl_exec.params.episodes = 500

In [None]:
# Train the algorithm
for t in range(rl_exec.params.episodes):

    # run train iteration
    rl_exec.train_grid_planning()

    # reset
    if t % 100 == 0:
        print("episode: " + str(t))
        print("agents reward: " + str(rl_exec.env.reward))
        print("cable length: " + str(rl_exec.env.env_matrix[np.where(rl_exec.env.env_matrix[:, 6] == 1)[0], 4].sum()))
        n_moffen = 0
        for l in rl_exec.env.cables_used:
            n_moffen += rl_exec.env.determine_number_of_moffen(rl_exec.env.env_matrix, rl_exec.env.grid.df_nodes, l)
        print("number of moffen used: " + str(n_moffen))
        print("cables used: " + str(rl_exec.env.cables_used))
        print("matrix: " + str(rl_exec.env.env_matrix))
        print("max state value: " + str(max(rl_exec.agent.state_value)))

    # reset object for new run
    rl_exec.reset_env_elements()

In [None]:
# show result
rl_exec.execute_grid_planning()

In [None]:
# create graph
result = rl_exec.env.env_matrix

In [None]:
edge_tuples1, edge_tuples2 = grid_data.create_nodes_and_edges_result(result)
grid_data.create_cable1_pos()
grid_data.create_cable2_pos()

In [None]:
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.nodes_pos, 
                       nodelist=grid_data.node_list, node_color='b', node_size=100)
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.nodes_pos, 
                       nodelist=grid_data.node_list_msr, node_color='r', node_size=150)
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.household_nodes_pos, 
                       nodelist=grid_data.node_list_households, node_color='g', node_size=50)
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.cable1_pos, 
                       nodelist=[8,9,10,11,12,13,14], node_color='purple', node_size=50)
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.cable2_pos, 
                       nodelist=[14,15,16,17,18,19,20], node_color='orange', node_size=50)
nx.draw_networkx_edges(grid_data.network_graph, 
                       pos=grid_data.cable1_pos,
                       edgelist=edge_tuples1, edge_color='purple', width=2)
nx.draw_networkx_edges(grid_data.network_graph, 
                       pos=grid_data.cable2_pos,
                       edgelist=edge_tuples2, edge_color='orange', width=2)

### Final result of 200k episodes

In [None]:
final_result = np.asmatrix(np.load("data/matrix_result.npy"))

In [None]:
final_result

In [None]:
edge_tuples1, edge_tuples2 = grid_data.create_nodes_and_edges_result(final_result)
grid_data.create_cable1_pos()
grid_data.create_cable2_pos()

In [None]:
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.nodes_pos, 
                       nodelist=grid_data.node_list, node_color='b', node_size=100)
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.nodes_pos, 
                       nodelist=grid_data.node_list_msr, node_color='r', node_size=150)
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.household_nodes_pos, 
                       nodelist=grid_data.node_list_households, node_color='g', node_size=50)
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.cable1_pos, 
                       nodelist=[8,9,10,11,12,13,14], node_color='purple', node_size=50)
nx.draw_networkx_nodes(grid_data.network_graph, 
                       pos=grid_data.cable2_pos, 
                       nodelist=[14,15,16,17,18,19,20], node_color='orange', node_size=50)
nx.draw_networkx_edges(grid_data.network_graph, 
                       pos=grid_data.cable1_pos,
                       edgelist=edge_tuples1, edge_color='purple', width=2)
nx.draw_networkx_edges(grid_data.network_graph, 
                       pos=grid_data.cable2_pos,
                       edgelist=edge_tuples2, edge_color='orange', width=2)