## Tutorial 2: Demonstration of developing original *Agent*
This tutorial demonstrate how to develop your own heuristic algorithm in *Agent* interface. 

*Agent* base classes are as follows: 

- `Agent`
- `KSPAgent`
- `PrioritizedKSPAgent`
- `KSPDRLAgent`(used in **Tutorial 3**)

You select proper base *Agent* in corresponding to routing algorithm. 

In [1]:
!pip install git+https://github.com/Optical-Networks-Group/rsa-rl.git

Collecting git+https://github.com/Optical-Networks-Group/rsa-rl.git
  Cloning https://github.com/Optical-Networks-Group/rsa-rl.git to /tmp/pip-req-build-ouwpqrw3
  Running command git clone -q https://github.com/Optical-Networks-Group/rsa-rl.git /tmp/pip-req-build-ouwpqrw3
Collecting plotly>=4.9.0
[?25l  Downloading https://files.pythonhosted.org/packages/4c/f3/93bc71d449828098efc7dda0a682937762d0c17f6140dcbc6fc6fa2a467d/plotly-4.13.0-py2.py3-none-any.whl (13.1MB)
[K     |████████████████████████████████| 13.1MB 6.1MB/s 
[?25hCollecting dash>=1.14.0
[?25l  Downloading https://files.pythonhosted.org/packages/69/91/ae029886dda55b93b60ac04377bcb2ab9209dd73244e3b5e513124cc6778/dash-1.17.0.tar.gz (75kB)
[K     |████████████████████████████████| 81kB 8.2MB/s 
[?25hCollecting dash-bootstrap-components>=0.10.7
  Using cached https://files.pythonhosted.org/packages/97/23/fc5bcf440d26725b809daf61e5e919a9e63f0aa993d0a5c0d54920705b75/dash_bootstrap_components-0.10.7-py2.py3-none-any.whl
Coll

## Experimental Settings
For evaluation, prepare *Environment* and evaluation function. 
Please see **Tutorial 1** if you have not seen it. 

In [2]:
import functools

from rsarl.envs import DeepRMSAEnv, make_multiprocess_vector_env
from rsarl.requester import UniformRequester
from rsarl.networks import SingleFiberNetwork
from rsarl.evaluator import batch_warming_up, batch_evaluation, batch_summary

In [3]:
# exp settings
n_envs, seed = 5, 0
n_requests = 10000

# build network
net = SingleFiberNetwork("nsf", n_slot=100, is_weight=True)
# build requester
requester = UniformRequester(
    net.n_nodes,
    avg_service_time=10,
    avg_request_arrival_rate=12)
# build env
env = DeepRMSAEnv(net, requester)
envs = make_multiprocess_vector_env(env, n_envs, seed, test=True)

In [4]:
def _evaluation(envs, agent, n_requests): 
    # start simulation
    _ = envs.reset()
    # 
    batch_warming_up(envs, agent, n_requests=3000)
    # evaluation
    experiences = batch_evaluation(envs, agent, n_requests=n_requests)
    # calc performance
    blocking_probs, avg_utils, total_rewards = batch_summary(experiences)

    for env_id, (blocking_prob, avg_util, total_reward) in enumerate(zip(blocking_probs, avg_utils, total_rewards)):
        print(f'[{env_id}-th ENV]Blocking Probability: {blocking_prob}')
        print(f'[{env_id}-th ENV]Avg. Slot-utilization: {avg_util}')
        print(f'[{env_id}-th ENV]Total Rewards: {total_reward}')
    
evaluation = functools.partial(_evaluation, envs=envs, n_requests=n_requests)

## Case 1: Develop your algorithm by using *PrioritizedKSPAgent*
First, you will develop an *Agent* based on ***PrioritizedKSPAgent***.   
*PrioritizedKSPAgent* uses routing algorithm that selects as shorter path as possible among k shortest paths. 
Thus, you only develop spectrum assignment algorithm by overriding the method, ***assign_spectrum***. 

In [5]:
from rsarl.agents import PrioritizedKSPAgent
from rsarl.algorithms import SpectrumAssignment

In [6]:
class SampleAgent(PrioritizedKSPAgent):
    
    def assign_spectrum(self, net, path: list, n_req_slot: int) -> int:
        """
            net: Network
            path: list of node ids
            n_req_slot: the number of required slot to assign path
            
        """
        # available slot upon the path
        path_slot = net.path_slot(path)
        
        # develop your own spectrum assignment algorithm and
        # e.g., random fit is also provided by RSA-RL as follows
        # slot_index = SpectrumAssignment.random(path_slot, n_req_slot)
        slot_index = 0
        
        # return start index of slots to assign
        # if not available, then return None
        return slot_index

In [7]:
agent = SampleAgent(k=5)
# pre-calculate ksp for all 
agent.prepare_ksp_table(net)

In [8]:
evaluation(agent=agent)

[0-th ENV]Blocking Probability: 90.28
[0-th ENV]Avg. Slot-utilization: 0.02019813636363636
[0-th ENV]Total Rewards: -8056.0
[1-th ENV]Blocking Probability: 90.23
[1-th ENV]Avg. Slot-utilization: 0.020700681818181816
[1-th ENV]Total Rewards: -8046.0
[2-th ENV]Blocking Probability: 90.56
[2-th ENV]Avg. Slot-utilization: 0.02177254545454545
[2-th ENV]Total Rewards: -8112.0
[3-th ENV]Blocking Probability: 90.7
[3-th ENV]Avg. Slot-utilization: 0.020649681818181817
[3-th ENV]Total Rewards: -8140.0
[4-th ENV]Blocking Probability: 90.29
[4-th ENV]Avg. Slot-utilization: 0.021674318181818183
[4-th ENV]Total Rewards: -8058.0


## Case 2: Develop your algorithm by using *KSPAgent*
Second, if you do not want to use ***priority*** of *k*-shortest paths, you should select ***KSPAgent*** that only provides ***k-shortest path table*** to enable you to select them. 
Let's take a case of implementing *Entropy Agent* which was proposed in the paper: https://ieeexplore.ieee.org/document/6647621.   
Note that there is a restriction of *Agent*’s interface  by ***Action***. 

In [9]:
import numpy as np
from rsarl.data import Action
from rsarl.agents import KSPAgent
from rsarl.utils import cal_slot, sort_tuple
from rsarl.utils.fragmentation import edge_based_entropy

In [10]:
class EntropyAgent(KSPAgent):

    def act(self, observation):
        # get current network
        net = observation.net
        # generate current request
        src, dst, bandwidth, duration = observation.request
        # get pre-calculated k-sp path
        sd_tuple = (src, dst)
        paths = self.path_table[sort_tuple(sd_tuple)]

        # Search KSP
        candidates = []
        for _k in range(self.k):
            path = paths[_k]
            # physical length of the path
            path_len = net.distance(path)
            n_req_slot = cal_slot(bandwidth, path_len)
            # calc entropy
            ent = edge_based_entropy(net, path, n_req_slot)
            min_ent = np.min(ent)
            slot_index = np.argmin(ent)
            # candidate (k-path, slot-idx, n_req_slot, entropy)
            candidates.append((_k, int(slot_index), n_req_slot, min_ent))

        # search the minimum entropy among k-shortest paths
        path_id, start_idx, n_req_slot, _ = min(candidates, key=lambda item:item[3])
        path = paths[path_id]

        act = Action(path, start_idx, n_req_slot, duration)
        return act

In [11]:
agent = EntropyAgent(k=5)
# pre-calculate ksp for all 
agent.prepare_ksp_table(net)

In [12]:
evaluation(agent=agent)

[0-th ENV]Blocking Probability: 13.120000000000001
[0-th ENV]Avg. Slot-utilization: 0.5234910454545454
[0-th ENV]Total Rewards: 7376.0
[1-th ENV]Blocking Probability: 11.61
[1-th ENV]Avg. Slot-utilization: 0.5151460454545455
[1-th ENV]Total Rewards: 7678.0
[2-th ENV]Blocking Probability: 13.01
[2-th ENV]Avg. Slot-utilization: 0.5207506818181817
[2-th ENV]Total Rewards: 7398.0
[3-th ENV]Blocking Probability: 12.839999999999998
[3-th ENV]Avg. Slot-utilization: 0.5237660454545454
[3-th ENV]Total Rewards: 7432.0
[4-th ENV]Blocking Probability: 12.49
[4-th ENV]Avg. Slot-utilization: 0.5172477727272727
[4-th ENV]Total Rewards: 7502.0


## Case 3: Develop routing and spectrum assignment algorithm by using *Agent*
Finally, when you use other routing algorithm, you use ***Agent***. 
Like *KSPAgent*, there is a restriction of *Agent*'s interface. 

In [13]:
from rsarl.agents import Agent

In [14]:
class SampleAgent2(Agent):

    def act(self, observation):
        # get current network
        net = observation.net
        # generate current request
        src, dst, bandwidth, duration = observation.request
        
        # develop your own routing and spectrum assignment algorithms
        act = None
        # path = 
        # start_idx = 
        # n_req_slot = 
        # act = Action(path, start_idx, n_req_slot, duration)
        return act

In [15]:
agent = SampleAgent2()
evaluation(agent=agent)

[0-th ENV]Blocking Probability: 100.0
[0-th ENV]Avg. Slot-utilization: 0.0
[0-th ENV]Total Rewards: -10000.0
[1-th ENV]Blocking Probability: 100.0
[1-th ENV]Avg. Slot-utilization: 0.0
[1-th ENV]Total Rewards: -10000.0
[2-th ENV]Blocking Probability: 100.0
[2-th ENV]Avg. Slot-utilization: 0.0
[2-th ENV]Total Rewards: -10000.0
[3-th ENV]Blocking Probability: 100.0
[3-th ENV]Avg. Slot-utilization: 0.0
[3-th ENV]Total Rewards: -10000.0
[4-th ENV]Blocking Probability: 100.0
[4-th ENV]Avg. Slot-utilization: 0.0
[4-th ENV]Total Rewards: -10000.0


## Conclusion
That's all! 
This tutorial demonstrates how to develop your own heuristic *Agent*. 
Next tutorial demonstrate how to develop your own *Agent* with deep reinforcement learning. 